What we learned in general
- If stuck with a problem, try a different approach
- If there is a bug, contact our lecturers as soon as possible
- We are all sitting in one boat - If we have a network failure, the other team certainly suffers too (due to the fact that we are all working on the same environment)
- Use the chance to try out as much new technology as possible
- If too much went wrong, just restart
- Take a lot of snapshots
Technical things we learned
- How to create, start and delete instances in OpenStack
- How to include new Images in OpenStack
- How to setup things and fix problems using only linux terminal
- How to install Mesos
- How to install Hadoop
- How to install Spark
- How to enjoy improved usability when suddenly switching from MapReduce to Spark
- How to work with MLLib
- How to run algorithms in Mahout
- How wo work with Spark Contexts
- How to include IPython and SparkNotebook with Apache Spark
- How to setup smart ssh tunneling to access graphical features
Hints for next time
- OpenStack is good to learn up how to setup Hadoop and Spark, however for concrete algorithm development it is too unstable
- An option would be to apply for an AWS Grant (each student would get 100$ budget when successful), AWS offers fast machines and Spark offers automatic setup scripts. This saves a lot of time and allows students to concentrate more on the algorithmic part.