What we learned in general

  1. If stuck with a problem, try a different approach
  2. If there is a bug, contact our lecturers as soon as possible
  3. We are all sitting in one boat - If we have a network failure, the other team certainly suffers too (due to the fact that we are all working on the same environment)
  4. Use the chance to try out as much new technology as possible
  5. If too much went wrong, just restart
  6. Take a lot of snapshots

Technical things we learned

  1. How to create, start and delete instances in OpenStack
  2. How to include new Images in OpenStack
  3. How to setup things and fix problems using only linux terminal
  4. How to install Mesos
  5. How to install Hadoop
  6. How to install Spark
  7. How to enjoy improved usability when suddenly switching from MapReduce to Spark
  8. How to work with MLLib
  9. How to run algorithms in Mahout
  10. How wo work with Spark Contexts
  11. How to include IPython and SparkNotebook with Apache Spark
  12. How to setup smart ssh tunneling to access graphical features

Hints for next time

  1. OpenStack is good to learn up how to setup Hadoop and Spark, however for concrete algorithm development it is too unstable
  2. An option would be to apply for an AWS Grant (each student would get 100$ budget when successful), AWS offers fast machines and Spark offers automatic setup scripts. This saves a lot of time and allows students to concentrate more on the algorithmic part.