This is our Script for the Distributed Data Mining Lab Course at TUM (Winter 14/15). It summarizes our experiences and conclusions and should serve as reference and guideline for future course work.
About us
Module 0: Preparation & Setup
Module 1: Intro to Hadoop and Spark
Module 2: Performance analysis
Module 3: Distributed Machine Learning Frameworks
Module 4: Conclusion
Appendix: Theory