Description
This hands-on course teaches the tools & methods used by data scientists, from researching solutions to scaling up prototypes to Spark clusters. It exposes the students to the entire data science pipeline, from data acquisition to extracting valuable insights applied to real-world problems.
Questions
Questions and discussions about the course are gathered on mattermost: https://mattermost-dslab.epfl.ch. You will need to register with your EPFL gitlab ID (see week 3).
Final Project
Lab Sessions
Week 1 - 20.02.2019 - Module 1 - Python for data scientists 1/4
Week 2 - 27.02.2019 - Module 1 - Python for data scientists 2/4
Week 3 - 06.03.2019 - Module 1 - Collaborating with Git 3/4
Week 4 - 13.03.2019 - Module 1 - Graded homework 1
Week 5 - 20.03.2019 - Module 2 - Big data
Week 6 - 27.03.2019 - Module 2 - Big data
Week 7 - 03.04.2019 - Module 3 - Spark
Week 8 - 10.04.2019 - Module 3 - Spark
Week 9 - 17.04.2019 - Module 3 - Spark
Week 10 - 01.05.2019 - Module 4 - Data streams with Kafka and Spark
Week 11 - 08.05.2019 - Module 4 - Data streams with Kafka and Spark
Week 12 - 15.05.2019 - Module 5 - Final assignment
Week 13 - 22.05.2019 - Module 5 - Final assignment
Week 14 - 29.05.2019 - Module 5 - Final assignment