6.S080 Software Systems for Data Science


Undergraduate Level - AUS and II
Units: 3-0-9
Prereqs: 6.00, 6.0001, 6.006
Instructors:  Sam Madden (madden@csail.mit.edu), Tim Kraska (kraska@mit.edu)
Schedule:  LMW2:30-4, room E25-111
This class will survey techniques and systems for ingesting, efficiently processing, analyzing, and visualizing large data sets. Topics will include data cleaning, data integration, scalable systems (relational databases, NoSQL, Spark, etc.), analytics (data cubes, scalable statistics and machine learning), fundamental statistics and machine learning and scalable visualization of large data sets. The goal of the class is to gain working experience along with in-depth discussions of the topics covered. Students should have a background in programming and algorithms. There will be a semester-long project and paper, and hands-on labs designed to give experience with state of the are data processing tools.