6.885 From ASCII to Answers: Advanced Topics in Data Processing (updated title)


Graduate H- Level
Units 3-0-9
Instructor: Prof. Sam Madden, madden@csail.mit.edu
Schedule: TR1-2:30, room 4-231
This subject qualifies as a Computer Systems Concentration Subject
This class will survey techniques and systems for ingesting, efficiently processing, analyzing, and visualizing large data sets. Topics will include data cleaning, data integration, scalable systems (relational databases, NoSQL, Hadoop, etc.), analytics (data cubes, scalable statistics and machine learning), and scalable visualization of large data sets.  The goal of the class is to gain working experience along with in-depth discussions of the topics covered.  Students should have a background in database or distributed systems (6.814/6.830 or 6.824 or equivalent). There will be a semester-long project and paper, and hands-on labs involving using real systems. There will be no exams.
Please see the following: http://db.csail.mit.edu/6885