Compsci 290: Data Engineering, Fall 2015
Smart, data-driven applications are the future. This class teaches the new engineering principles that have emerged to create and run such applications. Companies are trying hard to extract valuable insights from data. This process is not easy:
To prepare students to meet these challenges, this course brings together topics from multiple areas of Computer Science: database systems, distributed computing, algorithms, and machine learning. A lot of the course material is drawn from recent research literature. This year, we will cover the engineering principles that underpin:
Prerequisites: Good knowledge of Scala or Java is required. Prior exposure to databases will be very helpful. Most of the material that we cover will not be found in textbooks. Be prepared to do a fair amount of web search and reading.
10.05-11:20 AM on Mondays and Wednesdays; in the Sociology Psychology Building, Room Number 129.
There is no prescribed textbook for the class.
Learning Spark: Lightning-Fast Big Data Analysis, by by Holden Karau, Andy Konwinski, Patrick Wendell, and Matei Zaharia. O'Reilly Media. Feb 2015. (First edition of the book at Amazon.com)
Hadoop: The Definitive Guide, by Tom White. O'Reilly Media. April 2015. (Fourth edition of the book at Amazon.com)
Database Systems: The Complete Book, by Hector Garcia-Molina, Jeffrey D. Ullman, and Jennifer Widom. Prentice Hall. 2008. (The Second edition of the book at Amazon.com)
Readings will be posted on the readings page.
Instructor: Shivnath Babu
This class is heavy on programming. Details will be presented in class.
The midterm and final exams are not open-book or open-notes. Laptops and other electronic devices are also not allowed. Late work will not be accepted, unless there are documented excuses from a physician or dean.
Under the Duke Honor Code, you are expected to submit your own work in this course, including homeworks, projects, and exams. On many occasions when working on homeworks and projects, it is useful to ask others (the instructor or other students) for hints or debugging help, or to talk generally about the written problems or programming strategies. Such activity is both acceptable and encouraged, but you must indicate in your submission any assistance you received. Any assistance received that is not given proper citation will be considered a violation of the Honor Code. In any event, you are responsible for understanding and being able to explain on your own all written and programming solutions that you submit. The course staff will pursue aggressively all suspected cases of Honor Code violations, and they will be handled through official University channels.