Data-driven decisions are changing the way organizations and science operate. Many methods which were infeasible a couple of decades ago, can now be leveraged due to increasingly available large amounts of data. Processing this kind of data, though, is not just difficult because of its sheer size, but also because it is generated ever more rapidly, exhibits a more complex structure, and is often noisy. In this course, we look at the backend part of data science, i.e., what kind of technology and systems do we need to process and store huge amounts of data efficiently and in a scalable way. On the one hand, we look at principles underlying distributed systems in general; on the other hand, we also investigate the functionality of concrete systems. The latter part is enhanced by practical (programming) exercises, in which we take a closer look at the architecture of these systems and the programming models they employ.
Teaching Format: The course consists of one lecture and one practical exercise/lab session each week.
Evaluation: There is a written final exam graded from 1 to 6 with quarter grades. The exam takes place on Wednesday, 15 June, 2022. Details on the course can be found on the OLAT course page (see below for a link).
OLAT: All course-related information (lecture slides, practical exercises, etc.) is published on the corresponding OLAT course page.