Data-driven decisions are changing the way organizations (and science) operate. Relying on increasingly available large amounts of data organizations leverage quantitative analytics for their operations. Data, however, is growing in volume, velocity (time sensitivity), variety, and veracity requiring novel approaches for analytics and new capabilities for decisions makers to master this avalanche of data. This course is divided into two parts. In the first part, you will learn about general principles and best practices of data science by investigating the different stages of the data science process. This is not just done on a theoretical level, but also enhanced by practical exercises. In the second part, you will learn about architectures and programming models of massive parallel data processing systems used in industry and science today. This course will enable you to leverage massive parallel computing systems to write basic big data analysis applications using a system APIs and high level libraries and prepare you for other, more technically-oriented resources that you may encounter when working with these systems. During the course you will also implement a data analytics task in the context of a small (group-based) project.