Seminar Database Systems (PhD, MSc, BSc)

Organization:	Michael Böhlen, Sven Helmer and Paolo Penna
Teaching language:	English
Level:	Advanced BSc, MSc and PhD students
Academic Year:	Spring 2019
Dates:	Tuesday 19.2.2019, 16.30 - 18.00, UZH, BIN 0.K.11/12/13 Saturday 13.4.2019, 9.00 - 15.00, UZH, BIN 2.A.01 Saturday 11.5.2019, 9.00 - 15.00, ETH, CAB H.52 ()

Overview and objectives: The area of this year's seminar is Algorithms and Systems for Data Science. Students learn how to critically read and study research papers, how to summarize the contents of a paper, and how to present it in a seminar.

Teaching format: Each participant writes a self-contained report of about 10 pages and gives a 30 minutes presentation (blackboard, without a computer). Each participant has a buddy. Buddies read the report, make suggestions for improvements, and help with the presentation (e.g., dry runs). The first version of the report is due three weeks before the date of the presentation. This first version of the report and presentation will be discussed with the buddy and the teacher about two weeks before the presentation. The final versions of the report are due one week before the presentation.

Setup and Organization: The setup of the seminar will be discussed Tuesday, February 19, 2019 from 16:30 until 18:00 in room (tba) at UZH. At the first meeting the available slots for the seminar will be distributed and papers will be assigned.

Presentations:

Saturday April 13, BIN 2.A.01
Saturday May 11, CAB H.52 (we meet in front of the south entry (the one closest to the tram stop ETH/Universitätspital) at 8:50)

Participation at all three meetings is compulsory. The assessment depends on the quality of the report, presentation, active participation during the seminar, and input as a buddy.

Useful links:

organizational slides (PDF, 64 KB)
How to give talks and read papers: link

Topics

1. Architectures and Systems

2. Column Stores

3. Streams

4. Spark

5. Query Processing

6. Clustering

Saturday, April 13, 2019

topic	Presenter	Buddy	Advisor
MISO: Souping up Big Data Query Processing with a Multistore System , SIGMOD 2014. (PDF, 2 MB)	Pascal Engeli	Michael Studer	Sven Helmer
RHEEM: Enabling Cross-Platform Data Processing, PVLDB 2018. (PDF, 1 MB)	Mesut Ceylan	Alex Wolf	Sven Helmer
Abstraction for Advanced In-Database Analytics, PVLDB 2018. (PDF, 554 KB)	Sara Decova	Maximilian Wolfertz	Michael Böhlen
Column Sketches: A Scan Accelerator for Rapid and Robust Predicate Evaluation, SIGMOD 2018. (PDF, 1 MB)	Catharina Dekker	Clive Charles Javara	Paolo Penna
Column-Stores vs. Row-Stores: How Different Are TheyReally?, SIGMOD 2008. (PDF, 789 KB)	Peter Giger	Han-Mi Nguyen	Sven Helmer
Access Path Selection in Main-Memory Optimized Data Systems: Should I Scan or Should I Probe?, SIGMOD 2017. (PDF, 683 KB)	Mike Suter	Luca Wolf	Michael Böhlen
Incremental Query Processing on Big Data Streams, TKDE 2016. (PDF, 433 KB)	Lorenzo Selvatici	Timon Stampfli	Paolo Penna
The Stratosphere Platform for Big Data Analytics , VLDB Journal 2014. (PDF, 2 MB)	Syed Shahvaiz Ahmed	Donn Edward Anin	Paolo Penna
Drizzle: Fast and Adaptable Stream Processing at Scale, SOSP 2017. (PDF, 767 KB)	Yichun Xie	Emilien Pierre Carlo Pilloud	Michael Böhlen

Saturday, May 11, 2019

topic	Presenter	Buddy	Advisor
Spark SQL: Relational Data Processing in Spark , SIGMOD 2015. (PDF, 984 KB)	Luca Wolf	Sara Decova	Sven Helmer
SHC: Distributed Query Processing for Non-Relational Data Store, ICDE 2018. (PDF, 892 KB)	Donn Edward Anin	Lorenzo Selvatici	Sven Helmer
Flare: Optimizing Apache Spark with Native Compilation for Scale-Up Architectures and Medium-Size Data, OSDI 2018. (PDF, 711 KB)	Clive Charles Javara	Syed Shahvaiz Ahmed	Sven Helmer
A Minimal Variance Estimator for the Cardinality of Big Data Set Intersection . KDD 2017. (PDF, 720 KB)	Emilien Pierre Carlo Pilloud	Mesut Ceylan	Paolo Penna
Orca: A Modular Query Optimizer Architecture for Big Data, SIGMOD 2014. (PDF, 1 MB)	Maximilian Wolfertz	Mike Suter	Michael Böhlen
Optimizing Big Data Queries Using Program Synthesis, SOSP 2017. (PDF, 1018 KB)	Alex Wolf	Yichun Xie	Michael Böhlen
Clustering with Same-Cluster Queries. NIPS 2016. (PDF, 274 KB)	Michael Studer	Peter Giger	Paolo Penna
A Hierarchical Algorithm for Extreme Clustering. KDD 2017. (PDF, 1 MB)	Han-Mi Nguyen	Pascal Engeli	Paolo Penna
Coconut: A Scalable Bottom-Up Approach for Building Data Series Indexes . VLDB 2018. (PDF, 1 MB)	Timon Stampfli	Catharina Dekker	Michael Böhlen

Additional Information

Teaser text

Zum UZH Portal

Quicklinks and available languages

Main navigation

Seminar Database Systems (PhD, MSc, BSc)

Additional Information

Title