Header

Search

Modern Data Analytics 2025

Organization: Prof. Dr. Dan OlteanuDr. Andrei Draghici, Dr. Haozhe ZhangChristoph Mayer, Eden Chmielewski, and Yuchen He.

This seminar provides a deep dive into the recent research developments reshaping the core of modern database systems: query processing and optimization. The performance of virtually every data-driven application hinges on the database's ability to translate declarative queries into efficient, low-level execution plans. However, the sheer complexity of modern analytics, the demand for real-time results, and the scale of today's datasets are pushing classical, heuristic-based optimizers to their breaking point.

Learning outcome: The goal of the seminar is to expose the students to the recent trends in academia and industry on rethinking modern data analytics systems. The students will read and present research published in the top international venues in data management research, in particular ACM Special Interest Group on Management of Data (SIGMOD) and Very Large Data Bases (VLDB). Students will gain a deep understanding of the challenges and state-of-the-art solutions in query optimization, robust execution, and real-time analytics maintenance. The course will equip them to critically analyze and contribute to the development of next-generation, high-performance data systems.

Target audience: MSc in Software Engineering, Data Science and AI students.

Semester: This seminar will be offered in Fall 2025.

Teaching format: Each participant prepares a presentation based on a research paper; answers follow-up technical questions; reads the other papers in the seminar session; and actively participates in the technical discussions in the seminar. Each participant has a buddy, who will help improve their presentation by making suggestions for improvements and attending dry runs of the presentation. The best presentation of the seminar will be selected by the participants and receive a prize.

Registration: Please register as required by the department. In addition, please browse the papers mentioned below. In the kickoff meeting, the papers will be assigned to students, so make sure you get assigned to a paper you want.

Meetings: The first meeting will be on Thursday, September 18, 2025 from 10:15 to 12:00 in room BIN 1.D.29. The meeting will feature a presentation by the organizers overviewing the topics to be investigated in the seminar and it will answer questions from the participants. In this session, students will be assigned to papers.

The student presentations will take place on Saturday November 8 and November 22, 2025 in BIN 2.A.01.

Participation at all three meetings is compulsory. The assessment depends on the quality of the presentation, active participation during the seminar, and input as a buddy.

 

How to read papers and give talks

How to read papers:

How to give talks:

  • These two articles have a number of good suggestions.
  • This video is pretty good as well.
  • How To Speak by Patrick Winston - a newer version of Patrick's talk

Slides from the kick-off meeting

Here are the slides from the kick-off meeting Introduction slides. Bellow you can find the assignments for the two presentation days. If you want to get in contact with your supervisor, here is our list of emails:

Presentations for November 8th

  1. How Good are Query Optimizers, Really? Still Asking: How Good Are Query Optimizers, Really?
    • Presented by: Müge Yegin
    • Buddy: Xinyao Cao
    • Supervisor: Christoph Mayer
  2. SQLStorm: Taking Database Benchmarking into the LLM Era
    • Presented by: Xinyao Cao
    • Buddy: Noah Croes
    • Supervisor: Dan Olteanu
  3. How Good are Learned Cost Models, Really? Insights from Query Optimization Tasks
    • Presented by: Noah Cores
    • Buddy: Müge Yegin
    • Supervisor: Andrei Draghici
  4. SafeBound: A Practical System for Generating Cardinality Bounds
    • Presented by: Michael Sigg
    • Buddy: Sofoklis Strompolas
    • Supervisor: Yuchen He
  5. Analyzing the Impact of Cardinality Estimation on Execution Plans in Microsoft SQL Server
    • Presented by: Sofoklis Strompolas
    • Buddy: Michael Sigg
    • Supervisor: Eden Chmielewski
  6. DPconv: Super-Polynomially Faster Join Ordering
    • Presented by: Lihui Zhou
    • Buddy: Annamaria Vass
    • Supervisor: Yuchen He
  7. How to Optimize SQL Queries? A Comparison Between Split, Holistic, and Hybrid Approaches
    • Presented by: Annamaria Vass
    • Buddy: Lihui Zhou
    • Supervisor: Eden Chmielewski

Presentations for Saturday 22snd

  1. Robust Join Processing with Diamond Hardened Joins
    • Presented by: Marcelina Suszczyk
    • Buddy: Elif Deniz İșbuğa
    • Supervisor: Yuchen He
  2. SkinnerDB: Regret-Bounded Query Evaluation via Reinforcement Learning
    • Presented by: Philipp Stoffel
    • Buddy: Uros Dimitrijevic
    • Supervisor: Christoph Mayer
  3. ADOPT: Adaptively Optimizing Attribute Orders for Worst-Case Optimal Join Algorithms via Reinforcement Learning
    • Presented by: Uros Dimitrijevic
    • Buddy: Nishant Kumar
    • Supervisor: Christoph Mayer
  4. Holistic query Approximation via RL Modeling
    • Presented by: Nishant Kumar
    • Buddy: Philipp Stoffel
    • Supervisor: Andrei Draghici 
  5. Query running too slow? Rewrite it with Quorion!
    • Presented by: Elif Deniz İșbuğa
    • Buddy: Marcelina Suszczyk
    • Supervisor: Eden Chmielewski
  6. Streaming View: An Efficient Data Processing Engine for Modern Real-time Data Warehouse of Alibaba Cloud
    • Presented by: Birghton Thomas
    • Buddy: Akos Istvan Imets
    • Supervisor: Haozhe Zhang
  7. Streaming Democratized: Ease Across the Latency Spectrum with Delayed View Semantics and Snowflake Dynamic Tables
    • Presented by: Akos Istvan Imets
    • Buddy: Lin Han
    • Supervisor: Dan Olteanu
  8. Automated generation of materialized views in oracle
    • Presented by: Lin Han
    • Buddy: Brighton Thomas
    • Supervisor: Haozhe Zhang

The following papers are left here to provide a broader context