DaST Research
Research Mission
- Understand computational challenges for data processing, and
- Design simple and scalable solutions towards these challenges.
Theoretical research includes the development of novel data processing algorithms and languages along with the analysis of their complexities.
Systems research is on building data systems in academia and industry based on well-understood theory.
Research Highlights
- SIGMOD 2025 Best Paper Award for the LpBound cardinality estimator
- SIGMOD 2024 Research Highlights Award for information-theoretic cardinality estimation
- Gems of PODS 2024 "Recent Increments in Incremental View Maintenance"
- Factorised Databases workshop 2022
- ICDT 2022 Test of Time Award for factorised databases
- ICDT 2019 Best Paper Award for worst-case optimal dynamic algorithm for counting triangles
- Distinguished papers at: PODS 2024 (cardinality estimation), VLDB 2024 (foundation models for databases), ICDT 2023 (maintenance for queries with access patterns), OOPSLA 2022 (analytics over semiring dictionaries), PODS 2019 (functional aggregate queries with additive inequalities), PODS 2018 (in-database learning with sparse tensors), ICDT 2016 (declarative probabilistic programming), EDBT 2014 (clustering uncertain data)
- Honorable Mention to the 2021 SIGMOD Jim Gray Doctoral Dissertation Award for Maximilian Schleich
- Software LMFAO and F-IVM
- Keynotes on machine learning over relational data at VLDB 2020 and EDBT/ICDT 2019
Current Projects
- Cardinality Estimation with Guaranteed Upper Bounds
Cardinality estimation approach by translating the problem into linear optimization with entropy maximization constrained by Shannon inequalities and ℓp-norms on join column degree sequences.
- Real-Time Analytics over Fast and Continuously Evolving Relational Data
Unified framework for maintaining a wide range of analytics over databases that leverages factorization for underlying queries, output data representation, and bulk updates
- Foundations of Incremental View Maintenance
Update/query time trade-offs and fine-grained complexity analysis for dynamic problems
- Fact Attribution in Database Query Answering
Complexity analysis and algorithms for computing Shapley/Banzhaf values
- Adaptive Query Processing
Adaptive query processing combining worst-case optimal join algorithms and reinforcement learning
- Foundations of Intersection Joins
Charting the tractability of Boolean conjunctive queries with intersection joins via equivalence to disjunctions of Boolean conjunctive queries with equality joins
- In-Database Linear Algebra
Efficient and numerically accurate QR, SVD and PCA decompositions of matrices defined by joins over relational data
- Machine Learning over Relational Data
Scalable techniques for machine learning over databases that exploit the relational structure, push the learning task inside the database query engine, and factorise its computation
- Factorised Databases
Principled approach to avoiding redundancy in the representation and computation of query results in relational databases
Past Research Projects
- Probabilistic Databases
- Datalog Engines