Navigation auf uzh.ch

Suche

Department of Informatics Database Technology

Implementation of Database Systems

Lecturer: Andrej Taliun
Teaching language: English
Term: Fall 2012

Objectives. During the course we study cost-based query optimization, which is one of the most critical parts of database management systems. Our primary target is the state-of-the-art join algorithms and data summary structures. In the first part, we learn various join algorithms, understand their weak and strong points and, the most important, how their performance is affected by database properties. In the second part, we learn how a database collects data statistics, which data summary structures it uses and, finally, how it helps to select an optimal query execution plan with the cheapest join algorithms.


Literature. The textbook of the course is Database Systems: the Complete Book. Hector Garcia-Molina, Jeffrey D. Ullman and Jennifer Widom.. Material on data summaries comes from the following scientific publications:

  • Jeffrey S. Vitter. Random sampling with a reservoir. ACM Trans. Math. Softw. 1985.
  • Muralikrishna, M. and DeWitt, D. J. Equi-depth multidimensional histograms. SIGMOD 1988.
  • H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Viswanath Poosala, Kenneth C. Sevcik, and Torsten Suel. Optimal Histograms with Quality Guarantees. VLDB 1998.
  • Yossi Matias, Jeffrey Scott Vitter, and Min Wang. Wavelet-based histograms for selectivity estimation. SIGMOD 1998.
  • Aboulnaga, A. and Chaudhuri, S. Self-tuning histograms: building histograms without looking at data. SIGMOD, 1999.
  • Dimitrios Gunopulos, George Kollios, Vassilis J. Tsotras, and Carlotta Domeniconi. Approximating multi-dimensional aggregate range queries over real attributes, SIGMOD 2000.
  • Kaushik Chakrabarti, Minos N. Garofalakis, Rajeev Rastogi, and Kyuseok Shim. Approximate Query Processing Using Wavelets. VLDB 2000.
  • Venkatesh Ganti, Mong-Li Lee, and Raghu Ramakrishnan. ICICLES: Self-Tuning Samples for Approximate Query Answering. VLDB 2000.
  • Bruno, N., Chaudhuri, S., and Gravano. STHoles: a multidimensional workload-aware histogram. SIGMOD, 2001.
  • Wang, H. and Sevcik, K. C. A multi-dimensional histogram for selectivity estimation and fast approximate query answering. In CASCON 2003.
  • Sudipto Guha. 2005. Space efficiency in synopsis construction algorithms. VLDB 2005.

The lecture notes for this course will become available as we progress through the semester.


Lectures.


Course Projects. The goal of course project is to implement one of the given data summary structures and experimentally evaluate it. The students work on their projects in groups.

There are 3 real-world datasets for the experimental evaluation. All datasets are 2-dimensional and the domain of each attribute are real numbers in the range between 0 and 1.


Examination. The exam is oral and takes place January 21, 2013 at 10:00am, 2.E.11. Students must send their final project report one week before the examination, i.e., before January 14, 2011. The structure in and questions of the examination are found here (PDF, 26 KB).

Weiterführende Informationen

Title

Teaser text