Seminar Database Systems (PhD, MSc, BSc)

Organization:

Michael Böhlen,  Przemyslaw Uznanski and  Peter Widmayer
Teaching language: English
Level: PhD, MSc and advanced BSc students
Academic Year: Spring 2018
Dates:

Friday 23.2.2018, 14.15-16.00h ETHZ CAB H52   
Saturday 28.4.2018, 9.30 - 15.00h ETH CAB H52
Saturday 19.5.2018, 9.00 - 15.00h BIN 2.A.01


Overview and objectives: The area of this year's seminar is Locality-Sensitive Hashing, Similarity, and Nearest Neighbours. Students learn how to critically read and study research papers, how to summarize the contents of a paper, and how to present it in a seminar.

Teaching format: Each participant writes a self-contained report of about 10 pages and gives a 30 minutes presentation (blackboard, without a computer). Each participant has a buddy. Buddies read the report, make suggestions for improvements, and help with the presentation (e.g., dry runs). The first version of the report is due three weeks before the date of the presentation, and will be discussed with the buddy and the professor about one week before the presentation. The final versions of the report are due one week before the date of the seminar.

Setup and Organization: The setup of the seminar will be discussed Friday February 23, 2018 from 14:15 until 15:00 in room CAB H 52 at ETHZ. At the first meeting the available slots for the seminar will be distributed and papers will be assigned.

Presentations:

  • Saturday April 28, CAB H52
  • ​Saturday May 19, BIN 2.A.01

Participation at all three meetings is compulsory. The assessment depends on the quality of the report, presentation, active participation during the seminar, and input as a buddy.

Useful links:


Saturday, April 28, 2018

topic presenter buddy professor
Similarity Search in High Dimensions via Hashing (PDF, 303 KB) Timothy Pescatore Olga Klimashevska Michael Böhlen
Dimensionality Reduction Techniques for Proximity Problems (PDF, 214 KB) Dhivyabharathi Ramasamy Stefan Tiegel Przemyslaw Uznanski
Stable Distributions, Pseudorandom Generators, Embeddings, and Data Stream Computations (PDF, 678 KB) Luise Arn Nicolas Gordillo Przemyslaw Uznanski
Approximate nearest neighbors: towards removing the curse of dimensionality (PDF, 1298 KB) Lukas Arnold Wanja Chresta Peter Widmayer
An Improved Data Stream Summary: The Count-Min Sketch and its Applications (PDF, 143 KB) Petra Wittwer Liu Bingyan Michael Böhlen
Locality-Sensitive Hashing Scheme Based on p-Stable Distributions (PDF, 205 KB) Alphonse Mariyagnanaseelan Sebastian Sanchez Peter Widmayer
Syntactic clustering of the Web (PDF, 213 KB) Amos Madalin Neculau Michael Studer Peter Widmayer

Saturday, May 19, 2018

topic presenter buddy professor
Comparing Data Streams Using Hamming Norms (How to Zero In) (PDF, 148 KB) Stefan Tiegel Lukas Arnold Michael Böhlen
Sketching and Embedding are Equivalent for Norms (PDF, 299 KB) Wanja Chresta Dhivyabharathi Ramasamy Peter Widmayer
Set similarity search beyond MinHash (PDF, 1169 KB) Liu Bingyan Alphonse Mariyagnanaseelan Przemyslaw Uznanski
Practical and Optimal LSH for Angular Distance (PDF, 358 KB) Nicolas Gordillo Timothy Pescatore Przemyslaw Uznanski
Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions (PDF, 171 KB) Michael Studer Petra Wittwer Peter Widmayer
Earth Mover’s Distance based Similarity Search at Scale (PDF, 1012 KB) Olga Klimashevska Amos Madalin Neculau Michael Böhlen
Hokusai — Sketching Streams in Real Time (PDF, 3359 KB) Sebastian Sanchez Luise Arn Michael Böhlen