Navigation auf uzh.ch

Suche

Department of Informatics DDIS

Bachelor/Master Theses and Master Project Topics

This pages lists the open BSc. and MSc. thesis descriptions, as well as the master projects opportunities currently available in the DDIS research group.

If you are interested in any of the listed projects, please do not hesitate to contact the person mentioned in the open topic description.

If there are currently no open topics but you are generally interested in our research (see https://www.ifi.uzh.ch/en/ddis/research.html), or if you would like to propose a thesis about your own idea, you can send us an email to ddis-theses@ifi.uzh.ch.

Master thesis: Iterative Support Vector Machines

In political science, dimensionality reduction algorithms are often used to visualize the ideological orientation of voters or candidates in a one or two-dimensional space - referred to as ideal point estimation. Common choices are item-response theory or principal component analysis, for example in these applications: Smartmap or Voteview.

The goal of this master thesis is to develop a new ideal point estimation algorithm using a machine learning approach. The focus will be on support vector machines (SVM) since iteratively training SVMs has been shown to work well in initial experiments. Specifically, the low-dimensional coordinates should be optimized until the model best reconstructs the given data. Throughout the thesis, the implemented code should be parallelized and optimized, and results should be compared to the existing baselines using the political dataset of the Swiss Voting Advice Application Smartvote

If interested, please get in touch with us at the email address below. We can provide a more detailed description during a meeting. 

Requirements: Solid understanding of dimensionality reduction, support vector machines, and efficient programming in Python.

Start date: Open now.

Contact: Fynn Bachmann

 

Master Project: A Platform for sustainable Web-sourced multi-modal Data Collection

We have seen impressive advances in many areas of AI in recent years, enabled in no small part by ever-larger training datasets sourced from the web. While such datasets are unquestionably useful, they are often collected without taking the relevant properties of their data sources into account. This not only hurts sustainability and reproducibility, as data integrity cannot be guaranteed but also limits the data’s applicability in certain areas since questions regarding licensing and copyright are largely ignored.

The aim of this project is to develop a platform for the sustainable collection and curation of a multimodal web-sourced dataset. Rather than treating every website equally, a series of platform-specific crawlers, targeting online content-sharing platforms known to also support permissive licenses is to be used for collecting relevant data. For every element of the collection, the platform keeps track of what it is, where it comes from, and what further information is associated with it. In addition to links to the original sources, the platform also keeps a reference copy of the actual content itself.

In the first phase, the focus will be on images and associated textual information, both structured and unstructured. Various mechanisms need to be in place to assess the quality of the collected data and ensure some reasonable standards. In later phases, the platform should be extended to other media types, such as audio and video.

Contact Luca Rossetto