Header

Search

Bachelor/Master Theses and Master Project Topics

This pages lists the open BSc. and MSc. thesis descriptions, as well as the master projects opportunities currently available in the DDIS research group.

If you are interested in any of the listed projects, please do not hesitate to contact the person mentioned in the open topic description.

If there are currently no open topics but you are generally interested in our research (see https://www.ifi.uzh.ch/en/ddis/research.html), or if you would like to propose a thesis about your own idea, you can send us an email to ddis-theses@ifi.uzh.ch.

Master project: Adaptive Questionnaires Platform Development

Voting Advice Applications (VAA) such as Smartvote or Wahl-O-Mat depend on long questionnaires to recommend parties or candidates to a user. Recently, adaptive questionnaires have been introduced to optimize the data collection process and speed up recommendations in such applications. These adaptive questionnaires select the subsequent question based on the individual response profile of a user and, therefore, avoid redundancies.

To demonstrate and test the concept of adaptive questionnaires, the self-hosted AQVAA Platform was built based on Smartvote. Currently, the platform hosts user experiments in a controlled setting. The goal of this Master project is to extend the platform from a research prototype to a live site. This involves understanding and refactoring the code base, implementing additional features, and writing scripts to monitor the performance. 

If interested, please contact us at the email address below. We can provide a more detailed description during a meeting.

Note: The Master project is open now. However, the starting date of the project is flexible (ideally before September 2025). 

Requirements: Proficiency in Python for backend algorithm development, knowledge of PostgreSQL and Redis for database management and caching, and expertise in Angular, NestJS, and Nginx for front-end integration and deployment.

Contact: Fynn Bachmann

 

Master’s Thesis: Phenotype-to-Orphacode Retrieval with Embeddings

In rare disease diagnosis, observed symptoms, called phenotypes, can be structured using the Human Phenotype Ontology (HPO). These are linked to rare diseases in the Orphanet Rare Disease Ontology (ORDO) via the HPO–ORDO Ontological Module (HOOM), a mapping layer that also encodes how frequently each symptom appears in a disease. Together, these three ontologies form a rich graph. Retrieving the correct disease based on a patient’s symptoms is a complex information retrieval task, made harder by sparse data and low disease prevalence.

This thesis aims to design a domain-specific embedding model (e.g., HPO2Vec+ (GitHub) tailored for rare diseases) that integrates graph structure, phenotype–disease frequency, and prevalence information directly into its learning objective. The goal is to improve over off-the-shelf methods (e.g., Node2Vec, TransE) in retrieving candidate diseases from patient symptoms. A lightweight re-ranking step using a small language model may be explored to further boost performance (See: A Survey of Large Language Models in Medicine: Progress, Application, and Challenge (GitHub); RareBench: Can LLMs Serve as Rare Diseases Specialists?).

If interested, feel free to contact me via the email address below. I’m happy to provide a more detailed project description in a short meeting. This thesis contributes to the RareSim project.

Start date: from end of July

Requirements: Strong Python coding skills, interest in embeddings and knowledge graphs, and basic understanding of machine learning. Recommended: the course Advanced Topics in Artificial Intelligence (ATAI).

Contact: Pascal Severin Andermatt

 

Master’s Thesis: Travel Medicine Chatbot

The goal of this thesis is to utilize the rich, longitudinal TOURIST digital‐health dataset together with the Swiss travel guidelines to fine-tune a large language model (LLM) that can power an interactive travel medicine chatbot. This chatbot will provide personalized, destination-specific health advice in real time, flagging both infectious and non-infectious disease risk factors based on traveler profile and context.

You will begin by integrating and harmonizing the TOURIST2 data streams—passive GPS and environmental metrics alongside daily traveler-reported symptoms and behaviors—into an anonymized and normalized training dataset. In parallel, the healthytravel.ch recommendations will be parsed and transformed into structured guideline knowledge, either as Q&A pairs or a knowledge base.  You will use the training dataset to fine-tune an LLM via full supervised fine-tuning (SFT), optionally using low-rank adaptation (LoRA) for efficiency. Additionally, depending on the size and dynamicity of the structured guideline knowledge, you will either follow a) a multitask learning (MLT) approach, tackling a Q&A task augmented with synthetic traveler queries paired with expert-validated responses, or b) build a retrieval-augmented generation (RAG) system that dynamically retrieves guidelines content at inference time.  To choose the LLM, you will benchmark several state-of-the-art open-source LLMs (such as Llama 3 and Falcon) to identify the optimal base model for our domain.  You will design prompt templates that personalize responses based on traveler background context such as destination, itinerary, and risk profile. Evaluation includes quantitative metrics such as semantic similarity against expert answers, response latency, and safety-filter effectiveness. To ensure safe, accurate advice, you will implement guardrails to detect and block unsafe or biased suggestions, with special attention to equitable guidance across destinations. For deployment, you will wrap the chatbot in a lightweight API (e.g., FastAPI). All chatbot interactions will be logged for both performance monitoring and iterative model refinement.   

Start date: from end of July

Requirements: Strong Python coding skills, interest in LLMs, and basic understanding of NLG evaluation. Recommended: the course Advanced Topics in Artificial Intelligence (ATAI).

Contact: Selene Baez Santamaria, Andrea Farnham

Master’s Thesis: Evaluating LLMs for Deliberation Quality

Large-scale online deliberation platforms enable broader participation in public discourse by reducing barriers to entry and amplifying diverse viewpoints that might be excluded from traditional in-person meetings. However, this increased accessibility may compromise deliberation quality through information overload from excessive participation or diminished civility inherent to online environments.

In this master's thesis, you will explore a diverse set of tools that aim to measure different dimensions of online deliberation quality. The goal is to evaluate exisiting models, compare them across different datasets, and try to improve their assessment as best as posible. These models range from finetuned BERT models to LLM prompted models, with the possibility to fine-tune models ourselves. You should be experienced with running NLP evaluation processes with reproducable results.

The thesis is part of a larger project on the Digital Deliberative Democracy, which I'm happy to share more information about. If you're interested, please feel free to send me an email

Start date: Mid August

Requirements: Strong Python coding skills, interest in LLMs, and basic understanding of NLG evaluation. Recommended: the course Advanced Topics in Artificial Intelligence (ATAI).

Contact: Daan van der Weijden