Student Projects
To apply, please send your CV, your Ms and Bs transcripts by email to all the contacts indicated below the project description. Do not apply on SiROP . Since Prof. Davide Scaramuzza is affiliated with ETH, there is no organizational overhead for ETH students. Custom projects are occasionally available. If you would like to do a project with us but could not find an advertized project that suits you, please contact Prof. Davide Scaramuzza directly to ask for a tailored project (sdavide at ifi.uzh.ch).
Upon successful completion of a project in our lab, students may also have the opportunity to get an internship at one of our numerous industrial and academic partners worldwide (e.g., NASA/JPL, University of Pennsylvania, UCLA, MIT, Stanford, ...).
-
Motion Segmentation with Neuromorphic Sensing

This project investigates how neuromorphic vision can enhance motion understanding for mobile systems. By leveraging high-temporal-resolution sensing from event cameras, it aims to reliably separate ego-motion from independently moving objects for improved perception and navigation.
-
Event-based Perception for Autonomous Driving

This project explores how event-based sensing can enhance perception for autonomous driving systems. It investigates the integration of asynchronous, high-temporal-resolution visual signals with conventional sensing modalities to improve robustness, latency, and reliability under challenging real-world conditions.
-
Vision-Based Reinforcement Learning in the Real World

We aim to learn vision-based policies in the real world using embedded optimization layers within reinforcement learning.
-
Vision-Based Tactile Sensor for Humanoid Hands (in collaboration with Soft Robotics Lab)

Humanoid robots require tactile sensing to achieve robust dexterous manipulation beyond the limits of vision-based perception. This project develops an event-based tactile sensor to provide low-power, high-bandwidth force estimation from material deformation, with the goal of integrating it into a human-scale robotic hand.
-
Surgical HDR Imaging with an Event Camera (in collaboration with Balgrist)

Surgical environments present an extreme HDR challenge, causing standard cameras to lose critical detail due to overexposure. This project introduces a neuromorphic sensor fusion framework integrating RGB sensors with asynchronous event cameras to overcome these limitations. By exploiting the event camera's superior dynamic range ($>120$ dB) and temporal resolution, our method recovers texture and motion in saturated areas. The result is robust HDR reconstruction and low-latency tracking that outperforms traditional optical image enhancement techniques, ensuring reliable computer vision performance in harsh intraoperative lighting. This project is conducted at RPG and Balgrist under the supervision of Prof. Dr. Fürnstahl.
-
Vision-based Navigation in Dynamic Environment via Reinforcement Learning

In this project, we are going to develop a vision-based reinforcement learning policy for drone navigation in dynamic environments. The policy should adapt to two potentially conflicting navigation objectives: maximizing the visibility of a visual object as a perceptual constraint and obstacle avoidance to ensure safe flight.
-
Learning Robust Agile Flight via Adaptive Curriculum

This project focuses on developing robust reinforcement learning controllers for agile drone navigation using adaptive curricula. Commonly, these controllers are trained with a static, pre-defined curriculum. The goal is to develop a dynamic, adaptive curriculum that evolves online based on the agents' performance to increase the robustness of the controllers.
-
Vision-Based Drone Control with Structured Networks & Symmetry

Vision-based reinforcement learning controllers can achieve impressive drone flight performance, but training them is often slow and data-hungry because standard networks must re-learn the same behaviors across equivalent viewpoints and orientations. In this project, we will speed up vision-based drone control policy learning by using structured (symmetry-aware / equivariant) neural networks that encode physical and geometric symmetries directly into the policy. By enforcing these structure constraints, the controller can generalize better across rotations and scene variations, improving sample efficiency and sim-to-real transfer. Applicants should have a solid understanding of reinforcement learning, machine learning experience (Jax, PyTorch), and programming experience in Python and C++.
-
Drone Racing Meet Differentiable simulation

In this project, we investigate how DiffSim (differentiable simulation) can accelerate learning for drone racing by enabling end-to-end gradient-based training with Backpropagation Through Time (BPTT). Instead of relying solely on sample-inefficient trial-and-error, we use a differentiable simulator to propagate learning signals through the drone dynamics and control pipeline over time, allowing the controller to improve from trajectories much more efficiently. Our objective is to develop a training framework that learns high-speed, gate-to-gate racing behaviors in significantly less wall-clock time, while maintaining stable and agile flight. Applicants should be comfortable with control and learning for dynamical systems, have machine learning experience (e.g., Jax, PyTorch), and be proficient in C++ and Python.
-
Neural Quadrotor Dynamics

This project leverages the Neural Robot Dynamics (NeRD) framework to build a high-fidelity, high-speed neural simulator for a quadrotor UAV. Rather than relying solely on classical rigid-body dynamics and hand-crafted aero/actuation models, we will train a neural network to replace key low-level simulation components (e.g., unmodeled dynamics, actuator response, residual forces), enabling faster rollouts without sacrificing accuracy. The simulator will be fine-tuned on real flight logs to learn a vehicle-specific model and reduce the sim-to-real gap. Thanks to the model’s differentiability, the resulting engine also supports differentiable simulation (DiffSim) for gradient-based system identification, trajectory optimization, and policy learning. Ultimately, we aim to accelerate training of advanced flight control policies and improve zero-shot transfer by matching simulation to the target platform’s true dynamics. Applicants should be comfortable with control and learning for dynamical systems, have ML experience (e.g., JAX/PyTorch), and be proficient in C++ and Python.
-
Evolutionary Optimization Meets Differentiable Simulation

The goal is to investigate how the latest differentiable simulation strategies push the limits of learning real-world tasks such as agile flight.
-
Codesign of shape and control: A study in autonomous perching

We aim to co-design a controller and the shape of a glider for a perching maneuver, involving deployment on the real system.
-
Observability-and-Perception-aware Planning and Control for Event-Based Object Reconstruction

Design a model-based / learning-based controller that is aware of state observability and sensor perception objectives for object reconstruction using a quadrotor with an egocentric camera.
-
Vision-Based Drone Racing from Raw Pixels with Deep Reinforcement Learning

Explore the possibility of high-speed drone racing using raw RGB camera images only.
-
High-Speed Object Pickup During Quadrotor Flight with Reinforcement Learning

Explore the possibility of catching/picking up an object during high-speed agile flight, with potential application in fast turnaround delivery.
-
Event Representation Learning for Control with Visual Distractors

This project develops event-based representation learning methods for control tasks in environments with visual distractors, leveraging sparse, high-temporal-resolution event data to improve robustness and efficiency over traditional frame-based approaches.
-
Rethinking RNNs for Neuromorphic Computing and Event-based Vision

This thesis develops hardware-optimized recurrent neural network architectures with novel parallelization and kernel-level strategies to efficiently process event-based vision data for real-time neuromorphic and GPU-based applications.
-
Spiking Architectures for Advanced Event-Based Temporal Reasoning

This thesis explores novel spiking neural network architectures that leverage event-based vision data and emergent neural synchronization to enable efficient and robust temporal reasoning for dynamic scene understanding and sequential tasks.
-
Reinforcement Learning with World Models

Explore and develop model-based RL algorithms.
-
Event‑based Temporal Segmentation & Tracking

Event cameras are revolutionary sensors that capture pixel-level illumination changes with microsecond latency, providing significant advantages in high-speed and high-dynamic-range scenarios where traditional cameras suffer from motion blur. Recently, large-scale foundational segmentation models have been successfully adapted to the event domain. However, these current approaches remain constrained to per-frame analysis, treating continuous event streams as isolated, static snapshots and ignoring temporal consistency. At the same time, existing event-based methods for moving object segmentation can isolate motion but fail to maintain instance identity over time—they can segment moving pixels, but they cannot "track" specific objects. This project aims to bridge the gap between static foundational segmentation and dynamic motion analysis by developing the first comprehensive tracker for event cameras. The objective is to design a system capable of not only segmenting arbitrary objects but also maintaining their identity consistently across long, high-speed sequences. The student will extend current spatial feature adaptation strategies to support temporal identity, effectively transforming a frame-by-frame instance segmenter into a robust Video Object Segmentation (VOS) tracker. Furthermore, to handle severe object occlusions and rapid, erratic motion, the project will explore sparse temporal memory mechanisms that prevent identity-switching. Finally, to rigorously test the system's reliability, the student will establish a novel benchmark for dense segmentation in extreme edge cases, such as night driving with severe glare and rapid evasive maneuvers.