Visual-Interactive Data Labeling

Introduction to VIAL

The assignment of labels to data instances is a fundamental prerequisite for many machine learning tasks. Moreover, labeling is a frequently applied process in visual interactive analysis approaches and visual analytics. However, the strategies for creating labels usually differ between these two fields. This raises the question whether synergies between the different approaches can be attained. In this PhD project, we will study the process of labeling data instances with the user in the loop, from both the machine learning and visual interactive perspective. The project is building upon the “visual interactive labeling” (VIAL) process that unifies both perspectives. VIAL describes the six major steps of the process and discusss
their specific challenges. The PhD project will address general challenges to VIAL and include necessary work for the realization of future VIAL approaches.

The VIAL Process

VIAL

The VIAL process. Four algorithmic models (green) and two primary visual interfaces (red) are assembled to an iterative labeling process. To resemble the special characteristics of the Active Learning and the Visualization perspective, the VIAL process contains a branch (from "Learning Model” to “candidate suggestion” and “result visualization,” since both are complementary). The VIAL process can be applied for data exploration and labeling tasks. The output of the VIAL process is threefold: labeled data, learned models, and gained knowledge

Selection of Instances for Labeling by Machines and Humans

instanceSelection

Complementary strengths of model-based and user-based instance selection for data labeling. Left: visualization that explains the decision boundaries of a classifier using bright colors. Uncertainty-based AL strategies (model-based) will select instances near the decision boundaries. Right: visualization that explains the prediction of a classifier for unlabeled data (colors). Users selecting instances may want to resolve the local green class confusion

Exploratory Data Analysis Through Labeling

To leverage complementary strengths of humans and machines and foster knowledge generation, this PhD project aims at making three main contributions. Together, we will

  • (i) formalize labeling as a VA process
  • (ii) utilize human feedback for instance selection
  • (iii) transform labeling into a visual data exploration process

To evaluate our approaches, we may perform case studies with different datasets, human-subject experiments, data studies, or simulation experiments.

General Information and Contact

For general information about the position, please see the hub page for open PhD positions