Visual-Interactive Data Labeling

Introduction to VIAL

The assignment of labels to data instances is a fundamental prerequisite for many machine learning tasks. Moreover, labeling is a frequently applied process in visual interactive analysis approaches and visual analytics. However, the strategies for creating labels usually differ between these two fields. This raises the question of whether synergies between the different approaches can be attained. In this Ph.D. project, we will study the process of labeling data instances with the user in the loop, from both the machine learning and visual interactive perspective. The project is building upon the “visual interactive labeling” (VIAL) process that unifies both perspectives. VIAL describes the six major steps of the process and discusses
their specific challenges. The Ph.D. project will address general challenges to VIAL and include necessary work for the realization of future VIAL approaches.

The VIAL Process

VIAL

The VIAL process. Four algorithmic models (green) and two primary visual interfaces (red) are assembled to an iterative labeling process. To resemble the special characteristics of the Active Learning and the Visualization perspective, the VIAL process contains a branch (from "Learning Model” to “candidate suggestion” and “result visualization,” since both are complementary). The VIAL process can be applied for data exploration and labeling tasks. The output of the VIAL process is threefold: labeled data, learned models, and gained knowledge.

Selection of Instances for Labeling by Machines and Humans

instanceSelection

Complementary strengths of model-based and user-based instance selection for data labeling. Left: visualization that explains the decision boundaries of a classifier using bright colors. Uncertainty-based AL strategies (model-based) will select instances near the decision boundaries. Right: visualization that explains the prediction of a classifier for unlabeled data (colors). Users selecting instances may want to resolve the local green class confusion.

General Information and Contact

For general information about the position, please see the hub page for open PhD positions

Operational Information

  • Project start: 2022 (or by arrangement)
  • Project duration: 3 years