The «Test of Time Award» recognizes highly influential papers published ten years ago in «European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE)».
The study «Fair and balanced? Bias in Bug-fix Datasets» examined the possible impact of «unfair» data sets on the performance of prediction techniques.
In Software Engineering (SE) one relies more and more on data from development processes, e.g. from version control systems like GIT or SVN but also from bug trackers like Jira or Bugzilla. This data helps to better understand and control the projects as well as to predict the occurrence of problems. Unfortunately, this data is not always perfect and the information contained therein is sometimes incomplete or misleading.
The study showed that the data from publicly accessible software repositories (may) contain systematic errors and statistical tendencies. The analysis of the data without taking these problems into account can therefore be misleading.
The study shed light on the discussion on systematic data errors from version control systems and bug trackers and how to fix them in software development systems. As such it helped to positively influence the robustness of studies in this area.
Many software developers today use tools and techniques derived from SE research. They benefit from more robust research results.
The study «Cross-project Defect Prediction: A Large Scale Experiment on Data vs. Domain vs. Process» identified factors that do influence the success of cross-project predictions – a help for software engineers.