CrowdAlytics: Large-Scale Human-Machine Systems for Data Science
Artificial Intelligence (AI) is increasingly taking over areas of work that used to be manageable only by humans. This fuels fears of losing work. However, humans and machines have different abilities: while AI is faster and better than humans for certain, well-defined tasks, people can deal better with poorly-defined situations. Given these different capabilities, the question that arises is: How can we combine groups of humans and machines to effectively perform ill-structured tasks that are not achievable by either party alone?
One field that seems to crystalize itself as in direct need of combined, cooperative human-machine intelligence is data-driven knowledge discovery or Data Science. Data-driven findings are shaping the way organizations take decisions, new empirical scientific disciplines have emerged, citizens are constantly exposed to data journalism), and participants of civic coding events analyze/visualize data to better understand and improve their own environment. However, at the same time this rise in data and methods introduces several challenges for data scientists, who must specify what questions are worthy of pursuing and how to find, integrate, interpret, summarize, and analyze a vast number of sources of diverse nature at different stages of their multidisciplinary work. But data science professionals are very short in supply. To address this need, scientists have proposed statistical expert or intelligent discovery systems. In practice, though, the success of these systems is limited as well-trained data scientists are still an indispensable part of the process.
The aim of this research project is to investigate how people and AI can work together to solve data science tasks. In particular, we would like to develop new methods of human-machine cooperation, that allow both novice and expert users as well as machines to collaborate on complex data science tasks. We combine findings from statistics, data science, swarm intelligence research and computer-supported group work. The findings of this study help us to better understand how people and machines work together, a goal that is becoming increasingly important to our lives and work in the age of AI.
- Prof. Abraham Bernstein
- Dr. Cristina Sarasua
- Dhivyabharathi Ramasamy
- Florian Ruosch
- Rosni Kottekulam Vasu
- August 2019 This project is now supported by the Swiss National Science Foundation (SNF) under contract number 200020_184994. Its layman summary is published at http://p3.snf.ch/project-184994
Florian Ruosch, Cristina Sarasua and Abraham Bernstein, 2022 BAM: Benchmarking Argument Mining on Scientific Documents . The AAAI-22 Workshop on Scientific Document Understanding at the Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI-22) online due to COVID-19. 2022-03-01. CEUR Workshop Proceedings Workshop. In Proceedings.
Dhivyabharathi Ramasamy, Cristina Sarasua, Alberto Bacchelli and Abraham Bernstein, 2023Workflow analysis of data science code in public GitHub repositories. Empirical Software Engineering 28, 7. https://doi.org/10.1007/s10664-022-10229-z