Software archives such as source control systems, archived communications between project personnel, and defect tracking systems are used to help manage the progress of software projects. The software engineering community is beginning to recognize the potential benefit of mining this information to support the evolution of software systems, improve software design/reuse, and empirically validate novel ideas and techniques. Research is now proceeding to uncover the ways in which mining these archives can help to understand software development, to support predictions about software properties, and to plan software projects.
State of research
Each software has a history: a history of changes during the software development process, a history of executions during testing and production, and a history of successes and failures that occurred during these executions. More and more of this history is recorded in archives that allow for analysis and exploration — and thus understanding how the software came to be, and how it behaves today.
One issue that researchers must cope with is the sheer volume of data: programs experience thousands of changes, are tested in millions of runs, and execute in billions of cycles. This calls for advanced data mining techniques that assist in extracting suitable abstractions from programs and software development processes.
Focus of the symposium
Research in software engineering is proceeding to uncover the ways in which mining all these repositories can help to understand software development, to support predictions about software development, and to plan various aspects of software projects.
The focus is to strengthen the community of researchers who are working to recover and use the data stored in software repositories to deepen the understanding of software development practices. For that, this conference will address a number of particular topics such as empirical data analysis, user studies, integration of mining data into integrated development environments, techniques of software analysis and visualization, comparing open source versus closed source project characteristics or deriving different prediction models such as failure prediction.