Automatic Search Term Identification for Change Tasks

At the beginning of a change task, software developers search the source code to locate the places relevant to the task. As previous research and a small exploratory study that we conducted show, developers perform poorly in identifying good search terms and therefore waste a lot of time querying and exploring irrelevant code. To support developers in this step, we present an approach to automatically identify good search terms. Based on existing work and an analysis of change tasks, we derived heuristics, determined their relevancy and used the results to develop our approach. For a preliminary evaluation, we conducted a study with ten developers working on open source change tasks. Our approach was able to identify good search terms for all tasks and outperformed the searches of the participants, illustrating the potential of our approach. In addition, since the used heuristics are solely based on textual features of change tasks, our approach is easy and generally applicable and can leverage much of the existing work on feature location.

NIER2014 (PDF, 257 KB)

"Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Permissions@acm.org."

Prototypical Implementation

We enhanced Sando, a state of the art code search tool for Visual Studio, with a search term recommender based on our approach. Our prototype supports the identified heuristics and integrates with the Microsoft Team Foundation Server (TFS). The integration with the TFS is necessary to retrieve the corpus of all change tasks of the project and calculate the tf-idf measure of terms.

Once a user connected via the Visual Studio Team Explorer to a TFS, she can use our extended version of Sando to retrieve search term recommendations for work items. Our prototype offers to retrieve search terms for the currently open work item or for another work item, specified through its ID (see fig. 1).

After specifying the work item, the user clicks on Load to retrieve search term recommendations, which are enlisted in the view as shown in figure 2.

The recommendations are sorted according their relevancy. Double-clicking on a recommendation invokes a Sando code search.