Authors
Blake Bassett, Nicholas A Kraft
Publication date
2013/5/20
Conference
2013 21st International Conference on Program Comprehension (ICPC)
Pages
133-141
Publisher
IEEE
Description
Many recent feature location techniques (FLTs) apply text retrieval (TR) techniques to corpora built from text embedded in source code. Term weighting is a standard preprocessing step in TR and is used to adjust the importance of a term within a document or corpus. Common term weighting schemes such as tf-idf may not be optimal for use with source code, because they originate from a natural language context and were designed for use with unstructured documents. In this paper we propose a new approach to term weighting in which term weights are assigned using the structural information from the source code. We then evaluate the proposed approach by conducting an empirical study of a TR-based FLT. In all, we study over 400 bugs and features from five open source Java systems and find that structural term weighting can cause a statistically significant improvement in the accuracy of the FLT.
Total citations
2013201420152016201720182019202020212022202320241107459644131
Scholar articles
B Bassett, NA Kraft - 2013 21st International Conference on Program …, 2013