Authors
Simon Butler, Michel Wermelinger, Yijun Yu, Helen Sharp
Publication date
2011
Conference
ECOOP 2011–Object-Oriented Programming: 25th European Conference, Lancaster, Uk, July 25-29, 2011 Proceedings 25
Pages
130-154
Publisher
Springer Berlin Heidelberg
Description
Identifier names are the main vehicle for semantic information during program comprehension. Identifier names are tokenised into their semantic constituents by tools supporting program comprehension tasks, including concept location and requirements traceability. We present an approach to the automated tokenisation of identifier names that improves on existing techniques in two ways. First, it improves tokenisation accuracy for identifier names of a single case and those containing digits. Second, performance gains over existing techniques are achieved using smaller oracles. Accuracy was evaluated by comparing the output of our algorithm to manual tokenisations of 28,000 identifier names drawn from 60 open source Java projects totalling 16.5 MSLOC. We also undertook a study of the typographical features of identifier names (single case, use of digits, etc.) per object-oriented construct (class …
Total citations
201120122013201420152016201720182019202020212022202320243894107343484101
Scholar articles
S Butler, M Wermelinger, Y Yu, H Sharp - ECOOP 2011–Object-Oriented Programming: 25th …, 2011