Authors
Nghi DQ Bui, Yijun Yu, Lingxiao Jiang
Publication date
2021/5/22
Conference
2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE)
Pages
1186-1197
Publisher
IEEE
Description
Learning code representations has found many uses in software engineering, such as code classification, code search, comment generation, and bug prediction, etc. Although representations of code in tokens, syntax trees, dependency graphs, paths in trees, or the combinations of their variants have been proposed, existing learning techniques have a major limitation that these models are often trained on datasets labeled for specific downstream tasks, and as such the code representations may not be suitable for other tasks. Even though some techniques generate representations from unlabeled code, they are far from being satisfactory when applied to the downstream tasks. To overcome the limitation, this paper proposes InferCode, which adapts the self-supervised learning idea from natural language processing to the abstract syntax trees (ASTs) of code. The novelty lies in the training of code representations …
Total citations
20212022202320246324624
Scholar articles
NDQ Bui, Y Yu, L Jiang - 2021 IEEE/ACM 43rd International Conference on …, 2021