Authors
Paul Cook, Jey Han Lau, Diana McCarthy, Timothy Baldwin
Publication date
2014/8
Conference
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers
Pages
1624-1635
Description
Automatic lexical acquisition has been an active area of research in computational linguistics for over two decades, but the automatic identification of new word-senses has received attention only very recently. Previous work on this topic has been limited by the availability of appropriate evaluation resources. In this paper we present the largest corpus-based dataset of diachronic sense differences to date, which we believe will encourage further work in this area. We then describe several extensions to a state-of-the-art topic modelling approach for identifying new word-senses. This adapted method shows superior performance on our dataset of two different corpus pairs to that of the original method for both:(a) types having taken on a novel sense over time; and (b) the token instances of such novel senses.
Total citations
2015201620172018201920202021202220232024259871010332
Scholar articles
P Cook, JH Lau, D McCarthy, T Baldwin - Proceedings of COLING 2014, the 25th International …, 2014