Authors
Jerome R Bellegarda, John W Butzberger, Yen-Lu Chow, Noah B Coccaro, Devang Naik
Publication date
1996/5/9
Conference
1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings
Volume
1
Pages
172-175
Publisher
IEEE
Description
A new approach is proposed for the clustering of words in a given vocabulary. The method is based on a paradigm first formulated in the context of information retrieval, called latent semantic analysis. This paradigm leads to a parsimonious vector representation of each word in a suitable vector space, where familiar clustering techniques can be applied. The distance measure selected in this space arises naturally from the problem formulation. Preliminary experiments indicate that, the clusters produced are intuitively satisfactory. Because these clusters are semantic in nature, this approach may prove useful as a complement to conventional class-based statistical language modeling techniques.
Total citations
199619971998199920002001200220032004200520062007200820092010201120122013201420152016201720182019202020212022210105578751055628681134171779111533
Scholar articles
JR Bellegarda, JW Butzberger, YL Chow, NB Coccaro… - 1996 IEEE International Conference on Acoustics …, 1996