Authors
Sanjeev Arora, Rong Ge, Ankur Moitra
Publication date
2012/10/20
Conference
2012 IEEE 53rd annual symposium on foundations of computer science
Pages
1-10
Publisher
IEEE
Description
Topic Modeling is an approach used for automatic comprehension and classification of data in a variety of settings, and perhaps the canonical application is in uncovering thematic structure in a corpus of documents. A number of foundational works both in machine learning and in theory have suggested a probabilistic model for documents, whereby documents arise as a convex combination of (i.e. distribution on) a small number of topic vectors, each topic vector being a distribution on words (i.e. a vector of word-frequencies). Similar models have since been used in a variety of application areas, the Latent Dirichlet Allocation or LDA model of Blei et al. is especially popular. Theoretical studies of topic modeling focus on learning the model's parameters assuming the data is actually generated from it. Existing approaches for the most part rely on Singular Value Decomposition (SVD), and consequently have one of …
Total citations
20122013201420152016201720182019202020212022202320246254747555147614648453816
Scholar articles
S Arora, R Ge, A Moitra - 2012 IEEE 53rd annual symposium on foundations of …, 2012