Authors
Daniel Bär, Torsten Zesch, Iryna Gurevych
Publication date
2011
Journal
Proceedings of the International Conference on Recent Advances in Natural Language Processing
Pages
515-520
Description
While the concept of similarity is well grounded in psychology, text similarity is less well-defined. Thus, we analyze text similarity with respect to its definition and the datasets used for evaluation. We formalize text similarity based on the geometric model of conceptual spaces along three dimensions inherent to texts: structure, style, and content. We empirically ground these dimensions in a set of annotation studies, and categorize applications according to these dimensions. Furthermore, we analyze the characteristics of the existing evaluation datasets, and use those datasets to assess the performance of common text similarity measures.
Total citations
20122013201420152016201720182019202020212022202320241563313393944
Scholar articles
D Bär, T Zesch, I Gurevych - Proceedings of the International Conference Recent …, 2011