View article

[PDF] from arxiv.org

Pitfalls in the Evaluation of Sentence Embeddings

Authors

Steffen Eger, Andreas Rücklé, Iryna Gurevych

Publication date

2019

Conference

Repl4NLP 2019

Description

Deep learning models continuously break new records across different NLP tasks. At the same time, their success exposes weaknesses of model evaluation. Here, we compile several key pitfalls of evaluation of sentence embeddings, a currently very popular NLP paradigm. These pitfalls include the comparison of embeddings of different sizes, normalization of embeddings, and the low (and diverging) correlations between transfer and probing tasks. Our motivation is to challenge the current evaluation of sentence embeddings and to provide an easy-to-access reference for future research. Based on our insights, we also recommend better practices for better future evaluations of sentence embeddings.

Total citations

Cited by 20

2019202020212022202320243 4 4 2 5 1

Scholar articles

Pitfalls in the evaluation of sentence embeddings

S Eger, A Rücklé, I Gurevych - arXiv preprint arXiv:1906.01575, 2019

Pitfalls in the Evaluation of Sentence Embeddings. arXiv e-prints*

S Eger, A Rücklé, I Gurevych - 2019

Cited by 3 Related articles