arXiv:1906.01575 Abstract | arXiv Analytics

arXiv:1906.01575 [cs.CL]Abstract References Reviews Resources

Pitfalls in the Evaluation of Sentence Embeddings

Steffen Eger, Andreas Rücklé, Iryna Gurevych

Published 2019-06-04Version 1

Deep learning models continuously break new records across different NLP tasks. At the same time, their success exposes weaknesses of model evaluation. Here, we compile several key pitfalls of evaluation of sentence embeddings, a currently very popular NLP paradigm. These pitfalls include the comparison of embeddings of different sizes, normalization of embeddings, and the low (and diverging) correlations between transfer and probing tasks. Our motivation is to challenge the current evaluation of sentence embeddings and to provide an easy-to-access reference for future research. Based on our insights, we also recommend better practices for better future evaluations of sentence embeddings.

Comments: Accepted at Repl4NLP 2019

Categories: cs.CL

Keywords: sentence embeddings, recommend better practices, success exposes weaknesses, deep learning models continuously break, popular nlp paradigm

Related articles: Most relevant | Search more

arXiv:2305.13192 [cs.CL] (Published 2023-05-22)

ImSimCSE: Improving Contrastive Learning for Sentence Embeddings from Two Perspectives

Jiahao Xu, Wei Shao, Lihui Chen, Lemao Liu

arXiv:1605.04655 [cs.CL] (Published 2016-05-16)

Joint Learning of Sentence Embeddings for Relevance and Entailment

Petr Baudis, Silvestr Stanko, Jan Sedivy

arXiv:2311.03881 [cs.CL] (Published 2023-11-07)

Sparse Contrastive Learning of Sentence Embeddings