arXiv Analytics

Sign in

arXiv:2103.06628 [cs.CL]AbstractReferencesReviewsResources

Evaluation of Morphological Embeddings for the Russian Language

Vitaly Romanov, Albina Khusainova

Published 2021-03-11Version 1

A number of morphology-based word embedding models were introduced in recent years. However, their evaluation was mostly limited to English, which is known to be a morphologically simple language. In this paper, we explore whether and to what extent incorporating morphology into word embeddings improves performance on downstream NLP tasks, in the case of morphologically rich Russian language. NLP tasks of our choice are POS tagging, Chunking, and NER -- for Russian language, all can be mostly solved using only morphology without understanding the semantics of words. Our experiments show that morphology-based embeddings trained with Skipgram objective do not outperform existing embedding model -- FastText. Moreover, a more complex, but morphology unaware model, BERT, allows to achieve significantly greater performance on the tasks that presumably require understanding of a word's morphology.

Comments: Published in Proceedings of the 2019 3rd International Conference on Natural Language Processing and Information Retrieval
Categories: cs.CL, cs.LG
Related articles: Most relevant | Search more
arXiv:2410.13086 [cs.CL] (Published 2024-10-16)
Reverse-Engineering the Reader
arXiv:1910.11834 [cs.CL] (Published 2019-10-25)
Evaluation of Sentence Representations in Polish
arXiv:1904.04307 [cs.CL] (Published 2019-04-08)
Word Similarity Datasets for Thai: Construction and Evaluation