{ "id": "2106.00143", "version": "v1", "published": "2021-05-31T23:21:10.000Z", "updated": "2021-05-31T23:21:10.000Z", "title": "An Exploratory Analysis of Multilingual Word-Level Quality Estimation with Cross-Lingual Transformers", "authors": [ "Tharindu Ranasinghe", "Constantin Orasan", "Ruslan Mitkov" ], "comment": "Accepted to appear at the ACL-IJCNLP 2021 Main conference", "categories": [ "cs.CL", "cs.AI", "cs.LG" ], "abstract": "Most studies on word-level Quality Estimation (QE) of machine translation focus on language-specific models. The obvious disadvantages of these approaches are the need for labelled data for each language pair and the high cost required to maintain several language-specific models. To overcome these problems, we explore different approaches to multilingual, word-level QE. We show that these QE models perform on par with the current language-specific models. In the cases of zero-shot and few-shot QE, we demonstrate that it is possible to accurately predict word-level quality for any given new language pair from models trained on other language pairs. Our findings suggest that the word-level QE models based on powerful pre-trained transformers that we propose in this paper generalise well across languages, making them more useful in real-world scenarios.", "revisions": [ { "version": "v1", "updated": "2021-05-31T23:21:10.000Z" } ], "analyses": { "keywords": [ "multilingual word-level quality estimation", "exploratory analysis", "cross-lingual transformers", "language pair", "language-specific models" ], "tags": [ "conference paper" ], "note": { "typesetting": "TeX", "pages": 0, "language": "en", "license": "arXiv", "status": "editable" } } }