arXiv:1809.00252 Abstract | arXiv Analytics

arXiv:1809.00252 [cs.CL]Abstract References Reviews Resources

Parameter Sharing Methods for Multilingual Self-Attentional Translation Models

Published 2018-09-01Version 1

In multilingual neural machine translation, it has been shown that sharing a single translation model between multiple languages can achieve competitive performance, sometimes even leading to performance gains over bilingually trained models. However, these improvements are not uniform; often multilingual parameter sharing results in a decrease in accuracy due to translation models not being able to accommodate different languages in their limited parameter space. In this work, we examine parameter sharing techniques that strike a happy medium between full sharing and individual training, specifically focusing on the self-attentional Transformer model. We find that the full parameter sharing approach leads to increases in BLEU scores mainly when the target languages are from a similar language family. However, even in the case where target languages are from different families where full parameter sharing leads to a noticeable drop in BLEU scores, our proposed methods for partial sharing of parameters can lead to substantial improvements in translation accuracy.

Comments: Third Conference on Machine Translation (WMT 2018)

Categories: cs.CL, cs.LG

Keywords: multilingual self-attentional translation models, parameter sharing methods, multilingual neural machine translation, target languages, bleu scores

Tags: conference paper

Related articles: Most relevant | Search more

arXiv:1712.09662 [cs.CL] (Published 2017-12-27)

CNN Is All You Need

Qiming Chen, Ren Wu

arXiv:1908.09324 [cs.CL] (Published 2019-08-25)

Multilingual Neural Machine Translation with Language Clustering

Xu Tan, Jiale Chen, Di He, Yingce Xia, Tao Qin, Tie-Yan Liu

arXiv:2005.04816 [cs.CL] (Published 2020-05-11)

Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation

Aditya Siddhant et al.

arXiv Analytics

arXiv:1809.00252 [cs.CL]Abstract References Reviews Resources

Parameter Sharing Methods for Multilingual Self-Attentional Translation Models

Links

Toolbox

arXiv:1809.00252 [cs.CL]AbstractReferencesReviewsResources

Parameter Sharing Methods for Multilingual Self-Attentional Translation Models

Links

Toolbox

arXiv:1809.00252 [cs.CL]Abstract References Reviews Resources