arXiv Analytics

Sign in

arXiv:1811.00739 [cs.CL]AbstractReferencesReviewsResources

An Empirical Exploration of Curriculum Learning for Neural Machine Translation

Xuan Zhang, Gaurav Kumar, Huda Khayrallah, Kenton Murray, Jeremy Gwinnup, Marianna J Martindale, Paul McNamee, Kevin Duh, Marine Carpuat

Published 2018-11-02Version 1

Machine translation systems based on deep neural networks are expensive to train. Curriculum learning aims to address this issue by choosing the order in which samples are presented during training to help train better models faster. We adopt a probabilistic view of curriculum learning, which lets us flexibly evaluate the impact of curricula design, and perform an extensive exploration on a German-English translation task. Results show that it is possible to improve convergence time at no loss in translation quality. However, results are highly sensitive to the choice of sample difficulty criteria, curriculum schedule and other hyperparameters.

Related articles: Most relevant | Search more
arXiv:1707.09533 [cs.CL] (Published 2017-07-29)
Curriculum Learning and Minibatch Bucketing in Neural Machine Translation
arXiv:1712.05690 [cs.CL] (Published 2017-12-15)
Sockeye: A Toolkit for Neural Machine Translation
arXiv:1410.8206 [cs.CL] (Published 2014-10-30)
Addressing the Rare Word Problem in Neural Machine Translation