arXiv Analytics

Sign in

arXiv:1910.07370 [cs.CL]AbstractReferencesReviewsResources

Evolution of transfer learning in natural language processing

Aditya Malte, Pratik Ratadiya

Published 2019-10-16Version 1

In this paper, we present a study of the recent advancements which have helped bring Transfer Learning to NLP through the use of semi-supervised training. We discuss cutting-edge methods and architectures such as BERT, GPT, ELMo, ULMFit among others. Classically, tasks in natural language processing have been performed through rule-based and statistical methodologies. However, owing to the vast nature of natural languages these methods do not generalise well and failed to learn the nuances of language. Thus machine learning algorithms such as Naive Bayes and decision trees coupled with traditional models such as Bag-of-Words and N-grams were used to usurp this problem. Eventually, with the advent of advanced recurrent neural network architectures such as the LSTM, we were able to achieve state-of-the-art performance in several natural language processing tasks such as text classification and machine translation. We talk about how Transfer Learning has brought about the well-known ImageNet moment for NLP. Several advanced architectures such as the Transformer and its variants have allowed practitioners to leverage knowledge gained from unrelated task to drastically fasten convergence and provide better performance on the target task. This survey represents an effort at providing a succinct yet complete understanding of the recent advances in natural language processing using deep learning in with a special focus on detailing transfer learning and its potential advantages.

Related articles: Most relevant | Search more
arXiv:1906.12039 [cs.CL] (Published 2019-06-28)
Supervised Contextual Embeddings for Transfer Learning in Natural Language Processing Tasks
arXiv:2011.08272 [cs.CL] (Published 2020-11-16)
NLPGym -- A toolkit for evaluating RL agents on Natural Language Processing Tasks
arXiv:2005.00870 [cs.CL] (Published 2020-05-02)
Predicting Performance for Natural Language Processing Tasks