arXiv Analytics

Sign in

arXiv:1611.00196 [cs.CL]AbstractReferencesReviewsResources

Recurrent Neural Network Language Model Adaptation Derived Document Vector

Wei Li, Brian Kan, Wing Mak

Published 2016-11-01Version 1

In many natural language processing (NLP) tasks, a document is commonly modeled as a bag of words using the term frequency-inverse document frequency (TF-IDF) vector. One major shortcoming of the frequency-based TF-IDF feature vector is that it ignores word orders that carry syntactic and semantic relationships among the words in a document, and they can be important in some NLP tasks such as genre classification. This paper proposes a novel distributed vector representation of a document: a simple recurrent-neural-network language model (RNN-LM) or a long short-term memory RNN language model (LSTM-LM) is first created from all documents in a task; some of the LM parameters are then adapted by each document, and the adapted parameters are vectorized to represent the document. The new document vectors are labeled as DV-RNN and DV-LSTM respectively. We believe that our new document vectors can capture some high-level sequential information in the documents, which other current document representations fail to capture. The new document vectors were evaluated in the genre classification of documents in three corpora: the Brown Corpus, the BNC Baby Corpus and an artificially created Penn Treebank dataset. Their classification performances are compared with the performance of TF-IDF vector and the state-of-the-art distributed memory model of paragraph vector (PV-DM). The results show that DV-LSTM significantly outperforms TF-IDF and PV-DM in most cases, and combinations of the proposed document vectors with TF-IDF or PV-DM may further improve performance.

Related articles: Most relevant | Search more
arXiv:1506.01192 [cs.CL] (Published 2015-06-03)
Personalizing a Universal Recurrent Neural Network Language Model with User Characteristic Features by Crowdsouring over Social Networks
arXiv:2007.11794 [cs.CL] (Published 2020-07-23)
Applying GPGPU to Recurrent Neural Network Language Model based Fast Network Search in the Real-Time LVCSR
arXiv:1904.04163 [cs.CL] (Published 2019-04-08)
Knowledge Distillation For Recurrent Neural Network Language Modeling With Trust Regularization