arXiv:1611.00196 Abstract | arXiv Analytics

arXiv:1611.00196 [cs.CL]Abstract References Reviews Resources

Recurrent Neural Network Language Model Adaptation Derived Document Vector

Published 2016-11-01Version 1

In many natural language processing (NLP) tasks, a document is commonly modeled as a bag of words using the term frequency-inverse document frequency (TF-IDF) vector. One major shortcoming of the frequency-based TF-IDF feature vector is that it ignores word orders that carry syntactic and semantic relationships among the words in a document, and they can be important in some NLP tasks such as genre classification. This paper proposes a novel distributed vector representation of a document: a simple recurrent-neural-network language model (RNN-LM) or a long short-term memory RNN language model (LSTM-LM) is first created from all documents in a task; some of the LM parameters are then adapted by each document, and the adapted parameters are vectorized to represent the document. The new document vectors are labeled as DV-RNN and DV-LSTM respectively. We believe that our new document vectors can capture some high-level sequential information in the documents, which other current document representations fail to capture. The new document vectors were evaluated in the genre classification of documents in three corpora: the Brown Corpus, the BNC Baby Corpus and an artificially created Penn Treebank dataset. Their classification performances are compared with the performance of TF-IDF vector and the state-of-the-art distributed memory model of paragraph vector (PV-DM). The results show that DV-LSTM significantly outperforms TF-IDF and PV-DM in most cases, and combinations of the proposed document vectors with TF-IDF or PV-DM may further improve performance.

Categories: cs.CL

Keywords: recurrent neural network language model, neural network language model adaptation, model adaptation derived document vector, language model adaptation derived document

Related articles: Most relevant | Search more

arXiv:1506.01192 [cs.CL] (Published 2015-06-03)

Personalizing a Universal Recurrent Neural Network Language Model with User Characteristic Features by Crowdsouring over Social Networks

Bo-Hsiang Tseng, Hung-Yi Lee, Lin-Shan Lee

arXiv:2007.11794 [cs.CL] (Published 2020-07-23)

Applying GPGPU to Recurrent Neural Network Language Model based Fast Network Search in the Real-Time LVCSR

Kyungmin Lee, Chiyoun Park, Ilhwan Kim, Namhoon Kim, Jaewon Lee

arXiv:1904.04163 [cs.CL] (Published 2019-04-08)

Knowledge Distillation For Recurrent Neural Network Language Modeling With Trust Regularization

Yangyang Shi, Mei-Yuh Hwang, Xin Lei, Haoyu Sheng