arXiv:1908.01817 Abstract | arXiv Analytics

arXiv:1908.01817 [cs.CL]Abstract References Reviews Resources

Sparsity Emerges Naturally in Neural Language Models

Published 2019-07-22Version 1

Concerns about interpretability, computational resources, and principled inductive priors have motivated efforts to engineer sparse neural models for NLP tasks. If sparsity is important for NLP, might well-trained neural models naturally become roughly sparse? Using the Taxi-Euclidean norm to measure sparsity, we find that frequent input words are associated with concentrated or sparse activations, while frequent target words are associated with dispersed activations but concentrated gradients. We find that gradients associated with function words are more concentrated than the gradients of content words, even controlling for word frequency.

Comments: Published in the ICML 2019 Workshop on Identifying and Understanding Deep Learning Phenomena: https://openreview.net/forum?id=H1ets1h56E

Categories: cs.CL, cs.LG, stat.ML

Keywords: neural language models, sparsity emerges, engineer sparse neural models, frequent input words, frequent target words

Related articles: Most relevant | Search more

arXiv:1901.00398 [cs.CL] (Published 2019-01-02)

Judge the Judges: A Large-Scale Evaluation Study of Neural Language Models for Online Review Generation

Cristina Garbacea, Samuel Carton, Shiyan Yan, Qiaozhu Mei

arXiv:1708.00781 [cs.CL] (Published 2017-08-02)

Dynamic Entity Representations in Neural Language Models

Yangfeng Ji, Chenhao Tan, Sebastian Martschat, Yejin Choi, Noah A. Smith

arXiv:1811.00998 [cs.CL] (Published 2018-11-02)

Analysing Dropout and Compounding Errors in Neural Language Models

James O' Neill, Danushka Bollegala