arXiv Analytics

Sign in

arXiv:1908.01817 [cs.CL]AbstractReferencesReviewsResources

Sparsity Emerges Naturally in Neural Language Models

Naomi Saphra, Adam Lopez

Published 2019-07-22Version 1

Concerns about interpretability, computational resources, and principled inductive priors have motivated efforts to engineer sparse neural models for NLP tasks. If sparsity is important for NLP, might well-trained neural models naturally become roughly sparse? Using the Taxi-Euclidean norm to measure sparsity, we find that frequent input words are associated with concentrated or sparse activations, while frequent target words are associated with dispersed activations but concentrated gradients. We find that gradients associated with function words are more concentrated than the gradients of content words, even controlling for word frequency.

Comments: Published in the ICML 2019 Workshop on Identifying and Understanding Deep Learning Phenomena: https://openreview.net/forum?id=H1ets1h56E
Categories: cs.CL, cs.LG, stat.ML
Related articles: Most relevant | Search more
arXiv:1901.00398 [cs.CL] (Published 2019-01-02)
Judge the Judges: A Large-Scale Evaluation Study of Neural Language Models for Online Review Generation
arXiv:1708.00781 [cs.CL] (Published 2017-08-02)
Dynamic Entity Representations in Neural Language Models
arXiv:1811.00998 [cs.CL] (Published 2018-11-02)
Analysing Dropout and Compounding Errors in Neural Language Models