arXiv:1611.10176 Abstract | arXiv Analytics

arXiv:1611.10176 [cs.LG]Abstract References Reviews Resources

Effective Quantization Methods for Recurrent Neural Networks

Qinyao He, He Wen, Shuchang Zhou, Yuxin Wu, Cong Yao, Xinyu Zhou, Yuheng Zou

Published 2016-11-30Version 1

Reducing bit-widths of weights, activations, and gradients of a Neural Network can shrink its storage size and memory usage, and also allow for faster training and inference by exploiting bitwise operations. However, previous attempts for quantization of RNNs show considerable performance degradation when using low bit-width weights and activations. In this paper, we propose methods to quantize the structure of gates and interlinks in LSTM and GRU cells. In addition, we propose balanced quantization methods for weights to further reduce performance degradation. Experiments on PTB and IMDB datasets confirm effectiveness of our methods as performances of our models match or surpass the previous state-of-the-art of quantized RNN.

Categories: cs.LG, cs.CV

Keywords: recurrent neural networks, effective quantization methods, imdb datasets confirm effectiveness, low bit-width weights, reduce performance degradation

Related articles: Most relevant | Search more

arXiv:1710.06319 [cs.LG] (Published 2017-10-17)

Beat by Beat: Classifying Cardiac Arrhythmias with Recurrent Neural Networks

Patrick Schwab, Gaetano Scebba, Jia Zhang, Marco Delai, Walter Karlen

arXiv:1804.01653 [cs.LG] (Published 2018-04-05, updated 2018-08-28)

Review of Deep Learning

Rong Zhang, Weiping Li, Tong Mo

arXiv:1506.03099 [cs.LG] (Published 2015-06-09)

Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks

Samy Bengio, Oriol Vinyals, Navdeep Jaitly, Noam Shazeer