arXiv Analytics

Sign in

arXiv:1806.01248 [cs.LG]AbstractReferencesReviewsResources

Dynamically Hierarchy Revolution: DirNet for Compressing Recurrent Neural Network on Mobile Devices

Jie Zhang, Xiaolong Wang, Dawei Li, Yalin Wang

Published 2018-06-04Version 1

Recurrent neural networks (RNNs) achieve cutting-edge performance on a variety of problems. However, due to their high computational and memory demands, deploying RNNs on resource constrained mobile devices is a challenging task. To guarantee minimum accuracy loss with higher compression rate and driven by the mobile resource requirement, we introduce a novel model compression approach DirNet based on an optimized fast dictionary learning algorithm, which 1) dynamically mines the dictionary atoms of the projection dictionary matrix within layer to adjust the compression rate 2) adaptively changes the sparsity of sparse codes cross the hierarchical layers. Experimental results on language model and an ASR model trained with a 1000h speech dataset demonstrate that our method significantly outperforms prior approaches. Evaluated on off-the-shelf mobile devices, we are able to reduce the size of original model by eight times with real-time model inference and negligible accuracy loss.

Related articles: Most relevant | Search more
arXiv:1902.05690 [cs.LG] (Published 2019-02-15)
AutoQB: AutoML for Network Quantization and Binarization on Mobile Devices
arXiv:2011.04232 [cs.LG] (Published 2020-11-09)
SplitEasy: A Practical Approach for Training ML models on Mobile Devices in a split second
arXiv:2102.06336 [cs.LG] (Published 2021-02-12)
Dancing along Battery: Enabling Transformer with Run-time Reconfigurability on Mobile Devices
Yuhong Song et al.