arXiv:1611.05162 Abstract | arXiv Analytics

arXiv:1611.05162 [cs.LG]Abstract References Reviews Resources

Net-Trim: A Layer-wise Convex Pruning of Deep Neural Networks

Alireza Aghasi, Nam Nguyen, Justin Romberg

Published 2016-11-16Version 1

Model reduction is a highly desirable process for deep neural networks. While large networks are theoretically capable of learning arbitrarily complex models, overfitting and model redundancy negatively affects the prediction accuracy and model variance. Net-Trim is a layer-wise convex framework to prune (sparsify) deep neural networks. The method is applicable to neural networks operating with the rectified linear unit (ReLU) as the nonlinear activation. The basic idea is to retrain the network layer by layer keeping the layer inputs and outputs close to the originally trained model, while seeking a sparse transform matrix. We present both the parallel and cascade versions of the algorithm. While the former enjoys computational distributability, the latter is capable of achieving simpler models. In both cases, we mathematically show a consistency between the retrained model and the initial trained network. We also derive the general sufficient conditions for the recovery of a sparse transform matrix. In the case of standard Gaussian training samples of dimension $N$ being fed to a layer, and $s$ being the maximum number of nonzero terms across all columns of the transform matrix, we show that $\mathcal{O}(s\log N)$ samples are enough to accurately learn the layer model.

Categories: cs.LG, stat.ML

Keywords: deep neural networks, layer-wise convex pruning, sparse transform matrix, enjoys computational distributability, general sufficient conditions

Related articles: Most relevant | Search more

arXiv:1603.09260 [cs.LG] (Published 2016-03-30)

Degrees of Freedom in Deep Neural Networks

Tianxiang Gao, Vladimir Jojic

arXiv:1711.06104 [cs.LG] (Published 2017-11-16)

A unified view of gradient-based attribution methods for Deep Neural Networks

Marco Ancona, Enea Ceolini, Cengiz Öztireli, Markus Gross

arXiv:1710.10570 [cs.LG] (Published 2017-10-29)

Weight Initialization of Deep Neural Networks(DNNs) using Data Statistics

Saiprasad Koturwar, Shabbir Merchant