arXiv Analytics

Sign in

arXiv:1611.05162 [cs.LG]AbstractReferencesReviewsResources

Net-Trim: A Layer-wise Convex Pruning of Deep Neural Networks

Alireza Aghasi, Nam Nguyen, Justin Romberg

Published 2016-11-16Version 1

Model reduction is a highly desirable process for deep neural networks. While large networks are theoretically capable of learning arbitrarily complex models, overfitting and model redundancy negatively affects the prediction accuracy and model variance. Net-Trim is a layer-wise convex framework to prune (sparsify) deep neural networks. The method is applicable to neural networks operating with the rectified linear unit (ReLU) as the nonlinear activation. The basic idea is to retrain the network layer by layer keeping the layer inputs and outputs close to the originally trained model, while seeking a sparse transform matrix. We present both the parallel and cascade versions of the algorithm. While the former enjoys computational distributability, the latter is capable of achieving simpler models. In both cases, we mathematically show a consistency between the retrained model and the initial trained network. We also derive the general sufficient conditions for the recovery of a sparse transform matrix. In the case of standard Gaussian training samples of dimension $N$ being fed to a layer, and $s$ being the maximum number of nonzero terms across all columns of the transform matrix, we show that $\mathcal{O}(s\log N)$ samples are enough to accurately learn the layer model.

Related articles: Most relevant | Search more
arXiv:1603.09260 [cs.LG] (Published 2016-03-30)
Degrees of Freedom in Deep Neural Networks
arXiv:1711.06104 [cs.LG] (Published 2017-11-16)
A unified view of gradient-based attribution methods for Deep Neural Networks
arXiv:1710.10570 [cs.LG] (Published 2017-10-29)
Weight Initialization of Deep Neural Networks(DNNs) using Data Statistics