arXiv:1509.08745 [cs.LG]AbstractReferencesReviewsResources
Compression of Deep Neural Networks on the Fly
Guillaume Soulié, Vincent Gripon, Maëlys Robert
Published 2015-09-29Version 1
Because of their performance, deep neural networks are increasingly used for object recognition. They are particularly attractive because of their ability to 'absorb' great quantities of labeled data through millions of parameters. However, as the accuracy and the model sizes increase, so does the memory requirements of the classifiers. This prohibits their usage on resource limited hardware, including cell phones or other embedded devices. We introduce a novel compression method for deep neural networks that performs during the learning phase. It consists in adding an extra regularization term to the cost function of fully-connected layers. We combine this method with Product Quantization (PQ) of the trained weights for higher savings in memory and storage consumption. We evaluate our method on two data sets (MNIST and CIFAR10), on which we achieve significantly larger compression than state-of-the-art methods.