arXiv Analytics

Sign in

arXiv:2009.08576 [cs.LG]AbstractReferencesReviewsResources

Pruning Neural Networks at Initialization: Why are We Missing the Mark?

Jonathan Frankle, Gintare Karolina Dziugaite, Daniel M. Roy, Michael Carbin

Published 2020-09-18Version 1

Recent work has explored the possibility of pruning neural networks at initialization. We assess proposals for doing so: SNIP (Lee et al., 2019), GraSP (Wang et al., 2020), SynFlow (Tanaka et al., 2020), and magnitude pruning. Although these methods surpass the trivial baseline of random pruning, they remain below the accuracy of magnitude pruning after training, and we endeavor to understand why. We show that, unlike pruning after training, accuracy is the same or higher when randomly shuffling which weights these methods prune within each layer or sampling new initial values. As such, the per-weight pruning decisions made by these methods can be replaced by a per-layer choice of the fraction of weights to prune. This property undermines the claimed justifications for these methods and suggests broader challenges with the underlying pruning heuristics, the desire to prune at initialization, or both.

Related articles: Most relevant | Search more
arXiv:2012.08749 [cs.LG] (Published 2020-12-16)
Provable Benefits of Overparameterization in Model Compression: From Double Descent to Pruning Neural Networks
arXiv:2209.08554 [cs.LG] (Published 2022-09-18)
Pruning Neural Networks via Coresets and Convex Geometry: Towards No Assumptions
arXiv:2209.07263 [cs.LG] (Published 2022-09-15)
Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization)