arXiv Analytics

Sign in

arXiv:1810.08591 [cs.LG]AbstractReferencesReviewsResources

A Modern Take on the Bias-Variance Tradeoff in Neural Networks

Brady Neal, Sarthak Mittal, Aristide Baratin, Vinayak Tantia, Matthew Scicluna, Simon Lacoste-Julien, Ioannis Mitliagkas

Published 2018-10-19Version 1

We revisit the bias-variance tradeoff for neural networks in light of modern empirical findings. The traditional bias-variance tradeoff in machine learning suggests that as model complexity grows, variance increases. Classical bounds in statistical learning theory point to the number of parameters in a model as a measure of model complexity, which means the tradeoff would indicate that variance increases with the size of neural networks. However, we empirically find that variance due to training set sampling is roughly \textit{constant} (with both width and depth) in practice. Variance caused by the non-convexity of the loss landscape is different. We find that it decreases with width and increases with depth, in our setting. We provide theoretical analysis, in a simplified setting inspired by linear models, that is consistent with our empirical findings for width. We view bias-variance as a useful lens to study generalization through and encourage further theoretical explanation from this perspective.

Related articles: Most relevant | Search more
arXiv:1810.10032 [cs.LG] (Published 2018-10-23)
Some negative results for Neural Networks
arXiv:1805.09370 [cs.LG] (Published 2018-05-23)
Towards Robust Training of Neural Networks by Regularizing Adversarial Gradients
arXiv:1904.01399 [cs.LG] (Published 2019-04-02)
On Geometric Structure of Activation Spaces in Neural Networks