arXiv Analytics

Sign in

arXiv:2205.15549 [stat.ML]AbstractReferencesReviewsResources

VC Theoretical Explanation of Double Descent

Eng Hock Lee, Vladimir Cherkassky

Published 2022-05-31Version 1

There has been growing interest in generalization performance of large multilayer neural networks that can be trained to achieve zero training error, while generalizing well on test data. This regime is known as 'second descent' and it appears to contradict conventional view that optimal model complexity should reflect optimal balance between underfitting and overfitting, aka the bias-variance trade-off. This paper presents VC-theoretical analysis of double descent and shows that it can be fully explained by classical VC generalization bounds. We illustrate an application of analytic VC-bounds for modeling double descent for classification problems, using empirical results for several learning methods, such as SVM, Least Squares, and Multilayer Perceptron classifiers. In addition, we discuss several possible reasons for misinterpretation of VC-theoretical results in the machine learning community.

Related articles: Most relevant | Search more
arXiv:1911.05822 [stat.ML] (Published 2019-11-13)
A Model of Double Descent for High-dimensional Binary Linear Classification
arXiv:2010.02681 [stat.ML] (Published 2020-10-06)
Kernel regression in high dimension: Refined analysis beyond double descent
arXiv:2110.06910 [stat.ML] (Published 2021-10-13, updated 2022-05-28)
On the Double Descent of Random Features Models Trained with SGD