arXiv:2205.15549 Abstract | arXiv Analytics

arXiv:2205.15549 [stat.ML]Abstract References Reviews Resources

VC Theoretical Explanation of Double Descent

Published 2022-05-31Version 1

There has been growing interest in generalization performance of large multilayer neural networks that can be trained to achieve zero training error, while generalizing well on test data. This regime is known as 'second descent' and it appears to contradict conventional view that optimal model complexity should reflect optimal balance between underfitting and overfitting, aka the bias-variance trade-off. This paper presents VC-theoretical analysis of double descent and shows that it can be fully explained by classical VC generalization bounds. We illustrate an application of analytic VC-bounds for modeling double descent for classification problems, using empirical results for several learning methods, such as SVM, Least Squares, and Multilayer Perceptron classifiers. In addition, we discuss several possible reasons for misinterpretation of VC-theoretical results in the machine learning community.

Categories: stat.ML, cs.LG

Keywords: double descent, vc theoretical explanation, large multilayer neural networks, achieve zero training error, classical vc generalization bounds

Related articles: Most relevant | Search more

arXiv:1911.05822 [stat.ML] (Published 2019-11-13)

A Model of Double Descent for High-dimensional Binary Linear Classification

Zeyu Deng, Abla Kammoun, Christos Thrampoulidis

arXiv:2010.02681 [stat.ML] (Published 2020-10-06)

Kernel regression in high dimension: Refined analysis beyond double descent

Fanghui Liu, Zhenyu Liao, Johan A. K. Suykens

arXiv:2110.06910 [stat.ML] (Published 2021-10-13, updated 2022-05-28)

On the Double Descent of Random Features Models Trained with SGD

Fanghui Liu, Johan A. K. Suykens, Volkan Cevher