arXiv:2407.07670 Abstract | arXiv Analytics

arXiv:2407.07670 [stat.ML]Abstract References Reviews Resources

Stochastic Gradient Descent for Two-layer Neural Networks

Published 2024-07-10Version 1

This paper presents a comprehensive study on the convergence rates of the stochastic gradient descent (SGD) algorithm when applied to overparameterized two-layer neural networks. Our approach combines the Neural Tangent Kernel (NTK) approximation with convergence analysis in the Reproducing Kernel Hilbert Space (RKHS) generated by NTK, aiming to provide a deep understanding of the convergence behavior of SGD in overparameterized two-layer neural networks. Our research framework enables us to explore the intricate interplay between kernel methods and optimization processes, shedding light on the optimization dynamics and convergence properties of neural networks. In this study, we establish sharp convergence rates for the last iterate of the SGD algorithm in overparameterized two-layer neural networks. Additionally, we have made significant advancements in relaxing the constraints on the number of neurons, which have been reduced from exponential dependence to polynomial dependence on the sample size or number of iterations. This improvement allows for more flexibility in the design and scaling of neural networks, and will deepen our theoretical understanding of neural network models trained with SGD.

Categories: stat.ML, cs.LG

Keywords: stochastic gradient descent, overparameterized two-layer neural networks, neural tangent kernel, establish sharp convergence rates, neural network models

Related articles: Most relevant | Search more

arXiv:1905.13654 [stat.ML] (Published 2019-05-31)

Training Dynamics of Deep Networks using Stochastic Gradient Descent via Neural Tangent Kernel

Soufiane Hayou, Arnaud Doucet, Judith Rousseau

arXiv:2103.01887 [stat.ML] (Published 2021-03-02)

Self-Regularity of Non-Negative Output Weights for Overparameterized Two-Layer Neural Networks

David Gamarnik, Eren C. Kızıldağ, Ilias Zadik

arXiv:2103.14350 [stat.ML] (Published 2021-03-26)

The convergence of the Stochastic Gradient Descent (SGD) : a self-contained proof

Gabrel Turinici

arXiv Analytics

arXiv:2407.07670 [stat.ML]Abstract References Reviews Resources

Stochastic Gradient Descent for Two-layer Neural Networks

Links

Toolbox

arXiv:2407.07670 [stat.ML]AbstractReferencesReviewsResources

Stochastic Gradient Descent for Two-layer Neural Networks

Links

Toolbox

arXiv:2407.07670 [stat.ML]Abstract References Reviews Resources