arXiv Analytics

Sign in

arXiv:2005.10815 [cs.LG]AbstractReferencesReviewsResources

Can Shallow Neural Networks Beat the Curse of Dimensionality? A mean field training perspective

Stephan Wojtowytsch, Weinan E

Published 2020-05-21Version 1

We prove that the gradient descent training of a two-layer neural network on empirical or population risk may not decrease population risk at an order faster than $t^{-4/(d-2)}$ under mean field scaling. Thus gradient descent training for fitting reasonably smooth, but truly high-dimensional data may be subject to the curse of dimensionality. We present numerical evidence that gradient descent training with general Lipschitz target functions becomes slower and slower as the dimension increases, but converges at approximately the same rate in all dimensions when the target function lies in the natural function space for two-layer ReLU networks.

Related articles: Most relevant | Search more
arXiv:1903.02154 [cs.LG] (Published 2019-03-06)
A Priori Estimates of the Population Risk for Residual Networks
arXiv:1301.2269 [cs.LG] (Published 2013-01-10)
Learning the Dimensionality of Hidden Variables
arXiv:1701.00831 [cs.LG] (Published 2017-01-03)
Collapsing of dimensionality