arXiv Analytics

Sign in

arXiv:1903.02154 [cs.LG]AbstractReferencesReviewsResources

A Priori Estimates of the Population Risk for Residual Networks

Weinan E, Chao Ma, Qingcan Wang

Published 2019-03-06Version 1

Optimal a priori estimates are derived for the population risk of a regularized residual network model. The key lies in the designing of a new path norm, called the weighted path norm, which serves as the regularization term in the regularized model. The weighted path norm treats the skip connections and the nonlinearities differently so that paths with more nonlinearities have larger weights. The error estimates are a priori in nature in the sense that the estimates depend only on the target function and not on the parameters obtained in the training process. The estimates are optimal in the sense that the bound scales as O(1/L) with the network depth and the estimation error is comparable to the Monte Carlo error rates. In particular, optimal error bounds are obtained, for the first time, in terms of the depth of the network model. Comparisons are made with existing norm-based generalization error bounds.

Related articles: Most relevant | Search more
arXiv:2103.16355 [cs.LG] (Published 2021-03-30)
Nonlinear Weighted Directed Acyclic Graph and A Priori Estimates for Neural Networks
arXiv:1803.09357 [cs.LG] (Published 2018-03-25, updated 2018-10-17)
On the Local Minima of the Empirical Risk
arXiv:2206.00846 [cs.LG] (Published 2022-06-02)
Faster Rates of Convergence to Stationary Points in Differentially Private Optimization