arXiv Analytics

Sign in

arXiv:2010.12711 [cs.LG]AbstractReferencesReviewsResources

On Convergence and Generalization of Dropout Training

Poorya Mianjy, Raman Arora

Published 2020-10-23Version 1

We study dropout in two-layer neural networks with rectified linear unit (ReLU) activations. Under mild overparametrization and assuming that the limiting kernel can separate the data distribution with a positive margin, we show that dropout training with logistic loss achieves $\epsilon$-suboptimality in test error in $O(1/\epsilon)$ iterations.

Journal: In Proceedings of Advances in Neural Information Processing Systems (NeurIPS), 2020
Categories: cs.LG, stat.ML
Related articles: Most relevant | Search more
arXiv:1810.00122 [cs.LG] (Published 2018-09-29)
On the Convergence and Robustness of Batch Normalization
arXiv:1811.09358 [cs.LG] (Published 2018-11-23)
A Sufficient Condition for Convergences of Adam and RMSProp
arXiv:2109.03194 [cs.LG] (Published 2021-09-07)
On the Convergence of Decentralized Adaptive Gradient Methods