arXiv:2010.12711 [cs.LG]AbstractReferencesReviewsResources
On Convergence and Generalization of Dropout Training
Published 2020-10-23Version 1
We study dropout in two-layer neural networks with rectified linear unit (ReLU) activations. Under mild overparametrization and assuming that the limiting kernel can separate the data distribution with a positive margin, we show that dropout training with logistic loss achieves $\epsilon$-suboptimality in test error in $O(1/\epsilon)$ iterations.
Journal: In Proceedings of Advances in Neural Information Processing Systems (NeurIPS), 2020
Keywords: dropout training, convergence, generalization, two-layer neural networks, logistic loss achieves
Tags: journal article
Related articles: Most relevant | Search more
arXiv:1810.00122 [cs.LG] (Published 2018-09-29)
On the Convergence and Robustness of Batch Normalization
arXiv:1811.09358 [cs.LG] (Published 2018-11-23)
A Sufficient Condition for Convergences of Adam and RMSProp
arXiv:2109.03194 [cs.LG] (Published 2021-09-07)
On the Convergence of Decentralized Adaptive Gradient Methods