arXiv:1603.06861 Abstract | arXiv Analytics

arXiv:1603.06861 [stat.ML]Abstract References Reviews Resources

Trading-off variance and complexity in stochastic gradient descent

Vatsal Shah, Megasthenis Asteris, Anastasios Kyrillidis, Sujay Sanghavi

Published 2016-03-22Version 1

Stochastic gradient descent is the method of choice for large-scale machine learning problems, by virtue of its light complexity per iteration. However, it lags behind its non-stochastic counterparts with respect to the convergence rate, due to high variance introduced by the stochastic updates. The popular Stochastic Variance-Reduced Gradient (SVRG) method mitigates this shortcoming, introducing a new update rule which requires infrequent passes over the entire input dataset to compute the full-gradient. In this work, we propose CheapSVRG, a stochastic variance-reduction optimization scheme. Our algorithm is similar to SVRG but instead of the full gradient, it uses a surrogate which can be efficiently computed on a small subset of the input data. It achieves a linear convergence rate ---up to some error level, depending on the nature of the optimization problem---and features a trade-off between the computational complexity and the convergence rate. Empirical evaluation shows that CheapSVRG performs at least competitively compared to the state of the art.

Comments: 14 pages, 13 figures, first edition on 9th of October 2015

Categories: stat.ML, cs.IT, cs.LG, math.IT, math.OC

Keywords: stochastic gradient descent, trading-off variance, complexity, stochastic variance-reduction optimization scheme, popular stochastic variance-reduced gradient

Related articles: Most relevant | Search more

arXiv:2105.01650 [stat.ML] (Published 2021-05-04)

Stochastic gradient descent with noise of machine learning type. Part I: Discrete time analysis

Stephan Wojtowytsch

arXiv:1806.05438 [stat.ML] (Published 2018-06-14)

Stochastic Gradient Descent with Exponential Convergence Rates of Expected Classification Errors

Atsushi Nitanda, Taiji Suzuki

arXiv:1805.07960 [stat.ML] (Published 2018-05-21)

Stochastic Gradient Descent for Stochastic Doubly-Nonconvex Composite Optimization