arXiv:1506.06840 Abstract | arXiv Analytics

arXiv:1506.06840 [cs.LG]Abstract References Reviews Resources

On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants

Sashank J. Reddi, Ahmed Hefny, Suvrit Sra, Barnabás Póczos, Alex Smola

Published 2015-06-23Version 1

We study optimization algorithms based on variance reduction for stochastic gradient descent (SGD). Remarkable recent progress has been made in this direction through development of algorithms like SAG, SVRG, SAGA. These algorithms have been shown to outperform SGD, both theoretically and empirically. However, asynchronous versions of these algorithms---a crucial requirement for modern large-scale applications---have not been studied. We bridge this gap by presenting a unifying framework for many variance reduction techniques. Subsequently, we propose an asynchronous algorithm grounded in our framework, and prove its fast convergence. An important consequence of our general approach is that it yields asynchronous versions of variance reduction algorithms such as SVRG and SAGA as a byproduct. Our method achieves near linear speedup in sparse settings common to machine learning. We demonstrate the empirical performance of our method through a concrete realization of asynchronous SVRG.

Categories: cs.LG, stat.ML

Keywords: stochastic gradient descent, asynchronous variants, variance reduction algorithms, algorithms-a crucial requirement, variance reduction techniques

Related articles: Most relevant | Search more

arXiv:1802.08009 [cs.LG] (Published 2018-02-22)

Iterate averaging as regularization for stochastic gradient descent

Gergely Neu, Lorenzo Rosasco

arXiv:1808.01204 [cs.LG] (Published 2018-08-03)

Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data

Yuanzhi Li, Yingyu Liang

arXiv:1809.04564 [cs.LG] (Published 2018-09-12)

On the Stability and Convergence of Stochastic Gradient Descent with Momentum

Ali Ramezani-Kebrya, Ashish Khisti, Ben Liang