arXiv:2007.00878 Abstract | arXiv Analytics

arXiv:2007.00878 [cs.LG]Abstract References Reviews Resources

On the Outsized Importance of Learning Rates in Local Update Methods

Published 2020-07-02Version 1

We study a family of algorithms, which we refer to as local update methods, that generalize many federated learning and meta-learning algorithms. We prove that for quadratic objectives, local update methods perform stochastic gradient descent on a surrogate loss function which we exactly characterize. We show that the choice of client learning rate controls the condition number of that surrogate loss, as well as the distance between the minimizers of the surrogate and true loss functions. We use this theory to derive novel convergence rates for federated averaging that showcase this trade-off between the condition number of the surrogate loss and its alignment with the true loss function. We validate our results empirically, showing that in communication-limited settings, proper learning rate tuning is often sufficient to reach near-optimal behavior. We also present a practical method for automatic learning rate decay in local update methods that helps reduce the need for learning rate tuning, and highlight its empirical performance on a variety of tasks and datasets.

Categories: cs.LG, math.OC, stat.ML

Keywords: learning rate, perform stochastic gradient descent, outsized importance, true loss function, surrogate loss

Related articles: Most relevant | Search more

arXiv:2107.08686 [cs.LG] (Published 2021-07-19)

Improved Learning Rates for Stochastic Optimization: Two Theoretical Viewpoints

Shaojie Li, Yong Liu

arXiv:2003.02389 [cs.LG] (Published 2020-03-05)

Comparing Rewinding and Fine-tuning in Neural Network Pruning

Alex Renda, Jonathan Frankle, Michael Carbin

arXiv:1612.05086 [cs.LG] (Published 2016-12-15)

Coupling Adaptive Batch Sizes with Learning Rates

Lukas Balles, Javier Romero, Philipp Hennig

arXiv Analytics

arXiv:2007.00878 [cs.LG]Abstract References Reviews Resources

On the Outsized Importance of Learning Rates in Local Update Methods

Links

Toolbox

arXiv:2007.00878 [cs.LG]AbstractReferencesReviewsResources

On the Outsized Importance of Learning Rates in Local Update Methods

Links

Toolbox

arXiv:2007.00878 [cs.LG]Abstract References Reviews Resources