arXiv:1411.1134 [cs.LG]AbstractReferencesReviewsResources
Global Convergence of Stochastic Gradient Descent for Some Nonconvex Matrix Problems
Christopher De Sa, Kunle Olukotun, Christopher Ré
Published 2014-11-05Version 1
The Burer-Monteiro decomposition ($X = Y Y^T$) with stochastic gradient descent is commonly employed to speed up and scale up matrix problems including matrix completion, subspace tracking, and SDP relaxation. Although it is widely used in practice, there exist no known global convergence results for this method. In this paper, we prove that, under broad sampling conditions, a first-order rank-1 stochastic gradient descent (SGD) matrix recovery scheme converges globally from a random starting point at a $O(\epsilon^{-1} n \log n)$ rate with constant probability. We demonstrate our method experimentally.
Related articles: Most relevant | Search more
Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes
arXiv:1905.13210 [cs.LG] (Published 2019-05-30)
Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks
arXiv:1810.09418 [cs.LG] (Published 2018-10-22)
Optimality of the final model found via Stochastic Gradient Descent