{ "id": "2102.13643", "version": "v1", "published": "2021-02-26T18:40:58.000Z", "updated": "2021-02-26T18:40:58.000Z", "title": "Variance Reduction via Primal-Dual Accelerated Dual Averaging for Nonsmooth Convex Finite-Sums", "authors": [ "Chaobing Song", "Stephen J. Wright", "Jelena Diakonikolas" ], "comment": "32 pages, 18 figures", "categories": [ "math.OC", "cs.LG", "cs.NA", "math.NA" ], "abstract": "We study structured nonsmooth convex finite-sum optimization that appears widely in machine learning applications, including support vector machines and least absolute deviation. For the primal-dual formulation of this problem, we propose a novel algorithm called \\emph{Variance Reduction via Primal-Dual Accelerated Dual Averaging (\\vrpda)}. In the nonsmooth and general convex setting, \\vrpda~has the overall complexity $O(nd\\log\\min \\{1/\\epsilon, n\\} + d/\\epsilon )$ in terms of the primal-dual gap, where $n$ denotes the number of samples, $d$ the dimension of the primal variables, and $\\epsilon$ the desired accuracy. In the nonsmooth and strongly convex setting, the overall complexity of \\vrpda~becomes $O(nd\\log\\min\\{1/\\epsilon, n\\} + d/\\sqrt{\\epsilon})$ in terms of both the primal-dual gap and the distance between iterate and optimal solution. Both these results for \\vrpda~improve significantly on state-of-the-art complexity estimates, which are $O(nd\\log \\min\\{1/\\epsilon, n\\} + \\sqrt{n}d/\\epsilon)$ for the nonsmooth and general convex setting and $O(nd\\log \\min\\{1/\\epsilon, n\\} + \\sqrt{n}d/\\sqrt{\\epsilon})$ for the nonsmooth and strongly convex setting, in a much more simple and straightforward way. Moreover, both complexities are better than \\emph{lower} bounds for general convex finite sums that lack the particular (common) structure that we consider. Our theoretical results are supported by numerical experiments, which confirm the competitive performance of \\vrpda~compared to state-of-the-art.", "revisions": [ { "version": "v1", "updated": "2021-02-26T18:40:58.000Z" } ], "analyses": { "keywords": [ "primal-dual accelerated dual averaging", "variance reduction", "convex setting", "general convex", "overall complexity" ], "note": { "typesetting": "TeX", "pages": 32, "language": "en", "license": "arXiv", "status": "editable" } } }