{
  "id": "1709.04073",
  "version": "v1",
  "published": "2017-09-12T22:34:09.000Z",
  "updated": "2017-09-12T22:34:09.000Z",
  "title": "Linear Stochastic Approximation: Constant Step-Size and Iterate Averaging",
  "authors": [
    "Chandrashekar Lakshminarayanan",
    "Csaba Szepesvári"
  ],
  "comment": "16 pages, 2 figures, was submitted to NIPS 2017",
  "categories": [
    "cs.LG",
    "cs.SY",
    "stat.ML"
  ],
  "abstract": "We consider $d$-dimensional linear stochastic approximation algorithms (LSAs) with a constant step-size and the so called Polyak-Ruppert (PR) averaging of iterates. LSAs are widely applied in machine learning and reinforcement learning (RL), where the aim is to compute an appropriate $\\theta_{*} \\in \\mathbb{R}^d$ (that is an optimum or a fixed point) using noisy data and $O(d)$ updates per iteration. In this paper, we are motivated by the problem (in RL) of policy evaluation from experience replay using the \\emph{temporal difference} (TD) class of learning algorithms that are also LSAs. For LSAs with a constant step-size, and PR averaging, we provide bounds for the mean squared error (MSE) after $t$ iterations. We assume that data is \\iid with finite variance (underlying distribution being $P$) and that the expected dynamics is Hurwitz. For a given LSA with PR averaging, and data distribution $P$ satisfying the said assumptions, we show that there exists a range of constant step-sizes such that its MSE decays as $O(\\frac{1}{t})$. We examine the conditions under which a constant step-size can be chosen uniformly for a class of data distributions $\\mathcal{P}$, and show that not all data distributions `admit' such a uniform constant step-size. We also suggest a heuristic step-size tuning algorithm to choose a constant step-size of a given LSA for a given data distribution $P$. We compare our results with related work and also discuss the implication of our results in the context of TD algorithms that are LSAs.",
  "revisions": [
    {
      "version": "v1",
      "updated": "2017-09-12T22:34:09.000Z"
    }
  ],
  "analyses": {
    "keywords": [
      "constant step-size",
      "data distribution",
      "iterate averaging",
      "dimensional linear stochastic approximation algorithms",
      "mse decays"
    ],
    "note": {
      "typesetting": "TeX",
      "pages": 16,
      "language": "en",
      "license": "arXiv",
      "status": "editable"
    }
  }
}