Linear Stochastic Approximation: How Far Does Constant Step-Size and Iterate Averaging Go?

Chandrashekar Lakshminarayanan, Csaba Szepesvari
Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, PMLR 84:1347-1355, 2018.

Abstract

Temporal difference learning algorithms such as TD(0) and GTD in reinforcement learning (RL) and the stochastic gradient descent (SGD) for linear prediction are linear stochastic approximation (LSA) algorithms. These algorithms make only $O(d)$ ($d$ is parameter dimension) computations per iteration. In the design of LSA algorithms, step-size choice is critical, and is often tuned in an ad-hoc manner. In this paper, we study a constant step-size averaged linear stochastic approximation (CALSA) algorithm, and for a given class of problems, we ask whether properties of $i)$ a universal constant step-size and $ii)$ a uniform fast rate of $\frac{C}{t}$ for the mean square-error hold for all instance of the class, where the constant $C>0$ does not depend on the problem instance. We show that the answer to these question, in general, is no. However, we show the TD(0) and CAGTD algorithms with a problem independent universal constant step-size and iterate averaging, achieve an asymptotic fast rate of $O(\frac{1}{t})$.

Cite this Paper


BibTeX
@InProceedings{pmlr-v84-lakshminarayanan18a, title = {Linear Stochastic Approximation: How Far Does Constant Step-Size and Iterate Averaging Go?}, author = {Lakshminarayanan, Chandrashekar and Szepesvari, Csaba}, booktitle = {Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics}, pages = {1347--1355}, year = {2018}, editor = {Storkey, Amos and Perez-Cruz, Fernando}, volume = {84}, series = {Proceedings of Machine Learning Research}, month = {09--11 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v84/lakshminarayanan18a/lakshminarayanan18a.pdf}, url = {https://proceedings.mlr.press/v84/lakshminarayanan18a.html}, abstract = {Temporal difference learning algorithms such as TD(0) and GTD in reinforcement learning (RL) and the stochastic gradient descent (SGD) for linear prediction are linear stochastic approximation (LSA) algorithms. These algorithms make only $O(d)$ ($d$ is parameter dimension) computations per iteration. In the design of LSA algorithms, step-size choice is critical, and is often tuned in an ad-hoc manner. In this paper, we study a constant step-size averaged linear stochastic approximation (CALSA) algorithm, and for a given class of problems, we ask whether properties of $i)$ a universal constant step-size and $ii)$ a uniform fast rate of $\frac{C}{t}$ for the mean square-error hold for all instance of the class, where the constant $C>0$ does not depend on the problem instance. We show that the answer to these question, in general, is no. However, we show the TD(0) and CAGTD algorithms with a problem independent universal constant step-size and iterate averaging, achieve an asymptotic fast rate of $O(\frac{1}{t})$.} }
Endnote
%0 Conference Paper %T Linear Stochastic Approximation: How Far Does Constant Step-Size and Iterate Averaging Go? %A Chandrashekar Lakshminarayanan %A Csaba Szepesvari %B Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2018 %E Amos Storkey %E Fernando Perez-Cruz %F pmlr-v84-lakshminarayanan18a %I PMLR %P 1347--1355 %U https://proceedings.mlr.press/v84/lakshminarayanan18a.html %V 84 %X Temporal difference learning algorithms such as TD(0) and GTD in reinforcement learning (RL) and the stochastic gradient descent (SGD) for linear prediction are linear stochastic approximation (LSA) algorithms. These algorithms make only $O(d)$ ($d$ is parameter dimension) computations per iteration. In the design of LSA algorithms, step-size choice is critical, and is often tuned in an ad-hoc manner. In this paper, we study a constant step-size averaged linear stochastic approximation (CALSA) algorithm, and for a given class of problems, we ask whether properties of $i)$ a universal constant step-size and $ii)$ a uniform fast rate of $\frac{C}{t}$ for the mean square-error hold for all instance of the class, where the constant $C>0$ does not depend on the problem instance. We show that the answer to these question, in general, is no. However, we show the TD(0) and CAGTD algorithms with a problem independent universal constant step-size and iterate averaging, achieve an asymptotic fast rate of $O(\frac{1}{t})$.
APA
Lakshminarayanan, C. & Szepesvari, C.. (2018). Linear Stochastic Approximation: How Far Does Constant Step-Size and Iterate Averaging Go?. Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 84:1347-1355 Available from https://proceedings.mlr.press/v84/lakshminarayanan18a.html.

Related Material