Linear Stochastic Approximation: How Far Does Constant StepSize and Iterate Averaging Go?
[edit]
Proceedings of the TwentyFirst International Conference on Artificial Intelligence and Statistics, PMLR 84:13471355, 2018.
Abstract
Temporal difference learning algorithms such as TD(0) and GTD in reinforcement learning (RL) and the stochastic gradient descent (SGD) for linear prediction are linear stochastic approximation (LSA) algorithms. These algorithms make only $O(d)$ ($d$ is parameter dimension) computations per iteration. In the design of LSA algorithms, stepsize choice is critical, and is often tuned in an adhoc manner. In this paper, we study a constant stepsize averaged linear stochastic approximation (CALSA) algorithm, and for a given class of problems, we ask whether properties of $i)$ a universal constant stepsize and $ii)$ a uniform fast rate of $\frac{C}{t}$ for the mean squareerror hold for all instance of the class, where the constant $C>0$ does not depend on the problem instance. We show that the answer to these question, in general, is no. However, we show the TD(0) and CAGTD algorithms with a problem independent universal constant stepsize and iterate averaging, achieve an asymptotic fast rate of $O(\frac{1}{t})$.
Related Material


