FiniteTime Error Bounds For Linear Stochastic Approximation andTD Learning
[edit]
Proceedings of the ThirtySecond Conference on Learning Theory, PMLR 99:28032830, 2019.
Abstract
We consider the dynamics of a linear stochastic approximation algorithm driven by Markovian noise, and derive finitetime bounds on the moments of the error, i.e., deviation of the output of the algorithm from the equilibrium point of an associated ordinary differential equation (ODE). We obtain finitetime bounds on the meansquare error in the case of constant stepsize algorithms by considering the drift of an appropriately chosen Lyapunov function. The Lyapunov function can be interpreted either in terms of Stein’s method to obtain bounds on steadystate performance or in terms of Lyapunov stability theory for linear ODEs. We also provide a comprehensive treatment of the moments of the square of the 2norm of the approximation error. Our analysis yields the following results: (i) for a given stepsize, we show that the lowerorder moments can be made small as a function of the stepsize and can be upperbounded by the moments of a Gaussian random variable; (ii) we show that the higherorder moments beyond a threshold may be infinite in steadystate; and (iii) we characterize the number of samples needed for the finitetime bounds to be of the same order as the steadystate bounds. As a byproduct of our analysis, we also solve the open problem of obtaining finitetime bounds for the performance of temporal difference learning algorithms with linear function approximation and a constant stepsize, without requiring a projection step or an i.i.d. noise assumption.
Related Material


