Finite-Time Analysis of Asynchronous Stochastic Approximation and $Q$-Learning

Guannan Qu, Adam Wierman
; Proceedings of Thirty Third Conference on Learning Theory, PMLR 125:3185-3205, 2020.

Abstract

We consider a general asynchronous Stochastic Approximation (SA) scheme featuring a weighted infinity-norm contractive operator, and prove a bound on its finite-time convergence rate on a single trajectory. Additionally, we specialize the result to asynchronous $Q$-learning. The resulting bound matches the sharpest available bound for synchronous $Q$-learning, and improves over previous known bounds for asynchronous $Q$-learning.

Cite this Paper


BibTeX
@InProceedings{pmlr-v125-qu20a, title = {Finite-Time Analysis of Asynchronous Stochastic Approximation and $Q$-Learning}, author = {Qu, Guannan and Wierman, Adam}, pages = {3185--3205}, year = {2020}, editor = {Jacob Abernethy and Shivani Agarwal}, volume = {125}, series = {Proceedings of Machine Learning Research}, address = {}, month = {09--12 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v125/qu20a/qu20a.pdf}, url = {http://proceedings.mlr.press/v125/qu20a.html}, abstract = { We consider a general asynchronous Stochastic Approximation (SA) scheme featuring a weighted infinity-norm contractive operator, and prove a bound on its finite-time convergence rate on a single trajectory. Additionally, we specialize the result to asynchronous $Q$-learning. The resulting bound matches the sharpest available bound for synchronous $Q$-learning, and improves over previous known bounds for asynchronous $Q$-learning.} }
Endnote
%0 Conference Paper %T Finite-Time Analysis of Asynchronous Stochastic Approximation and $Q$-Learning %A Guannan Qu %A Adam Wierman %B Proceedings of Thirty Third Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2020 %E Jacob Abernethy %E Shivani Agarwal %F pmlr-v125-qu20a %I PMLR %J Proceedings of Machine Learning Research %P 3185--3205 %U http://proceedings.mlr.press %V 125 %W PMLR %X We consider a general asynchronous Stochastic Approximation (SA) scheme featuring a weighted infinity-norm contractive operator, and prove a bound on its finite-time convergence rate on a single trajectory. Additionally, we specialize the result to asynchronous $Q$-learning. The resulting bound matches the sharpest available bound for synchronous $Q$-learning, and improves over previous known bounds for asynchronous $Q$-learning.
APA
Qu, G. & Wierman, A.. (2020). Finite-Time Analysis of Asynchronous Stochastic Approximation and $Q$-Learning. Proceedings of Thirty Third Conference on Learning Theory, in PMLR 125:3185-3205

Related Material