Central Limit Theorems for Asynchronous Averaged Q-Learning

Xingtu Liu
Proceedings of The 8th Annual Learning for Dynamics and Control Conference, PMLR 331:2207-2230, 2026.

Abstract

This paper establishes central limit theorems for Polyak–Ruppert averaged Q-learning under asynchronous updates. We present a non-asymptotic central limit theorem, where the convergence rate in Wasserstein distance explicitly reflects the dependence on the number of iterations, state–action space size, the discount factor, and the quality of exploration. In addition, we derive a functional central limit theorem, showing that the partial-sum process converges weakly to a Brownian motion.

Cite this Paper


BibTeX
@InProceedings{pmlr-v331-liu26b, title = {Central Limit Theorems for Asynchronous Averaged Q-Learning}, author = {Liu, Xingtu}, booktitle = {Proceedings of The 8th Annual Learning for Dynamics and Control Conference}, pages = {2207--2230}, year = {2026}, editor = {Sukhatme, Gaurav and Lindemann, Lars and Tu, Stephen and Wierman, Adam and Atanasov, Nikolay}, volume = {331}, series = {Proceedings of Machine Learning Research}, month = {17--19 Jun}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v331/main/assets/liu26b/liu26b.pdf}, url = {https://proceedings.mlr.press/v331/liu26b.html}, abstract = {This paper establishes central limit theorems for Polyak–Ruppert averaged Q-learning under asynchronous updates. We present a non-asymptotic central limit theorem, where the convergence rate in Wasserstein distance explicitly reflects the dependence on the number of iterations, state–action space size, the discount factor, and the quality of exploration. In addition, we derive a functional central limit theorem, showing that the partial-sum process converges weakly to a Brownian motion.} }
Endnote
%0 Conference Paper %T Central Limit Theorems for Asynchronous Averaged Q-Learning %A Xingtu Liu %B Proceedings of The 8th Annual Learning for Dynamics and Control Conference %C Proceedings of Machine Learning Research %D 2026 %E Gaurav Sukhatme %E Lars Lindemann %E Stephen Tu %E Adam Wierman %E Nikolay Atanasov %F pmlr-v331-liu26b %I PMLR %P 2207--2230 %U https://proceedings.mlr.press/v331/liu26b.html %V 331 %X This paper establishes central limit theorems for Polyak–Ruppert averaged Q-learning under asynchronous updates. We present a non-asymptotic central limit theorem, where the convergence rate in Wasserstein distance explicitly reflects the dependence on the number of iterations, state–action space size, the discount factor, and the quality of exploration. In addition, we derive a functional central limit theorem, showing that the partial-sum process converges weakly to a Brownian motion.
APA
Liu, X.. (2026). Central Limit Theorems for Asynchronous Averaged Q-Learning. Proceedings of The 8th Annual Learning for Dynamics and Control Conference, in Proceedings of Machine Learning Research 331:2207-2230 Available from https://proceedings.mlr.press/v331/liu26b.html.

Related Material