On the Heterogeneity of Independent Learning Dynamics in Zero-sum Stochastic Games

Muhammed Sayin, Kemal Cetiner
Proceedings of The 4th Annual Learning for Dynamics and Control Conference, PMLR 168:994-1005, 2022.

Abstract

We analyze the convergence properties of the two-timescale fictitious play combining the classical fictitious play with the Q-learning for two-player zero-sum stochastic games with player-dependent learning rates. We show its almost sure convergence under the standard assumptions in two-timescale stochastic approximation methods when the discount factor is less than the product of the ratios of player-dependent step sizes. To this end, we formulate a novel Lyapunov function formulation and present a one-sided asynchronous convergence result.

Cite this Paper


BibTeX
@InProceedings{pmlr-v168-sayin22a, title = {On the Heterogeneity of Independent Learning Dynamics in Zero-sum Stochastic Games}, author = {Sayin, Muhammed and Cetiner, Kemal}, booktitle = {Proceedings of The 4th Annual Learning for Dynamics and Control Conference}, pages = {994--1005}, year = {2022}, editor = {Firoozi, Roya and Mehr, Negar and Yel, Esen and Antonova, Rika and Bohg, Jeannette and Schwager, Mac and Kochenderfer, Mykel}, volume = {168}, series = {Proceedings of Machine Learning Research}, month = {23--24 Jun}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v168/sayin22a/sayin22a.pdf}, url = {https://proceedings.mlr.press/v168/sayin22a.html}, abstract = {We analyze the convergence properties of the two-timescale fictitious play combining the classical fictitious play with the Q-learning for two-player zero-sum stochastic games with player-dependent learning rates. We show its almost sure convergence under the standard assumptions in two-timescale stochastic approximation methods when the discount factor is less than the product of the ratios of player-dependent step sizes. To this end, we formulate a novel Lyapunov function formulation and present a one-sided asynchronous convergence result.} }
Endnote
%0 Conference Paper %T On the Heterogeneity of Independent Learning Dynamics in Zero-sum Stochastic Games %A Muhammed Sayin %A Kemal Cetiner %B Proceedings of The 4th Annual Learning for Dynamics and Control Conference %C Proceedings of Machine Learning Research %D 2022 %E Roya Firoozi %E Negar Mehr %E Esen Yel %E Rika Antonova %E Jeannette Bohg %E Mac Schwager %E Mykel Kochenderfer %F pmlr-v168-sayin22a %I PMLR %P 994--1005 %U https://proceedings.mlr.press/v168/sayin22a.html %V 168 %X We analyze the convergence properties of the two-timescale fictitious play combining the classical fictitious play with the Q-learning for two-player zero-sum stochastic games with player-dependent learning rates. We show its almost sure convergence under the standard assumptions in two-timescale stochastic approximation methods when the discount factor is less than the product of the ratios of player-dependent step sizes. To this end, we formulate a novel Lyapunov function formulation and present a one-sided asynchronous convergence result.
APA
Sayin, M. & Cetiner, K.. (2022). On the Heterogeneity of Independent Learning Dynamics in Zero-sum Stochastic Games. Proceedings of The 4th Annual Learning for Dynamics and Control Conference, in Proceedings of Machine Learning Research 168:994-1005 Available from https://proceedings.mlr.press/v168/sayin22a.html.

Related Material