On the Heterogeneity of Independent Learning Dynamics in Zero-sum Stochastic Games

Muhammed Sayin; Kemal Cetiner

On the Heterogeneity of Independent Learning Dynamics in Zero-sum Stochastic Games

Muhammed Sayin, Kemal Cetiner

Proceedings of The 4th Annual Learning for Dynamics and Control Conference, PMLR 168:994-1005, 2022.

Abstract

We analyze the convergence properties of the two-timescale fictitious play combining the classical fictitious play with the Q-learning for two-player zero-sum stochastic games with player-dependent learning rates. We show its almost sure convergence under the standard assumptions in two-timescale stochastic approximation methods when the discount factor is less than the product of the ratios of player-dependent step sizes. To this end, we formulate a novel Lyapunov function formulation and present a one-sided asynchronous convergence result.

Cite this Paper

BibTeX


@InProceedings{pmlr-v168-sayin22a,
  title = 	 {On the Heterogeneity of Independent Learning Dynamics in Zero-sum Stochastic Games},
  author =       {Sayin, Muhammed and Cetiner, Kemal},
  booktitle = 	 {Proceedings of The 4th Annual Learning for Dynamics and Control Conference},
  pages = 	 {994--1005},
  year = 	 {2022},
  editor = 	 {Firoozi, Roya and Mehr, Negar and Yel, Esen and Antonova, Rika and Bohg, Jeannette and Schwager, Mac and Kochenderfer, Mykel},
  volume = 	 {168},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {23--24 Jun},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v168/sayin22a/sayin22a.pdf},
  url = 	 {https://proceedings.mlr.press/v168/sayin22a.html},
  abstract = 	 {We analyze the convergence properties of the two-timescale fictitious play combining the classical fictitious play with the Q-learning for two-player zero-sum stochastic games with player-dependent learning rates. We show its almost sure convergence under the standard assumptions in two-timescale stochastic approximation methods when the discount factor is less than the product of the ratios of player-dependent step sizes. To this end, we formulate a novel Lyapunov function formulation and present a one-sided asynchronous convergence result.}
}

Endnote

%0 Conference Paper
%T On the Heterogeneity of Independent Learning Dynamics in Zero-sum Stochastic Games
%A Muhammed Sayin
%A Kemal Cetiner
%B Proceedings of The 4th Annual Learning for Dynamics and Control Conference
%C Proceedings of Machine Learning Research
%D 2022
%E Roya Firoozi
%E Negar Mehr
%E Esen Yel
%E Rika Antonova
%E Jeannette Bohg
%E Mac Schwager
%E Mykel Kochenderfer	
%F pmlr-v168-sayin22a
%I PMLR
%P 994--1005
%U https://proceedings.mlr.press/v168/sayin22a.html
%V 168
%X We analyze the convergence properties of the two-timescale fictitious play combining the classical fictitious play with the Q-learning for two-player zero-sum stochastic games with player-dependent learning rates. We show its almost sure convergence under the standard assumptions in two-timescale stochastic approximation methods when the discount factor is less than the product of the ratios of player-dependent step sizes. To this end, we formulate a novel Lyapunov function formulation and present a one-sided asynchronous convergence result.

APA


Sayin, M. & Cetiner, K.. (2022). On the Heterogeneity of Independent Learning Dynamics in Zero-sum Stochastic Games. Proceedings of The 4th Annual Learning for Dynamics and Control Conference, in Proceedings of Machine Learning Research 168:994-1005 Available from https://proceedings.mlr.press/v168/sayin22a.html.

Related Material

Download PDF