Finite-sample guarantees for Nash Q-learning with linear function approximation

Pedro Cisneros-Velarde; Sanmi Koyejo

Finite-sample guarantees for Nash Q-learning with linear function approximation

Pedro Cisneros-Velarde, Sanmi Koyejo

Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, PMLR 216:424-432, 2023.

Abstract

Nash Q-learning may be considered one of the first and most known algorithms in multi-agent reinforcement learning (MARL) for learning policies that constitute a Nash equilibrium of an underlying general-sum Markov game. Its original proof provided asymptotic guarantees and was for the tabular case. Recently, finite-sample guarantees have been provided using more modern RL techniques for the tabular case. Our work analyzes Nash Q-learning using linear function approximation – a representation regime introduced when the state space is large or continuous – and provides finite-sample guarantees that indicate its sample efficiency. We find that the obtained performance nearly matches an existing efficient result for single-agent RL under the same representation and has a polynomial gap when compared to the best-known result for the tabular case.

Cite this Paper

BibTeX


@InProceedings{pmlr-v216-cisneros-velarde23a,
  title = 	 {Finite-sample guarantees for {N}ash {Q}-learning with linear function approximation},
  author =       {Cisneros-Velarde, Pedro and Koyejo, Sanmi},
  booktitle = 	 {Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence},
  pages = 	 {424--432},
  year = 	 {2023},
  editor = 	 {Evans, Robin J. and Shpitser, Ilya},
  volume = 	 {216},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {31 Jul--04 Aug},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v216/cisneros-velarde23a/cisneros-velarde23a.pdf},
  url = 	 {https://proceedings.mlr.press/v216/cisneros-velarde23a.html},
  abstract = 	 {Nash Q-learning may be considered one of the first and most known algorithms in multi-agent reinforcement learning (MARL) for learning policies that constitute a Nash equilibrium of an underlying general-sum Markov game. Its original proof provided asymptotic guarantees and was for the tabular case. Recently, finite-sample guarantees have been provided using more modern RL techniques for the tabular case. Our work analyzes Nash Q-learning using linear function approximation – a representation regime introduced when the state space is large or continuous – and provides finite-sample guarantees that indicate its sample efficiency. We find that the obtained performance nearly matches an existing efficient result for single-agent RL under the same representation and has a polynomial gap when compared to the best-known result for the tabular case.}
}

Endnote

%0 Conference Paper
%T Finite-sample guarantees for Nash Q-learning with linear function approximation
%A Pedro Cisneros-Velarde
%A Sanmi Koyejo
%B Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence
%C Proceedings of Machine Learning Research
%D 2023
%E Robin J. Evans
%E Ilya Shpitser	
%F pmlr-v216-cisneros-velarde23a
%I PMLR
%P 424--432
%U https://proceedings.mlr.press/v216/cisneros-velarde23a.html
%V 216
%X Nash Q-learning may be considered one of the first and most known algorithms in multi-agent reinforcement learning (MARL) for learning policies that constitute a Nash equilibrium of an underlying general-sum Markov game. Its original proof provided asymptotic guarantees and was for the tabular case. Recently, finite-sample guarantees have been provided using more modern RL techniques for the tabular case. Our work analyzes Nash Q-learning using linear function approximation – a representation regime introduced when the state space is large or continuous – and provides finite-sample guarantees that indicate its sample efficiency. We find that the obtained performance nearly matches an existing efficient result for single-agent RL under the same representation and has a polynomial gap when compared to the best-known result for the tabular case.

APA


Cisneros-Velarde, P. & Koyejo, S.. (2023). Finite-sample guarantees for Nash Q-learning with linear function approximation. Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 216:424-432 Available from https://proceedings.mlr.press/v216/cisneros-velarde23a.html.

Finite-sample guarantees for Nash Q-learning with linear function approximation

Abstract

Cite this Paper

Related Material