On the Complexity of Multi-Agent Decision Making: From Learning in Games to Partial Monitoring

Dean Foster; Dylan J. Foster; Noah Golowich; Alexander Rakhlin

On the Complexity of Multi-Agent Decision Making: From Learning in Games to Partial Monitoring

Dean Foster, Dylan J. Foster, Noah Golowich, Alexander Rakhlin

Proceedings of Thirty Sixth Conference on Learning Theory, PMLR 195:2678-2792, 2023.

Abstract

A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees, and how these considerations change as we move from few to many agents. We study this question in a general framework for interactive decision making with multiple agents, encompassing Markov games with function approximation and normal-form games with bandit feedback. We focus on equilibrium computation, in which a centralized learning algorithm aims to compute an equilibrium by controlling multiple agents that interact with an (unknown) environment. Our main contributions are:• We provide upper and lower bounds on the optimal sample complexity for multi-agent decision making based on a multi-agent generalization of the Decision-Estimation Coefficient, a complexity measure introduced by Foster et al. (2021) in the single-agent counterpart to our setting. Compared to the best results for the single-agent setting, our upper and lower bounds have additional gaps. We show that no “reasonable” complexity measure can close these gaps, highlighting a striking separation between single and multiple agents.• We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making, but with hidden (unobserved) rewards, a framework that subsumes variants of the partial monitoring problem. As a consequence of this connection, we characterize the statistical complexity for hidden-reward interactive decision making to the best extent possible.Building on this development, we provide several new structural results, including 1) conditions under which the statistical complexity of multi-agent decision making can be reduced to that of single-agent, and 2) conditions under which the so-called curse of multiple agents can be avoided.

Cite this Paper

BibTeX


@InProceedings{pmlr-v195-foster23a,
  title = 	 {On the Complexity of Multi-Agent Decision Making: From Learning in Games to Partial Monitoring},
  author =       {Foster, Dean and Foster, Dylan J. and Golowich, Noah and Rakhlin, Alexander},
  booktitle = 	 {Proceedings of Thirty Sixth Conference on Learning Theory},
  pages = 	 {2678--2792},
  year = 	 {2023},
  editor = 	 {Neu, Gergely and Rosasco, Lorenzo},
  volume = 	 {195},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {12--15 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v195/foster23a/foster23a.pdf},
  url = 	 {https://proceedings.mlr.press/v195/foster23a.html},
  abstract = 	 {A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees, and how these considerations change as we move from few to many agents. We study this question in a general framework for interactive decision making with multiple agents, encompassing Markov games with function approximation and normal-form games with bandit feedback. We focus on equilibrium computation, in which a centralized learning algorithm aims to compute an equilibrium by controlling multiple agents that interact with an (unknown) environment. Our main contributions are:• We provide upper and lower bounds on the optimal sample complexity for multi-agent decision making based on a multi-agent generalization of the Decision-Estimation Coefficient, a complexity measure introduced by Foster et al. (2021) in the single-agent counterpart to our setting. Compared to the best results for the single-agent setting, our upper and lower bounds have additional gaps. We show that no “reasonable” complexity measure can close these gaps, highlighting a striking separation between single and multiple agents.• We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making, but with hidden (unobserved) rewards, a framework that subsumes variants of the partial monitoring problem. As a consequence of this connection, we characterize the statistical complexity for hidden-reward interactive decision making to the best extent possible.Building on this development, we provide several new structural results, including 1) conditions under which the statistical complexity of multi-agent decision making can be reduced to that of single-agent, and 2) conditions under which the so-called curse of multiple agents can be avoided.}
}

Endnote

%0 Conference Paper
%T On the Complexity of Multi-Agent Decision Making: From Learning in Games to Partial Monitoring
%A Dean Foster
%A Dylan J. Foster
%A Noah Golowich
%A Alexander Rakhlin
%B Proceedings of Thirty Sixth Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2023
%E Gergely Neu
%E Lorenzo Rosasco	
%F pmlr-v195-foster23a
%I PMLR
%P 2678--2792
%U https://proceedings.mlr.press/v195/foster23a.html
%V 195
%X A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees, and how these considerations change as we move from few to many agents. We study this question in a general framework for interactive decision making with multiple agents, encompassing Markov games with function approximation and normal-form games with bandit feedback. We focus on equilibrium computation, in which a centralized learning algorithm aims to compute an equilibrium by controlling multiple agents that interact with an (unknown) environment. Our main contributions are:• We provide upper and lower bounds on the optimal sample complexity for multi-agent decision making based on a multi-agent generalization of the Decision-Estimation Coefficient, a complexity measure introduced by Foster et al. (2021) in the single-agent counterpart to our setting. Compared to the best results for the single-agent setting, our upper and lower bounds have additional gaps. We show that no “reasonable” complexity measure can close these gaps, highlighting a striking separation between single and multiple agents.• We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making, but with hidden (unobserved) rewards, a framework that subsumes variants of the partial monitoring problem. As a consequence of this connection, we characterize the statistical complexity for hidden-reward interactive decision making to the best extent possible.Building on this development, we provide several new structural results, including 1) conditions under which the statistical complexity of multi-agent decision making can be reduced to that of single-agent, and 2) conditions under which the so-called curse of multiple agents can be avoided.

APA


Foster, D., Foster, D.J., Golowich, N. & Rakhlin, A.. (2023). On the Complexity of Multi-Agent Decision Making: From Learning in Games to Partial Monitoring. Proceedings of Thirty Sixth Conference on Learning Theory, in Proceedings of Machine Learning Research 195:2678-2792 Available from https://proceedings.mlr.press/v195/foster23a.html.

Related Material

Download PDF