On the Complexity of Multi-Agent Decision Making: From Learning in Games to Partial Monitoring

Dean Foster, Dylan J. Foster, Noah Golowich, Alexander Rakhlin
Proceedings of Thirty Sixth Conference on Learning Theory, PMLR 195:2678-2792, 2023.

Abstract

A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees, and how these considerations change as we move from few to many agents. We study this question in a general framework for interactive decision making with multiple agents, encompassing Markov games with function approximation and normal-form games with bandit feedback. We focus on equilibrium computation, in which a centralized learning algorithm aims to compute an equilibrium by controlling multiple agents that interact with an (unknown) environment. Our main contributions are:• We provide upper and lower bounds on the optimal sample complexity for multi-agent decision making based on a multi-agent generalization of the Decision-Estimation Coefficient, a complexity measure introduced by Foster et al. (2021) in the single-agent counterpart to our setting. Compared to the best results for the single-agent setting, our upper and lower bounds have additional gaps. We show that no “reasonable” complexity measure can close these gaps, highlighting a striking separation between single and multiple agents.• We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making, but with hidden (unobserved) rewards, a framework that subsumes variants of the partial monitoring problem. As a consequence of this connection, we characterize the statistical complexity for hidden-reward interactive decision making to the best extent possible.Building on this development, we provide several new structural results, including 1) conditions under which the statistical complexity of multi-agent decision making can be reduced to that of single-agent, and 2) conditions under which the so-called curse of multiple agents can be avoided.

Cite this Paper


BibTeX
@InProceedings{pmlr-v195-foster23a, title = {On the Complexity of Multi-Agent Decision Making: From Learning in Games to Partial Monitoring}, author = {Foster, Dean and Foster, Dylan J. and Golowich, Noah and Rakhlin, Alexander}, booktitle = {Proceedings of Thirty Sixth Conference on Learning Theory}, pages = {2678--2792}, year = {2023}, editor = {Neu, Gergely and Rosasco, Lorenzo}, volume = {195}, series = {Proceedings of Machine Learning Research}, month = {12--15 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v195/foster23a/foster23a.pdf}, url = {https://proceedings.mlr.press/v195/foster23a.html}, abstract = {A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees, and how these considerations change as we move from few to many agents. We study this question in a general framework for interactive decision making with multiple agents, encompassing Markov games with function approximation and normal-form games with bandit feedback. We focus on equilibrium computation, in which a centralized learning algorithm aims to compute an equilibrium by controlling multiple agents that interact with an (unknown) environment. Our main contributions are:• We provide upper and lower bounds on the optimal sample complexity for multi-agent decision making based on a multi-agent generalization of the Decision-Estimation Coefficient, a complexity measure introduced by Foster et al. (2021) in the single-agent counterpart to our setting. Compared to the best results for the single-agent setting, our upper and lower bounds have additional gaps. We show that no “reasonable” complexity measure can close these gaps, highlighting a striking separation between single and multiple agents.• We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making, but with hidden (unobserved) rewards, a framework that subsumes variants of the partial monitoring problem. As a consequence of this connection, we characterize the statistical complexity for hidden-reward interactive decision making to the best extent possible.Building on this development, we provide several new structural results, including 1) conditions under which the statistical complexity of multi-agent decision making can be reduced to that of single-agent, and 2) conditions under which the so-called curse of multiple agents can be avoided.} }
Endnote
%0 Conference Paper %T On the Complexity of Multi-Agent Decision Making: From Learning in Games to Partial Monitoring %A Dean Foster %A Dylan J. Foster %A Noah Golowich %A Alexander Rakhlin %B Proceedings of Thirty Sixth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2023 %E Gergely Neu %E Lorenzo Rosasco %F pmlr-v195-foster23a %I PMLR %P 2678--2792 %U https://proceedings.mlr.press/v195/foster23a.html %V 195 %X A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees, and how these considerations change as we move from few to many agents. We study this question in a general framework for interactive decision making with multiple agents, encompassing Markov games with function approximation and normal-form games with bandit feedback. We focus on equilibrium computation, in which a centralized learning algorithm aims to compute an equilibrium by controlling multiple agents that interact with an (unknown) environment. Our main contributions are:• We provide upper and lower bounds on the optimal sample complexity for multi-agent decision making based on a multi-agent generalization of the Decision-Estimation Coefficient, a complexity measure introduced by Foster et al. (2021) in the single-agent counterpart to our setting. Compared to the best results for the single-agent setting, our upper and lower bounds have additional gaps. We show that no “reasonable” complexity measure can close these gaps, highlighting a striking separation between single and multiple agents.• We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making, but with hidden (unobserved) rewards, a framework that subsumes variants of the partial monitoring problem. As a consequence of this connection, we characterize the statistical complexity for hidden-reward interactive decision making to the best extent possible.Building on this development, we provide several new structural results, including 1) conditions under which the statistical complexity of multi-agent decision making can be reduced to that of single-agent, and 2) conditions under which the so-called curse of multiple agents can be avoided.
APA
Foster, D., Foster, D.J., Golowich, N. & Rakhlin, A.. (2023). On the Complexity of Multi-Agent Decision Making: From Learning in Games to Partial Monitoring. Proceedings of Thirty Sixth Conference on Learning Theory, in Proceedings of Machine Learning Research 195:2678-2792 Available from https://proceedings.mlr.press/v195/foster23a.html.

Related Material