Breaking the Curse of Many Agents: Provable Mean Embedding Q-Iteration for Mean-Field Reinforcement Learning

Lingxiao Wang, Zhuoran Yang, Zhaoran Wang
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:10092-10103, 2020.

Abstract

Multi-agent reinforcement learning (MARL) achieves significant empirical successes. However, MARL suffers from the curse of many agents. In this paper, we exploit the symmetry of agents in MARL. In the most generic form, we study a mean-field MARL problem. Such a mean-field MARL is defined on mean-field states, which are distributions that are supported on continuous space. Based on the mean embedding of the distributions, we propose MF-FQI algorithm, which solves the mean-field MARL and establishes a non-asymptotic analysis for MF-FQI algorithm. We highlight that MF-FQI algorithm enjoys a “blessing of many agents” property in the sense that a larger number of observed agents improves the performance of MF-FQI algorithm.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-wang20z, title = {Breaking the Curse of Many Agents: Provable Mean Embedding Q-Iteration for Mean-Field Reinforcement Learning}, author = {Wang, Lingxiao and Yang, Zhuoran and Wang, Zhaoran}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {10092--10103}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/wang20z/wang20z.pdf}, url = {http://proceedings.mlr.press/v119/wang20z.html}, abstract = {Multi-agent reinforcement learning (MARL) achieves significant empirical successes. However, MARL suffers from the curse of many agents. In this paper, we exploit the symmetry of agents in MARL. In the most generic form, we study a mean-field MARL problem. Such a mean-field MARL is defined on mean-field states, which are distributions that are supported on continuous space. Based on the mean embedding of the distributions, we propose MF-FQI algorithm, which solves the mean-field MARL and establishes a non-asymptotic analysis for MF-FQI algorithm. We highlight that MF-FQI algorithm enjoys a “blessing of many agents” property in the sense that a larger number of observed agents improves the performance of MF-FQI algorithm.} }
Endnote
%0 Conference Paper %T Breaking the Curse of Many Agents: Provable Mean Embedding Q-Iteration for Mean-Field Reinforcement Learning %A Lingxiao Wang %A Zhuoran Yang %A Zhaoran Wang %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-wang20z %I PMLR %P 10092--10103 %U http://proceedings.mlr.press/v119/wang20z.html %V 119 %X Multi-agent reinforcement learning (MARL) achieves significant empirical successes. However, MARL suffers from the curse of many agents. In this paper, we exploit the symmetry of agents in MARL. In the most generic form, we study a mean-field MARL problem. Such a mean-field MARL is defined on mean-field states, which are distributions that are supported on continuous space. Based on the mean embedding of the distributions, we propose MF-FQI algorithm, which solves the mean-field MARL and establishes a non-asymptotic analysis for MF-FQI algorithm. We highlight that MF-FQI algorithm enjoys a “blessing of many agents” property in the sense that a larger number of observed agents improves the performance of MF-FQI algorithm.
APA
Wang, L., Yang, Z. & Wang, Z.. (2020). Breaking the Curse of Many Agents: Provable Mean Embedding Q-Iteration for Mean-Field Reinforcement Learning. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:10092-10103 Available from http://proceedings.mlr.press/v119/wang20z.html.

Related Material