MASER: Multi-Agent Reinforcement Learning with Subgoals Generated from Experience Replay Buffer

Jeewon Jeon; Woojun Kim; Whiyoung Jung; Youngchul Sung

MASER: Multi-Agent Reinforcement Learning with Subgoals Generated from Experience Replay Buffer

Jeewon Jeon, Woojun Kim, Whiyoung Jung, Youngchul Sung

Proceedings of the 39th International Conference on Machine Learning, PMLR 162:10041-10052, 2022.

Abstract

In this paper, we consider cooperative multi-agent reinforcement learning (MARL) with sparse reward. To tackle this problem, we propose a novel method named MASER: MARL with subgoals generated from experience replay buffer. Under the widely-used assumption of centralized training with decentralized execution and consistent Q-value decomposition for MARL, MASER automatically generates proper subgoals for multiple agents from the experience replay buffer by considering both individual Q-value and total Q-value. Then, MASER designs individual intrinsic reward for each agent based on actionable representation relevant to Q-learning so that the agents reach their subgoals while maximizing the joint action value. Numerical results show that MASER significantly outperforms StarCraft II micromanagement benchmark compared to other state-of-the-art MARL algorithms.

Cite this Paper

BibTeX


@InProceedings{pmlr-v162-jeon22a,
  title = 	 {{MASER}: Multi-Agent Reinforcement Learning with Subgoals Generated from Experience Replay Buffer},
  author =       {Jeon, Jeewon and Kim, Woojun and Jung, Whiyoung and Sung, Youngchul},
  booktitle = 	 {Proceedings of the 39th International Conference on Machine Learning},
  pages = 	 {10041--10052},
  year = 	 {2022},
  editor = 	 {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan},
  volume = 	 {162},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {17--23 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v162/jeon22a/jeon22a.pdf},
  url = 	 {https://proceedings.mlr.press/v162/jeon22a.html},
  abstract = 	 {In this paper, we consider cooperative multi-agent reinforcement learning (MARL) with sparse reward. To tackle this problem, we propose a novel method named MASER: MARL with subgoals generated from experience replay buffer. Under the widely-used assumption of centralized training with decentralized execution and consistent Q-value decomposition for MARL, MASER automatically generates proper subgoals for multiple agents from the experience replay buffer by considering both individual Q-value and total Q-value. Then, MASER designs individual intrinsic reward for each agent based on actionable representation relevant to Q-learning so that the agents reach their subgoals while maximizing the joint action value. Numerical results show that MASER significantly outperforms StarCraft II micromanagement benchmark compared to other state-of-the-art MARL algorithms.}
}

Endnote

%0 Conference Paper
%T MASER: Multi-Agent Reinforcement Learning with Subgoals Generated from Experience Replay Buffer
%A Jeewon Jeon
%A Woojun Kim
%A Whiyoung Jung
%A Youngchul Sung
%B Proceedings of the 39th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2022
%E Kamalika Chaudhuri
%E Stefanie Jegelka
%E Le Song
%E Csaba Szepesvari
%E Gang Niu
%E Sivan Sabato	
%F pmlr-v162-jeon22a
%I PMLR
%P 10041--10052
%U https://proceedings.mlr.press/v162/jeon22a.html
%V 162
%X In this paper, we consider cooperative multi-agent reinforcement learning (MARL) with sparse reward. To tackle this problem, we propose a novel method named MASER: MARL with subgoals generated from experience replay buffer. Under the widely-used assumption of centralized training with decentralized execution and consistent Q-value decomposition for MARL, MASER automatically generates proper subgoals for multiple agents from the experience replay buffer by considering both individual Q-value and total Q-value. Then, MASER designs individual intrinsic reward for each agent based on actionable representation relevant to Q-learning so that the agents reach their subgoals while maximizing the joint action value. Numerical results show that MASER significantly outperforms StarCraft II micromanagement benchmark compared to other state-of-the-art MARL algorithms.

APA


Jeon, J., Kim, W., Jung, W. & Sung, Y.. (2022). MASER: Multi-Agent Reinforcement Learning with Subgoals Generated from Experience Replay Buffer. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:10041-10052 Available from https://proceedings.mlr.press/v162/jeon22a.html.

MASER: Multi-Agent Reinforcement Learning with Subgoals Generated from Experience Replay Buffer

Abstract

Cite this Paper

Related Material