Sample-Efficient Multiagent Reinforcement Learning with Reset Replay

Yaodong Yang, Guangyong Chen, Jianye Hao, Pheng-Ann Heng
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:55961-55975, 2024.

Abstract

The popularity of multiagent reinforcement learning (MARL) is growing rapidly with the demand for real-world tasks that require swarm intelligence. However, a noticeable drawback of MARL is its low sample efficiency, which leads to a huge amount of interactions with the environment. Surprisingly, few MARL works focus on this practical problem especially in the parallel environment setting, which greatly hampers the application of MARL into the real world. In response to this gap, in this paper, we propose Multiagent Reinforcement Learning with Reset Replay (MARR) to greatly improve the sample efficiency of MARL by enabling MARL training at a high replay ratio in the parallel environment setting for the first time. To achieve this, first, a reset strategy is introduced for maintaining the network plasticity to ensure that MARL continually learns with a high replay ratio. Second, MARR incorporates a data augmentation technique to boost the sample efficiency further. Extensive experiments in SMAC and MPE show that MARR significantly improves the performance of various MARL approaches with much fewer environment interactions.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-yang24c, title = {Sample-Efficient Multiagent Reinforcement Learning with Reset Replay}, author = {Yang, Yaodong and Chen, Guangyong and Hao, Jianye and Heng, Pheng-Ann}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {55961--55975}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/yang24c/yang24c.pdf}, url = {https://proceedings.mlr.press/v235/yang24c.html}, abstract = {The popularity of multiagent reinforcement learning (MARL) is growing rapidly with the demand for real-world tasks that require swarm intelligence. However, a noticeable drawback of MARL is its low sample efficiency, which leads to a huge amount of interactions with the environment. Surprisingly, few MARL works focus on this practical problem especially in the parallel environment setting, which greatly hampers the application of MARL into the real world. In response to this gap, in this paper, we propose Multiagent Reinforcement Learning with Reset Replay (MARR) to greatly improve the sample efficiency of MARL by enabling MARL training at a high replay ratio in the parallel environment setting for the first time. To achieve this, first, a reset strategy is introduced for maintaining the network plasticity to ensure that MARL continually learns with a high replay ratio. Second, MARR incorporates a data augmentation technique to boost the sample efficiency further. Extensive experiments in SMAC and MPE show that MARR significantly improves the performance of various MARL approaches with much fewer environment interactions.} }
Endnote
%0 Conference Paper %T Sample-Efficient Multiagent Reinforcement Learning with Reset Replay %A Yaodong Yang %A Guangyong Chen %A Jianye Hao %A Pheng-Ann Heng %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-yang24c %I PMLR %P 55961--55975 %U https://proceedings.mlr.press/v235/yang24c.html %V 235 %X The popularity of multiagent reinforcement learning (MARL) is growing rapidly with the demand for real-world tasks that require swarm intelligence. However, a noticeable drawback of MARL is its low sample efficiency, which leads to a huge amount of interactions with the environment. Surprisingly, few MARL works focus on this practical problem especially in the parallel environment setting, which greatly hampers the application of MARL into the real world. In response to this gap, in this paper, we propose Multiagent Reinforcement Learning with Reset Replay (MARR) to greatly improve the sample efficiency of MARL by enabling MARL training at a high replay ratio in the parallel environment setting for the first time. To achieve this, first, a reset strategy is introduced for maintaining the network plasticity to ensure that MARL continually learns with a high replay ratio. Second, MARR incorporates a data augmentation technique to boost the sample efficiency further. Extensive experiments in SMAC and MPE show that MARR significantly improves the performance of various MARL approaches with much fewer environment interactions.
APA
Yang, Y., Chen, G., Hao, J. & Heng, P.. (2024). Sample-Efficient Multiagent Reinforcement Learning with Reset Replay. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:55961-55975 Available from https://proceedings.mlr.press/v235/yang24c.html.

Related Material