Enhancing Cooperative Multi-Agent Reinforcement Learning with State Modelling and Adversarial Exploration

Andreas Kontogiannis, Konstantinos Papathanasiou, Yi Shen, Giorgos Stamou, Michael M. Zavlanos, George Vouros
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:31437-31466, 2025.

Abstract

Learning to cooperate in distributed partially observable environments with no communication abilities poses significant challenges for multi-agent deep reinforcement learning (MARL). This paper addresses key concerns in this domain, focusing on inferring state representations from individual agent observations and leveraging these representations to enhance agents’ exploration and collaborative task execution policies. To this end, we propose a novel state modelling framework for cooperative MARL, where agents infer meaningful belief representations of the non-observable state, with respect to optimizing their own policies, while filtering redundant and less informative joint state information. Building upon this framework, we propose the MARL SMPE$^2$ algorithm. In SMPE$^2$, agents enhance their own policy’s discriminative abilities under partial observability, explicitly by incorporating their beliefs into the policy network, and implicitly by adopting an adversarial type of exploration policies which encourages agents to discover novel, high-value states while improving the discriminative abilities of others. Experimentally, we show that SMPE$^2$ outperforms a plethora of state-of-the-art MARL algorithms in complex fully cooperative tasks from the MPE, LBF, and RWARE benchmarks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-kontogiannis25a, title = {Enhancing Cooperative Multi-Agent Reinforcement Learning with State Modelling and Adversarial Exploration}, author = {Kontogiannis, Andreas and Papathanasiou, Konstantinos and Shen, Yi and Stamou, Giorgos and Zavlanos, Michael M. and Vouros, George}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {31437--31466}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/kontogiannis25a/kontogiannis25a.pdf}, url = {https://proceedings.mlr.press/v267/kontogiannis25a.html}, abstract = {Learning to cooperate in distributed partially observable environments with no communication abilities poses significant challenges for multi-agent deep reinforcement learning (MARL). This paper addresses key concerns in this domain, focusing on inferring state representations from individual agent observations and leveraging these representations to enhance agents’ exploration and collaborative task execution policies. To this end, we propose a novel state modelling framework for cooperative MARL, where agents infer meaningful belief representations of the non-observable state, with respect to optimizing their own policies, while filtering redundant and less informative joint state information. Building upon this framework, we propose the MARL SMPE$^2$ algorithm. In SMPE$^2$, agents enhance their own policy’s discriminative abilities under partial observability, explicitly by incorporating their beliefs into the policy network, and implicitly by adopting an adversarial type of exploration policies which encourages agents to discover novel, high-value states while improving the discriminative abilities of others. Experimentally, we show that SMPE$^2$ outperforms a plethora of state-of-the-art MARL algorithms in complex fully cooperative tasks from the MPE, LBF, and RWARE benchmarks.} }
Endnote
%0 Conference Paper %T Enhancing Cooperative Multi-Agent Reinforcement Learning with State Modelling and Adversarial Exploration %A Andreas Kontogiannis %A Konstantinos Papathanasiou %A Yi Shen %A Giorgos Stamou %A Michael M. Zavlanos %A George Vouros %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-kontogiannis25a %I PMLR %P 31437--31466 %U https://proceedings.mlr.press/v267/kontogiannis25a.html %V 267 %X Learning to cooperate in distributed partially observable environments with no communication abilities poses significant challenges for multi-agent deep reinforcement learning (MARL). This paper addresses key concerns in this domain, focusing on inferring state representations from individual agent observations and leveraging these representations to enhance agents’ exploration and collaborative task execution policies. To this end, we propose a novel state modelling framework for cooperative MARL, where agents infer meaningful belief representations of the non-observable state, with respect to optimizing their own policies, while filtering redundant and less informative joint state information. Building upon this framework, we propose the MARL SMPE$^2$ algorithm. In SMPE$^2$, agents enhance their own policy’s discriminative abilities under partial observability, explicitly by incorporating their beliefs into the policy network, and implicitly by adopting an adversarial type of exploration policies which encourages agents to discover novel, high-value states while improving the discriminative abilities of others. Experimentally, we show that SMPE$^2$ outperforms a plethora of state-of-the-art MARL algorithms in complex fully cooperative tasks from the MPE, LBF, and RWARE benchmarks.
APA
Kontogiannis, A., Papathanasiou, K., Shen, Y., Stamou, G., Zavlanos, M.M. & Vouros, G.. (2025). Enhancing Cooperative Multi-Agent Reinforcement Learning with State Modelling and Adversarial Exploration. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:31437-31466 Available from https://proceedings.mlr.press/v267/kontogiannis25a.html.

Related Material