Autotelic Reinforcement Learning in Multi-Agent Environments

Eleni Nisioti, Elias Masquil, Gautier Hamon, Clément Moulin-Frier
Proceedings of The 2nd Conference on Lifelong Learning Agents, PMLR 232:137-161, 2023.

Abstract

In the intrinsically motivated skills acquisition problem, the agent is set in an environment with- out any pre-defined goals and needs to acquire an open-ended repertoire of skills. To do so the agent needs to be autotelic (deriving from the Greek auto (self) and telos (end goal)): it needs to generate goals and learn to achieve them following its own intrinsic motivation rather than external supervision. Autotelic agents have so far been considered in isolation. But many applications of open-ended learning entail groups of agents. Multi-agent environments pose an additional challenge for autotelic agents: to discover and master goals that require cooperation agents must pursue them simultaneously, but they have low chances of doing so if they sample them independently. In this work, we propose a new learning paradigm for modelling such settings, the Decentralized Intrinsically Motivated Skills Acquisition Problem (Dec-IMSAP), and employ it to solve cooperative navigation tasks. First, we show that agents setting their goals independently fail to master the full diversity of goals. Then, we show that a sufficient condition for achieving this is to ensure that a group aligns its goals, i.e., the agents pursue the same cooperative goal. Our empirical analysis shows that alignment enables specialization, an efficient strategy for cooperation. Finally, we intro- duce the Goal-coordination game, a fully-decentralized emergent communication algorithm, where goal alignment emerges from the maximization of individual rewards in multi-goal cooperative environments and show that it is able to reach equal performance to a centralized training baseline that guarantees aligned goals. To our knowledge, this is the first contribution addressing the problem of intrinsically motivated multi-agent goal exploration in a decentralized training paradigm.

Cite this Paper


BibTeX
@InProceedings{pmlr-v232-nisioti23a, title = {Autotelic Reinforcement Learning in Multi-Agent Environments}, author = {Nisioti, Eleni and Masquil, Elias and Hamon, Gautier and Moulin-Frier, Cl\'ement}, booktitle = {Proceedings of The 2nd Conference on Lifelong Learning Agents}, pages = {137--161}, year = {2023}, editor = {Chandar, Sarath and Pascanu, Razvan and Sedghi, Hanie and Precup, Doina}, volume = {232}, series = {Proceedings of Machine Learning Research}, month = {22--25 Aug}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v232/nisioti23a/nisioti23a.pdf}, url = {https://proceedings.mlr.press/v232/nisioti23a.html}, abstract = { In the intrinsically motivated skills acquisition problem, the agent is set in an environment with- out any pre-defined goals and needs to acquire an open-ended repertoire of skills. To do so the agent needs to be autotelic (deriving from the Greek auto (self) and telos (end goal)): it needs to generate goals and learn to achieve them following its own intrinsic motivation rather than external supervision. Autotelic agents have so far been considered in isolation. But many applications of open-ended learning entail groups of agents. Multi-agent environments pose an additional challenge for autotelic agents: to discover and master goals that require cooperation agents must pursue them simultaneously, but they have low chances of doing so if they sample them independently. In this work, we propose a new learning paradigm for modelling such settings, the Decentralized Intrinsically Motivated Skills Acquisition Problem (Dec-IMSAP), and employ it to solve cooperative navigation tasks. First, we show that agents setting their goals independently fail to master the full diversity of goals. Then, we show that a sufficient condition for achieving this is to ensure that a group aligns its goals, i.e., the agents pursue the same cooperative goal. Our empirical analysis shows that alignment enables specialization, an efficient strategy for cooperation. Finally, we intro- duce the Goal-coordination game, a fully-decentralized emergent communication algorithm, where goal alignment emerges from the maximization of individual rewards in multi-goal cooperative environments and show that it is able to reach equal performance to a centralized training baseline that guarantees aligned goals. To our knowledge, this is the first contribution addressing the problem of intrinsically motivated multi-agent goal exploration in a decentralized training paradigm.} }
Endnote
%0 Conference Paper %T Autotelic Reinforcement Learning in Multi-Agent Environments %A Eleni Nisioti %A Elias Masquil %A Gautier Hamon %A Clément Moulin-Frier %B Proceedings of The 2nd Conference on Lifelong Learning Agents %C Proceedings of Machine Learning Research %D 2023 %E Sarath Chandar %E Razvan Pascanu %E Hanie Sedghi %E Doina Precup %F pmlr-v232-nisioti23a %I PMLR %P 137--161 %U https://proceedings.mlr.press/v232/nisioti23a.html %V 232 %X In the intrinsically motivated skills acquisition problem, the agent is set in an environment with- out any pre-defined goals and needs to acquire an open-ended repertoire of skills. To do so the agent needs to be autotelic (deriving from the Greek auto (self) and telos (end goal)): it needs to generate goals and learn to achieve them following its own intrinsic motivation rather than external supervision. Autotelic agents have so far been considered in isolation. But many applications of open-ended learning entail groups of agents. Multi-agent environments pose an additional challenge for autotelic agents: to discover and master goals that require cooperation agents must pursue them simultaneously, but they have low chances of doing so if they sample them independently. In this work, we propose a new learning paradigm for modelling such settings, the Decentralized Intrinsically Motivated Skills Acquisition Problem (Dec-IMSAP), and employ it to solve cooperative navigation tasks. First, we show that agents setting their goals independently fail to master the full diversity of goals. Then, we show that a sufficient condition for achieving this is to ensure that a group aligns its goals, i.e., the agents pursue the same cooperative goal. Our empirical analysis shows that alignment enables specialization, an efficient strategy for cooperation. Finally, we intro- duce the Goal-coordination game, a fully-decentralized emergent communication algorithm, where goal alignment emerges from the maximization of individual rewards in multi-goal cooperative environments and show that it is able to reach equal performance to a centralized training baseline that guarantees aligned goals. To our knowledge, this is the first contribution addressing the problem of intrinsically motivated multi-agent goal exploration in a decentralized training paradigm.
APA
Nisioti, E., Masquil, E., Hamon, G. & Moulin-Frier, C.. (2023). Autotelic Reinforcement Learning in Multi-Agent Environments. Proceedings of The 2nd Conference on Lifelong Learning Agents, in Proceedings of Machine Learning Research 232:137-161 Available from https://proceedings.mlr.press/v232/nisioti23a.html.

Related Material