Decentralized Multi-Agents by Imitation of a Centralized Controller

Alex Tong Lin, Mark Debord, Katia Estabridis, Gary Hewer, Guido Montufar, Stanley Osher
Proceedings of the 2nd Mathematical and Scientific Machine Learning Conference, PMLR 145:619-651, 2022.

Abstract

We consider a multi-agent reinforcement learning problem where each agent seeks to maximize a shared reward while interacting with other agents, and they may or may not be able to communicate. Typically the agents do not have access to other agent policies and thus each agent is situated in a non-stationary and partially-observable environment. In order to obtain multi-agents that act in a decentralized manner, we introduce a novel algorithm under the popular framework of centralized training, but decentralized execution. This training framework first obtains solutions to a multi- agent problem with a single centralized joint-space learner, which is then used to guide imitation learning for independent decentralized multi-agents. This framework has the flexibility to use any reinforcement learning algorithm to obtain the expert as well as any imitation learning algorithm to obtain the decentralized agents. This is in contrast to other multi-agent learning algorithms that, for example, can require more specific structures. We present some theoretical bounds for our method, and we show that one can obtain decentralized solutions to a multi-agent problem through imitation learning.

Cite this Paper


BibTeX
@InProceedings{pmlr-v145-lin22a, title = {Decentralized Multi-Agents by Imitation of a Centralized Controller}, author = {Lin, Alex Tong and Debord, Mark and Estabridis, Katia and Hewer, Gary and Montufar, Guido and Osher, Stanley}, booktitle = {Proceedings of the 2nd Mathematical and Scientific Machine Learning Conference}, pages = {619--651}, year = {2022}, editor = {Bruna, Joan and Hesthaven, Jan and Zdeborova, Lenka}, volume = {145}, series = {Proceedings of Machine Learning Research}, month = {16--19 Aug}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v145/lin22a/lin22a.pdf}, url = {https://proceedings.mlr.press/v145/lin22a.html}, abstract = {We consider a multi-agent reinforcement learning problem where each agent seeks to maximize a shared reward while interacting with other agents, and they may or may not be able to communicate. Typically the agents do not have access to other agent policies and thus each agent is situated in a non-stationary and partially-observable environment. In order to obtain multi-agents that act in a decentralized manner, we introduce a novel algorithm under the popular framework of centralized training, but decentralized execution. This training framework first obtains solutions to a multi- agent problem with a single centralized joint-space learner, which is then used to guide imitation learning for independent decentralized multi-agents. This framework has the flexibility to use any reinforcement learning algorithm to obtain the expert as well as any imitation learning algorithm to obtain the decentralized agents. This is in contrast to other multi-agent learning algorithms that, for example, can require more specific structures. We present some theoretical bounds for our method, and we show that one can obtain decentralized solutions to a multi-agent problem through imitation learning. } }
Endnote
%0 Conference Paper %T Decentralized Multi-Agents by Imitation of a Centralized Controller %A Alex Tong Lin %A Mark Debord %A Katia Estabridis %A Gary Hewer %A Guido Montufar %A Stanley Osher %B Proceedings of the 2nd Mathematical and Scientific Machine Learning Conference %C Proceedings of Machine Learning Research %D 2022 %E Joan Bruna %E Jan Hesthaven %E Lenka Zdeborova %F pmlr-v145-lin22a %I PMLR %P 619--651 %U https://proceedings.mlr.press/v145/lin22a.html %V 145 %X We consider a multi-agent reinforcement learning problem where each agent seeks to maximize a shared reward while interacting with other agents, and they may or may not be able to communicate. Typically the agents do not have access to other agent policies and thus each agent is situated in a non-stationary and partially-observable environment. In order to obtain multi-agents that act in a decentralized manner, we introduce a novel algorithm under the popular framework of centralized training, but decentralized execution. This training framework first obtains solutions to a multi- agent problem with a single centralized joint-space learner, which is then used to guide imitation learning for independent decentralized multi-agents. This framework has the flexibility to use any reinforcement learning algorithm to obtain the expert as well as any imitation learning algorithm to obtain the decentralized agents. This is in contrast to other multi-agent learning algorithms that, for example, can require more specific structures. We present some theoretical bounds for our method, and we show that one can obtain decentralized solutions to a multi-agent problem through imitation learning.
APA
Lin, A.T., Debord, M., Estabridis, K., Hewer, G., Montufar, G. & Osher, S.. (2022). Decentralized Multi-Agents by Imitation of a Centralized Controller. Proceedings of the 2nd Mathematical and Scientific Machine Learning Conference, in Proceedings of Machine Learning Research 145:619-651 Available from https://proceedings.mlr.press/v145/lin22a.html.

Related Material