Decentralized Reinforcement Learning: Global Decision-Making via Local Economic Transactions

Michael Chang, Sid Kaushik, S. Matthew Weinberg, Tom Griffiths, Sergey Levine
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:1437-1447, 2020.

Abstract

This paper seeks to establish a framework for directing a society of simple, specialized, self-interested agents to solve what traditionally are posed as monolithic single-agent sequential decision problems. What makes it challenging to use a decentralized approach to collectively optimize a central objective is the difficulty in characterizing the equilibrium strategy profile of non-cooperative games. To overcome this challenge, we design a mechanism for defining the learning environment of each agent for which we know that the optimal solution for the global objective coincides with a Nash equilibrium strategy profile of the agents optimizing their own local objectives. The society functions as an economy of agents that learn the credit assignment process itself by buying and selling to each other the right to operate on the environment state. We derive a class of decentralized reinforcement learning algorithms that are broadly applicable not only to standard reinforcement learning but also for selecting options in semi-MDPs and dynamically composing computation graphs. Lastly, we demonstrate the potential advantages of a society’s inherent modular structure for more efficient transfer learning.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-chang20b, title = {Decentralized Reinforcement Learning: Global Decision-Making via Local Economic Transactions}, author = {Chang, Michael and Kaushik, Sid and Weinberg, S. Matthew and Griffiths, Tom and Levine, Sergey}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {1437--1447}, year = {2020}, editor = {Hal Daumé III and Aarti Singh}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/chang20b/chang20b.pdf}, url = { http://proceedings.mlr.press/v119/chang20b.html }, abstract = {This paper seeks to establish a framework for directing a society of simple, specialized, self-interested agents to solve what traditionally are posed as monolithic single-agent sequential decision problems. What makes it challenging to use a decentralized approach to collectively optimize a central objective is the difficulty in characterizing the equilibrium strategy profile of non-cooperative games. To overcome this challenge, we design a mechanism for defining the learning environment of each agent for which we know that the optimal solution for the global objective coincides with a Nash equilibrium strategy profile of the agents optimizing their own local objectives. The society functions as an economy of agents that learn the credit assignment process itself by buying and selling to each other the right to operate on the environment state. We derive a class of decentralized reinforcement learning algorithms that are broadly applicable not only to standard reinforcement learning but also for selecting options in semi-MDPs and dynamically composing computation graphs. Lastly, we demonstrate the potential advantages of a society’s inherent modular structure for more efficient transfer learning.} }
Endnote
%0 Conference Paper %T Decentralized Reinforcement Learning: Global Decision-Making via Local Economic Transactions %A Michael Chang %A Sid Kaushik %A S. Matthew Weinberg %A Tom Griffiths %A Sergey Levine %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-chang20b %I PMLR %P 1437--1447 %U http://proceedings.mlr.press/v119/chang20b.html %V 119 %X This paper seeks to establish a framework for directing a society of simple, specialized, self-interested agents to solve what traditionally are posed as monolithic single-agent sequential decision problems. What makes it challenging to use a decentralized approach to collectively optimize a central objective is the difficulty in characterizing the equilibrium strategy profile of non-cooperative games. To overcome this challenge, we design a mechanism for defining the learning environment of each agent for which we know that the optimal solution for the global objective coincides with a Nash equilibrium strategy profile of the agents optimizing their own local objectives. The society functions as an economy of agents that learn the credit assignment process itself by buying and selling to each other the right to operate on the environment state. We derive a class of decentralized reinforcement learning algorithms that are broadly applicable not only to standard reinforcement learning but also for selecting options in semi-MDPs and dynamically composing computation graphs. Lastly, we demonstrate the potential advantages of a society’s inherent modular structure for more efficient transfer learning.
APA
Chang, M., Kaushik, S., Weinberg, S.M., Griffiths, T. & Levine, S.. (2020). Decentralized Reinforcement Learning: Global Decision-Making via Local Economic Transactions. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:1437-1447 Available from http://proceedings.mlr.press/v119/chang20b.html .

Related Material