Interactive Inverse Reinforcement Learning for Cooperative Games

Thomas Kleine Büning; Anne-Marie George; Christos Dimitrakakis

Interactive Inverse Reinforcement Learning for Cooperative Games

Thomas Kleine Büning, Anne-Marie George, Christos Dimitrakakis

Proceedings of the 39th International Conference on Machine Learning, PMLR 162:2393-2413, 2022.

Abstract

We study the problem of designing autonomous agents that can learn to cooperate effectively with a potentially suboptimal partner while having no access to the joint reward function. This problem is modeled as a cooperative episodic two-agent Markov decision process. We assume control over only the first of the two agents in a Stackelberg formulation of the game, where the second agent is acting so as to maximise expected utility given the first agent’s policy. How should the first agent act in order to learn the joint reward function as quickly as possible and so that the joint policy is as close to optimal as possible? We analyse how knowledge about the reward function can be gained in this interactive two-agent scenario. We show that when the learning agent’s policies have a significant effect on the transition function, the reward function can be learned efficiently.

Cite this Paper

BibTeX


@InProceedings{pmlr-v162-buning22a,
  title = 	 {Interactive Inverse Reinforcement Learning for Cooperative Games},
  author =       {B{\"u}ning, Thomas Kleine and George, Anne-Marie and Dimitrakakis, Christos},
  booktitle = 	 {Proceedings of the 39th International Conference on Machine Learning},
  pages = 	 {2393--2413},
  year = 	 {2022},
  editor = 	 {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan},
  volume = 	 {162},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {17--23 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v162/buning22a/buning22a.pdf},
  url = 	 {https://proceedings.mlr.press/v162/buning22a.html},
  abstract = 	 {We study the problem of designing autonomous agents that can learn to cooperate effectively with a potentially suboptimal partner while having no access to the joint reward function. This problem is modeled as a cooperative episodic two-agent Markov decision process. We assume control over only the first of the two agents in a Stackelberg formulation of the game, where the second agent is acting so as to maximise expected utility given the first agent’s policy. How should the first agent act in order to learn the joint reward function as quickly as possible and so that the joint policy is as close to optimal as possible? We analyse how knowledge about the reward function can be gained in this interactive two-agent scenario. We show that when the learning agent’s policies have a significant effect on the transition function, the reward function can be learned efficiently.}
}

Endnote

%0 Conference Paper
%T Interactive Inverse Reinforcement Learning for Cooperative Games
%A Thomas Kleine Büning
%A Anne-Marie George
%A Christos Dimitrakakis
%B Proceedings of the 39th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2022
%E Kamalika Chaudhuri
%E Stefanie Jegelka
%E Le Song
%E Csaba Szepesvari
%E Gang Niu
%E Sivan Sabato	
%F pmlr-v162-buning22a
%I PMLR
%P 2393--2413
%U https://proceedings.mlr.press/v162/buning22a.html
%V 162
%X We study the problem of designing autonomous agents that can learn to cooperate effectively with a potentially suboptimal partner while having no access to the joint reward function. This problem is modeled as a cooperative episodic two-agent Markov decision process. We assume control over only the first of the two agents in a Stackelberg formulation of the game, where the second agent is acting so as to maximise expected utility given the first agent’s policy. How should the first agent act in order to learn the joint reward function as quickly as possible and so that the joint policy is as close to optimal as possible? We analyse how knowledge about the reward function can be gained in this interactive two-agent scenario. We show that when the learning agent’s policies have a significant effect on the transition function, the reward function can be learned efficiently.

APA


Büning, T.K., George, A. & Dimitrakakis, C.. (2022). Interactive Inverse Reinforcement Learning for Cooperative Games. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:2393-2413 Available from https://proceedings.mlr.press/v162/buning22a.html.

Related Material

Download PDF