Towards Open Ad Hoc Teamwork Using Graph-based Policy Learning

Muhammad A Rahman; Niklas Hopner; Filippos Christianos; Stefano V Albrecht

Towards Open Ad Hoc Teamwork Using Graph-based Policy Learning

Muhammad A Rahman, Niklas Hopner, Filippos Christianos, Stefano V Albrecht

Proceedings of the 38th International Conference on Machine Learning, PMLR 139:8776-8786, 2021.

Abstract

Ad hoc teamwork is the challenging problem of designing an autonomous agent which can adapt quickly to collaborate with teammates without prior coordination mechanisms, including joint training. Prior work in this area has focused on closed teams in which the number of agents is fixed. In this work, we consider open teams by allowing agents with different fixed policies to enter and leave the environment without prior notification. Our solution builds on graph neural networks to learn agent models and joint-action value models under varying team compositions. We contribute a novel action-value computation that integrates the agent model and joint-action value model to produce action-value estimates. We empirically demonstrate that our approach successfully models the effects other agents have on the learner, leading to policies that robustly adapt to dynamic team compositions and significantly outperform several alternative methods.

Cite this Paper

BibTeX


@InProceedings{pmlr-v139-rahman21a,
  title = 	 {Towards Open Ad Hoc Teamwork Using Graph-based Policy Learning},
  author =       {Rahman, Muhammad A and Hopner, Niklas and Christianos, Filippos and Albrecht, Stefano V},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {8776--8786},
  year = 	 {2021},
  editor = 	 {Meila, Marina and Zhang, Tong},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v139/rahman21a/rahman21a.pdf},
  url = 	 {https://proceedings.mlr.press/v139/rahman21a.html},
  abstract = 	 {Ad hoc teamwork is the challenging problem of designing an autonomous agent which can adapt quickly to collaborate with teammates without prior coordination mechanisms, including joint training. Prior work in this area has focused on closed teams in which the number of agents is fixed. In this work, we consider open teams by allowing agents with different fixed policies to enter and leave the environment without prior notification. Our solution builds on graph neural networks to learn agent models and joint-action value models under varying team compositions. We contribute a novel action-value computation that integrates the agent model and joint-action value model to produce action-value estimates. We empirically demonstrate that our approach successfully models the effects other agents have on the learner, leading to policies that robustly adapt to dynamic team compositions and significantly outperform several alternative methods.}
}

Endnote

%0 Conference Paper
%T Towards Open Ad Hoc Teamwork Using Graph-based Policy Learning
%A Muhammad A Rahman
%A Niklas Hopner
%A Filippos Christianos
%A Stefano V Albrecht
%B Proceedings of the 38th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Marina Meila
%E Tong Zhang	
%F pmlr-v139-rahman21a
%I PMLR
%P 8776--8786
%U https://proceedings.mlr.press/v139/rahman21a.html
%V 139
%X Ad hoc teamwork is the challenging problem of designing an autonomous agent which can adapt quickly to collaborate with teammates without prior coordination mechanisms, including joint training. Prior work in this area has focused on closed teams in which the number of agents is fixed. In this work, we consider open teams by allowing agents with different fixed policies to enter and leave the environment without prior notification. Our solution builds on graph neural networks to learn agent models and joint-action value models under varying team compositions. We contribute a novel action-value computation that integrates the agent model and joint-action value model to produce action-value estimates. We empirically demonstrate that our approach successfully models the effects other agents have on the learner, leading to policies that robustly adapt to dynamic team compositions and significantly outperform several alternative methods.

APA


Rahman, M.A., Hopner, N., Christianos, F. & Albrecht, S.V.. (2021). Towards Open Ad Hoc Teamwork Using Graph-based Policy Learning. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:8776-8786 Available from https://proceedings.mlr.press/v139/rahman21a.html.

Towards Open Ad Hoc Teamwork Using Graph-based Policy Learning

Abstract

Cite this Paper

Related Material