Influencing Towards Stable Multi-Agent Interactions

Woodrow Zhouyuan Wang; Andy Shih; Annie Xie; Dorsa Sadigh

Influencing Towards Stable Multi-Agent Interactions

Woodrow Zhouyuan Wang, Andy Shih, Annie Xie, Dorsa Sadigh

Proceedings of the 5th Conference on Robot Learning, PMLR 164:1132-1143, 2022.

Abstract

Learning in multi-agent environments is difficult due to the non-stationarity introduced by an opponent’s or partner’s changing behaviors. Instead of reactively adapting to the other agent’s (opponent or partner) behavior, we propose an algorithm to proactively influence the other agent’s strategy to stabilize – which can restrain the non-stationarity caused by the other agent. We learn a low-dimensional latent representation of the other agent’s strategy and the dynamics of how the latent strategy evolves with respect to our robot’s behavior. With this learned dynamics model, we can define an unsupervised stability reward to train our robot to deliberately influence the other agent to stabilize towards a single strategy. We demonstrate the effectiveness of stabilizing in improving efficiency of maximizing the task reward in a variety of simulated environments, including autonomous driving, emergent communication, and robotic manipulation. We show qualitative results on our website.

Cite this Paper

BibTeX


@InProceedings{pmlr-v164-wang22f,
  title = 	 {Influencing Towards Stable Multi-Agent Interactions},
  author =       {Wang, Woodrow Zhouyuan and Shih, Andy and Xie, Annie and Sadigh, Dorsa},
  booktitle = 	 {Proceedings of the 5th Conference on Robot Learning},
  pages = 	 {1132--1143},
  year = 	 {2022},
  editor = 	 {Faust, Aleksandra and Hsu, David and Neumann, Gerhard},
  volume = 	 {164},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {08--11 Nov},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v164/wang22f/wang22f.pdf},
  url = 	 {https://proceedings.mlr.press/v164/wang22f.html},
  abstract = 	 {Learning in multi-agent environments is difficult due to the non-stationarity introduced by an opponent’s or partner’s changing behaviors. Instead of reactively adapting to the other agent’s (opponent or partner) behavior, we propose an algorithm to proactively influence the other agent’s strategy to stabilize – which can restrain the non-stationarity caused by the other agent. We learn a low-dimensional latent representation of the other agent’s strategy and the dynamics of how the latent strategy evolves with respect to our robot’s behavior. With this learned dynamics model, we can define an unsupervised stability reward to train our robot to deliberately influence the other agent to stabilize towards a single strategy. We demonstrate the effectiveness of stabilizing in improving efficiency of maximizing the task reward in a variety of simulated environments, including autonomous driving, emergent communication, and robotic manipulation. We show qualitative results on our website.}
}

Endnote

%0 Conference Paper
%T Influencing Towards Stable Multi-Agent Interactions
%A Woodrow Zhouyuan Wang
%A Andy Shih
%A Annie Xie
%A Dorsa Sadigh
%B Proceedings of the 5th Conference on Robot Learning
%C Proceedings of Machine Learning Research
%D 2022
%E Aleksandra Faust
%E David Hsu
%E Gerhard Neumann	
%F pmlr-v164-wang22f
%I PMLR
%P 1132--1143
%U https://proceedings.mlr.press/v164/wang22f.html
%V 164
%X Learning in multi-agent environments is difficult due to the non-stationarity introduced by an opponent’s or partner’s changing behaviors. Instead of reactively adapting to the other agent’s (opponent or partner) behavior, we propose an algorithm to proactively influence the other agent’s strategy to stabilize – which can restrain the non-stationarity caused by the other agent. We learn a low-dimensional latent representation of the other agent’s strategy and the dynamics of how the latent strategy evolves with respect to our robot’s behavior. With this learned dynamics model, we can define an unsupervised stability reward to train our robot to deliberately influence the other agent to stabilize towards a single strategy. We demonstrate the effectiveness of stabilizing in improving efficiency of maximizing the task reward in a variety of simulated environments, including autonomous driving, emergent communication, and robotic manipulation. We show qualitative results on our website.

APA


Wang, W.Z., Shih, A., Xie, A. & Sadigh, D.. (2022). Influencing Towards Stable Multi-Agent Interactions. Proceedings of the 5th Conference on Robot Learning, in Proceedings of Machine Learning Research 164:1132-1143 Available from https://proceedings.mlr.press/v164/wang22f.html.

Influencing Towards Stable Multi-Agent Interactions

Abstract

Cite this Paper

Related Material