Model-based Reinforcement Learning for Decentralized Multiagent Rendezvous

Rose Wang, J. Chase Kew, Dennis Lee, Tsang-Wei Lee, Tingnan Zhang, Brian Ichter, Jie Tan, Aleksandra Faust
Proceedings of the 2020 Conference on Robot Learning, PMLR 155:711-725, 2021.

Abstract

Collaboration requires agents to align their goals on the fly. Underlying the human ability to align goals with other agents is their ability to predict the intentions of others and actively update their own plans. We propose hierarchical predictive planning (HPP), a model-based reinforcement learning method for decentralized multiagent rendezvous. Starting with pretrained, single-agent point to point navigation policies and using noisy, high-dimensional sensor inputs like lidar, we first learn via self-supervision motion predictions of all agents on the team. Next, HPP uses the prediction models to propose and evaluate navigation subgoals for completing the rendezvous task without explicit communication among agents. We evaluate HPP in a suite of unseen environments, with increasing complexity and numbers of obstacles. We show that HPP outperforms alternative reinforcement learning, path planning, and heuristic-based baselines on challenging, unseen environments. Experiments in the real world demonstrate successful transfer of the prediction models from sim to real world without any additional fine-tuning. Altogether, HPP removes the need for a centralized operator in multiagent systems by combining model-based RL and inference methods, enabling agents to dynamically align plans.

Cite this Paper


BibTeX
@InProceedings{pmlr-v155-wang21d, title = {Model-based Reinforcement Learning for Decentralized Multiagent Rendezvous}, author = {Wang, Rose and Kew, J. Chase and Lee, Dennis and Lee, Tsang-Wei and Zhang, Tingnan and Ichter, Brian and Tan, Jie and Faust, Aleksandra}, booktitle = {Proceedings of the 2020 Conference on Robot Learning}, pages = {711--725}, year = {2021}, editor = {Kober, Jens and Ramos, Fabio and Tomlin, Claire}, volume = {155}, series = {Proceedings of Machine Learning Research}, month = {16--18 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v155/wang21d/wang21d.pdf}, url = {https://proceedings.mlr.press/v155/wang21d.html}, abstract = {Collaboration requires agents to align their goals on the fly. Underlying the human ability to align goals with other agents is their ability to predict the intentions of others and actively update their own plans. We propose hierarchical predictive planning (HPP), a model-based reinforcement learning method for decentralized multiagent rendezvous. Starting with pretrained, single-agent point to point navigation policies and using noisy, high-dimensional sensor inputs like lidar, we first learn via self-supervision motion predictions of all agents on the team. Next, HPP uses the prediction models to propose and evaluate navigation subgoals for completing the rendezvous task without explicit communication among agents. We evaluate HPP in a suite of unseen environments, with increasing complexity and numbers of obstacles. We show that HPP outperforms alternative reinforcement learning, path planning, and heuristic-based baselines on challenging, unseen environments. Experiments in the real world demonstrate successful transfer of the prediction models from sim to real world without any additional fine-tuning. Altogether, HPP removes the need for a centralized operator in multiagent systems by combining model-based RL and inference methods, enabling agents to dynamically align plans.} }
Endnote
%0 Conference Paper %T Model-based Reinforcement Learning for Decentralized Multiagent Rendezvous %A Rose Wang %A J. Chase Kew %A Dennis Lee %A Tsang-Wei Lee %A Tingnan Zhang %A Brian Ichter %A Jie Tan %A Aleksandra Faust %B Proceedings of the 2020 Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2021 %E Jens Kober %E Fabio Ramos %E Claire Tomlin %F pmlr-v155-wang21d %I PMLR %P 711--725 %U https://proceedings.mlr.press/v155/wang21d.html %V 155 %X Collaboration requires agents to align their goals on the fly. Underlying the human ability to align goals with other agents is their ability to predict the intentions of others and actively update their own plans. We propose hierarchical predictive planning (HPP), a model-based reinforcement learning method for decentralized multiagent rendezvous. Starting with pretrained, single-agent point to point navigation policies and using noisy, high-dimensional sensor inputs like lidar, we first learn via self-supervision motion predictions of all agents on the team. Next, HPP uses the prediction models to propose and evaluate navigation subgoals for completing the rendezvous task without explicit communication among agents. We evaluate HPP in a suite of unseen environments, with increasing complexity and numbers of obstacles. We show that HPP outperforms alternative reinforcement learning, path planning, and heuristic-based baselines on challenging, unseen environments. Experiments in the real world demonstrate successful transfer of the prediction models from sim to real world without any additional fine-tuning. Altogether, HPP removes the need for a centralized operator in multiagent systems by combining model-based RL and inference methods, enabling agents to dynamically align plans.
APA
Wang, R., Kew, J.C., Lee, D., Lee, T., Zhang, T., Ichter, B., Tan, J. & Faust, A.. (2021). Model-based Reinforcement Learning for Decentralized Multiagent Rendezvous. Proceedings of the 2020 Conference on Robot Learning, in Proceedings of Machine Learning Research 155:711-725 Available from https://proceedings.mlr.press/v155/wang21d.html.

Related Material