Hierarchical Policy Blending As Optimal Transport

An Thai Le, Kay Hansel, Jan Peters, Georgia Chalvatzaki
Proceedings of The 5th Annual Learning for Dynamics and Control Conference, PMLR 211:797-812, 2023.

Abstract

We present hierarchical policy blending as optimal transport (HiPBOT). HiPBOT hierarchically adjusts the weights of low-level reactive expert policies of different agents by adding a look-ahead planning layer on the parameter space. The high-level planner renders policy blending as unbalanced optimal transport consolidating the scaling of the underlying Riemannian motion policies. As a result, HiPBOT effectively decides the priorities between expert policies and agents, ensuring the task’s success and guaranteeing safety. Experimental results in several application scenarios, from low-dimensional navigation to high-dimensional whole-body control, show the efficacy and efficiency of HiPBOT. Our method outperforms state-of-the-art baselines – either adopting probabilistic inference or defining a tree structure of experts – paving the way for new applications of optimal transport to robot control. More material at https://sites.google.com/view/hipobot

Cite this Paper


BibTeX
@InProceedings{pmlr-v211-le23a, title = {Hierarchical Policy Blending As Optimal Transport}, author = {Le, An Thai and Hansel, Kay and Peters, Jan and Chalvatzaki, Georgia}, booktitle = {Proceedings of The 5th Annual Learning for Dynamics and Control Conference}, pages = {797--812}, year = {2023}, editor = {Matni, Nikolai and Morari, Manfred and Pappas, George J.}, volume = {211}, series = {Proceedings of Machine Learning Research}, month = {15--16 Jun}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v211/le23a/le23a.pdf}, url = {https://proceedings.mlr.press/v211/le23a.html}, abstract = {We present hierarchical policy blending as optimal transport (HiPBOT). HiPBOT hierarchically adjusts the weights of low-level reactive expert policies of different agents by adding a look-ahead planning layer on the parameter space. The high-level planner renders policy blending as unbalanced optimal transport consolidating the scaling of the underlying Riemannian motion policies. As a result, HiPBOT effectively decides the priorities between expert policies and agents, ensuring the task’s success and guaranteeing safety. Experimental results in several application scenarios, from low-dimensional navigation to high-dimensional whole-body control, show the efficacy and efficiency of HiPBOT. Our method outperforms state-of-the-art baselines – either adopting probabilistic inference or defining a tree structure of experts – paving the way for new applications of optimal transport to robot control. More material at https://sites.google.com/view/hipobot} }
Endnote
%0 Conference Paper %T Hierarchical Policy Blending As Optimal Transport %A An Thai Le %A Kay Hansel %A Jan Peters %A Georgia Chalvatzaki %B Proceedings of The 5th Annual Learning for Dynamics and Control Conference %C Proceedings of Machine Learning Research %D 2023 %E Nikolai Matni %E Manfred Morari %E George J. Pappas %F pmlr-v211-le23a %I PMLR %P 797--812 %U https://proceedings.mlr.press/v211/le23a.html %V 211 %X We present hierarchical policy blending as optimal transport (HiPBOT). HiPBOT hierarchically adjusts the weights of low-level reactive expert policies of different agents by adding a look-ahead planning layer on the parameter space. The high-level planner renders policy blending as unbalanced optimal transport consolidating the scaling of the underlying Riemannian motion policies. As a result, HiPBOT effectively decides the priorities between expert policies and agents, ensuring the task’s success and guaranteeing safety. Experimental results in several application scenarios, from low-dimensional navigation to high-dimensional whole-body control, show the efficacy and efficiency of HiPBOT. Our method outperforms state-of-the-art baselines – either adopting probabilistic inference or defining a tree structure of experts – paving the way for new applications of optimal transport to robot control. More material at https://sites.google.com/view/hipobot
APA
Le, A.T., Hansel, K., Peters, J. & Chalvatzaki, G.. (2023). Hierarchical Policy Blending As Optimal Transport. Proceedings of The 5th Annual Learning for Dynamics and Control Conference, in Proceedings of Machine Learning Research 211:797-812 Available from https://proceedings.mlr.press/v211/le23a.html.

Related Material