Hierarchical Policy Blending As Optimal Transport

An Thai Le; Kay Hansel; Jan Peters; Georgia Chalvatzaki

Hierarchical Policy Blending As Optimal Transport

An Thai Le, Kay Hansel, Jan Peters, Georgia Chalvatzaki

Proceedings of The 5th Annual Learning for Dynamics and Control Conference, PMLR 211:797-812, 2023.

Abstract

We present hierarchical policy blending as optimal transport (HiPBOT). HiPBOT hierarchically adjusts the weights of low-level reactive expert policies of different agents by adding a look-ahead planning layer on the parameter space. The high-level planner renders policy blending as unbalanced optimal transport consolidating the scaling of the underlying Riemannian motion policies. As a result, HiPBOT effectively decides the priorities between expert policies and agents, ensuring the task’s success and guaranteeing safety. Experimental results in several application scenarios, from low-dimensional navigation to high-dimensional whole-body control, show the efficacy and efficiency of HiPBOT. Our method outperforms state-of-the-art baselines – either adopting probabilistic inference or defining a tree structure of experts – paving the way for new applications of optimal transport to robot control. More material at https://sites.google.com/view/hipobot

Cite this Paper

BibTeX


@InProceedings{pmlr-v211-le23a,
  title = 	 {Hierarchical Policy Blending As Optimal Transport},
  author =       {Le, An Thai and Hansel, Kay and Peters, Jan and Chalvatzaki, Georgia},
  booktitle = 	 {Proceedings of The 5th Annual Learning for Dynamics and Control Conference},
  pages = 	 {797--812},
  year = 	 {2023},
  editor = 	 {Matni, Nikolai and Morari, Manfred and Pappas, George J.},
  volume = 	 {211},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {15--16 Jun},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v211/le23a/le23a.pdf},
  url = 	 {https://proceedings.mlr.press/v211/le23a.html},
  abstract = 	 {We present hierarchical policy blending as optimal transport (HiPBOT). HiPBOT hierarchically adjusts the weights of low-level reactive expert policies of different agents by adding a look-ahead planning layer on the parameter space. The high-level planner renders policy blending as unbalanced optimal transport consolidating the scaling of the underlying Riemannian motion policies. As a result, HiPBOT effectively decides the priorities between expert policies and agents, ensuring the task’s success and guaranteeing safety. Experimental results in several application scenarios, from low-dimensional navigation to high-dimensional whole-body control, show the efficacy and efficiency of HiPBOT. Our method outperforms state-of-the-art baselines – either adopting probabilistic inference or defining a tree structure of experts – paving the way for new applications of optimal transport to robot control. More material at https://sites.google.com/view/hipobot}
}

Endnote

%0 Conference Paper
%T Hierarchical Policy Blending As Optimal Transport
%A An Thai Le
%A Kay Hansel
%A Jan Peters
%A Georgia Chalvatzaki
%B Proceedings of The 5th Annual Learning for Dynamics and Control Conference
%C Proceedings of Machine Learning Research
%D 2023
%E Nikolai Matni
%E Manfred Morari
%E George J. Pappas	
%F pmlr-v211-le23a
%I PMLR
%P 797--812
%U https://proceedings.mlr.press/v211/le23a.html
%V 211
%X We present hierarchical policy blending as optimal transport (HiPBOT). HiPBOT hierarchically adjusts the weights of low-level reactive expert policies of different agents by adding a look-ahead planning layer on the parameter space. The high-level planner renders policy blending as unbalanced optimal transport consolidating the scaling of the underlying Riemannian motion policies. As a result, HiPBOT effectively decides the priorities between expert policies and agents, ensuring the task’s success and guaranteeing safety. Experimental results in several application scenarios, from low-dimensional navigation to high-dimensional whole-body control, show the efficacy and efficiency of HiPBOT. Our method outperforms state-of-the-art baselines – either adopting probabilistic inference or defining a tree structure of experts – paving the way for new applications of optimal transport to robot control. More material at https://sites.google.com/view/hipobot

APA


Le, A.T., Hansel, K., Peters, J. & Chalvatzaki, G.. (2023). Hierarchical Policy Blending As Optimal Transport. Proceedings of The 5th Annual Learning for Dynamics and Control Conference, in Proceedings of Machine Learning Research 211:797-812 Available from https://proceedings.mlr.press/v211/le23a.html.

Hierarchical Policy Blending As Optimal Transport

Abstract

Cite this Paper

Related Material