Broadly-Exploring, Local-Policy Trees for Long-Horizon Task Planning

brian ichter; Pierre Sermanet; Corey Lynch

Broadly-Exploring, Local-Policy Trees for Long-Horizon Task Planning

brian ichter, Pierre Sermanet, Corey Lynch

Proceedings of the 5th Conference on Robot Learning, PMLR 164:59-69, 2022.

Abstract

Long-horizon planning in realistic environments requires the ability to reason over sequential tasks in high-dimensional state spaces with complex dynamics. Classical motion planning algorithms, such as rapidly-exploring random trees, are capable of efficiently exploring large state spaces and computing long-horizon, sequential plans. However, these algorithms are generally challenged with complex, stochastic, and high-dimensional state spaces as well as in the presence of small, topologically complex goal regions, which naturally emerge in tasks that interact with the environment. Machine learning offers a promising solution for its ability to learn general policies that can handle complex interactions and high-dimensional observations. However, these policies are generally limited in horizon length. Our approach, Broadly-Exploring, Local-policy Trees (BELT), merges these two approaches to leverage the strengths of both through a task-conditioned, model-based tree search. BELT uses an RRT-inspired tree search to efficiently explore the state space. Locally, the exploration is guided by a task-conditioned, learned policy capable of performing general short-horizon tasks. This task space can be quite general and abstract; its only requirements are to be sampleable and to well-cover the space of useful tasks. This search is aided by a task-conditioned model that temporally extends dynamics propagation to allow long-horizon search and sequential reasoning over tasks. BELT is demonstrated experimentally to be able to plan long-horizon, sequential trajectories with a goal conditioned policy and generate plans that are robust.

Cite this Paper

BibTeX

@InProceedings{pmlr-v164-ichter22a,
  title = 	 {Broadly-Exploring, Local-Policy Trees for Long-Horizon Task Planning},
  author =       {ichter, brian and Sermanet, Pierre and Lynch, Corey},
  booktitle = 	 {Proceedings of the 5th Conference on Robot Learning},
  pages = 	 {59--69},
  year = 	 {2022},
  editor = 	 {Faust, Aleksandra and Hsu, David and Neumann, Gerhard},
  volume = 	 {164},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {08--11 Nov},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v164/ichter22a/ichter22a.pdf},
  url = 	 {https://proceedings.mlr.press/v164/ichter22a.html},
  abstract = 	 {Long-horizon planning in realistic environments requires the ability to reason over sequential tasks in high-dimensional state spaces with complex dynamics. Classical motion planning algorithms, such as rapidly-exploring random trees, are capable of efficiently exploring large state spaces and computing long-horizon, sequential plans. However, these algorithms are generally challenged with complex, stochastic, and high-dimensional state spaces as well as in the presence of small, topologically complex goal regions, which naturally emerge in tasks that interact with the environment. Machine learning offers a promising solution for its ability to learn general policies that can handle complex interactions and high-dimensional observations. However, these policies are generally limited in horizon length. Our approach, Broadly-Exploring, Local-policy Trees (BELT), merges these two approaches to leverage the strengths of both through a task-conditioned, model-based tree search. BELT uses an RRT-inspired tree search to efficiently explore the state space. Locally, the exploration is guided by a task-conditioned, learned policy capable of performing general short-horizon tasks. This task space can be quite general and abstract; its only requirements are to be sampleable and to well-cover the space of useful tasks. This search is aided by a task-conditioned model that temporally extends dynamics propagation to allow long-horizon search and sequential reasoning over tasks. BELT is demonstrated experimentally to be able to plan long-horizon, sequential trajectories with a goal conditioned policy and generate plans that are robust.}
}

Endnote

%0 Conference Paper
%T Broadly-Exploring, Local-Policy Trees for Long-Horizon Task Planning
%A brian ichter
%A Pierre Sermanet
%A Corey Lynch
%B Proceedings of the 5th Conference on Robot Learning
%C Proceedings of Machine Learning Research
%D 2022
%E Aleksandra Faust
%E David Hsu
%E Gerhard Neumann	
%F pmlr-v164-ichter22a
%I PMLR
%P 59--69
%U https://proceedings.mlr.press/v164/ichter22a.html
%V 164
%X Long-horizon planning in realistic environments requires the ability to reason over sequential tasks in high-dimensional state spaces with complex dynamics. Classical motion planning algorithms, such as rapidly-exploring random trees, are capable of efficiently exploring large state spaces and computing long-horizon, sequential plans. However, these algorithms are generally challenged with complex, stochastic, and high-dimensional state spaces as well as in the presence of small, topologically complex goal regions, which naturally emerge in tasks that interact with the environment. Machine learning offers a promising solution for its ability to learn general policies that can handle complex interactions and high-dimensional observations. However, these policies are generally limited in horizon length. Our approach, Broadly-Exploring, Local-policy Trees (BELT), merges these two approaches to leverage the strengths of both through a task-conditioned, model-based tree search. BELT uses an RRT-inspired tree search to efficiently explore the state space. Locally, the exploration is guided by a task-conditioned, learned policy capable of performing general short-horizon tasks. This task space can be quite general and abstract; its only requirements are to be sampleable and to well-cover the space of useful tasks. This search is aided by a task-conditioned model that temporally extends dynamics propagation to allow long-horizon search and sequential reasoning over tasks. BELT is demonstrated experimentally to be able to plan long-horizon, sequential trajectories with a goal conditioned policy and generate plans that are robust.

APA

ichter, b., Sermanet, P. & Lynch, C.. (2022). Broadly-Exploring, Local-Policy Trees for Long-Horizon Task Planning. Proceedings of the 5th Conference on Robot Learning, in Proceedings of Machine Learning Research 164:59-69 Available from https://proceedings.mlr.press/v164/ichter22a.html.

Broadly-Exploring, Local-Policy Trees for Long-Horizon Task Planning

Abstract

Cite this Paper

Related Material