Expansive Latent Planning for Sparse Reward Offline Reinforcement Learning

Robert Gieselmann; Florian T. Pokorny

Expansive Latent Planning for Sparse Reward Offline Reinforcement Learning

Robert Gieselmann, Florian T. Pokorny

Proceedings of The 7th Conference on Robot Learning, PMLR 229:1-22, 2023.

Abstract

Sampling-based motion planning algorithms excel at searching global solution paths in geometrically complex settings. However, classical approaches, such as RRT, are difficult to scale beyond low-dimensional search spaces and rely on privileged knowledge e.g. about collision detection and underlying state distances. In this work, we take a step towards the integration of sampling-based planning into the reinforcement learning framework to solve sparse-reward control tasks from high-dimensional inputs. Our method, called VELAP, determines sequences of waypoints through sampling-based exploration in a learned state embedding. Unlike other sampling-based techniques, we iteratively expand a tree-based memory of visited latent areas, which is leveraged to explore a larger portion of the latent space for a given number of search iterations. We demonstrate state-of-the-art results in learning control from offline data in the context of vision-based manipulation under sparse reward feedback. Our method extends the set of available planning tools in model-based reinforcement learning by adding a latent planner that searches globally for feasible paths instead of being bound to a fixed prediction horizon.

Cite this Paper

BibTeX


@InProceedings{pmlr-v229-gieselmann23a,
  title = 	 {Expansive Latent Planning for Sparse Reward Offline Reinforcement Learning},
  author =       {Gieselmann, Robert and Pokorny, Florian T.},
  booktitle = 	 {Proceedings of The 7th Conference on Robot Learning},
  pages = 	 {1--22},
  year = 	 {2023},
  editor = 	 {Tan, Jie and Toussaint, Marc and Darvish, Kourosh},
  volume = 	 {229},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {06--09 Nov},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v229/gieselmann23a/gieselmann23a.pdf},
  url = 	 {https://proceedings.mlr.press/v229/gieselmann23a.html},
  abstract = 	 {Sampling-based motion planning algorithms excel at searching global solution paths in geometrically complex settings. However, classical approaches, such as RRT, are difficult to scale beyond low-dimensional search spaces and rely on privileged knowledge e.g. about collision detection and underlying state distances. In this work, we take a step towards the integration of sampling-based planning into the reinforcement learning framework to solve sparse-reward control tasks from high-dimensional inputs. Our method, called VELAP, determines sequences of waypoints through sampling-based exploration in a learned state embedding. Unlike other sampling-based techniques, we iteratively expand a tree-based memory of visited latent areas, which is leveraged to explore a larger portion of the latent space for a given number of search iterations. We demonstrate state-of-the-art results in learning control from offline data in the context of vision-based manipulation under sparse reward feedback. Our method extends the set of available planning tools in model-based reinforcement learning by adding a latent planner that searches globally for feasible paths instead of being bound to a fixed prediction horizon.}
}

Endnote

%0 Conference Paper
%T Expansive Latent Planning for Sparse Reward Offline Reinforcement Learning
%A Robert Gieselmann
%A Florian T. Pokorny
%B Proceedings of The 7th Conference on Robot Learning
%C Proceedings of Machine Learning Research
%D 2023
%E Jie Tan
%E Marc Toussaint
%E Kourosh Darvish	
%F pmlr-v229-gieselmann23a
%I PMLR
%P 1--22
%U https://proceedings.mlr.press/v229/gieselmann23a.html
%V 229
%X Sampling-based motion planning algorithms excel at searching global solution paths in geometrically complex settings. However, classical approaches, such as RRT, are difficult to scale beyond low-dimensional search spaces and rely on privileged knowledge e.g. about collision detection and underlying state distances. In this work, we take a step towards the integration of sampling-based planning into the reinforcement learning framework to solve sparse-reward control tasks from high-dimensional inputs. Our method, called VELAP, determines sequences of waypoints through sampling-based exploration in a learned state embedding. Unlike other sampling-based techniques, we iteratively expand a tree-based memory of visited latent areas, which is leveraged to explore a larger portion of the latent space for a given number of search iterations. We demonstrate state-of-the-art results in learning control from offline data in the context of vision-based manipulation under sparse reward feedback. Our method extends the set of available planning tools in model-based reinforcement learning by adding a latent planner that searches globally for feasible paths instead of being bound to a fixed prediction horizon.

APA


Gieselmann, R. & Pokorny, F.T.. (2023). Expansive Latent Planning for Sparse Reward Offline Reinforcement Learning. Proceedings of The 7th Conference on Robot Learning, in Proceedings of Machine Learning Research 229:1-22 Available from https://proceedings.mlr.press/v229/gieselmann23a.html.

Expansive Latent Planning for Sparse Reward Offline Reinforcement Learning

Abstract

Cite this Paper

Related Material