Sub-Goal Trees a Framework for Goal-Based Reinforcement Learning

Tom Jurgenson; Or Avner; Edward Groshev; Aviv Tamar

Sub-Goal Trees a Framework for Goal-Based Reinforcement Learning

Tom Jurgenson, Or Avner, Edward Groshev, Aviv Tamar

Proceedings of the 37th International Conference on Machine Learning, PMLR 119:5020-5030, 2020.

Abstract

Many AI problems, in robotics and other domains, are goal-directed, essentially seeking a trajectory leading to some goal state. Reinforcement learning (RL), building on Bellman’s optimality equation, naturally optimizes for a single goal, yet can be made goal-directed by augmenting the state with the goal. Instead, we propose a new RL framework, derived from a dynamic programming equation for the all pairs shortest path (APSP) problem, which naturally solves goal-directed queries. We show that this approach has computational benefits for both standard and approximate dynamic programming. Interestingly, our formulation prescribes a novel protocol for computing a trajectory: instead of predicting the next state given its predecessor, as in standard RL, a goal-conditioned trajectory is constructed by first predicting an intermediate state between start and goal, partitioning the trajectory into two. Then, recursively, predicting intermediate points on each sub-segment, until a complete trajectory is obtained. We call this trajectory structure a sub-goal tree. Building on it, we additionally extend the policy gradient methodology to recursively predict sub-goals, resulting in novel goal-based algorithms. Finally, we apply our method to neural motion planning, where we demonstrate significant improvements compared to standard RL on navigating a 7-DoF robot arm between obstacles.

Cite this Paper

BibTeX


@InProceedings{pmlr-v119-jurgenson20a,
  title = 	 {Sub-Goal Trees a Framework for Goal-Based Reinforcement Learning},
  author =       {Jurgenson, Tom and Avner, Or and Groshev, Edward and Tamar, Aviv},
  booktitle = 	 {Proceedings of the 37th International Conference on Machine Learning},
  pages = 	 {5020--5030},
  year = 	 {2020},
  editor = 	 {III, Hal Daumé and Singh, Aarti},
  volume = 	 {119},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--18 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v119/jurgenson20a/jurgenson20a.pdf},
  url = 	 {https://proceedings.mlr.press/v119/jurgenson20a.html},
  abstract = 	 {Many AI problems, in robotics and other domains, are goal-directed, essentially seeking a trajectory leading to some goal state. Reinforcement learning (RL), building on Bellman’s optimality equation, naturally optimizes for a single goal, yet can be made goal-directed by augmenting the state with the goal. Instead, we propose a new RL framework, derived from a dynamic programming equation for the all pairs shortest path (APSP) problem, which naturally solves goal-directed queries. We show that this approach has computational benefits for both standard and approximate dynamic programming. Interestingly, our formulation prescribes a novel protocol for computing a trajectory: instead of predicting the next state given its predecessor, as in standard RL, a goal-conditioned trajectory is constructed by first predicting an intermediate state between start and goal, partitioning the trajectory into two. Then, recursively, predicting intermediate points on each sub-segment, until a complete trajectory is obtained. We call this trajectory structure a sub-goal tree. Building on it, we additionally extend the policy gradient methodology to recursively predict sub-goals, resulting in novel goal-based algorithms. Finally, we apply our method to neural motion planning, where we demonstrate significant improvements compared to standard RL on navigating a 7-DoF robot arm between obstacles.}
}

Endnote

%0 Conference Paper
%T Sub-Goal Trees a Framework for Goal-Based Reinforcement Learning
%A Tom Jurgenson
%A Or Avner
%A Edward Groshev
%A Aviv Tamar
%B Proceedings of the 37th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2020
%E Hal Daumé III
%E Aarti Singh	
%F pmlr-v119-jurgenson20a
%I PMLR
%P 5020--5030
%U https://proceedings.mlr.press/v119/jurgenson20a.html
%V 119
%X Many AI problems, in robotics and other domains, are goal-directed, essentially seeking a trajectory leading to some goal state. Reinforcement learning (RL), building on Bellman’s optimality equation, naturally optimizes for a single goal, yet can be made goal-directed by augmenting the state with the goal. Instead, we propose a new RL framework, derived from a dynamic programming equation for the all pairs shortest path (APSP) problem, which naturally solves goal-directed queries. We show that this approach has computational benefits for both standard and approximate dynamic programming. Interestingly, our formulation prescribes a novel protocol for computing a trajectory: instead of predicting the next state given its predecessor, as in standard RL, a goal-conditioned trajectory is constructed by first predicting an intermediate state between start and goal, partitioning the trajectory into two. Then, recursively, predicting intermediate points on each sub-segment, until a complete trajectory is obtained. We call this trajectory structure a sub-goal tree. Building on it, we additionally extend the policy gradient methodology to recursively predict sub-goals, resulting in novel goal-based algorithms. Finally, we apply our method to neural motion planning, where we demonstrate significant improvements compared to standard RL on navigating a 7-DoF robot arm between obstacles.

APA


Jurgenson, T., Avner, O., Groshev, E. & Tamar, A.. (2020). Sub-Goal Trees a Framework for Goal-Based Reinforcement Learning. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:5020-5030 Available from https://proceedings.mlr.press/v119/jurgenson20a.html.

Sub-Goal Trees a Framework for Goal-Based Reinforcement Learning

Abstract

Cite this Paper

Related Material