Abstract Value Iteration for Hierarchical Reinforcement Learning

Kishor Jothimurugan; Osbert Bastani; Rajeev Alur

Abstract Value Iteration for Hierarchical Reinforcement Learning

Kishor Jothimurugan, Osbert Bastani, Rajeev Alur

Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:1162-1170, 2021.

Abstract

We propose a novel hierarchical reinforcement learning framework for control with continuous state and action spaces. In our framework, the user specifies subgoal regions which are subsets of states; then, we (i) learn options that serve as transitions between these subgoal regions, and (ii) construct a high-level plan in the resulting abstract decision process (ADP). A key challenge is that the ADP may not be Markov; we propose two algorithms for planning in the ADP that address this issue. Our first algorithm is conservative, allowing us to prove theoretical guarantees on its performance, which help inform the design of subgoal regions. Our second algorithm is a practical one that interweaves planning at the abstract level and learning at the concrete level. In our experiments, we demonstrate that our approach outperforms state-of-the-art hierarchical reinforcement learning algorithms on several challenging benchmarks.

Cite this Paper

BibTeX


@InProceedings{pmlr-v130-jothimurugan21a,
  title = 	 { Abstract Value Iteration for Hierarchical Reinforcement Learning },
  author =       {Jothimurugan, Kishor and Bastani, Osbert and Alur, Rajeev},
  booktitle = 	 {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {1162--1170},
  year = 	 {2021},
  editor = 	 {Banerjee, Arindam and Fukumizu, Kenji},
  volume = 	 {130},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--15 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v130/jothimurugan21a/jothimurugan21a.pdf},
  url = 	 {https://proceedings.mlr.press/v130/jothimurugan21a.html},
  abstract = 	 { We propose a novel hierarchical reinforcement learning framework for control with continuous state and action spaces. In our framework, the user specifies subgoal regions which are subsets of states; then, we (i) learn options that serve as transitions between these subgoal regions, and (ii) construct a high-level plan in the resulting abstract decision process (ADP). A key challenge is that the ADP may not be Markov; we propose two algorithms for planning in the ADP that address this issue. Our first algorithm is conservative, allowing us to prove theoretical guarantees on its performance, which help inform the design of subgoal regions. Our second algorithm is a practical one that interweaves planning at the abstract level and learning at the concrete level. In our experiments, we demonstrate that our approach outperforms state-of-the-art hierarchical reinforcement learning algorithms on several challenging benchmarks. }
}

Endnote

%0 Conference Paper
%T  Abstract Value Iteration for Hierarchical Reinforcement Learning 
%A Kishor Jothimurugan
%A Osbert Bastani
%A Rajeev Alur
%B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2021
%E Arindam Banerjee
%E Kenji Fukumizu	
%F pmlr-v130-jothimurugan21a
%I PMLR
%P 1162--1170
%U https://proceedings.mlr.press/v130/jothimurugan21a.html
%V 130
%X  We propose a novel hierarchical reinforcement learning framework for control with continuous state and action spaces. In our framework, the user specifies subgoal regions which are subsets of states; then, we (i) learn options that serve as transitions between these subgoal regions, and (ii) construct a high-level plan in the resulting abstract decision process (ADP). A key challenge is that the ADP may not be Markov; we propose two algorithms for planning in the ADP that address this issue. Our first algorithm is conservative, allowing us to prove theoretical guarantees on its performance, which help inform the design of subgoal regions. Our second algorithm is a practical one that interweaves planning at the abstract level and learning at the concrete level. In our experiments, we demonstrate that our approach outperforms state-of-the-art hierarchical reinforcement learning algorithms on several challenging benchmarks.

APA


Jothimurugan, K., Bastani, O. & Alur, R.. (2021).  Abstract Value Iteration for Hierarchical Reinforcement Learning . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:1162-1170 Available from https://proceedings.mlr.press/v130/jothimurugan21a.html.

Abstract Value Iteration for Hierarchical Reinforcement Learning

Abstract

Cite this Paper

Related Material