Skill Discovery for Exploration and Planning using Deep Skill Graphs

Akhil Bagaria; Jason K Senthil; George Konidaris

Skill Discovery for Exploration and Planning using Deep Skill Graphs

Akhil Bagaria, Jason K Senthil, George Konidaris

Proceedings of the 38th International Conference on Machine Learning, PMLR 139:521-531, 2021.

Abstract

We introduce a new skill-discovery algorithm that builds a discrete graph representation of large continuous MDPs, where nodes correspond to skill subgoals and the edges to skill policies. The agent constructs this graph during an unsupervised training phase where it interleaves discovering skills and planning using them to gain coverage over ever-increasing portions of the state-space. Given a novel goal at test time, the agent plans with the acquired skill graph to reach a nearby state, then switches to learning to reach the goal. We show that the resulting algorithm, Deep Skill Graphs, outperforms both flat and existing hierarchical reinforcement learning methods on four difficult continuous control tasks.

Cite this Paper

BibTeX

@InProceedings{pmlr-v139-bagaria21a,
  title = 	 {Skill Discovery for Exploration and Planning using Deep Skill Graphs},
  author =       {Bagaria, Akhil and Senthil, Jason K and Konidaris, George},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {521--531},
  year = 	 {2021},
  editor = 	 {Meila, Marina and Zhang, Tong},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v139/bagaria21a/bagaria21a.pdf},
  url = 	 {https://proceedings.mlr.press/v139/bagaria21a.html},
  abstract = 	 {We introduce a new skill-discovery algorithm that builds a discrete graph representation of large continuous MDPs, where nodes correspond to skill subgoals and the edges to skill policies. The agent constructs this graph during an unsupervised training phase where it interleaves discovering skills and planning using them to gain coverage over ever-increasing portions of the state-space. Given a novel goal at test time, the agent plans with the acquired skill graph to reach a nearby state, then switches to learning to reach the goal. We show that the resulting algorithm, Deep Skill Graphs, outperforms both flat and existing hierarchical reinforcement learning methods on four difficult continuous control tasks.}
}

Endnote

%0 Conference Paper
%T Skill Discovery for Exploration and Planning using Deep Skill Graphs
%A Akhil Bagaria
%A Jason K Senthil
%A George Konidaris
%B Proceedings of the 38th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Marina Meila
%E Tong Zhang	
%F pmlr-v139-bagaria21a
%I PMLR
%P 521--531
%U https://proceedings.mlr.press/v139/bagaria21a.html
%V 139
%X We introduce a new skill-discovery algorithm that builds a discrete graph representation of large continuous MDPs, where nodes correspond to skill subgoals and the edges to skill policies. The agent constructs this graph during an unsupervised training phase where it interleaves discovering skills and planning using them to gain coverage over ever-increasing portions of the state-space. Given a novel goal at test time, the agent plans with the acquired skill graph to reach a nearby state, then switches to learning to reach the goal. We show that the resulting algorithm, Deep Skill Graphs, outperforms both flat and existing hierarchical reinforcement learning methods on four difficult continuous control tasks.

APA

Bagaria, A., Senthil, J.K. & Konidaris, G.. (2021). Skill Discovery for Exploration and Planning using Deep Skill Graphs. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:521-531 Available from https://proceedings.mlr.press/v139/bagaria21a.html.

Skill Discovery for Exploration and Planning using Deep Skill Graphs

Abstract

Cite this Paper

Related Material