Self-Paced Contextual Reinforcement Learning

Pascal Klink; Hany Abdulsamad; Boris Belousov; Jan Peters

Self-Paced Contextual Reinforcement Learning

Pascal Klink, Hany Abdulsamad, Boris Belousov, Jan Peters

Proceedings of the Conference on Robot Learning, PMLR 100:513-529, 2020.

Abstract

Generalization and adaptation of learned skills to novel situations is a core requirement for intelligent autonomous robots. Although contextual reinforcement learning provides a principled framework for learning and generalization of behaviors across related tasks, it generally relies on uninformed sampling of environments from an unknown, uncontrolled context distribution, thus missing the benefits of structured, sequential learning. We introduce a novel relative entropy reinforcement learning algorithm that gives the agent the freedom to control the intermediate task distribution, allowing for its gradual progression towards the target context distribution. Empirical evaluation shows that the proposed curriculum learning scheme drastically improves sample efficiency and enables learning in scenarios with both broad and sharp target context distributions in which classical approaches perform sub-optimally.

Cite this Paper

BibTeX

@InProceedings{pmlr-v100-klink20a,
  title = 	 {Self-Paced Contextual Reinforcement Learning},
  author =       {Klink, Pascal and Abdulsamad, Hany and Belousov, Boris and Peters, Jan},
  booktitle = 	 {Proceedings of the Conference on Robot Learning},
  pages = 	 {513--529},
  year = 	 {2020},
  editor = 	 {Kaelbling, Leslie Pack and Kragic, Danica and Sugiura, Komei},
  volume = 	 {100},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {30 Oct--01 Nov},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v100/klink20a/klink20a.pdf},
  url = 	 {https://proceedings.mlr.press/v100/klink20a.html},
  abstract = 	 {Generalization and adaptation of learned skills to novel situations is a core requirement for intelligent autonomous robots. Although contextual reinforcement learning provides a principled framework for learning and generalization of behaviors across related tasks, it generally relies on uninformed sampling of environments from an unknown, uncontrolled context distribution, thus missing the benefits of structured, sequential learning. We introduce a novel relative entropy reinforcement learning algorithm that gives the agent the freedom to control the intermediate task distribution, allowing for its gradual progression towards the target context distribution. Empirical evaluation shows that the proposed curriculum learning scheme drastically improves sample efficiency and enables learning in scenarios with both broad and sharp target context distributions in which classical approaches perform sub-optimally.}
}

Endnote

%0 Conference Paper
%T Self-Paced Contextual Reinforcement Learning
%A Pascal Klink
%A Hany Abdulsamad
%A Boris Belousov
%A Jan Peters
%B Proceedings of the Conference on Robot Learning
%C Proceedings of Machine Learning Research
%D 2020
%E Leslie Pack Kaelbling
%E Danica Kragic
%E Komei Sugiura	
%F pmlr-v100-klink20a
%I PMLR
%P 513--529
%U https://proceedings.mlr.press/v100/klink20a.html
%V 100
%X Generalization and adaptation of learned skills to novel situations is a core requirement for intelligent autonomous robots. Although contextual reinforcement learning provides a principled framework for learning and generalization of behaviors across related tasks, it generally relies on uninformed sampling of environments from an unknown, uncontrolled context distribution, thus missing the benefits of structured, sequential learning. We introduce a novel relative entropy reinforcement learning algorithm that gives the agent the freedom to control the intermediate task distribution, allowing for its gradual progression towards the target context distribution. Empirical evaluation shows that the proposed curriculum learning scheme drastically improves sample efficiency and enables learning in scenarios with both broad and sharp target context distributions in which classical approaches perform sub-optimally.

APA

Klink, P., Abdulsamad, H., Belousov, B. & Peters, J.. (2020). Self-Paced Contextual Reinforcement Learning. Proceedings of the Conference on Robot Learning, in Proceedings of Machine Learning Research 100:513-529 Available from https://proceedings.mlr.press/v100/klink20a.html.

Self-Paced Contextual Reinforcement Learning

Abstract

Cite this Paper

Related Material