Sample-efficient Cross-Entropy Method for Real-time Planning

Cristina Pinneri; Shambhuraj Sawant; Sebastian Blaes; Jan Achterhold; Joerg Stueckler; Michal Rolinek; Georg Martius

Sample-efficient Cross-Entropy Method for Real-time Planning

Cristina Pinneri, Shambhuraj Sawant, Sebastian Blaes, Jan Achterhold, Joerg Stueckler, Michal Rolinek, Georg Martius

Proceedings of the 2020 Conference on Robot Learning, PMLR 155:1049-1065, 2021.

Abstract

Trajectory optimizers for model-based reinforcement learning, such as the Cross-Entropy Method (CEM), can yield compelling results even in high-dimensional control tasks and sparse-reward environments. However, their sampling inefficiency prevents them from being used for real-time planning and control. We propose an improved version of the CEM algorithm for fast planning, with novel additions including temporally-correlated actions and memory, requiring 2.7-22x less samples and yielding a performance increase of 1.2-10x in high-dimensional control problems.

Cite this Paper

BibTeX

@InProceedings{pmlr-v155-pinneri21a,
  title = 	 {Sample-efficient Cross-Entropy Method for Real-time Planning},
  author =       {Pinneri, Cristina and Sawant, Shambhuraj and Blaes, Sebastian and Achterhold, Jan and Stueckler, Joerg and Rolinek, Michal and Martius, Georg},
  booktitle = 	 {Proceedings of the 2020 Conference on Robot Learning},
  pages = 	 {1049--1065},
  year = 	 {2021},
  editor = 	 {Kober, Jens and Ramos, Fabio and Tomlin, Claire},
  volume = 	 {155},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {16--18 Nov},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v155/pinneri21a/pinneri21a.pdf},
  url = 	 {https://proceedings.mlr.press/v155/pinneri21a.html},
  abstract = 	 {Trajectory optimizers for model-based reinforcement learning, such as the  Cross-Entropy Method (CEM), can yield compelling results even in high-dimensional control tasks and sparse-reward environments. However, their sampling inefficiency prevents them from being used for real-time planning and control. We propose an improved version of the CEM algorithm for fast planning, with novel additions including temporally-correlated actions and memory, requiring 2.7-22x less samples and yielding a performance increase of 1.2-10x in high-dimensional control problems.}
}

Endnote

%0 Conference Paper
%T Sample-efficient Cross-Entropy Method for Real-time Planning
%A Cristina Pinneri
%A Shambhuraj Sawant
%A Sebastian Blaes
%A Jan Achterhold
%A Joerg Stueckler
%A Michal Rolinek
%A Georg Martius
%B Proceedings of the 2020 Conference on Robot Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Jens Kober
%E Fabio Ramos
%E Claire Tomlin	
%F pmlr-v155-pinneri21a
%I PMLR
%P 1049--1065
%U https://proceedings.mlr.press/v155/pinneri21a.html
%V 155
%X Trajectory optimizers for model-based reinforcement learning, such as the  Cross-Entropy Method (CEM), can yield compelling results even in high-dimensional control tasks and sparse-reward environments. However, their sampling inefficiency prevents them from being used for real-time planning and control. We propose an improved version of the CEM algorithm for fast planning, with novel additions including temporally-correlated actions and memory, requiring 2.7-22x less samples and yielding a performance increase of 1.2-10x in high-dimensional control problems.

APA

Pinneri, C., Sawant, S., Blaes, S., Achterhold, J., Stueckler, J., Rolinek, M. & Martius, G.. (2021). Sample-efficient Cross-Entropy Method for Real-time Planning. Proceedings of the 2020 Conference on Robot Learning, in Proceedings of Machine Learning Research 155:1049-1065 Available from https://proceedings.mlr.press/v155/pinneri21a.html.

Related Material

Download PDF