Smooth Exploration for Robotic Reinforcement Learning

Antonin Raffin; Jens Kober; Freek Stulp

Smooth Exploration for Robotic Reinforcement Learning

Antonin Raffin, Jens Kober, Freek Stulp

Proceedings of the 5th Conference on Robot Learning, PMLR 164:1634-1644, 2022.

Abstract

Reinforcement learning (RL) enables robots to learn skills from interactions with the real world. In practice, the unstructured step-based exploration used in Deep RL – often very successful in simulation – leads to jerky motion patterns on real robots. Consequences of the resulting shaky behavior are poor exploration, or even damage to the robot. We address these issues by adapting state-dependent exploration (SDE) to current Deep RL algorithms. To enable this adaptation, we propose two extensions to the original SDE, using more general features and re-sampling the noise periodically, which leads to a new exploration method generalized state-dependent exploration (gSDE). We evaluate gSDE both in simulation, on PyBullet continuous control tasks, and directly on three different real robots: a tendon-driven elastic robot, a quadruped and an RC car. The noise sampling interval of gSDE enables a compromise between performance and smoothness, which allows training directly on the real robots without loss of performance.

Cite this Paper

BibTeX


@InProceedings{pmlr-v164-raffin22a,
  title = 	 {Smooth Exploration for Robotic Reinforcement Learning},
  author =       {Raffin, Antonin and Kober, Jens and Stulp, Freek},
  booktitle = 	 {Proceedings of the 5th Conference on Robot Learning},
  pages = 	 {1634--1644},
  year = 	 {2022},
  editor = 	 {Faust, Aleksandra and Hsu, David and Neumann, Gerhard},
  volume = 	 {164},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {08--11 Nov},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v164/raffin22a/raffin22a.pdf},
  url = 	 {https://proceedings.mlr.press/v164/raffin22a.html},
  abstract = 	 {Reinforcement learning (RL) enables robots to learn skills from interactions with the real world. In practice, the unstructured step-based exploration used in Deep RL – often very successful in simulation – leads to jerky motion patterns on real robots. Consequences of the resulting shaky behavior are poor exploration, or even damage to the robot. We address these issues by adapting state-dependent exploration (SDE) to current Deep RL algorithms. To enable this adaptation, we propose two extensions to the original SDE, using more general features and re-sampling the noise periodically, which leads to a new exploration method generalized state-dependent exploration (gSDE). We evaluate gSDE both in simulation, on PyBullet continuous control tasks, and directly on three different real robots: a tendon-driven elastic robot, a quadruped and an RC car. The noise sampling interval of gSDE enables a compromise between performance and smoothness, which allows training directly on the real robots without loss of performance.}
}

Endnote

%0 Conference Paper
%T Smooth Exploration for Robotic Reinforcement Learning
%A Antonin Raffin
%A Jens Kober
%A Freek Stulp
%B Proceedings of the 5th Conference on Robot Learning
%C Proceedings of Machine Learning Research
%D 2022
%E Aleksandra Faust
%E David Hsu
%E Gerhard Neumann	
%F pmlr-v164-raffin22a
%I PMLR
%P 1634--1644
%U https://proceedings.mlr.press/v164/raffin22a.html
%V 164
%X Reinforcement learning (RL) enables robots to learn skills from interactions with the real world. In practice, the unstructured step-based exploration used in Deep RL – often very successful in simulation – leads to jerky motion patterns on real robots. Consequences of the resulting shaky behavior are poor exploration, or even damage to the robot. We address these issues by adapting state-dependent exploration (SDE) to current Deep RL algorithms. To enable this adaptation, we propose two extensions to the original SDE, using more general features and re-sampling the noise periodically, which leads to a new exploration method generalized state-dependent exploration (gSDE). We evaluate gSDE both in simulation, on PyBullet continuous control tasks, and directly on three different real robots: a tendon-driven elastic robot, a quadruped and an RC car. The noise sampling interval of gSDE enables a compromise between performance and smoothness, which allows training directly on the real robots without loss of performance.

APA


Raffin, A., Kober, J. & Stulp, F.. (2022). Smooth Exploration for Robotic Reinforcement Learning. Proceedings of the 5th Conference on Robot Learning, in Proceedings of Machine Learning Research 164:1634-1644 Available from https://proceedings.mlr.press/v164/raffin22a.html.

Smooth Exploration for Robotic Reinforcement Learning

Abstract

Cite this Paper

Related Material