Inferring Smooth Control: Monte Carlo Posterior Policy Iteration with Gaussian Processes

Joe Watson, Jan Peters
Proceedings of The 6th Conference on Robot Learning, PMLR 205:67-79, 2023.

Abstract

Monte Carlo methods have become increasingly relevant for control of non-differentiable systems, approximate dynamics models, and learning from data.These methods scale to high-dimensional spaces and are effective at the non-convex optimization often seen in robot learning. We look at sample-based methods from the perspective of inference-based control, specifically posterior policy iteration. From this perspective, we highlight how Gaussian noise priors produce rough control actions that are unsuitable for physical robot deployment. Considering smoother Gaussian process priors, as used in episodic reinforcement learning and motion planning, we demonstrate how smoother model predictive control can be achieved using online sequential inference. This inference is realized through an efficient factorization of the action distribution, and novel means of optimizing the likelihood temperature for to improve importance sampling accuracy. We evaluate this approach on several high-dimensional robot control tasks, matching the sample efficiency of prior heuristic methods while also ensuring smoothness. Simulation results can be seen at monte-carlo-ppi.github.io.

Cite this Paper


BibTeX
@InProceedings{pmlr-v205-watson23a, title = {Inferring Smooth Control: Monte Carlo Posterior Policy Iteration with Gaussian Processes}, author = {Watson, Joe and Peters, Jan}, booktitle = {Proceedings of The 6th Conference on Robot Learning}, pages = {67--79}, year = {2023}, editor = {Liu, Karen and Kulic, Dana and Ichnowski, Jeff}, volume = {205}, series = {Proceedings of Machine Learning Research}, month = {14--18 Dec}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v205/watson23a/watson23a.pdf}, url = {https://proceedings.mlr.press/v205/watson23a.html}, abstract = {Monte Carlo methods have become increasingly relevant for control of non-differentiable systems, approximate dynamics models, and learning from data.These methods scale to high-dimensional spaces and are effective at the non-convex optimization often seen in robot learning. We look at sample-based methods from the perspective of inference-based control, specifically posterior policy iteration. From this perspective, we highlight how Gaussian noise priors produce rough control actions that are unsuitable for physical robot deployment. Considering smoother Gaussian process priors, as used in episodic reinforcement learning and motion planning, we demonstrate how smoother model predictive control can be achieved using online sequential inference. This inference is realized through an efficient factorization of the action distribution, and novel means of optimizing the likelihood temperature for to improve importance sampling accuracy. We evaluate this approach on several high-dimensional robot control tasks, matching the sample efficiency of prior heuristic methods while also ensuring smoothness. Simulation results can be seen at monte-carlo-ppi.github.io.} }
Endnote
%0 Conference Paper %T Inferring Smooth Control: Monte Carlo Posterior Policy Iteration with Gaussian Processes %A Joe Watson %A Jan Peters %B Proceedings of The 6th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2023 %E Karen Liu %E Dana Kulic %E Jeff Ichnowski %F pmlr-v205-watson23a %I PMLR %P 67--79 %U https://proceedings.mlr.press/v205/watson23a.html %V 205 %X Monte Carlo methods have become increasingly relevant for control of non-differentiable systems, approximate dynamics models, and learning from data.These methods scale to high-dimensional spaces and are effective at the non-convex optimization often seen in robot learning. We look at sample-based methods from the perspective of inference-based control, specifically posterior policy iteration. From this perspective, we highlight how Gaussian noise priors produce rough control actions that are unsuitable for physical robot deployment. Considering smoother Gaussian process priors, as used in episodic reinforcement learning and motion planning, we demonstrate how smoother model predictive control can be achieved using online sequential inference. This inference is realized through an efficient factorization of the action distribution, and novel means of optimizing the likelihood temperature for to improve importance sampling accuracy. We evaluate this approach on several high-dimensional robot control tasks, matching the sample efficiency of prior heuristic methods while also ensuring smoothness. Simulation results can be seen at monte-carlo-ppi.github.io.
APA
Watson, J. & Peters, J.. (2023). Inferring Smooth Control: Monte Carlo Posterior Policy Iteration with Gaussian Processes. Proceedings of The 6th Conference on Robot Learning, in Proceedings of Machine Learning Research 205:67-79 Available from https://proceedings.mlr.press/v205/watson23a.html.

Related Material