Cautious Bayesian Optimization for Efficient and Scalable Policy Search

Lukas P. Fröhlich, Melanie N. Zeilinger, Edgar D. Klenske
Proceedings of the 3rd Conference on Learning for Dynamics and Control, PMLR 144:227-240, 2021.

Abstract

Sample efficiency is one of the key factors when applying policy search to real-world problems. In recent years, Bayesian Optimization (BO) has become prominent in the field of robotics due to its sample efficiency and little prior knowledge needed. However, one drawback of BO is its poor performance on high-dimensional search spaces as it focuses on global search. In the policy search setting, local optimization is typically sufficient as initial policies are often available, e.g., via meta-learning, kinesthetic demonstrations or sim-to-real approaches. In this paper, we propose to constrain the policy search space to a sublevel-set of the Bayesian surrogate model’s predictive uncertainty. This simple yet effective way of constraining the policy update enables BO to scale to high-dimensional spaces (>100) as well as reduces the risk of damaging the system. We demonstrate the effectiveness of our approach on a wide range of problems, including a motor skills task, adapting deep RL agents to new reward signals and a sim-to-real task for an inverted pendulum system.

Cite this Paper


BibTeX
@InProceedings{pmlr-v144-frohlich21a, title = {Cautious Bayesian Optimization for Efficient and Scalable Policy Search}, author = {Fr\"ohlich, Lukas P. and Zeilinger, Melanie N. and Klenske, Edgar D.}, booktitle = {Proceedings of the 3rd Conference on Learning for Dynamics and Control}, pages = {227--240}, year = {2021}, editor = {Jadbabaie, Ali and Lygeros, John and Pappas, George J. and A. Parrilo, Pablo and Recht, Benjamin and Tomlin, Claire J. and Zeilinger, Melanie N.}, volume = {144}, series = {Proceedings of Machine Learning Research}, month = {07 -- 08 June}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v144/frohlich21a/frohlich21a.pdf}, url = {https://proceedings.mlr.press/v144/frohlich21a.html}, abstract = {Sample efficiency is one of the key factors when applying policy search to real-world problems. In recent years, Bayesian Optimization (BO) has become prominent in the field of robotics due to its sample efficiency and little prior knowledge needed. However, one drawback of BO is its poor performance on high-dimensional search spaces as it focuses on global search. In the policy search setting, local optimization is typically sufficient as initial policies are often available, e.g., via meta-learning, kinesthetic demonstrations or sim-to-real approaches. In this paper, we propose to constrain the policy search space to a sublevel-set of the Bayesian surrogate model’s predictive uncertainty. This simple yet effective way of constraining the policy update enables BO to scale to high-dimensional spaces (>100) as well as reduces the risk of damaging the system. We demonstrate the effectiveness of our approach on a wide range of problems, including a motor skills task, adapting deep RL agents to new reward signals and a sim-to-real task for an inverted pendulum system.} }
Endnote
%0 Conference Paper %T Cautious Bayesian Optimization for Efficient and Scalable Policy Search %A Lukas P. Fröhlich %A Melanie N. Zeilinger %A Edgar D. Klenske %B Proceedings of the 3rd Conference on Learning for Dynamics and Control %C Proceedings of Machine Learning Research %D 2021 %E Ali Jadbabaie %E John Lygeros %E George J. Pappas %E Pablo A. Parrilo %E Benjamin Recht %E Claire J. Tomlin %E Melanie N. Zeilinger %F pmlr-v144-frohlich21a %I PMLR %P 227--240 %U https://proceedings.mlr.press/v144/frohlich21a.html %V 144 %X Sample efficiency is one of the key factors when applying policy search to real-world problems. In recent years, Bayesian Optimization (BO) has become prominent in the field of robotics due to its sample efficiency and little prior knowledge needed. However, one drawback of BO is its poor performance on high-dimensional search spaces as it focuses on global search. In the policy search setting, local optimization is typically sufficient as initial policies are often available, e.g., via meta-learning, kinesthetic demonstrations or sim-to-real approaches. In this paper, we propose to constrain the policy search space to a sublevel-set of the Bayesian surrogate model’s predictive uncertainty. This simple yet effective way of constraining the policy update enables BO to scale to high-dimensional spaces (>100) as well as reduces the risk of damaging the system. We demonstrate the effectiveness of our approach on a wide range of problems, including a motor skills task, adapting deep RL agents to new reward signals and a sim-to-real task for an inverted pendulum system.
APA
Fröhlich, L.P., Zeilinger, M.N. & Klenske, E.D.. (2021). Cautious Bayesian Optimization for Efficient and Scalable Policy Search. Proceedings of the 3rd Conference on Learning for Dynamics and Control, in Proceedings of Machine Learning Research 144:227-240 Available from https://proceedings.mlr.press/v144/frohlich21a.html.

Related Material