Safe Exploration for Optimization with Gaussian Processes

Yanan Sui; Alkis Gotovos; Joel Burdick; Andreas Krause

Safe Exploration for Optimization with Gaussian Processes

Yanan Sui, Alkis Gotovos, Joel Burdick, Andreas Krause

Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:997-1005, 2015.

Abstract

We consider sequential decision problems under uncertainty, where we seek to optimize an unknown function from noisy samples. This requires balancing exploration (learning about the objective) and exploitation (localizing the maximum), a problem well-studied in the multi-armed bandit literature. In many applications, however, we require that the sampled function values exceed some prespecified "safety" threshold, a requirement that existing algorithms fail to meet. Examples include medical applications where patient comfort must be guaranteed, recommender systems aiming to avoid user dissatisfaction, and robotic control, where one seeks to avoid controls causing physical harm to the platform. We tackle this novel, yet rich, set of problems under the assumption that the unknown function satisfies regularity conditions expressed via a Gaussian process prior. We develop an efficient algorithm called SafeOpt, and theoretically guarantee its convergence to a natural notion of optimum reachable under safety constraints. We evaluate SafeOpt on synthetic data, as well as two real applications: movie recommendation, and therapeutic spinal cord stimulation.

Cite this Paper

BibTeX


@InProceedings{pmlr-v37-sui15,
  title = 	 {Safe Exploration for Optimization with Gaussian Processes},
  author = 	 {Sui, Yanan and Gotovos, Alkis and Burdick, Joel and Krause, Andreas},
  booktitle = 	 {Proceedings of the 32nd International Conference on Machine Learning},
  pages = 	 {997--1005},
  year = 	 {2015},
  editor = 	 {Bach, Francis and Blei, David},
  volume = 	 {37},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Lille, France},
  month = 	 {07--09 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v37/sui15.pdf},
  url = 	 {https://proceedings.mlr.press/v37/sui15.html},
  abstract = 	 {We consider sequential decision problems under uncertainty, where we seek to optimize an unknown function from noisy samples. This requires balancing exploration (learning about the objective) and exploitation (localizing the maximum), a problem well-studied in the multi-armed bandit literature. In many applications, however, we require that the sampled function values exceed some prespecified "safety" threshold, a requirement that existing algorithms fail to meet. Examples include medical applications where patient comfort must be guaranteed, recommender systems aiming to avoid user dissatisfaction, and robotic control, where one seeks to avoid controls causing physical harm to the platform. We tackle this novel, yet rich, set of problems under the assumption that the unknown function satisfies regularity conditions expressed via a Gaussian process prior. We develop an efficient algorithm called SafeOpt, and theoretically guarantee its convergence to a natural notion of optimum reachable under safety constraints. We evaluate SafeOpt on synthetic data, as well as two real applications: movie recommendation, and therapeutic spinal cord stimulation.}
}

Endnote

%0 Conference Paper
%T Safe Exploration for Optimization with Gaussian Processes
%A Yanan Sui
%A Alkis Gotovos
%A Joel Burdick
%A Andreas Krause
%B Proceedings of the 32nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2015
%E Francis Bach
%E David Blei	
%F pmlr-v37-sui15
%I PMLR
%P 997--1005
%U https://proceedings.mlr.press/v37/sui15.html
%V 37
%X We consider sequential decision problems under uncertainty, where we seek to optimize an unknown function from noisy samples. This requires balancing exploration (learning about the objective) and exploitation (localizing the maximum), a problem well-studied in the multi-armed bandit literature. In many applications, however, we require that the sampled function values exceed some prespecified "safety" threshold, a requirement that existing algorithms fail to meet. Examples include medical applications where patient comfort must be guaranteed, recommender systems aiming to avoid user dissatisfaction, and robotic control, where one seeks to avoid controls causing physical harm to the platform. We tackle this novel, yet rich, set of problems under the assumption that the unknown function satisfies regularity conditions expressed via a Gaussian process prior. We develop an efficient algorithm called SafeOpt, and theoretically guarantee its convergence to a natural notion of optimum reachable under safety constraints. We evaluate SafeOpt on synthetic data, as well as two real applications: movie recommendation, and therapeutic spinal cord stimulation.

RIS


TY  - CPAPER
TI  - Safe Exploration for Optimization with Gaussian Processes
AU  - Yanan Sui
AU  - Alkis Gotovos
AU  - Joel Burdick
AU  - Andreas Krause
BT  - Proceedings of the 32nd International Conference on Machine Learning
DA  - 2015/06/01
ED  - Francis Bach
ED  - David Blei	
ID  - pmlr-v37-sui15
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 37
SP  - 997
EP  - 1005
L1  - http://proceedings.mlr.press/v37/sui15.pdf
UR  - https://proceedings.mlr.press/v37/sui15.html
AB  - We consider sequential decision problems under uncertainty, where we seek to optimize an unknown function from noisy samples. This requires balancing exploration (learning about the objective) and exploitation (localizing the maximum), a problem well-studied in the multi-armed bandit literature. In many applications, however, we require that the sampled function values exceed some prespecified "safety" threshold, a requirement that existing algorithms fail to meet. Examples include medical applications where patient comfort must be guaranteed, recommender systems aiming to avoid user dissatisfaction, and robotic control, where one seeks to avoid controls causing physical harm to the platform. We tackle this novel, yet rich, set of problems under the assumption that the unknown function satisfies regularity conditions expressed via a Gaussian process prior. We develop an efficient algorithm called SafeOpt, and theoretically guarantee its convergence to a natural notion of optimum reachable under safety constraints. We evaluate SafeOpt on synthetic data, as well as two real applications: movie recommendation, and therapeutic spinal cord stimulation.
ER  -

APA


Sui, Y., Gotovos, A., Burdick, J. & Krause, A.. (2015). Safe Exploration for Optimization with Gaussian Processes. Proceedings of the 32nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 37:997-1005 Available from https://proceedings.mlr.press/v37/sui15.html.

Safe Exploration for Optimization with Gaussian Processes

Abstract

Cite this Paper

Related Material