Optimization as Estimation with Gaussian Processes in Bandit Settings

Zi Wang; Bolei Zhou; Stefanie Jegelka

Optimization as Estimation with Gaussian Processes in Bandit Settings

Zi Wang, Bolei Zhou, Stefanie Jegelka

Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, PMLR 51:1022-1031, 2016.

Abstract

Recently, there has been rising interest in Bayesian optimization – the optimization of an unknown function with assumptions usually expressed by a Gaussian Process (GP) prior. We study an optimization strategy that directly uses an estimate of the argmax of the function. This strategy offers both practical and theoretical advantages: no tradeoff parameter needs to be selected, and, moreover, we establish close connections to the popular GP-UCB and GP-PI strategies. Our approach can be understood as automatically and adaptively trading off exploration and exploitation in GP-UCB and GP-PI. We illustrate the effects of this adaptive tuning via bounds on the regret as well as an extensive empirical evaluation on robotics and vision tasks, demonstrating the robustness of this strategy for a range of performance criteria.

Cite this Paper

BibTeX


@InProceedings{pmlr-v51-wang16f,
  title = 	 {Optimization as Estimation with Gaussian Processes in Bandit Settings},
  author = 	 {Wang, Zi and Zhou, Bolei and Jegelka, Stefanie},
  booktitle = 	 {Proceedings of the 19th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {1022--1031},
  year = 	 {2016},
  editor = 	 {Gretton, Arthur and Robert, Christian C.},
  volume = 	 {51},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Cadiz, Spain},
  month = 	 {09--11 May},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v51/wang16f.pdf},
  url = 	 {https://proceedings.mlr.press/v51/wang16f.html},
  abstract = 	 {Recently, there has been rising interest in Bayesian optimization – the optimization of an unknown function with assumptions usually expressed by a Gaussian Process (GP) prior. We study an optimization strategy that directly uses an estimate of the argmax of the function. This strategy offers both practical and theoretical advantages: no tradeoff parameter needs to be selected, and, moreover, we establish close connections to the popular GP-UCB and GP-PI strategies. Our approach can be understood as automatically and adaptively trading off exploration and exploitation in GP-UCB and GP-PI. We illustrate the effects of this adaptive tuning via bounds on the regret as well as an extensive empirical evaluation on robotics and vision tasks, demonstrating the robustness of this strategy for a range of performance criteria.}
}

Endnote

%0 Conference Paper
%T Optimization as Estimation with Gaussian Processes in Bandit Settings
%A Zi Wang
%A Bolei Zhou
%A Stefanie Jegelka
%B Proceedings of the 19th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2016
%E Arthur Gretton
%E Christian C. Robert	
%F pmlr-v51-wang16f
%I PMLR
%P 1022--1031
%U https://proceedings.mlr.press/v51/wang16f.html
%V 51
%X Recently, there has been rising interest in Bayesian optimization – the optimization of an unknown function with assumptions usually expressed by a Gaussian Process (GP) prior. We study an optimization strategy that directly uses an estimate of the argmax of the function. This strategy offers both practical and theoretical advantages: no tradeoff parameter needs to be selected, and, moreover, we establish close connections to the popular GP-UCB and GP-PI strategies. Our approach can be understood as automatically and adaptively trading off exploration and exploitation in GP-UCB and GP-PI. We illustrate the effects of this adaptive tuning via bounds on the regret as well as an extensive empirical evaluation on robotics and vision tasks, demonstrating the robustness of this strategy for a range of performance criteria.

RIS


TY  - CPAPER
TI  - Optimization as Estimation with Gaussian Processes in Bandit Settings
AU  - Zi Wang
AU  - Bolei Zhou
AU  - Stefanie Jegelka
BT  - Proceedings of the 19th International Conference on Artificial Intelligence and Statistics
DA  - 2016/05/02
ED  - Arthur Gretton
ED  - Christian C. Robert	
ID  - pmlr-v51-wang16f
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 51
SP  - 1022
EP  - 1031
L1  - http://proceedings.mlr.press/v51/wang16f.pdf
UR  - https://proceedings.mlr.press/v51/wang16f.html
AB  - Recently, there has been rising interest in Bayesian optimization – the optimization of an unknown function with assumptions usually expressed by a Gaussian Process (GP) prior. We study an optimization strategy that directly uses an estimate of the argmax of the function. This strategy offers both practical and theoretical advantages: no tradeoff parameter needs to be selected, and, moreover, we establish close connections to the popular GP-UCB and GP-PI strategies. Our approach can be understood as automatically and adaptively trading off exploration and exploitation in GP-UCB and GP-PI. We illustrate the effects of this adaptive tuning via bounds on the regret as well as an extensive empirical evaluation on robotics and vision tasks, demonstrating the robustness of this strategy for a range of performance criteria.
ER  -

APA


Wang, Z., Zhou, B. & Jegelka, S.. (2016). Optimization as Estimation with Gaussian Processes in Bandit Settings. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 51:1022-1031 Available from https://proceedings.mlr.press/v51/wang16f.html.

Optimization as Estimation with Gaussian Processes in Bandit Settings

Abstract

Cite this Paper

Related Material