Significance of Gradient Information in Bayesian Optimization

Shubhanshu Shekhar; Tara Javidi

Significance of Gradient Information in Bayesian Optimization

Shubhanshu Shekhar, Tara Javidi

Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:2836-2844, 2021.

Abstract

We consider the problem of Bayesian Optimization (BO) in which the goal is to design an adaptive querying strategy to optimize a function $f:[0,1]^d\mapsto \reals$. The function is assumed to be drawn from a Gaussian Process, and can only be accessed through noisy oracle queries. The most commonly used oracle in BO literature is the noisy Zeroth-Order-Oracle (ZOO) which returns noise-corrupted function value $y = f(x) + \eta$ at any point $x \in \domain$ queried by the agent. A less studied oracle in BO is the First-Order-Oracle (FOO) which also returns noisy gradient value at the queried point. In this paper we consider the fundamental question of quantifying the possible improvement in regret that can be achieved under FOO access as compared to the case in which only ZOO access is available. Under some regularity assumptions on $K$, we first show that the expected cumulative regret with ZOO of any algorithm must satisfy a lower bound of $\Omega(\sqrt{2^d n})$, where $n$ is the query budget. This lower bound captures the appropriate scaling of the regret on both dimension $d$ and budget $n$, and relies on a novel reduction from BO to a multi-armed bandit (MAB) problem. We then propose a two-phase algorithm which, with some additional prior knowledge, achieves a vastly improved $\mc{O}\lp d (\log n)^2 \rp$ regret when given access to a FOO. Together, these two results highlight the significant value of incorporating gradient information in BO algorithms.

Cite this Paper

BibTeX


@InProceedings{pmlr-v130-shekhar21a,
  title = 	 { Significance of Gradient Information in Bayesian Optimization },
  author =       {Shekhar, Shubhanshu and Javidi, Tara},
  booktitle = 	 {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {2836--2844},
  year = 	 {2021},
  editor = 	 {Banerjee, Arindam and Fukumizu, Kenji},
  volume = 	 {130},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--15 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v130/shekhar21a/shekhar21a.pdf},
  url = 	 {https://proceedings.mlr.press/v130/shekhar21a.html},
  abstract = 	 { We consider the problem of Bayesian Optimization (BO) in which the goal is to design an adaptive querying strategy to optimize a function $f:[0,1]^d\mapsto \reals$. The function is assumed to be drawn from a Gaussian Process, and can only be accessed through noisy oracle queries. The most commonly used oracle in BO literature is the noisy Zeroth-Order-Oracle (ZOO) which returns noise-corrupted function value $y = f(x) + \eta$ at any point $x \in \domain$ queried by the agent. A less studied oracle in BO is the First-Order-Oracle (FOO) which also returns noisy gradient value at the queried point. In this paper we consider the fundamental question of quantifying the possible improvement in regret that can be achieved under FOO access as compared to the case in which only ZOO access is available. Under some regularity assumptions on $K$, we first show that the expected cumulative regret with ZOO of any algorithm must satisfy a lower bound of $\Omega(\sqrt{2^d n})$, where $n$ is the query budget. This lower bound captures the appropriate scaling of the regret on both dimension $d$ and budget $n$, and relies on a novel reduction from BO to a multi-armed bandit (MAB) problem. We then propose a two-phase algorithm which, with some additional prior knowledge, achieves a vastly improved $\mc{O}\lp d (\log n)^2 \rp$ regret when given access to a FOO. Together, these two results highlight the significant value of incorporating gradient information in BO algorithms. }
}

Endnote

%0 Conference Paper
%T  Significance of Gradient Information in Bayesian Optimization 
%A Shubhanshu Shekhar
%A Tara Javidi
%B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2021
%E Arindam Banerjee
%E Kenji Fukumizu	
%F pmlr-v130-shekhar21a
%I PMLR
%P 2836--2844
%U https://proceedings.mlr.press/v130/shekhar21a.html
%V 130
%X  We consider the problem of Bayesian Optimization (BO) in which the goal is to design an adaptive querying strategy to optimize a function $f:[0,1]^d\mapsto \reals$. The function is assumed to be drawn from a Gaussian Process, and can only be accessed through noisy oracle queries. The most commonly used oracle in BO literature is the noisy Zeroth-Order-Oracle (ZOO) which returns noise-corrupted function value $y = f(x) + \eta$ at any point $x \in \domain$ queried by the agent. A less studied oracle in BO is the First-Order-Oracle (FOO) which also returns noisy gradient value at the queried point. In this paper we consider the fundamental question of quantifying the possible improvement in regret that can be achieved under FOO access as compared to the case in which only ZOO access is available. Under some regularity assumptions on $K$, we first show that the expected cumulative regret with ZOO of any algorithm must satisfy a lower bound of $\Omega(\sqrt{2^d n})$, where $n$ is the query budget. This lower bound captures the appropriate scaling of the regret on both dimension $d$ and budget $n$, and relies on a novel reduction from BO to a multi-armed bandit (MAB) problem. We then propose a two-phase algorithm which, with some additional prior knowledge, achieves a vastly improved $\mc{O}\lp d (\log n)^2 \rp$ regret when given access to a FOO. Together, these two results highlight the significant value of incorporating gradient information in BO algorithms.

APA


Shekhar, S. & Javidi, T.. (2021).  Significance of Gradient Information in Bayesian Optimization . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:2836-2844 Available from https://proceedings.mlr.press/v130/shekhar21a.html.

Significance of Gradient Information in Bayesian Optimization

Abstract

Cite this Paper

Related Material