High Dimensional Bayesian Optimization with Elastic Gaussian Process
[edit]
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:28832891, 2017.
Abstract
Bayesian optimization is an efficient way to optimize expensive blackbox functions such as designing a new product with highest quality or hyperparameter tuning of a machine learning algorithm. However, it has a serious limitation when the parameter space is highdimensional as Bayesian optimization crucially depends on solving a global optimization of a surrogate utility function in the same sized dimensions. The surrogate utility function, known commonly as acquisition function is a continuous function but can be extremely sharp at high dimension  having only a few peaks marooned in a large terrain of almost flat surface. Global optimization algorithms such as DIRECT are infeasible at higher dimensions and gradientdependent methods cannot move if initialized in the flat terrain. We propose an algorithm that enables local gradientdependent algorithms to move through the flat terrain by using a sequence of grosstofiner Gaussian process priors on the objective function as we leverage two underlying facts  a) there exists a large enough lengthscales for which the acquisition function can be made to have a significant gradient at any location in the parameter space, and b) the extrema of the consecutive acquisition functions are close although they are different only due to a small difference in the lengthscales. Theoretical guarantees are provided and experiments clearly demonstrate the utility of the proposed method at high dimension using both benchmark test functions and realworld case studies.
Related Material


