Active Area Search via Bayesian Quadrature
Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, PMLR 33:595-603, 2014.
The selection of data collection locations is a problem that has received significant research attention from classical design of experiments to various recent active learning algorithms. Typical objectives are to map an unknown function, optimize it, or find level sets in it. Each of these objectives focuses on an assessment of individual points. The introduction of set kernels has led to algorithms that instead consider labels assigned to sets of data points. In this paper we combine these two concepts and consider the problem of choosing data collection locations when the goal is to identify regions whose set of collected data would be labeled positively by a set classifier. We present an algorithm for the case where the positive class is defined in terms of a region’s average function value being above some threshold with high probability, a problem we call active area search. To this end, we model the latent function using a Gaussian process and use Bayesian quadrature to estimate its integral on predefined regions. Our method is the first which directly solves the active area search problem. In experiments it outperforms previous algorithms that were developed for other active search goals.