Active Area Search via Bayesian Quadrature

Yifei Ma, Roman Garnett, Jeff Schneider
Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, PMLR 33:595-603, 2014.

Abstract

The selection of data collection locations is a problem that has received significant research attention from classical design of experiments to various recent active learning algorithms. Typical objectives are to map an unknown function, optimize it, or find level sets in it. Each of these objectives focuses on an assessment of individual points. The introduction of set kernels has led to algorithms that instead consider labels assigned to sets of data points. In this paper we combine these two concepts and consider the problem of choosing data collection locations when the goal is to identify regions whose set of collected data would be labeled positively by a set classifier. We present an algorithm for the case where the positive class is defined in terms of a region’s average function value being above some threshold with high probability, a problem we call active area search. To this end, we model the latent function using a Gaussian process and use Bayesian quadrature to estimate its integral on predefined regions. Our method is the first which directly solves the active area search problem. In experiments it outperforms previous algorithms that were developed for other active search goals.

Cite this Paper


BibTeX
@InProceedings{pmlr-v33-ma14, title = {{Active Area Search via Bayesian Quadrature}}, author = {Ma, Yifei and Garnett, Roman and Schneider, Jeff}, booktitle = {Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics}, pages = {595--603}, year = {2014}, editor = {Kaski, Samuel and Corander, Jukka}, volume = {33}, series = {Proceedings of Machine Learning Research}, address = {Reykjavik, Iceland}, month = {22--25 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v33/ma14.pdf}, url = {https://proceedings.mlr.press/v33/ma14.html}, abstract = {The selection of data collection locations is a problem that has received significant research attention from classical design of experiments to various recent active learning algorithms. Typical objectives are to map an unknown function, optimize it, or find level sets in it. Each of these objectives focuses on an assessment of individual points. The introduction of set kernels has led to algorithms that instead consider labels assigned to sets of data points. In this paper we combine these two concepts and consider the problem of choosing data collection locations when the goal is to identify regions whose set of collected data would be labeled positively by a set classifier. We present an algorithm for the case where the positive class is defined in terms of a region’s average function value being above some threshold with high probability, a problem we call active area search. To this end, we model the latent function using a Gaussian process and use Bayesian quadrature to estimate its integral on predefined regions. Our method is the first which directly solves the active area search problem. In experiments it outperforms previous algorithms that were developed for other active search goals.} }
Endnote
%0 Conference Paper %T Active Area Search via Bayesian Quadrature %A Yifei Ma %A Roman Garnett %A Jeff Schneider %B Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2014 %E Samuel Kaski %E Jukka Corander %F pmlr-v33-ma14 %I PMLR %P 595--603 %U https://proceedings.mlr.press/v33/ma14.html %V 33 %X The selection of data collection locations is a problem that has received significant research attention from classical design of experiments to various recent active learning algorithms. Typical objectives are to map an unknown function, optimize it, or find level sets in it. Each of these objectives focuses on an assessment of individual points. The introduction of set kernels has led to algorithms that instead consider labels assigned to sets of data points. In this paper we combine these two concepts and consider the problem of choosing data collection locations when the goal is to identify regions whose set of collected data would be labeled positively by a set classifier. We present an algorithm for the case where the positive class is defined in terms of a region’s average function value being above some threshold with high probability, a problem we call active area search. To this end, we model the latent function using a Gaussian process and use Bayesian quadrature to estimate its integral on predefined regions. Our method is the first which directly solves the active area search problem. In experiments it outperforms previous algorithms that were developed for other active search goals.
RIS
TY - CPAPER TI - Active Area Search via Bayesian Quadrature AU - Yifei Ma AU - Roman Garnett AU - Jeff Schneider BT - Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics DA - 2014/04/02 ED - Samuel Kaski ED - Jukka Corander ID - pmlr-v33-ma14 PB - PMLR DP - Proceedings of Machine Learning Research VL - 33 SP - 595 EP - 603 L1 - http://proceedings.mlr.press/v33/ma14.pdf UR - https://proceedings.mlr.press/v33/ma14.html AB - The selection of data collection locations is a problem that has received significant research attention from classical design of experiments to various recent active learning algorithms. Typical objectives are to map an unknown function, optimize it, or find level sets in it. Each of these objectives focuses on an assessment of individual points. The introduction of set kernels has led to algorithms that instead consider labels assigned to sets of data points. In this paper we combine these two concepts and consider the problem of choosing data collection locations when the goal is to identify regions whose set of collected data would be labeled positively by a set classifier. We present an algorithm for the case where the positive class is defined in terms of a region’s average function value being above some threshold with high probability, a problem we call active area search. To this end, we model the latent function using a Gaussian process and use Bayesian quadrature to estimate its integral on predefined regions. Our method is the first which directly solves the active area search problem. In experiments it outperforms previous algorithms that were developed for other active search goals. ER -
APA
Ma, Y., Garnett, R. & Schneider, J.. (2014). Active Area Search via Bayesian Quadrature. Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 33:595-603 Available from https://proceedings.mlr.press/v33/ma14.html.

Related Material