Active Area Search via Bayesian Quadrature

Yifei Ma; Roman Garnett; Jeff Schneider

Active Area Search via Bayesian Quadrature

Yifei Ma, Roman Garnett, Jeff Schneider

Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, PMLR 33:595-603, 2014.

Abstract

The selection of data collection locations is a problem that has received significant research attention from classical design of experiments to various recent active learning algorithms. Typical objectives are to map an unknown function, optimize it, or find level sets in it. Each of these objectives focuses on an assessment of individual points. The introduction of set kernels has led to algorithms that instead consider labels assigned to sets of data points. In this paper we combine these two concepts and consider the problem of choosing data collection locations when the goal is to identify regions whose set of collected data would be labeled positively by a set classifier. We present an algorithm for the case where the positive class is defined in terms of a region’s average function value being above some threshold with high probability, a problem we call active area search. To this end, we model the latent function using a Gaussian process and use Bayesian quadrature to estimate its integral on predefined regions. Our method is the first which directly solves the active area search problem. In experiments it outperforms previous algorithms that were developed for other active search goals.

Cite this Paper

BibTeX


@InProceedings{pmlr-v33-ma14,
  title = 	 {{Active Area Search via Bayesian Quadrature}},
  author = 	 {Ma, Yifei and Garnett, Roman and Schneider, Jeff},
  booktitle = 	 {Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics},
  pages = 	 {595--603},
  year = 	 {2014},
  editor = 	 {Kaski, Samuel and Corander, Jukka},
  volume = 	 {33},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Reykjavik, Iceland},
  month = 	 {22--25 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v33/ma14.pdf},
  url = 	 {https://proceedings.mlr.press/v33/ma14.html},
  abstract = 	 {The selection of data collection locations is a problem that has received significant research attention from classical design of experiments to various recent active learning algorithms.  Typical objectives are to map an unknown function, optimize it, or find level sets in it.  Each of these objectives focuses on an assessment of individual points.  The introduction of set kernels has led to algorithms that instead consider labels assigned to sets of data points.  In this paper we combine these two concepts and consider the problem of choosing data collection locations when the goal is to identify regions whose set of collected data would be labeled positively by a set classifier.  We present an algorithm for the case where the positive class is defined in terms of a region’s average function value being above some threshold with high probability, a problem we call active area search. To this end, we model the latent function using a Gaussian process and use Bayesian quadrature to estimate its integral on predefined regions.  Our method is the first which directly solves the active area search problem.  In experiments it outperforms previous algorithms that were developed for other active search goals.}
}

Endnote

%0 Conference Paper
%T Active Area Search via Bayesian Quadrature
%A Yifei Ma
%A Roman Garnett
%A Jeff Schneider
%B Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2014
%E Samuel Kaski
%E Jukka Corander	
%F pmlr-v33-ma14
%I PMLR
%P 595--603
%U https://proceedings.mlr.press/v33/ma14.html
%V 33
%X The selection of data collection locations is a problem that has received significant research attention from classical design of experiments to various recent active learning algorithms.  Typical objectives are to map an unknown function, optimize it, or find level sets in it.  Each of these objectives focuses on an assessment of individual points.  The introduction of set kernels has led to algorithms that instead consider labels assigned to sets of data points.  In this paper we combine these two concepts and consider the problem of choosing data collection locations when the goal is to identify regions whose set of collected data would be labeled positively by a set classifier.  We present an algorithm for the case where the positive class is defined in terms of a region’s average function value being above some threshold with high probability, a problem we call active area search. To this end, we model the latent function using a Gaussian process and use Bayesian quadrature to estimate its integral on predefined regions.  Our method is the first which directly solves the active area search problem.  In experiments it outperforms previous algorithms that were developed for other active search goals.

RIS


TY  - CPAPER
TI  - Active Area Search via Bayesian Quadrature
AU  - Yifei Ma
AU  - Roman Garnett
AU  - Jeff Schneider
BT  - Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics
DA  - 2014/04/02
ED  - Samuel Kaski
ED  - Jukka Corander	
ID  - pmlr-v33-ma14
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 33
SP  - 595
EP  - 603
L1  - http://proceedings.mlr.press/v33/ma14.pdf
UR  - https://proceedings.mlr.press/v33/ma14.html
AB  - The selection of data collection locations is a problem that has received significant research attention from classical design of experiments to various recent active learning algorithms.  Typical objectives are to map an unknown function, optimize it, or find level sets in it.  Each of these objectives focuses on an assessment of individual points.  The introduction of set kernels has led to algorithms that instead consider labels assigned to sets of data points.  In this paper we combine these two concepts and consider the problem of choosing data collection locations when the goal is to identify regions whose set of collected data would be labeled positively by a set classifier.  We present an algorithm for the case where the positive class is defined in terms of a region’s average function value being above some threshold with high probability, a problem we call active area search. To this end, we model the latent function using a Gaussian process and use Bayesian quadrature to estimate its integral on predefined regions.  Our method is the first which directly solves the active area search problem.  In experiments it outperforms previous algorithms that were developed for other active search goals.
ER  -

APA


Ma, Y., Garnett, R. & Schneider, J.. (2014). Active Area Search via Bayesian Quadrature. Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 33:595-603 Available from https://proceedings.mlr.press/v33/ma14.html.

Related Material

Download PDF