Baseline Methods for Active Learning

Gavin C. Cawley

Baseline Methods for Active Learning

Gavin C. Cawley

Active Learning and Experimental Design workshop In conjunction with AISTATS 2010, PMLR 16:47-57, 2011.

Abstract

In many potential applications of machine learning, unlabelled data are abundantly available at low cost, but there is a paucity of labelled data, and labeling unlabelled examples is expensive and/or time-consuming. This motivates the development of active learning methods, that seek to direct the collection of labelled examples such that the greatest performance gains can be achieved using the smallest quantity of labelled data. In this paper, we describe some simple pool-based active learning strategies, based on optimally regularised linear [kernel] ridge regression, providing a set of baseline submissions for the Active Learning Challenge. A simple random strategy, where unlabelled patterns are submitted to the oracle purely at random, is found to be surprisingly effective, being competitive with more complex approaches.

Cite this Paper

BibTeX


@InProceedings{pmlr-v16-cawley11a,
  title = 	 {Baseline Methods for Active Learning},
  author = 	 {Cawley, Gavin C.},
  booktitle = 	 {Active Learning and Experimental Design workshop In conjunction with AISTATS 2010},
  pages = 	 {47--57},
  year = 	 {2011},
  editor = 	 {Guyon, Isabelle and Cawley, Gavin and Dror, Gideon and Lemaire, Vincent and Statnikov, Alexander},
  volume = 	 {16},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Sardinia, Italy},
  month = 	 {16 May},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v16/cawley11a/cawley11a.pdf},
  url = 	 {https://proceedings.mlr.press/v16/cawley11a.html},
  abstract = 	 {In many potential applications of machine learning, unlabelled data are abundantly available at low cost, but there is a paucity of labelled data, and labeling unlabelled examples is expensive and/or time-consuming. This motivates the development of active learning methods, that seek to direct the collection of labelled examples such that the greatest performance gains can be achieved using the smallest quantity of labelled data. In this paper, we describe some simple pool-based active learning strategies, based on optimally regularised linear [kernel] ridge regression, providing a set of baseline submissions for the Active Learning Challenge. A simple random strategy, where unlabelled patterns are submitted to the oracle purely at random, is found to be surprisingly effective, being competitive with more complex approaches.}
}

Endnote

%0 Conference Paper
%T Baseline Methods for Active Learning
%A Gavin C. Cawley
%B Active Learning and Experimental Design workshop In conjunction with AISTATS 2010
%C Proceedings of Machine Learning Research
%D 2011
%E Isabelle Guyon
%E Gavin Cawley
%E Gideon Dror
%E Vincent Lemaire
%E Alexander Statnikov	
%F pmlr-v16-cawley11a
%I PMLR
%P 47--57
%U https://proceedings.mlr.press/v16/cawley11a.html
%V 16
%X In many potential applications of machine learning, unlabelled data are abundantly available at low cost, but there is a paucity of labelled data, and labeling unlabelled examples is expensive and/or time-consuming. This motivates the development of active learning methods, that seek to direct the collection of labelled examples such that the greatest performance gains can be achieved using the smallest quantity of labelled data. In this paper, we describe some simple pool-based active learning strategies, based on optimally regularised linear [kernel] ridge regression, providing a set of baseline submissions for the Active Learning Challenge. A simple random strategy, where unlabelled patterns are submitted to the oracle purely at random, is found to be surprisingly effective, being competitive with more complex approaches.

RIS


TY  - CPAPER
TI  - Baseline Methods for Active Learning
AU  - Gavin C. Cawley
BT  - Active Learning and Experimental Design workshop In conjunction with AISTATS 2010
DA  - 2011/04/21
ED  - Isabelle Guyon
ED  - Gavin Cawley
ED  - Gideon Dror
ED  - Vincent Lemaire
ED  - Alexander Statnikov	
ID  - pmlr-v16-cawley11a
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 16
SP  - 47
EP  - 57
L1  - http://proceedings.mlr.press/v16/cawley11a/cawley11a.pdf
UR  - https://proceedings.mlr.press/v16/cawley11a.html
AB  - In many potential applications of machine learning, unlabelled data are abundantly available at low cost, but there is a paucity of labelled data, and labeling unlabelled examples is expensive and/or time-consuming. This motivates the development of active learning methods, that seek to direct the collection of labelled examples such that the greatest performance gains can be achieved using the smallest quantity of labelled data. In this paper, we describe some simple pool-based active learning strategies, based on optimally regularised linear [kernel] ridge regression, providing a set of baseline submissions for the Active Learning Challenge. A simple random strategy, where unlabelled patterns are submitted to the oracle purely at random, is found to be surprisingly effective, being competitive with more complex approaches.
ER  -

APA


Cawley, G.C.. (2011). Baseline Methods for Active Learning. Active Learning and Experimental Design workshop In conjunction with AISTATS 2010, in Proceedings of Machine Learning Research 16:47-57 Available from https://proceedings.mlr.press/v16/cawley11a.html.

Related Material

Download PDF