An Active Learning Algorithm Based on Parzen Window Classiffication

Liang Lan; Haidong Shi; Zhuang Wang; Slobodan Vucetic

An Active Learning Algorithm Based on Parzen Window Classiffication

Liang Lan, Haidong Shi, Zhuang Wang, Slobodan Vucetic

Active Learning and Experimental Design workshop In conjunction with AISTATS 2010, PMLR 16:99-112, 2011.

Abstract

This paper describes active learning algorithm used in AISTATS 2010 Active Learning Challenge as well as several of its extensions evaluated in the post-competition experiments. The algorithm consists of a pair of Regularized Parzen Window Classifiers, one trained on full set of features and another on features filtered using Pearson correlation. Predictions of the two classifiers are averaged to obtain the ensemble classifier. Parzen Window classifier was chosen because is an easy to implement lazy algorithm and has a single parameter, the kernel window size, that is determined by the cross-validation. The labeling schedule started by selecting random 20 examples and then continued by doubling the number of labeled examples in each round of active learning. A combination of random sampling and uncertainty sampling was used for querying. For the random sampling, examples were first clustered using either all features or the filtered features (whichever resulted in higher cross-validated accuracy) and then the same number of random examples was selected from each cluster. Our algorithm ranked as the 5th overall, and was consistently ranked in the upper half of the competing algorithms. The challenge results show that Parzen Window classifiers are less accurate than several competing learning algorithms used in the competition, but also indicate the success of the simple querying strategy that was employed. In the post-competition, we were able to improve the accuracy by using an ensemble of 5 Parzen Window classifiers, each trained on features selected by different filters. We also explored how more involved querying during the initial stages of active learning and the pre-clustering querying strategy would influence the performance of the proposed algorithm.

Cite this Paper

BibTeX


@InProceedings{pmlr-v16-lan11a,
  title = 	 {An Active Learning Algorithm Based on Parzen Window Classiffication},
  author = 	 {Lan, Liang and Shi, Haidong and Wang, Zhuang and Vucetic, Slobodan},
  booktitle = 	 {Active Learning and Experimental Design workshop In conjunction with AISTATS 2010},
  pages = 	 {99--112},
  year = 	 {2011},
  editor = 	 {Guyon, Isabelle and Cawley, Gavin and Dror, Gideon and Lemaire, Vincent and Statnikov, Alexander},
  volume = 	 {16},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Sardinia, Italy},
  month = 	 {16 May},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v16/lan11a/lan11a.pdf},
  url = 	 {https://proceedings.mlr.press/v16/lan11a.html},
  abstract = 	 {This paper describes active learning algorithm used in AISTATS 2010 Active Learning Challenge as well as several of its extensions evaluated in the post-competition experiments. The algorithm consists of a pair of Regularized Parzen Window Classifiers, one trained on full set of features and another on features filtered using Pearson correlation. Predictions of the two classifiers are averaged to obtain the ensemble classifier. Parzen Window classifier was chosen because is an easy to implement lazy algorithm and has a single parameter, the kernel window size, that is determined by the cross-validation. The labeling schedule started by selecting random 20 examples and then continued by doubling the number of labeled examples in each round of active learning. A combination of random sampling and uncertainty sampling was used for querying. For the random sampling, examples were first clustered using either all features or the filtered features (whichever resulted in higher cross-validated accuracy) and then the same number of random examples was selected from each cluster. Our algorithm ranked as the 5th overall, and was consistently ranked in the upper half of the competing algorithms. The challenge results show that Parzen Window classifiers are less accurate than several competing learning algorithms used in the competition, but also indicate the success of the simple querying strategy that was employed. In the post-competition, we were able to improve the accuracy by using an ensemble of 5 Parzen Window classifiers, each trained on features selected by different filters. We also explored how more involved querying during the initial stages of active learning and the pre-clustering querying strategy would influence the performance of the proposed algorithm.}
}

Endnote

%0 Conference Paper
%T An Active Learning Algorithm Based on Parzen Window Classiffication
%A Liang Lan
%A Haidong Shi
%A Zhuang Wang
%A Slobodan Vucetic
%B Active Learning and Experimental Design workshop In conjunction with AISTATS 2010
%C Proceedings of Machine Learning Research
%D 2011
%E Isabelle Guyon
%E Gavin Cawley
%E Gideon Dror
%E Vincent Lemaire
%E Alexander Statnikov	
%F pmlr-v16-lan11a
%I PMLR
%P 99--112
%U https://proceedings.mlr.press/v16/lan11a.html
%V 16
%X This paper describes active learning algorithm used in AISTATS 2010 Active Learning Challenge as well as several of its extensions evaluated in the post-competition experiments. The algorithm consists of a pair of Regularized Parzen Window Classifiers, one trained on full set of features and another on features filtered using Pearson correlation. Predictions of the two classifiers are averaged to obtain the ensemble classifier. Parzen Window classifier was chosen because is an easy to implement lazy algorithm and has a single parameter, the kernel window size, that is determined by the cross-validation. The labeling schedule started by selecting random 20 examples and then continued by doubling the number of labeled examples in each round of active learning. A combination of random sampling and uncertainty sampling was used for querying. For the random sampling, examples were first clustered using either all features or the filtered features (whichever resulted in higher cross-validated accuracy) and then the same number of random examples was selected from each cluster. Our algorithm ranked as the 5th overall, and was consistently ranked in the upper half of the competing algorithms. The challenge results show that Parzen Window classifiers are less accurate than several competing learning algorithms used in the competition, but also indicate the success of the simple querying strategy that was employed. In the post-competition, we were able to improve the accuracy by using an ensemble of 5 Parzen Window classifiers, each trained on features selected by different filters. We also explored how more involved querying during the initial stages of active learning and the pre-clustering querying strategy would influence the performance of the proposed algorithm.

RIS


TY  - CPAPER
TI  - An Active Learning Algorithm Based on Parzen Window Classiffication
AU  - Liang Lan
AU  - Haidong Shi
AU  - Zhuang Wang
AU  - Slobodan Vucetic
BT  - Active Learning and Experimental Design workshop In conjunction with AISTATS 2010
DA  - 2011/04/21
ED  - Isabelle Guyon
ED  - Gavin Cawley
ED  - Gideon Dror
ED  - Vincent Lemaire
ED  - Alexander Statnikov	
ID  - pmlr-v16-lan11a
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 16
SP  - 99
EP  - 112
L1  - http://proceedings.mlr.press/v16/lan11a/lan11a.pdf
UR  - https://proceedings.mlr.press/v16/lan11a.html
AB  - This paper describes active learning algorithm used in AISTATS 2010 Active Learning Challenge as well as several of its extensions evaluated in the post-competition experiments. The algorithm consists of a pair of Regularized Parzen Window Classifiers, one trained on full set of features and another on features filtered using Pearson correlation. Predictions of the two classifiers are averaged to obtain the ensemble classifier. Parzen Window classifier was chosen because is an easy to implement lazy algorithm and has a single parameter, the kernel window size, that is determined by the cross-validation. The labeling schedule started by selecting random 20 examples and then continued by doubling the number of labeled examples in each round of active learning. A combination of random sampling and uncertainty sampling was used for querying. For the random sampling, examples were first clustered using either all features or the filtered features (whichever resulted in higher cross-validated accuracy) and then the same number of random examples was selected from each cluster. Our algorithm ranked as the 5th overall, and was consistently ranked in the upper half of the competing algorithms. The challenge results show that Parzen Window classifiers are less accurate than several competing learning algorithms used in the competition, but also indicate the success of the simple querying strategy that was employed. In the post-competition, we were able to improve the accuracy by using an ensemble of 5 Parzen Window classifiers, each trained on features selected by different filters. We also explored how more involved querying during the initial stages of active learning and the pre-clustering querying strategy would influence the performance of the proposed algorithm.
ER  -

APA


Lan, L., Shi, H., Wang, Z. & Vucetic, S.. (2011). An Active Learning Algorithm Based on Parzen Window Classiffication. Active Learning and Experimental Design workshop In conjunction with AISTATS 2010, in Proceedings of Machine Learning Research 16:99-112 Available from https://proceedings.mlr.press/v16/lan11a.html.

Related Material

Download PDF