Active Learning from Multiple Knowledge Sources

Yan Yan, Romer Rosales, Glenn Fung, Faisal Farooq, Bharat Rao, Jennifer Dy
; Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, PMLR 22:1350-1357, 2012.

Abstract

Some supervised learning tasks do not fit the usual single annotator scenario. In these problems, ground-truth may not exist and multiple annotators are generally available. A few approaches have been proposed to address this learning problem. In this setting active learning (AL), the problem of optimally selecting unlabeled samples for labeling, offers new challenges and has received little attention. In multiple annotator AL, it is not sufficient to select a sample for labeling since, in addition, an optimal annotator must also be selected. This setting is of great interest as annotators’ expertise generally varies and could depend on the given sample itself; additionally, some annotators may be adversarial. Thus, clearly the information provided by some annotators should be more valuable than that provided by others and it could vary across data points. We propose an AL approach for this new scenario motivated by information theoretic principles. Specifically, we focus on maximizing the information that an annotator label provides about the true (but unknown) label of the data point. We develop this concept, propose an algorithm for active learning, and experimentally validate the proposed approach.

Cite this Paper


BibTeX
@InProceedings{pmlr-v22-yan12, title = {Active Learning from Multiple Knowledge Sources}, author = {Yan Yan and Romer Rosales and Glenn Fung and Faisal Farooq and Bharat Rao and Jennifer Dy}, booktitle = {Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics}, pages = {1350--1357}, year = {2012}, editor = {Neil D. Lawrence and Mark Girolami}, volume = {22}, series = {Proceedings of Machine Learning Research}, address = {La Palma, Canary Islands}, month = {21--23 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v22/yan12/yan12.pdf}, url = {http://proceedings.mlr.press/v22/yan12.html}, abstract = {Some supervised learning tasks do not fit the usual single annotator scenario. In these problems, ground-truth may not exist and multiple annotators are generally available. A few approaches have been proposed to address this learning problem. In this setting active learning (AL), the problem of optimally selecting unlabeled samples for labeling, offers new challenges and has received little attention. In multiple annotator AL, it is not sufficient to select a sample for labeling since, in addition, an optimal annotator must also be selected. This setting is of great interest as annotators’ expertise generally varies and could depend on the given sample itself; additionally, some annotators may be adversarial. Thus, clearly the information provided by some annotators should be more valuable than that provided by others and it could vary across data points. We propose an AL approach for this new scenario motivated by information theoretic principles. Specifically, we focus on maximizing the information that an annotator label provides about the true (but unknown) label of the data point. We develop this concept, propose an algorithm for active learning, and experimentally validate the proposed approach.} }
Endnote
%0 Conference Paper %T Active Learning from Multiple Knowledge Sources %A Yan Yan %A Romer Rosales %A Glenn Fung %A Faisal Farooq %A Bharat Rao %A Jennifer Dy %B Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2012 %E Neil D. Lawrence %E Mark Girolami %F pmlr-v22-yan12 %I PMLR %J Proceedings of Machine Learning Research %P 1350--1357 %U http://proceedings.mlr.press %V 22 %W PMLR %X Some supervised learning tasks do not fit the usual single annotator scenario. In these problems, ground-truth may not exist and multiple annotators are generally available. A few approaches have been proposed to address this learning problem. In this setting active learning (AL), the problem of optimally selecting unlabeled samples for labeling, offers new challenges and has received little attention. In multiple annotator AL, it is not sufficient to select a sample for labeling since, in addition, an optimal annotator must also be selected. This setting is of great interest as annotators’ expertise generally varies and could depend on the given sample itself; additionally, some annotators may be adversarial. Thus, clearly the information provided by some annotators should be more valuable than that provided by others and it could vary across data points. We propose an AL approach for this new scenario motivated by information theoretic principles. Specifically, we focus on maximizing the information that an annotator label provides about the true (but unknown) label of the data point. We develop this concept, propose an algorithm for active learning, and experimentally validate the proposed approach.
RIS
TY - CPAPER TI - Active Learning from Multiple Knowledge Sources AU - Yan Yan AU - Romer Rosales AU - Glenn Fung AU - Faisal Farooq AU - Bharat Rao AU - Jennifer Dy BT - Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics PY - 2012/03/21 DA - 2012/03/21 ED - Neil D. Lawrence ED - Mark Girolami ID - pmlr-v22-yan12 PB - PMLR SP - 1350 DP - PMLR EP - 1357 L1 - http://proceedings.mlr.press/v22/yan12/yan12.pdf UR - http://proceedings.mlr.press/v22/yan12.html AB - Some supervised learning tasks do not fit the usual single annotator scenario. In these problems, ground-truth may not exist and multiple annotators are generally available. A few approaches have been proposed to address this learning problem. In this setting active learning (AL), the problem of optimally selecting unlabeled samples for labeling, offers new challenges and has received little attention. In multiple annotator AL, it is not sufficient to select a sample for labeling since, in addition, an optimal annotator must also be selected. This setting is of great interest as annotators’ expertise generally varies and could depend on the given sample itself; additionally, some annotators may be adversarial. Thus, clearly the information provided by some annotators should be more valuable than that provided by others and it could vary across data points. We propose an AL approach for this new scenario motivated by information theoretic principles. Specifically, we focus on maximizing the information that an annotator label provides about the true (but unknown) label of the data point. We develop this concept, propose an algorithm for active learning, and experimentally validate the proposed approach. ER -
APA
Yan, Y., Rosales, R., Fung, G., Farooq, F., Rao, B. & Dy, J.. (2012). Active Learning from Multiple Knowledge Sources. Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, in PMLR 22:1350-1357

Related Material