Active Learning from Multiple Knowledge Sources

Yan Yan, Romer Rosales, Glenn Fung, Faisal Farooq, Bharat Rao, Jennifer Dy
Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, PMLR 22:1350-1357, 2012.

Abstract

Some supervised learning tasks do not fit the usual single annotator scenario. In these problems, ground-truth may not exist and multiple annotators are generally available. A few approaches have been proposed to address this learning problem. In this setting active learning (AL), the problem of optimally selecting unlabeled samples for labeling, offers new challenges and has received little attention. In multiple annotator AL, it is not sufficient to select a sample for labeling since, in addition, an optimal annotator must also be selected. This setting is of great interest as annotators’ expertise generally varies and could depend on the given sample itself; additionally, some annotators may be adversarial. Thus, clearly the information provided by some annotators should be more valuable than that provided by others and it could vary across data points. We propose an AL approach for this new scenario motivated by information theoretic principles. Specifically, we focus on maximizing the information that an annotator label provides about the true (but unknown) label of the data point. We develop this concept, propose an algorithm for active learning, and experimentally validate the proposed approach.

Cite this Paper


BibTeX
@InProceedings{pmlr-v22-yan12, title = {Active Learning from Multiple Knowledge Sources}, author = {Yan, Yan and Rosales, Romer and Fung, Glenn and Farooq, Faisal and Rao, Bharat and Dy, Jennifer}, booktitle = {Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics}, pages = {1350--1357}, year = {2012}, editor = {Lawrence, Neil D. and Girolami, Mark}, volume = {22}, series = {Proceedings of Machine Learning Research}, address = {La Palma, Canary Islands}, month = {21--23 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v22/yan12/yan12.pdf}, url = {https://proceedings.mlr.press/v22/yan12.html}, abstract = {Some supervised learning tasks do not fit the usual single annotator scenario. In these problems, ground-truth may not exist and multiple annotators are generally available. A few approaches have been proposed to address this learning problem. In this setting active learning (AL), the problem of optimally selecting unlabeled samples for labeling, offers new challenges and has received little attention. In multiple annotator AL, it is not sufficient to select a sample for labeling since, in addition, an optimal annotator must also be selected. This setting is of great interest as annotators’ expertise generally varies and could depend on the given sample itself; additionally, some annotators may be adversarial. Thus, clearly the information provided by some annotators should be more valuable than that provided by others and it could vary across data points. We propose an AL approach for this new scenario motivated by information theoretic principles. Specifically, we focus on maximizing the information that an annotator label provides about the true (but unknown) label of the data point. We develop this concept, propose an algorithm for active learning, and experimentally validate the proposed approach.} }
Endnote
%0 Conference Paper %T Active Learning from Multiple Knowledge Sources %A Yan Yan %A Romer Rosales %A Glenn Fung %A Faisal Farooq %A Bharat Rao %A Jennifer Dy %B Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2012 %E Neil D. Lawrence %E Mark Girolami %F pmlr-v22-yan12 %I PMLR %P 1350--1357 %U https://proceedings.mlr.press/v22/yan12.html %V 22 %X Some supervised learning tasks do not fit the usual single annotator scenario. In these problems, ground-truth may not exist and multiple annotators are generally available. A few approaches have been proposed to address this learning problem. In this setting active learning (AL), the problem of optimally selecting unlabeled samples for labeling, offers new challenges and has received little attention. In multiple annotator AL, it is not sufficient to select a sample for labeling since, in addition, an optimal annotator must also be selected. This setting is of great interest as annotators’ expertise generally varies and could depend on the given sample itself; additionally, some annotators may be adversarial. Thus, clearly the information provided by some annotators should be more valuable than that provided by others and it could vary across data points. We propose an AL approach for this new scenario motivated by information theoretic principles. Specifically, we focus on maximizing the information that an annotator label provides about the true (but unknown) label of the data point. We develop this concept, propose an algorithm for active learning, and experimentally validate the proposed approach.
RIS
TY - CPAPER TI - Active Learning from Multiple Knowledge Sources AU - Yan Yan AU - Romer Rosales AU - Glenn Fung AU - Faisal Farooq AU - Bharat Rao AU - Jennifer Dy BT - Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics DA - 2012/03/21 ED - Neil D. Lawrence ED - Mark Girolami ID - pmlr-v22-yan12 PB - PMLR DP - Proceedings of Machine Learning Research VL - 22 SP - 1350 EP - 1357 L1 - http://proceedings.mlr.press/v22/yan12/yan12.pdf UR - https://proceedings.mlr.press/v22/yan12.html AB - Some supervised learning tasks do not fit the usual single annotator scenario. In these problems, ground-truth may not exist and multiple annotators are generally available. A few approaches have been proposed to address this learning problem. In this setting active learning (AL), the problem of optimally selecting unlabeled samples for labeling, offers new challenges and has received little attention. In multiple annotator AL, it is not sufficient to select a sample for labeling since, in addition, an optimal annotator must also be selected. This setting is of great interest as annotators’ expertise generally varies and could depend on the given sample itself; additionally, some annotators may be adversarial. Thus, clearly the information provided by some annotators should be more valuable than that provided by others and it could vary across data points. We propose an AL approach for this new scenario motivated by information theoretic principles. Specifically, we focus on maximizing the information that an annotator label provides about the true (but unknown) label of the data point. We develop this concept, propose an algorithm for active learning, and experimentally validate the proposed approach. ER -
APA
Yan, Y., Rosales, R., Fung, G., Farooq, F., Rao, B. & Dy, J.. (2012). Active Learning from Multiple Knowledge Sources. Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 22:1350-1357 Available from https://proceedings.mlr.press/v22/yan12.html.

Related Material