Learning from Weak Teachers

Ruth Urner; Shai Ben David; Ohad Shamir

Learning from Weak Teachers

Ruth Urner, Shai Ben David, Ohad Shamir

Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, PMLR 22:1252-1260, 2012.

Abstract

This paper addresses the problem of learning when high-quality labeled examples are an expensive resource, while samples with error-prone labeling (for example generated by crowdsourcing) are readily available. We introduce a formal framework for such learning scenarios with label sources of varying quality, and we propose a parametric model for such label sources (“weak teachers”), reflecting the intuition that their labeling is likely to be correct in label-homogeneous regions but may deteriorate near classification boundaries. We consider learning when the learner has access to weakly labeled random samples and, on top of that, can actively query the correct labels of sample points of its choice. We propose a learning algorithm for this scenario, analyze its sample complexity and prove that, under certain conditions on the underlying data distribution, our learner can utilize the weak labels to reduce the number of expert labels it requires. We view this paper as a first step towards the development of a theory of learning from labels generated by teachers of varying accuracy, a scenario that is relevant in various practical applications.

Cite this Paper

BibTeX


@InProceedings{pmlr-v22-urner12,
  title = 	 {Learning from Weak Teachers},
  author = 	 {Urner, Ruth and David, Shai Ben and Shamir, Ohad},
  booktitle = 	 {Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics},
  pages = 	 {1252--1260},
  year = 	 {2012},
  editor = 	 {Lawrence, Neil D. and Girolami, Mark},
  volume = 	 {22},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {La Palma, Canary Islands},
  month = 	 {21--23 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v22/urner12/urner12.pdf},
  url = 	 {https://proceedings.mlr.press/v22/urner12.html},
  abstract = 	 {This paper addresses the problem of learning when high-quality labeled examples are an expensive resource, while samples with error-prone labeling (for example generated by crowdsourcing) are readily available. We introduce a formal framework for such learning scenarios with label sources of varying quality, and we propose a parametric model for such label sources (“weak teachers”), reflecting the intuition that their labeling is likely to be correct in label-homogeneous regions but may deteriorate near classification boundaries. We consider learning when the learner has access to weakly labeled random samples and, on top of that, can actively query the correct labels of sample points of its choice. We propose a learning algorithm for this scenario, analyze its sample complexity and prove that, under certain conditions on the underlying data distribution, our learner can utilize the weak labels to reduce the number of expert labels it requires. We view this paper as a first step towards the development of a theory of learning from labels generated by teachers of varying accuracy, a scenario that is relevant in various practical applications.}
}

Endnote

%0 Conference Paper
%T Learning from Weak Teachers
%A Ruth Urner
%A Shai Ben David
%A Ohad Shamir
%B Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2012
%E Neil D. Lawrence
%E Mark Girolami	
%F pmlr-v22-urner12
%I PMLR
%P 1252--1260
%U https://proceedings.mlr.press/v22/urner12.html
%V 22
%X This paper addresses the problem of learning when high-quality labeled examples are an expensive resource, while samples with error-prone labeling (for example generated by crowdsourcing) are readily available. We introduce a formal framework for such learning scenarios with label sources of varying quality, and we propose a parametric model for such label sources (“weak teachers”), reflecting the intuition that their labeling is likely to be correct in label-homogeneous regions but may deteriorate near classification boundaries. We consider learning when the learner has access to weakly labeled random samples and, on top of that, can actively query the correct labels of sample points of its choice. We propose a learning algorithm for this scenario, analyze its sample complexity and prove that, under certain conditions on the underlying data distribution, our learner can utilize the weak labels to reduce the number of expert labels it requires. We view this paper as a first step towards the development of a theory of learning from labels generated by teachers of varying accuracy, a scenario that is relevant in various practical applications.

RIS


TY  - CPAPER
TI  - Learning from Weak Teachers
AU  - Ruth Urner
AU  - Shai Ben David
AU  - Ohad Shamir
BT  - Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics
DA  - 2012/03/21
ED  - Neil D. Lawrence
ED  - Mark Girolami	
ID  - pmlr-v22-urner12
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 22
SP  - 1252
EP  - 1260
L1  - http://proceedings.mlr.press/v22/urner12/urner12.pdf
UR  - https://proceedings.mlr.press/v22/urner12.html
AB  - This paper addresses the problem of learning when high-quality labeled examples are an expensive resource, while samples with error-prone labeling (for example generated by crowdsourcing) are readily available. We introduce a formal framework for such learning scenarios with label sources of varying quality, and we propose a parametric model for such label sources (“weak teachers”), reflecting the intuition that their labeling is likely to be correct in label-homogeneous regions but may deteriorate near classification boundaries. We consider learning when the learner has access to weakly labeled random samples and, on top of that, can actively query the correct labels of sample points of its choice. We propose a learning algorithm for this scenario, analyze its sample complexity and prove that, under certain conditions on the underlying data distribution, our learner can utilize the weak labels to reduce the number of expert labels it requires. We view this paper as a first step towards the development of a theory of learning from labels generated by teachers of varying accuracy, a scenario that is relevant in various practical applications.
ER  -

APA


Urner, R., David, S.B. & Shamir, O.. (2012). Learning from Weak Teachers. Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 22:1252-1260 Available from https://proceedings.mlr.press/v22/urner12.html.

Related Material

Download PDF