Learning from Weak Teachers

Ruth Urner, Shai Ben David, Ohad Shamir
; Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, PMLR 22:1252-1260, 2012.

Abstract

This paper addresses the problem of learning when high-quality labeled examples are an expensive resource, while samples with error-prone labeling (for example generated by crowdsourcing) are readily available. We introduce a formal framework for such learning scenarios with label sources of varying quality, and we propose a parametric model for such label sources (“weak teachers”), reflecting the intuition that their labeling is likely to be correct in label-homogeneous regions but may deteriorate near classification boundaries. We consider learning when the learner has access to weakly labeled random samples and, on top of that, can actively query the correct labels of sample points of its choice. We propose a learning algorithm for this scenario, analyze its sample complexity and prove that, under certain conditions on the underlying data distribution, our learner can utilize the weak labels to reduce the number of expert labels it requires. We view this paper as a first step towards the development of a theory of learning from labels generated by teachers of varying accuracy, a scenario that is relevant in various practical applications.

Cite this Paper


BibTeX
@InProceedings{pmlr-v22-urner12, title = {Learning from Weak Teachers}, author = {Ruth Urner and Shai Ben David and Ohad Shamir}, booktitle = {Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics}, pages = {1252--1260}, year = {2012}, editor = {Neil D. Lawrence and Mark Girolami}, volume = {22}, series = {Proceedings of Machine Learning Research}, address = {La Palma, Canary Islands}, month = {21--23 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v22/urner12/urner12.pdf}, url = {http://proceedings.mlr.press/v22/urner12.html}, abstract = {This paper addresses the problem of learning when high-quality labeled examples are an expensive resource, while samples with error-prone labeling (for example generated by crowdsourcing) are readily available. We introduce a formal framework for such learning scenarios with label sources of varying quality, and we propose a parametric model for such label sources (“weak teachers”), reflecting the intuition that their labeling is likely to be correct in label-homogeneous regions but may deteriorate near classification boundaries. We consider learning when the learner has access to weakly labeled random samples and, on top of that, can actively query the correct labels of sample points of its choice. We propose a learning algorithm for this scenario, analyze its sample complexity and prove that, under certain conditions on the underlying data distribution, our learner can utilize the weak labels to reduce the number of expert labels it requires. We view this paper as a first step towards the development of a theory of learning from labels generated by teachers of varying accuracy, a scenario that is relevant in various practical applications.} }
Endnote
%0 Conference Paper %T Learning from Weak Teachers %A Ruth Urner %A Shai Ben David %A Ohad Shamir %B Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2012 %E Neil D. Lawrence %E Mark Girolami %F pmlr-v22-urner12 %I PMLR %J Proceedings of Machine Learning Research %P 1252--1260 %U http://proceedings.mlr.press %V 22 %W PMLR %X This paper addresses the problem of learning when high-quality labeled examples are an expensive resource, while samples with error-prone labeling (for example generated by crowdsourcing) are readily available. We introduce a formal framework for such learning scenarios with label sources of varying quality, and we propose a parametric model for such label sources (“weak teachers”), reflecting the intuition that their labeling is likely to be correct in label-homogeneous regions but may deteriorate near classification boundaries. We consider learning when the learner has access to weakly labeled random samples and, on top of that, can actively query the correct labels of sample points of its choice. We propose a learning algorithm for this scenario, analyze its sample complexity and prove that, under certain conditions on the underlying data distribution, our learner can utilize the weak labels to reduce the number of expert labels it requires. We view this paper as a first step towards the development of a theory of learning from labels generated by teachers of varying accuracy, a scenario that is relevant in various practical applications.
RIS
TY - CPAPER TI - Learning from Weak Teachers AU - Ruth Urner AU - Shai Ben David AU - Ohad Shamir BT - Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics PY - 2012/03/21 DA - 2012/03/21 ED - Neil D. Lawrence ED - Mark Girolami ID - pmlr-v22-urner12 PB - PMLR SP - 1252 DP - PMLR EP - 1260 L1 - http://proceedings.mlr.press/v22/urner12/urner12.pdf UR - http://proceedings.mlr.press/v22/urner12.html AB - This paper addresses the problem of learning when high-quality labeled examples are an expensive resource, while samples with error-prone labeling (for example generated by crowdsourcing) are readily available. We introduce a formal framework for such learning scenarios with label sources of varying quality, and we propose a parametric model for such label sources (“weak teachers”), reflecting the intuition that their labeling is likely to be correct in label-homogeneous regions but may deteriorate near classification boundaries. We consider learning when the learner has access to weakly labeled random samples and, on top of that, can actively query the correct labels of sample points of its choice. We propose a learning algorithm for this scenario, analyze its sample complexity and prove that, under certain conditions on the underlying data distribution, our learner can utilize the weak labels to reduce the number of expert labels it requires. We view this paper as a first step towards the development of a theory of learning from labels generated by teachers of varying accuracy, a scenario that is relevant in various practical applications. ER -
APA
Urner, R., David, S.B. & Shamir, O.. (2012). Learning from Weak Teachers. Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, in PMLR 22:1252-1260

Related Material