Modeling annotator expertise: Learning when everybody knows a bit of something

Yan Yan; Romer Rosales; Glenn Fung; Mark Schmidt; Gerardo Hermosillo; Luca Bogoni; Linda Moy; Jennifer Dy

Modeling annotator expertise: Learning when everybody knows a bit of something

Yan Yan, Romer Rosales, Glenn Fung, Mark Schmidt, Gerardo Hermosillo, Luca Bogoni, Linda Moy, Jennifer Dy

Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, PMLR 9:932-939, 2010.

Abstract

Supervised learning from multiple labeling sources is an increasingly important problem in machine learning and data mining. This paper develops a probabilistic approach to this problem when annotators may be unreliable (labels are noisy), but also their expertise varies depending on the data they observe (annotators may have knowledge about different parts of the input space). That is, an annotator may not be consistently accurate (or inaccurate) across the task domain. The presented approach produces classification and annotator models that allow us to provide estimates of the true labels and annotator variable expertise. We provide an analysis of the proposed model under various scenarios and show experimentally that annotator expertise can indeed vary in real tasks and that the presented approach provides clear advantages over previously introduced multi-annotator methods, which only consider general annotator characteristics.

Cite this Paper

BibTeX

@InProceedings{pmlr-v9-yan10a,
  title = 	 {Modeling annotator expertise: Learning when everybody knows a bit of something},
  author = 	 {Yan, Yan and Rosales, Romer and Fung, Glenn and Schmidt, Mark and Hermosillo, Gerardo and Bogoni, Luca and Moy, Linda and Dy, Jennifer},
  booktitle = 	 {Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics},
  pages = 	 {932--939},
  year = 	 {2010},
  editor = 	 {Teh, Yee Whye and Titterington, Mike},
  volume = 	 {9},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Chia Laguna Resort, Sardinia, Italy},
  month = 	 {13--15 May},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v9/yan10a/yan10a.pdf},
  url = 	 {https://proceedings.mlr.press/v9/yan10a.html},
  abstract = 	 {Supervised learning from multiple labeling sources is an increasingly important problem in machine learning and data mining. This paper develops a probabilistic approach to this problem when annotators may be unreliable (labels are noisy), but also their expertise varies depending on the data they observe (annotators may have knowledge about different parts of the input space). That is, an annotator may not be consistently accurate (or inaccurate) across the task domain. The presented approach produces classification and annotator models that allow us to provide estimates of the true labels and annotator variable expertise. We provide an analysis of the proposed model under various scenarios and show experimentally that annotator expertise can indeed vary in real tasks and that the presented approach provides clear advantages over previously introduced multi-annotator methods, which only consider general annotator characteristics.}
}

Endnote

%0 Conference Paper
%T Modeling annotator expertise: Learning when everybody knows a bit of something
%A Yan Yan
%A Romer Rosales
%A Glenn Fung
%A Mark Schmidt
%A Gerardo Hermosillo
%A Luca Bogoni
%A Linda Moy
%A Jennifer Dy
%B Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2010
%E Yee Whye Teh
%E Mike Titterington	
%F pmlr-v9-yan10a
%I PMLR
%P 932--939
%U https://proceedings.mlr.press/v9/yan10a.html
%V 9
%X Supervised learning from multiple labeling sources is an increasingly important problem in machine learning and data mining. This paper develops a probabilistic approach to this problem when annotators may be unreliable (labels are noisy), but also their expertise varies depending on the data they observe (annotators may have knowledge about different parts of the input space). That is, an annotator may not be consistently accurate (or inaccurate) across the task domain. The presented approach produces classification and annotator models that allow us to provide estimates of the true labels and annotator variable expertise. We provide an analysis of the proposed model under various scenarios and show experimentally that annotator expertise can indeed vary in real tasks and that the presented approach provides clear advantages over previously introduced multi-annotator methods, which only consider general annotator characteristics.

RIS

TY  - CPAPER
TI  - Modeling annotator expertise: Learning when everybody knows a bit of something
AU  - Yan Yan
AU  - Romer Rosales
AU  - Glenn Fung
AU  - Mark Schmidt
AU  - Gerardo Hermosillo
AU  - Luca Bogoni
AU  - Linda Moy
AU  - Jennifer Dy
BT  - Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics
DA  - 2010/03/31
ED  - Yee Whye Teh
ED  - Mike Titterington	
ID  - pmlr-v9-yan10a
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 9
SP  - 932
EP  - 939
L1  - http://proceedings.mlr.press/v9/yan10a/yan10a.pdf
UR  - https://proceedings.mlr.press/v9/yan10a.html
AB  - Supervised learning from multiple labeling sources is an increasingly important problem in machine learning and data mining. This paper develops a probabilistic approach to this problem when annotators may be unreliable (labels are noisy), but also their expertise varies depending on the data they observe (annotators may have knowledge about different parts of the input space). That is, an annotator may not be consistently accurate (or inaccurate) across the task domain. The presented approach produces classification and annotator models that allow us to provide estimates of the true labels and annotator variable expertise. We provide an analysis of the proposed model under various scenarios and show experimentally that annotator expertise can indeed vary in real tasks and that the presented approach provides clear advantages over previously introduced multi-annotator methods, which only consider general annotator characteristics.
ER  -

APA

Yan, Y., Rosales, R., Fung, G., Schmidt, M., Hermosillo, G., Bogoni, L., Moy, L. & Dy, J.. (2010). Modeling annotator expertise: Learning when everybody knows a bit of something. Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 9:932-939 Available from https://proceedings.mlr.press/v9/yan10a.html.

Related Material

Download PDF