Sequential crowdsourced labeling as an epsilon-greedy exploration in a Markov Decision Process

Vikas Raykar; Priyanka Agrawal

Sequential crowdsourced labeling as an epsilon-greedy exploration in a Markov Decision Process

Vikas Raykar, Priyanka Agrawal

Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, PMLR 33:832-840, 2014.

Abstract

Crowdsourcing marketplaces are widely used for curating large annotated datasets by collecting labels from multiple annotators. In such scenarios one has to balance the tradeoff between the accuracy of the collected labels, the cost of acquiring these labels, and the time taken to finish the labeling task. With the goal of reducing the labeling cost, we introduce the notion of sequential crowdsourced labeling, where instead of asking for all the labels in one shot we acquire labels from annotators sequentially one at a time. We model it as an epsilon-greedy exploration in a Markov Decision Process with a Bayesian decision theoretic utility function that incorporates accuracy, cost and time. Experimental results confirm that the proposed sequential labeling procedure can achieve similar accuracy at roughly half the labeling cost and at any stage in the labeling process the algorithm achieves a higher accuracy compared to randomly asking for the next label.

Cite this Paper

BibTeX

@InProceedings{pmlr-v33-raykar14,
  title = 	 {{Sequential crowdsourced labeling as an epsilon-greedy exploration in a Markov Decision Process}},
  author = 	 {Raykar, Vikas and Agrawal, Priyanka},
  booktitle = 	 {Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics},
  pages = 	 {832--840},
  year = 	 {2014},
  editor = 	 {Kaski, Samuel and Corander, Jukka},
  volume = 	 {33},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Reykjavik, Iceland},
  month = 	 {22--25 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v33/raykar14.pdf},
  url = 	 {https://proceedings.mlr.press/v33/raykar14.html},
  abstract = 	 {Crowdsourcing marketplaces are widely used for curating large annotated datasets by collecting labels from multiple annotators. In such scenarios one has to balance the tradeoff between the accuracy of the collected labels, the cost of acquiring these labels, and the time taken to finish the labeling task. With the goal of reducing the labeling cost, we introduce the notion of sequential crowdsourced labeling, where instead of asking for all the labels in one shot we acquire labels from annotators sequentially one at a time. We model it as an epsilon-greedy exploration in a Markov Decision Process with a Bayesian decision theoretic utility function that incorporates accuracy, cost and time. Experimental results confirm that the proposed sequential labeling procedure can achieve similar accuracy at roughly half the labeling cost and at any stage in the labeling process the algorithm achieves a higher accuracy compared to randomly asking for the next label.}
}

Endnote

%0 Conference Paper
%T Sequential crowdsourced labeling as an epsilon-greedy exploration in a Markov Decision Process
%A Vikas Raykar
%A Priyanka Agrawal
%B Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2014
%E Samuel Kaski
%E Jukka Corander	
%F pmlr-v33-raykar14
%I PMLR
%P 832--840
%U https://proceedings.mlr.press/v33/raykar14.html
%V 33
%X Crowdsourcing marketplaces are widely used for curating large annotated datasets by collecting labels from multiple annotators. In such scenarios one has to balance the tradeoff between the accuracy of the collected labels, the cost of acquiring these labels, and the time taken to finish the labeling task. With the goal of reducing the labeling cost, we introduce the notion of sequential crowdsourced labeling, where instead of asking for all the labels in one shot we acquire labels from annotators sequentially one at a time. We model it as an epsilon-greedy exploration in a Markov Decision Process with a Bayesian decision theoretic utility function that incorporates accuracy, cost and time. Experimental results confirm that the proposed sequential labeling procedure can achieve similar accuracy at roughly half the labeling cost and at any stage in the labeling process the algorithm achieves a higher accuracy compared to randomly asking for the next label.

RIS

TY  - CPAPER
TI  - Sequential crowdsourced labeling as an epsilon-greedy exploration in a Markov Decision Process
AU  - Vikas Raykar
AU  - Priyanka Agrawal
BT  - Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics
DA  - 2014/04/02
ED  - Samuel Kaski
ED  - Jukka Corander	
ID  - pmlr-v33-raykar14
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 33
SP  - 832
EP  - 840
L1  - http://proceedings.mlr.press/v33/raykar14.pdf
UR  - https://proceedings.mlr.press/v33/raykar14.html
AB  - Crowdsourcing marketplaces are widely used for curating large annotated datasets by collecting labels from multiple annotators. In such scenarios one has to balance the tradeoff between the accuracy of the collected labels, the cost of acquiring these labels, and the time taken to finish the labeling task. With the goal of reducing the labeling cost, we introduce the notion of sequential crowdsourced labeling, where instead of asking for all the labels in one shot we acquire labels from annotators sequentially one at a time. We model it as an epsilon-greedy exploration in a Markov Decision Process with a Bayesian decision theoretic utility function that incorporates accuracy, cost and time. Experimental results confirm that the proposed sequential labeling procedure can achieve similar accuracy at roughly half the labeling cost and at any stage in the labeling process the algorithm achieves a higher accuracy compared to randomly asking for the next label.
ER  -

APA

Raykar, V. & Agrawal, P.. (2014). Sequential crowdsourced labeling as an epsilon-greedy exploration in a Markov Decision Process. Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 33:832-840 Available from https://proceedings.mlr.press/v33/raykar14.html.

Sequential crowdsourced labeling as an epsilon-greedy exploration in a Markov Decision Process

Abstract

Cite this Paper

Related Material