Optimality of Belief Propagation for Crowdsourced Classification

Jungseul Ok; Sewoong Oh; Jinwoo Shin; Yung Yi

Optimality of Belief Propagation for Crowdsourced Classification

Jungseul Ok, Sewoong Oh, Jinwoo Shin, Yung Yi

Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:535-544, 2016.

Abstract

Crowdsourcing systems are popular for solving large-scale labelling tasks with low-paid (or even non-paid) workers. We study the problem of recovering the true labels from noisy crowdsourced labels under the popular Dawid-Skene model. To address this inference problem, several algorithms have recently been proposed, but the best known guarantee is still significantly larger than the fundamental limit. We close this gap under a simple but canonical scenario where each worker is assigned at most two tasks. In particular, we introduce a tighter lower bound on the fundamental limit and prove that Belief Propagation (BP) exactly matches this lower bound. The guaranteed optimality of BP is the strongest in the sense that it is information-theoretically impossible for any other algorithm to correctly la- bel a larger fraction of the tasks. In the general setting, when more than two tasks are assigned to each worker, we establish the dominance result on BP that it outperforms other existing algorithms with known provable guarantees. Experimental results suggest that BP is close to optimal for all regimes considered, while existing state-of-the-art algorithms exhibit suboptimal performances.

Cite this Paper

BibTeX


@InProceedings{pmlr-v48-ok16,
  title = 	 {Optimality of Belief Propagation for Crowdsourced Classification},
  author = 	 {Ok, Jungseul and Oh, Sewoong and Shin, Jinwoo and Yi, Yung},
  booktitle = 	 {Proceedings of The 33rd International Conference on Machine Learning},
  pages = 	 {535--544},
  year = 	 {2016},
  editor = 	 {Balcan, Maria Florina and Weinberger, Kilian Q.},
  volume = 	 {48},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {New York, New York, USA},
  month = 	 {20--22 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v48/ok16.pdf},
  url = 	 {https://proceedings.mlr.press/v48/ok16.html},
  abstract = 	 {Crowdsourcing systems are popular for solving large-scale labelling tasks with low-paid (or even non-paid) workers. We study the problem of recovering the true labels from noisy crowdsourced labels under the popular Dawid-Skene model. To address this inference problem, several algorithms have recently been proposed, but the best known guarantee is still significantly larger than the fundamental limit. We close this gap under a simple but canonical scenario where each worker is assigned at most two tasks. In particular, we introduce a tighter lower bound on the fundamental limit and prove that Belief Propagation (BP) exactly matches this lower bound. The guaranteed optimality of BP is the strongest in the sense that it is information-theoretically impossible for any other algorithm to correctly la- bel a larger fraction of the tasks. In the general setting, when more than two tasks are assigned to each worker, we establish the dominance result on BP that it outperforms other existing algorithms with known provable guarantees. Experimental results suggest that BP is close to optimal for all regimes considered, while existing state-of-the-art algorithms exhibit suboptimal performances.}
}

Endnote

%0 Conference Paper
%T Optimality of Belief Propagation for Crowdsourced Classification
%A Jungseul Ok
%A Sewoong Oh
%A Jinwoo Shin
%A Yung Yi
%B Proceedings of The 33rd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2016
%E Maria Florina Balcan
%E Kilian Q. Weinberger	
%F pmlr-v48-ok16
%I PMLR
%P 535--544
%U https://proceedings.mlr.press/v48/ok16.html
%V 48
%X Crowdsourcing systems are popular for solving large-scale labelling tasks with low-paid (or even non-paid) workers. We study the problem of recovering the true labels from noisy crowdsourced labels under the popular Dawid-Skene model. To address this inference problem, several algorithms have recently been proposed, but the best known guarantee is still significantly larger than the fundamental limit. We close this gap under a simple but canonical scenario where each worker is assigned at most two tasks. In particular, we introduce a tighter lower bound on the fundamental limit and prove that Belief Propagation (BP) exactly matches this lower bound. The guaranteed optimality of BP is the strongest in the sense that it is information-theoretically impossible for any other algorithm to correctly la- bel a larger fraction of the tasks. In the general setting, when more than two tasks are assigned to each worker, we establish the dominance result on BP that it outperforms other existing algorithms with known provable guarantees. Experimental results suggest that BP is close to optimal for all regimes considered, while existing state-of-the-art algorithms exhibit suboptimal performances.

RIS


TY  - CPAPER
TI  - Optimality of Belief Propagation for Crowdsourced Classification
AU  - Jungseul Ok
AU  - Sewoong Oh
AU  - Jinwoo Shin
AU  - Yung Yi
BT  - Proceedings of The 33rd International Conference on Machine Learning
DA  - 2016/06/11
ED  - Maria Florina Balcan
ED  - Kilian Q. Weinberger	
ID  - pmlr-v48-ok16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 48
SP  - 535
EP  - 544
L1  - http://proceedings.mlr.press/v48/ok16.pdf
UR  - https://proceedings.mlr.press/v48/ok16.html
AB  - Crowdsourcing systems are popular for solving large-scale labelling tasks with low-paid (or even non-paid) workers. We study the problem of recovering the true labels from noisy crowdsourced labels under the popular Dawid-Skene model. To address this inference problem, several algorithms have recently been proposed, but the best known guarantee is still significantly larger than the fundamental limit. We close this gap under a simple but canonical scenario where each worker is assigned at most two tasks. In particular, we introduce a tighter lower bound on the fundamental limit and prove that Belief Propagation (BP) exactly matches this lower bound. The guaranteed optimality of BP is the strongest in the sense that it is information-theoretically impossible for any other algorithm to correctly la- bel a larger fraction of the tasks. In the general setting, when more than two tasks are assigned to each worker, we establish the dominance result on BP that it outperforms other existing algorithms with known provable guarantees. Experimental results suggest that BP is close to optimal for all regimes considered, while existing state-of-the-art algorithms exhibit suboptimal performances.
ER  -

APA


Ok, J., Oh, S., Shin, J. & Yi, Y.. (2016). Optimality of Belief Propagation for Crowdsourced Classification. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:535-544 Available from https://proceedings.mlr.press/v48/ok16.html.

Optimality of Belief Propagation for Crowdsourced Classification

Abstract

Cite this Paper

Related Material