No Oops, You Won’t Do It Again: Mechanisms for Self-correction in Crowdsourcing

Nihar Shah; Dengyong Zhou

No Oops, You Won’t Do It Again: Mechanisms for Self-correction in Crowdsourcing

Nihar Shah, Dengyong Zhou

Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:1-10, 2016.

Abstract

Crowdsourcing is a very popular means of obtaining the large amounts of labeled data that modern machine learning methods require. Although cheap and fast to obtain, crowdsourced labels suffer from significant amounts of error, thereby degrading the performance of downstream machine learning tasks. With the goal of improving the quality of the labeled data, we seek to mitigate the many errors that occur due to silly mistakes or inadvertent errors by crowdsourcing workers. We propose a two-stage setting for crowdsourcing where the worker first answers the questions, and is then allowed to change her answers after looking at a (noisy) reference answer. We mathematically formulate this process and develop mechanisms to incentivize workers to act appropriately. Our mathematical guarantees show that our mechanism incentivizes the workers to answer honestly in both stages, and refrain from answering randomly in the first stage or simply copying in the second. Numerical experiments reveal a significant boost in performance that such "self-correction" can provide when using crowdsourcing to train machine learning algorithms.

Cite this Paper

BibTeX


@InProceedings{pmlr-v48-shaha16,
  title = 	 {No Oops, You Won't Do It Again: Mechanisms for Self-correction in Crowdsourcing},
  author = 	 {Shah, Nihar and Zhou, Dengyong},
  booktitle = 	 {Proceedings of The 33rd International Conference on Machine Learning},
  pages = 	 {1--10},
  year = 	 {2016},
  editor = 	 {Balcan, Maria Florina and Weinberger, Kilian Q.},
  volume = 	 {48},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {New York, New York, USA},
  month = 	 {20--22 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v48/shaha16.pdf},
  url = 	 {https://proceedings.mlr.press/v48/shaha16.html},
  abstract = 	 {Crowdsourcing is a very popular means of obtaining the large amounts of labeled data that modern machine learning methods require. Although cheap and fast to obtain, crowdsourced labels suffer from significant amounts of error, thereby degrading the performance of downstream machine learning tasks. With the goal of improving the quality of the labeled data, we seek to mitigate the many errors that occur due to silly mistakes or inadvertent errors by crowdsourcing workers. We propose a two-stage setting for crowdsourcing where the worker first answers the questions, and is then allowed to change her answers after looking at a (noisy) reference answer. We mathematically formulate this process and develop mechanisms to incentivize workers to act appropriately. Our mathematical guarantees show that our mechanism incentivizes the workers to answer honestly in both stages, and refrain from answering randomly in the first stage or simply copying in the second. Numerical experiments reveal a significant boost in performance that such "self-correction" can provide when using crowdsourcing to train machine learning algorithms.}
}

Endnote

%0 Conference Paper
%T No Oops, You Won’t Do It Again: Mechanisms for Self-correction in Crowdsourcing
%A Nihar Shah
%A Dengyong Zhou
%B Proceedings of The 33rd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2016
%E Maria Florina Balcan
%E Kilian Q. Weinberger	
%F pmlr-v48-shaha16
%I PMLR
%P 1--10
%U https://proceedings.mlr.press/v48/shaha16.html
%V 48
%X Crowdsourcing is a very popular means of obtaining the large amounts of labeled data that modern machine learning methods require. Although cheap and fast to obtain, crowdsourced labels suffer from significant amounts of error, thereby degrading the performance of downstream machine learning tasks. With the goal of improving the quality of the labeled data, we seek to mitigate the many errors that occur due to silly mistakes or inadvertent errors by crowdsourcing workers. We propose a two-stage setting for crowdsourcing where the worker first answers the questions, and is then allowed to change her answers after looking at a (noisy) reference answer. We mathematically formulate this process and develop mechanisms to incentivize workers to act appropriately. Our mathematical guarantees show that our mechanism incentivizes the workers to answer honestly in both stages, and refrain from answering randomly in the first stage or simply copying in the second. Numerical experiments reveal a significant boost in performance that such "self-correction" can provide when using crowdsourcing to train machine learning algorithms.

RIS


TY  - CPAPER
TI  - No Oops, You Won’t Do It Again: Mechanisms for Self-correction in Crowdsourcing
AU  - Nihar Shah
AU  - Dengyong Zhou
BT  - Proceedings of The 33rd International Conference on Machine Learning
DA  - 2016/06/11
ED  - Maria Florina Balcan
ED  - Kilian Q. Weinberger	
ID  - pmlr-v48-shaha16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 48
SP  - 1
EP  - 10
L1  - http://proceedings.mlr.press/v48/shaha16.pdf
UR  - https://proceedings.mlr.press/v48/shaha16.html
AB  - Crowdsourcing is a very popular means of obtaining the large amounts of labeled data that modern machine learning methods require. Although cheap and fast to obtain, crowdsourced labels suffer from significant amounts of error, thereby degrading the performance of downstream machine learning tasks. With the goal of improving the quality of the labeled data, we seek to mitigate the many errors that occur due to silly mistakes or inadvertent errors by crowdsourcing workers. We propose a two-stage setting for crowdsourcing where the worker first answers the questions, and is then allowed to change her answers after looking at a (noisy) reference answer. We mathematically formulate this process and develop mechanisms to incentivize workers to act appropriately. Our mathematical guarantees show that our mechanism incentivizes the workers to answer honestly in both stages, and refrain from answering randomly in the first stage or simply copying in the second. Numerical experiments reveal a significant boost in performance that such "self-correction" can provide when using crowdsourcing to train machine learning algorithms.
ER  -

APA


Shah, N. & Zhou, D.. (2016). No Oops, You Won’t Do It Again: Mechanisms for Self-correction in Crowdsourcing. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:1-10 Available from https://proceedings.mlr.press/v48/shaha16.html.

No Oops, You Won’t Do It Again: Mechanisms for Self-correction in Crowdsourcing

Abstract

Cite this Paper

Related Material