Testing conventional wisdom (of the crowd)

Noah Burrell; Grant Schoenebeck

Testing conventional wisdom (of the crowd)

Noah Burrell, Grant Schoenebeck

Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, PMLR 216:237-248, 2023.

Abstract

Do common assumptions about the way that crowd workers make mistakes in microtask (labeling) applications manifest in real crowdsourcing data? Prior work only addresses this question indirectly. Instead, it primarily focuses on designing new label aggregation algorithms, seeming to imply that better performance justifies any additional assumptions. However, empirical evidence in past instances has raised significant challenges to common assumptions. We continue this line of work, using crowdsourcing data itself as directly as possible to interrogate several basic assumptions about workers and tasks. We find strong evidence that the assumption that workers respond correctly to each task with a constant probability, which is common in theoretical work, is implausible in real data. We also illustrate how heterogeneity among tasks and workers can take different forms, which have different implications for the design and evaluation of label aggregation algorithms.

Cite this Paper

BibTeX


@InProceedings{pmlr-v216-burrell23a,
  title = 	 {Testing conventional wisdom (of the crowd)},
  author =       {Burrell, Noah and Schoenebeck, Grant},
  booktitle = 	 {Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence},
  pages = 	 {237--248},
  year = 	 {2023},
  editor = 	 {Evans, Robin J. and Shpitser, Ilya},
  volume = 	 {216},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {31 Jul--04 Aug},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v216/burrell23a/burrell23a.pdf},
  url = 	 {https://proceedings.mlr.press/v216/burrell23a.html},
  abstract = 	 {Do common assumptions about the way that crowd workers make mistakes in microtask (labeling) applications manifest in real crowdsourcing data? Prior work only addresses this question indirectly. Instead, it primarily focuses on designing new label aggregation algorithms, seeming to imply that better performance justifies any additional assumptions. However, empirical evidence in past instances has raised significant challenges to common assumptions. We continue this line of work, using crowdsourcing data itself as directly as possible to interrogate several basic assumptions about workers and tasks. We find strong evidence that the assumption that workers respond correctly to each task with a constant probability, which is common in theoretical work, is implausible in real data. We also illustrate how heterogeneity among tasks and workers can take different forms, which have different implications for the design and evaluation of label aggregation algorithms.}
}

Endnote

%0 Conference Paper
%T Testing conventional wisdom (of the crowd)
%A Noah Burrell
%A Grant Schoenebeck
%B Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence
%C Proceedings of Machine Learning Research
%D 2023
%E Robin J. Evans
%E Ilya Shpitser	
%F pmlr-v216-burrell23a
%I PMLR
%P 237--248
%U https://proceedings.mlr.press/v216/burrell23a.html
%V 216
%X Do common assumptions about the way that crowd workers make mistakes in microtask (labeling) applications manifest in real crowdsourcing data? Prior work only addresses this question indirectly. Instead, it primarily focuses on designing new label aggregation algorithms, seeming to imply that better performance justifies any additional assumptions. However, empirical evidence in past instances has raised significant challenges to common assumptions. We continue this line of work, using crowdsourcing data itself as directly as possible to interrogate several basic assumptions about workers and tasks. We find strong evidence that the assumption that workers respond correctly to each task with a constant probability, which is common in theoretical work, is implausible in real data. We also illustrate how heterogeneity among tasks and workers can take different forms, which have different implications for the design and evaluation of label aggregation algorithms.

APA


Burrell, N. & Schoenebeck, G.. (2023). Testing conventional wisdom (of the crowd). Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 216:237-248 Available from https://proceedings.mlr.press/v216/burrell23a.html.

Testing conventional wisdom (of the crowd)

Abstract

Cite this Paper

Related Material