Label Consistency-based Worker Filtering for Crowdsourcing

Jiao Li, Liangxiao Jiang, Chaoqun Li, Wenjun Zhang
Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence, PMLR 244:2226-2237, 2024.

Abstract

In crowdsourcing scenarios, we can obtain multiple noisy labels from different crowd workers on the Internet for each instance and then infer its unknown true label via a label integration method. However, noisy labels often have a serious negative impact on label integration. In this case, most existing works always focus on designing more complex label integration methods to infer unknown true labels more accurately from multiple noisy labels, but little attention has been paid to another perspective, i.e., purifying noisy labels before label integration. In this paper, we aim to purify noisy labels for existing label integration methods and propose a label consistency-based worker filtering (LCWF) algorithm. In LCWF, we consider that if all low-quality workers are filtered out and only high-quality workers remain, the label consistency should be high. Therefore, we utilize label consistency to filter out low-quality workers. Firstly, we directly transform the worker filtering problem into a discrete optimization problem and utilize label consistency to define the fitness function for this problem. Then, we search for the optimal solution to this problem by a genetic algorithm. Finally, we filter out all labels from low-quality workers according to the optimal solution we obtained. Experimental results on simulated and real-world datasets demonstrate that LCWF can effectively purify noisy labels and improve the integration accuracy of existing label integration methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v244-li24a, title = {Label Consistency-based Worker Filtering for Crowdsourcing}, author = {Li, Jiao and Jiang, Liangxiao and Li, Chaoqun and Zhang, Wenjun}, booktitle = {Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence}, pages = {2226--2237}, year = {2024}, editor = {Kiyavash, Negar and Mooij, Joris M.}, volume = {244}, series = {Proceedings of Machine Learning Research}, month = {15--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v244/main/assets/li24a/li24a.pdf}, url = {https://proceedings.mlr.press/v244/li24a.html}, abstract = {In crowdsourcing scenarios, we can obtain multiple noisy labels from different crowd workers on the Internet for each instance and then infer its unknown true label via a label integration method. However, noisy labels often have a serious negative impact on label integration. In this case, most existing works always focus on designing more complex label integration methods to infer unknown true labels more accurately from multiple noisy labels, but little attention has been paid to another perspective, i.e., purifying noisy labels before label integration. In this paper, we aim to purify noisy labels for existing label integration methods and propose a label consistency-based worker filtering (LCWF) algorithm. In LCWF, we consider that if all low-quality workers are filtered out and only high-quality workers remain, the label consistency should be high. Therefore, we utilize label consistency to filter out low-quality workers. Firstly, we directly transform the worker filtering problem into a discrete optimization problem and utilize label consistency to define the fitness function for this problem. Then, we search for the optimal solution to this problem by a genetic algorithm. Finally, we filter out all labels from low-quality workers according to the optimal solution we obtained. Experimental results on simulated and real-world datasets demonstrate that LCWF can effectively purify noisy labels and improve the integration accuracy of existing label integration methods.} }
Endnote
%0 Conference Paper %T Label Consistency-based Worker Filtering for Crowdsourcing %A Jiao Li %A Liangxiao Jiang %A Chaoqun Li %A Wenjun Zhang %B Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2024 %E Negar Kiyavash %E Joris M. Mooij %F pmlr-v244-li24a %I PMLR %P 2226--2237 %U https://proceedings.mlr.press/v244/li24a.html %V 244 %X In crowdsourcing scenarios, we can obtain multiple noisy labels from different crowd workers on the Internet for each instance and then infer its unknown true label via a label integration method. However, noisy labels often have a serious negative impact on label integration. In this case, most existing works always focus on designing more complex label integration methods to infer unknown true labels more accurately from multiple noisy labels, but little attention has been paid to another perspective, i.e., purifying noisy labels before label integration. In this paper, we aim to purify noisy labels for existing label integration methods and propose a label consistency-based worker filtering (LCWF) algorithm. In LCWF, we consider that if all low-quality workers are filtered out and only high-quality workers remain, the label consistency should be high. Therefore, we utilize label consistency to filter out low-quality workers. Firstly, we directly transform the worker filtering problem into a discrete optimization problem and utilize label consistency to define the fitness function for this problem. Then, we search for the optimal solution to this problem by a genetic algorithm. Finally, we filter out all labels from low-quality workers according to the optimal solution we obtained. Experimental results on simulated and real-world datasets demonstrate that LCWF can effectively purify noisy labels and improve the integration accuracy of existing label integration methods.
APA
Li, J., Jiang, L., Li, C. & Zhang, W.. (2024). Label Consistency-based Worker Filtering for Crowdsourcing. Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 244:2226-2237 Available from https://proceedings.mlr.press/v244/li24a.html.

Related Material