[edit]
Label Consistency-based Worker Filtering for Crowdsourcing
Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence, PMLR 244:2226-2237, 2024.
Abstract
In crowdsourcing scenarios, we can obtain multiple noisy labels from different crowd workers on the Internet for each instance and then infer its unknown true label via a label integration method. However, noisy labels often have a serious negative impact on label integration. In this case, most existing works always focus on designing more complex label integration methods to infer unknown true labels more accurately from multiple noisy labels, but little attention has been paid to another perspective, i.e., purifying noisy labels before label integration. In this paper, we aim to purify noisy labels for existing label integration methods and propose a label consistency-based worker filtering (LCWF) algorithm. In LCWF, we consider that if all low-quality workers are filtered out and only high-quality workers remain, the label consistency should be high. Therefore, we utilize label consistency to filter out low-quality workers. Firstly, we directly transform the worker filtering problem into a discrete optimization problem and utilize label consistency to define the fitness function for this problem. Then, we search for the optimal solution to this problem by a genetic algorithm. Finally, we filter out all labels from low-quality workers according to the optimal solution we obtained. Experimental results on simulated and real-world datasets demonstrate that LCWF can effectively purify noisy labels and improve the integration accuracy of existing label integration methods.