[edit]
Learning from Crowds with Dual-View K-Nearest Neighbor
Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence, PMLR 244:2238-2249, 2024.
Abstract
In crowdsourcing scenarios, we can obtain multiple noisy labels from different crowd workers for each instance and then infer its integrated label via label integration. To achieve better performance, some recently published label integration methods have attempted to exploit the multiple noisy labels of inferred instances’ nearest neighbors via the K-nearest neighbor (KNN) algorithm. However, the used KNN algorithm searches inferred instances’ nearest neighbors only relying on the defined distance functions in the original attribute view and totally ignoring the valuable information hidden in the multiple noisy labels, which limits their performance. Motivated by multi-view learning, we define the multiple noisy labels as another label view of instances and propose to search inferred instances’ nearest neighbors using the joint information from both the original attribute view and the multiple noisy label view. To this end, we propose a novel label integration method called dual-view K-nearest neighbor (DVKNN). In DVKNN, we first define a new distance function to search the K-nearest neighbors of an inferred instance. Then, we define a fine-grained weight for each noisy label from each neighbor. Finally, we perform weighted majority voting (WMV) on all these noisy labels to obtain the integrated label of the inferred instance. Extensive experiments validate the effectiveness and rationality of DVKNN.