Understanding Instance-Level Label Noise: Disparate Impacts and Treatments

Yang Liu
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:6725-6735, 2021.

Abstract

This paper aims to provide understandings for the effect of an over-parameterized model, e.g. a deep neural network, memorizing instance-dependent noisy labels. We first quantify the harms caused by memorizing noisy instances, and show the disparate impacts of noisy labels for sample instances with different representation frequencies. We then analyze how several popular solutions for learning with noisy labels mitigate this harm at the instance level. Our analysis reveals that existing approaches lead to disparate treatments when handling noisy instances. While higher-frequency instances often enjoy a high probability of an improvement by applying these solutions, lower-frequency instances do not. Our analysis reveals new understandings for when these approaches work, and provides theoretical justifications for previously reported empirical observations. This observation requires us to rethink the distribution of label noise across instances and calls for different treatments for instances in different regimes.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-liu21a, title = {Understanding Instance-Level Label Noise: Disparate Impacts and Treatments}, author = {Liu, Yang}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {6725--6735}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/liu21a/liu21a.pdf}, url = {https://proceedings.mlr.press/v139/liu21a.html}, abstract = {This paper aims to provide understandings for the effect of an over-parameterized model, e.g. a deep neural network, memorizing instance-dependent noisy labels. We first quantify the harms caused by memorizing noisy instances, and show the disparate impacts of noisy labels for sample instances with different representation frequencies. We then analyze how several popular solutions for learning with noisy labels mitigate this harm at the instance level. Our analysis reveals that existing approaches lead to disparate treatments when handling noisy instances. While higher-frequency instances often enjoy a high probability of an improvement by applying these solutions, lower-frequency instances do not. Our analysis reveals new understandings for when these approaches work, and provides theoretical justifications for previously reported empirical observations. This observation requires us to rethink the distribution of label noise across instances and calls for different treatments for instances in different regimes.} }
Endnote
%0 Conference Paper %T Understanding Instance-Level Label Noise: Disparate Impacts and Treatments %A Yang Liu %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-liu21a %I PMLR %P 6725--6735 %U https://proceedings.mlr.press/v139/liu21a.html %V 139 %X This paper aims to provide understandings for the effect of an over-parameterized model, e.g. a deep neural network, memorizing instance-dependent noisy labels. We first quantify the harms caused by memorizing noisy instances, and show the disparate impacts of noisy labels for sample instances with different representation frequencies. We then analyze how several popular solutions for learning with noisy labels mitigate this harm at the instance level. Our analysis reveals that existing approaches lead to disparate treatments when handling noisy instances. While higher-frequency instances often enjoy a high probability of an improvement by applying these solutions, lower-frequency instances do not. Our analysis reveals new understandings for when these approaches work, and provides theoretical justifications for previously reported empirical observations. This observation requires us to rethink the distribution of label noise across instances and calls for different treatments for instances in different regimes.
APA
Liu, Y.. (2021). Understanding Instance-Level Label Noise: Disparate Impacts and Treatments. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:6725-6735 Available from https://proceedings.mlr.press/v139/liu21a.html.

Related Material