Classification with Asymmetric Label Noise: Consistency and Maximal Denoising

Clayton Scott; Gilles Blanchard; Gregory Handy

Classification with Asymmetric Label Noise: Consistency and Maximal Denoising

Clayton Scott, Gilles Blanchard, Gregory Handy

Proceedings of the 26th Annual Conference on Learning Theory, PMLR 30:489-511, 2013.

Abstract

In many real-world classification problems, the labels of training examples are randomly corrupted. Thus, the set of training examples for each class is contaminated by examples of the other class. Previous theoretical work on this problem assumes that the two classes are separable, that the label noise is independent of the true class label, or that the noise proportions for each class are known. We introduce a general framework for classification with label noise that eliminates these assumptions. Instead, we give assumptions ensuring identifiability and the existence of a consistent estimator of the optimal risk, with associated estimation strategies. For any arbitrary pair of contaminated distributions, there is a unique pair of non-contaminated distributions satisfying the proposed assumptions, and we argue that this solution corresponds in a certain sense to maximal denoising. In particular, we find that learning in the presence of label noise is possible even when the class-conditional distributions overlap and the label noise is not symmetric. A key to our approach is a universally consistent estimator of the maximal proportion of one distribution that is present in another, a problem we refer to as“mixture proportion estimation. This work is motivated by a problem in nuclear particle classification.

Cite this Paper

BibTeX


@InProceedings{pmlr-v30-Scott13,
  title = 	 {Classification with Asymmetric Label Noise: Consistency and Maximal Denoising},
  author = 	 {Scott, Clayton and Blanchard, Gilles and Handy, Gregory},
  booktitle = 	 {Proceedings of the 26th Annual Conference on Learning Theory},
  pages = 	 {489--511},
  year = 	 {2013},
  editor = 	 {Shalev-Shwartz, Shai and Steinwart, Ingo},
  volume = 	 {30},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Princeton, NJ, USA},
  month = 	 {12--14 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v30/Scott13.pdf},
  url = 	 {https://proceedings.mlr.press/v30/Scott13.html},
  abstract = 	 {In many real-world classification problems, the labels of training examples are randomly corrupted. Thus, the set of training examples for each class is contaminated by examples of the other class. Previous theoretical work on this problem assumes that the two classes are separable, that the label noise is independent of the true class label, or that the noise proportions for each class are known. We introduce a general framework for classification with label noise that eliminates these assumptions. Instead, we give assumptions ensuring identifiability and the existence of a consistent estimator of the optimal risk, with associated estimation strategies. For any arbitrary pair of contaminated distributions, there is a unique pair of non-contaminated distributions satisfying the proposed assumptions, and we argue that this solution corresponds in a certain sense to maximal denoising. In particular, we find that learning in the presence of label noise is possible even when the class-conditional distributions overlap and the label noise is not symmetric. A key to our approach is a universally consistent estimator of the maximal proportion of one distribution that is present in another, a problem we refer to as“mixture proportion estimation. This work is motivated by a problem in nuclear particle classification.}
}

Endnote

%0 Conference Paper
%T Classification with Asymmetric Label Noise: Consistency and Maximal Denoising
%A Clayton Scott
%A Gilles Blanchard
%A Gregory Handy
%B Proceedings of the 26th Annual Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2013
%E Shai Shalev-Shwartz
%E Ingo Steinwart	
%F pmlr-v30-Scott13
%I PMLR
%P 489--511
%U https://proceedings.mlr.press/v30/Scott13.html
%V 30
%X In many real-world classification problems, the labels of training examples are randomly corrupted. Thus, the set of training examples for each class is contaminated by examples of the other class. Previous theoretical work on this problem assumes that the two classes are separable, that the label noise is independent of the true class label, or that the noise proportions for each class are known. We introduce a general framework for classification with label noise that eliminates these assumptions. Instead, we give assumptions ensuring identifiability and the existence of a consistent estimator of the optimal risk, with associated estimation strategies. For any arbitrary pair of contaminated distributions, there is a unique pair of non-contaminated distributions satisfying the proposed assumptions, and we argue that this solution corresponds in a certain sense to maximal denoising. In particular, we find that learning in the presence of label noise is possible even when the class-conditional distributions overlap and the label noise is not symmetric. A key to our approach is a universally consistent estimator of the maximal proportion of one distribution that is present in another, a problem we refer to as“mixture proportion estimation. This work is motivated by a problem in nuclear particle classification.

RIS


TY  - CPAPER
TI  - Classification with Asymmetric Label Noise: Consistency and Maximal Denoising
AU  - Clayton Scott
AU  - Gilles Blanchard
AU  - Gregory Handy
BT  - Proceedings of the 26th Annual Conference on Learning Theory
DA  - 2013/06/13
ED  - Shai Shalev-Shwartz
ED  - Ingo Steinwart	
ID  - pmlr-v30-Scott13
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 30
SP  - 489
EP  - 511
L1  - http://proceedings.mlr.press/v30/Scott13.pdf
UR  - https://proceedings.mlr.press/v30/Scott13.html
AB  - In many real-world classification problems, the labels of training examples are randomly corrupted. Thus, the set of training examples for each class is contaminated by examples of the other class. Previous theoretical work on this problem assumes that the two classes are separable, that the label noise is independent of the true class label, or that the noise proportions for each class are known. We introduce a general framework for classification with label noise that eliminates these assumptions. Instead, we give assumptions ensuring identifiability and the existence of a consistent estimator of the optimal risk, with associated estimation strategies. For any arbitrary pair of contaminated distributions, there is a unique pair of non-contaminated distributions satisfying the proposed assumptions, and we argue that this solution corresponds in a certain sense to maximal denoising. In particular, we find that learning in the presence of label noise is possible even when the class-conditional distributions overlap and the label noise is not symmetric. A key to our approach is a universally consistent estimator of the maximal proportion of one distribution that is present in another, a problem we refer to as“mixture proportion estimation. This work is motivated by a problem in nuclear particle classification.
ER  -

APA


Scott, C., Blanchard, G. & Handy, G.. (2013). Classification with Asymmetric Label Noise: Consistency and Maximal Denoising. Proceedings of the 26th Annual Conference on Learning Theory, in Proceedings of Machine Learning Research 30:489-511 Available from https://proceedings.mlr.press/v30/Scott13.html.

Related Material

Download PDF