Understanding and Utilizing Deep Neural Networks Trained with Noisy Labels

Pengfei Chen; Ben Ben Liao; Guangyong Chen; Shengyu Zhang

Understanding and Utilizing Deep Neural Networks Trained with Noisy Labels

Pengfei Chen, Ben Ben Liao, Guangyong Chen, Shengyu Zhang

Proceedings of the 36th International Conference on Machine Learning, PMLR 97:1062-1070, 2019.

Abstract

Noisy labels are ubiquitous in real-world datasets, which poses a challenge for robustly training deep neural networks (DNNs) as DNNs usually have the high capacity to memorize the noisy labels. In this paper, we find that the test accuracy can be quantitatively characterized in terms of the noise ratio in datasets. In particular, the test accuracy is a quadratic function of the noise ratio in the case of symmetric noise, which explains the experimental findings previously published. Based on our analysis, we apply cross-validation to randomly split noisy datasets, which identifies most samples that have correct labels. Then we adopt the Co-teaching strategy which takes full advantage of the identified samples to train DNNs robustly against noisy labels. Compared with extensive state-of-the-art methods, our strategy consistently improves the generalization performance of DNNs under both synthetic and real-world training noise.

Cite this Paper

BibTeX

@InProceedings{pmlr-v97-chen19g,
  title = 	 {Understanding and Utilizing Deep Neural Networks Trained with Noisy Labels},
  author =       {Chen, Pengfei and Liao, Ben Ben and Chen, Guangyong and Zhang, Shengyu},
  booktitle = 	 {Proceedings of the 36th International Conference on Machine Learning},
  pages = 	 {1062--1070},
  year = 	 {2019},
  editor = 	 {Chaudhuri, Kamalika and Salakhutdinov, Ruslan},
  volume = 	 {97},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {09--15 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v97/chen19g/chen19g.pdf},
  url = 	 {https://proceedings.mlr.press/v97/chen19g.html},
  abstract = 	 {Noisy labels are ubiquitous in real-world datasets, which poses a challenge for robustly training deep neural networks (DNNs) as DNNs usually have the high capacity to memorize the noisy labels. In this paper, we find that the test accuracy can be quantitatively characterized in terms of the noise ratio in datasets. In particular, the test accuracy is a quadratic function of the noise ratio in the case of symmetric noise, which explains the experimental findings previously published. Based on our analysis, we apply cross-validation to randomly split noisy datasets, which identifies most samples that have correct labels. Then we adopt the Co-teaching strategy which takes full advantage of the identified samples to train DNNs robustly against noisy labels. Compared with extensive state-of-the-art methods, our strategy consistently improves the generalization performance of DNNs under both synthetic and real-world training noise.}
}

Endnote

%0 Conference Paper
%T Understanding and Utilizing Deep Neural Networks Trained with Noisy Labels
%A Pengfei Chen
%A Ben Ben Liao
%A Guangyong Chen
%A Shengyu Zhang
%B Proceedings of the 36th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2019
%E Kamalika Chaudhuri
%E Ruslan Salakhutdinov	
%F pmlr-v97-chen19g
%I PMLR
%P 1062--1070
%U https://proceedings.mlr.press/v97/chen19g.html
%V 97
%X Noisy labels are ubiquitous in real-world datasets, which poses a challenge for robustly training deep neural networks (DNNs) as DNNs usually have the high capacity to memorize the noisy labels. In this paper, we find that the test accuracy can be quantitatively characterized in terms of the noise ratio in datasets. In particular, the test accuracy is a quadratic function of the noise ratio in the case of symmetric noise, which explains the experimental findings previously published. Based on our analysis, we apply cross-validation to randomly split noisy datasets, which identifies most samples that have correct labels. Then we adopt the Co-teaching strategy which takes full advantage of the identified samples to train DNNs robustly against noisy labels. Compared with extensive state-of-the-art methods, our strategy consistently improves the generalization performance of DNNs under both synthetic and real-world training noise.

APA

Chen, P., Liao, B.B., Chen, G. & Zhang, S.. (2019). Understanding and Utilizing Deep Neural Networks Trained with Noisy Labels. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:1062-1070 Available from https://proceedings.mlr.press/v97/chen19g.html.

Understanding and Utilizing Deep Neural Networks Trained with Noisy Labels

Abstract

Cite this Paper

Related Material