Learning from Similarity-Confidence Data

Yuzhou Cao; Lei Feng; Yitian Xu; Bo An; Gang Niu; Masashi Sugiyama

Learning from Similarity-Confidence Data

Yuzhou Cao, Lei Feng, Yitian Xu, Bo An, Gang Niu, Masashi Sugiyama

Proceedings of the 38th International Conference on Machine Learning, PMLR 139:1272-1282, 2021.

Abstract

Weakly supervised learning has drawn considerable attention recently to reduce the expensive time and labor consumption of labeling massive data. In this paper, we investigate a novel weakly supervised learning problem of learning from similarity-confidence (Sconf) data, where only unlabeled data pairs equipped with confidence that illustrates their degree of similarity (two examples are similar if they belong to the same class) are needed for training a discriminative binary classifier. We propose an unbiased estimator of the classification risk that can be calculated from only Sconf data and show that the estimation error bound achieves the optimal convergence rate. To alleviate potential overfitting when flexible models are used, we further employ a risk correction scheme on the proposed risk estimator. Experimental results demonstrate the effectiveness of the proposed methods.

Cite this Paper

BibTeX


@InProceedings{pmlr-v139-cao21b,
  title = 	 {Learning from Similarity-Confidence Data},
  author =       {Cao, Yuzhou and Feng, Lei and Xu, Yitian and An, Bo and Niu, Gang and Sugiyama, Masashi},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {1272--1282},
  year = 	 {2021},
  editor = 	 {Meila, Marina and Zhang, Tong},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v139/cao21b/cao21b.pdf},
  url = 	 {https://proceedings.mlr.press/v139/cao21b.html},
  abstract = 	 {Weakly supervised learning has drawn considerable attention recently to reduce the expensive time and labor consumption of labeling massive data. In this paper, we investigate a novel weakly supervised learning problem of learning from similarity-confidence (Sconf) data, where only unlabeled data pairs equipped with confidence that illustrates their degree of similarity (two examples are similar if they belong to the same class) are needed for training a discriminative binary classifier. We propose an unbiased estimator of the classification risk that can be calculated from only Sconf data and show that the estimation error bound achieves the optimal convergence rate. To alleviate potential overfitting when flexible models are used, we further employ a risk correction scheme on the proposed risk estimator. Experimental results demonstrate the effectiveness of the proposed methods.}
}

Endnote

%0 Conference Paper
%T Learning from Similarity-Confidence Data
%A Yuzhou Cao
%A Lei Feng
%A Yitian Xu
%A Bo An
%A Gang Niu
%A Masashi Sugiyama
%B Proceedings of the 38th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Marina Meila
%E Tong Zhang	
%F pmlr-v139-cao21b
%I PMLR
%P 1272--1282
%U https://proceedings.mlr.press/v139/cao21b.html
%V 139
%X Weakly supervised learning has drawn considerable attention recently to reduce the expensive time and labor consumption of labeling massive data. In this paper, we investigate a novel weakly supervised learning problem of learning from similarity-confidence (Sconf) data, where only unlabeled data pairs equipped with confidence that illustrates their degree of similarity (two examples are similar if they belong to the same class) are needed for training a discriminative binary classifier. We propose an unbiased estimator of the classification risk that can be calculated from only Sconf data and show that the estimation error bound achieves the optimal convergence rate. To alleviate potential overfitting when flexible models are used, we further employ a risk correction scheme on the proposed risk estimator. Experimental results demonstrate the effectiveness of the proposed methods.

APA


Cao, Y., Feng, L., Xu, Y., An, B., Niu, G. & Sugiyama, M.. (2021). Learning from Similarity-Confidence Data. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:1272-1282 Available from https://proceedings.mlr.press/v139/cao21b.html.

Learning from Similarity-Confidence Data

Abstract

Cite this Paper

Related Material