A Rate of Convergence for Mixture Proportion Estimation, with Application to Learning from Noisy Labels


Clayton Scott ;
Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, PMLR 38:838-846, 2015.


Mixture proportion estimation (MPE) is a fundamental tool for solving a number of weakly supervised learning problems – supervised learning problems where label information is noisy or missing. Previous work on MPE has established a universally consistent estimator. In this work we establish a rate of convergence for mixture proportion estimation under an appropriate distributional assumption, and argue that this rate of convergence is useful for analyzing weakly supervised learning algorithms that build on MPE. To illustrate this idea, we examine an algorithm for classification in the presence of noisy labels based on surrogate risk minimization, and show that the rate of convergence for MPE enables proof of the algorithm’s consistency. Finally, we provide a practical implementation of mixture proportion estimation and demonstrate its efficacy in classification with noisy labels.

Related Material