Learning with Bad Training Data via Iterative Trimmed Loss Minimization
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:5739-5748, 2019.
In this paper, we study a simple and generic framework to tackle the problem of learning model parameters when a fraction of the training samples are corrupted. Our approach is motivated by a simple observation: in a variety of such settings, the evolution of training accuracy (as a function of training epochs) is different for clean samples and bad samples. We propose to iteratively minimize the trimmed loss, by alternating between (a) selecting samples with lowest current loss, and (b) retraining a model on only these samples. Analytically, we characterize the statistical performance and convergence rate of the algorithm for simple and natural linear and non-linear models. Experimentally, we demonstrate its effectiveness in three settings: (a) deep image classifiers with errors only in labels, (b) generative adversarial networks with bad training images, and (c) deep image classifiers with adversarial (image, label) pairs (i.e., backdoor attacks). For the well-studied setting of random label noise, our algorithm achieves state-of-the-art performance without having access to any a-priori guaranteed clean samples.