Ignoring Is a Bliss: Learning with Large Noise Through Reweighting-Minimization
Proceedings of the 2017 Conference on Learning Theory, PMLR 65:1849-1881, 2017.
We consider learning in the presence of arbitrary noise that can overwhelm the signal in terms of magnitude on a fraction of data points observed (aka outliers). Standard approaches based on minimizing empirical loss can fail miserably and lead to arbitrary bad solutions in this setting. We propose an approach that iterates between finding a solution with minimal empirical loss and re-weighting the data, reinforcing data points where the previous solution works well. We show that our approach can handle arbitrarily large noise, is robust as having a non-trivial breakdown point, and converges linearly under certain conditions. The intuitive idea of our approach is to automatically exclude “difficult” data points from model fitting. More importantly (and perhaps surprisingly), we validate this intuition by establishing guarantees for generalization and iteration complexity that \em essentially ignore the presence of outliers