High Dimensional Robust Sparse Regression
[edit]
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:411421, 2020.
Abstract
We provide a novel – and to the best of our knowledge, the first – algorithm for high dimensional sparse regression with constant fraction of corruptions in explanatory and/or response variables. Our algorithm recovers the true sparse parameters with sublinear sample complexity,in the presence of a constant fraction of arbitrary corruptions. Our main contribution is a robust variant of Iterative Hard Thresholding. Using this, we provide accurate estimators:when the covariance matrix in sparse regression is identity, our error guarantee is near informationtheoretically optimal. We then deal with robust sparse regression with unknown structured covariance matrix. We propose a filtering algorithm whichconsists of a novel randomized outlier removal technique for robust sparse mean estimation that may be of interest in its own right: the filtering algorithm is flexible enough to deal with unknown covariance.Also, it is orderwise more efficient computationally than the ellipsoid algorithm.Using sublinear sample complexity, our algorithm achieves the best known (and first) error guarantee. We demonstrate the effectiveness on largescale sparse regression problems with arbitrary corruptions.
Related Material


