High Dimensional Robust Sparse Regression

Liu Liu; Yanyao Shen; Tianyang Li; Constantine Caramanis

High Dimensional Robust Sparse Regression

Liu Liu, Yanyao Shen, Tianyang Li, Constantine Caramanis

Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:411-421, 2020.

Abstract

We provide a novel – and to the best of our knowledge, the first – algorithm for high dimensional sparse regression with constant fraction of corruptions in explanatory and/or response variables. Our algorithm recovers the true sparse parameters with sub-linear sample complexity,in the presence of a constant fraction of arbitrary corruptions. Our main contribution is a robust variant of Iterative Hard Thresholding. Using this, we provide accurate estimators:when the covariance matrix in sparse regression is identity, our error guarantee is near information-theoretically optimal. We then deal with robust sparse regression with unknown structured covariance matrix. We propose a filtering algorithm whichconsists of a novel randomized outlier removal technique for robust sparse mean estimation that may be of interest in its own right: the filtering algorithm is flexible enough to deal with unknown covariance.Also, it is orderwise more efficient computationally than the ellipsoid algorithm.Using sub-linear sample complexity, our algorithm achieves the best known (and first) error guarantee. We demonstrate the effectiveness on large-scale sparse regression problems with arbitrary corruptions.

Cite this Paper

BibTeX

@InProceedings{pmlr-v108-liu20b,
  title = 	 {High Dimensional Robust Sparse Regression},
  author =       {Liu, Liu and Shen, Yanyao and Li, Tianyang and Caramanis, Constantine},
  booktitle = 	 {Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics},
  pages = 	 {411--421},
  year = 	 {2020},
  editor = 	 {Chiappa, Silvia and Calandra, Roberto},
  volume = 	 {108},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {26--28 Aug},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v108/liu20b/liu20b.pdf},
  url = 	 {https://proceedings.mlr.press/v108/liu20b.html},
  abstract = 	 {We provide a novel – and to the best of our knowledge, the first – algorithm for high dimensional sparse regression with constant fraction of corruptions in explanatory and/or response variables. Our algorithm recovers the true sparse parameters with sub-linear sample complexity,in the presence of a constant fraction of arbitrary corruptions. Our main contribution is a robust variant of Iterative Hard Thresholding. Using this, we provide accurate estimators:when the covariance matrix in sparse regression is identity,  our error guarantee is near information-theoretically optimal. We then deal with robust sparse regression with unknown structured covariance matrix. We propose a filtering algorithm whichconsists of a novel randomized outlier removal technique for robust sparse mean estimation that may be of interest in its own right: the filtering algorithm is flexible enough to deal with unknown covariance.Also, it is orderwise more efficient computationally than the ellipsoid algorithm.Using sub-linear sample complexity, our algorithm achieves the best known (and first) error guarantee. We demonstrate the effectiveness on large-scale sparse regression problems with arbitrary corruptions.}
}

Endnote

%0 Conference Paper
%T High Dimensional Robust Sparse Regression
%A Liu Liu
%A Yanyao Shen
%A Tianyang Li
%A Constantine Caramanis
%B Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2020
%E Silvia Chiappa
%E Roberto Calandra	
%F pmlr-v108-liu20b
%I PMLR
%P 411--421
%U https://proceedings.mlr.press/v108/liu20b.html
%V 108
%X We provide a novel – and to the best of our knowledge, the first – algorithm for high dimensional sparse regression with constant fraction of corruptions in explanatory and/or response variables. Our algorithm recovers the true sparse parameters with sub-linear sample complexity,in the presence of a constant fraction of arbitrary corruptions. Our main contribution is a robust variant of Iterative Hard Thresholding. Using this, we provide accurate estimators:when the covariance matrix in sparse regression is identity,  our error guarantee is near information-theoretically optimal. We then deal with robust sparse regression with unknown structured covariance matrix. We propose a filtering algorithm whichconsists of a novel randomized outlier removal technique for robust sparse mean estimation that may be of interest in its own right: the filtering algorithm is flexible enough to deal with unknown covariance.Also, it is orderwise more efficient computationally than the ellipsoid algorithm.Using sub-linear sample complexity, our algorithm achieves the best known (and first) error guarantee. We demonstrate the effectiveness on large-scale sparse regression problems with arbitrary corruptions.

APA

Liu, L., Shen, Y., Li, T. & Caramanis, C.. (2020). High Dimensional Robust Sparse Regression. Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 108:411-421 Available from https://proceedings.mlr.press/v108/liu20b.html.

High Dimensional Robust Sparse Regression

Abstract

Cite this Paper

Related Material