Robust Sparse Regression under Adversarial Corruption

Yudong Chen; Constantine Caramanis; Shie Mannor

Robust Sparse Regression under Adversarial Corruption

Yudong Chen, Constantine Caramanis, Shie Mannor

Proceedings of the 30th International Conference on Machine Learning, PMLR 28(3):774-782, 2013.

Abstract

We consider high dimensional sparse regression with arbitrary – possibly, severe or coordinated – errors in the covariates matrix. We are interested in understanding how many corruptions we can tolerate, while identifying the correct support. To the best of our knowledge, neither standard outlier rejection techniques, nor recently developed robust regression algorithms (that focus only on corrupted response variables), nor recent algorithms for dealing with stochastic noise or erasures, can provide guarantees on support recovery. As we show, neither can the natural brute force algorithm that takes exponential time to find the subset of data and support columns, that yields the smallest regression error. We explore the power of a simple idea: replace the essential linear algebraic calculation – the inner product – with a robust counterpart that cannot be greatly affected by a controlled number of arbitrarily corrupted points: the trimmed inner product. We consider three popular algorithms in the uncorrupted setting: Thresholding Regression, Lasso, and the Dantzig selector, and show that the counterparts obtained using the trimmed inner product are provably robust.

Cite this Paper

BibTeX

@InProceedings{pmlr-v28-chen13h,
  title = 	 {Robust Sparse Regression under Adversarial Corruption},
  author = 	 {Chen, Yudong and Caramanis, Constantine and Mannor, Shie},
  booktitle = 	 {Proceedings of the 30th International Conference on Machine Learning},
  pages = 	 {774--782},
  year = 	 {2013},
  editor = 	 {Dasgupta, Sanjoy and McAllester, David},
  volume = 	 {28},
  number =       {3},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Atlanta, Georgia, USA},
  month = 	 {17--19 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v28/chen13h.pdf},
  url = 	 {https://proceedings.mlr.press/v28/chen13h.html},
  abstract = 	 {We consider high dimensional sparse regression with arbitrary – possibly, severe or coordinated – errors in the covariates matrix. We are interested in understanding how many corruptions we can tolerate, while identifying the correct support. To the best of our knowledge, neither standard outlier rejection techniques, nor recently developed robust regression algorithms (that focus only on corrupted response variables), nor recent algorithms for dealing with stochastic noise or erasures, can provide guarantees on support recovery. As we show, neither can the natural brute force algorithm that takes exponential time to find the subset of data and support columns, that yields the smallest regression error.     We explore the power of a simple idea: replace the essential linear algebraic calculation – the inner product – with a robust counterpart that cannot be greatly affected by a controlled number of arbitrarily corrupted points: the trimmed inner product. We consider three popular algorithms in the uncorrupted setting: Thresholding Regression, Lasso, and the Dantzig selector, and show that the counterparts obtained using the trimmed inner product are provably robust.}
}

Endnote

%0 Conference Paper
%T Robust Sparse Regression under Adversarial Corruption
%A Yudong Chen
%A Constantine Caramanis
%A Shie Mannor
%B Proceedings of the 30th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2013
%E Sanjoy Dasgupta
%E David McAllester	
%F pmlr-v28-chen13h
%I PMLR
%P 774--782
%U https://proceedings.mlr.press/v28/chen13h.html
%V 28
%N 3
%X We consider high dimensional sparse regression with arbitrary – possibly, severe or coordinated – errors in the covariates matrix. We are interested in understanding how many corruptions we can tolerate, while identifying the correct support. To the best of our knowledge, neither standard outlier rejection techniques, nor recently developed robust regression algorithms (that focus only on corrupted response variables), nor recent algorithms for dealing with stochastic noise or erasures, can provide guarantees on support recovery. As we show, neither can the natural brute force algorithm that takes exponential time to find the subset of data and support columns, that yields the smallest regression error.     We explore the power of a simple idea: replace the essential linear algebraic calculation – the inner product – with a robust counterpart that cannot be greatly affected by a controlled number of arbitrarily corrupted points: the trimmed inner product. We consider three popular algorithms in the uncorrupted setting: Thresholding Regression, Lasso, and the Dantzig selector, and show that the counterparts obtained using the trimmed inner product are provably robust.

RIS

TY  - CPAPER
TI  - Robust Sparse Regression under Adversarial Corruption
AU  - Yudong Chen
AU  - Constantine Caramanis
AU  - Shie Mannor
BT  - Proceedings of the 30th International Conference on Machine Learning
DA  - 2013/05/26
ED  - Sanjoy Dasgupta
ED  - David McAllester	
ID  - pmlr-v28-chen13h
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 28
IS  - 3
SP  - 774
EP  - 782
L1  - http://proceedings.mlr.press/v28/chen13h.pdf
UR  - https://proceedings.mlr.press/v28/chen13h.html
AB  - We consider high dimensional sparse regression with arbitrary – possibly, severe or coordinated – errors in the covariates matrix. We are interested in understanding how many corruptions we can tolerate, while identifying the correct support. To the best of our knowledge, neither standard outlier rejection techniques, nor recently developed robust regression algorithms (that focus only on corrupted response variables), nor recent algorithms for dealing with stochastic noise or erasures, can provide guarantees on support recovery. As we show, neither can the natural brute force algorithm that takes exponential time to find the subset of data and support columns, that yields the smallest regression error.     We explore the power of a simple idea: replace the essential linear algebraic calculation – the inner product – with a robust counterpart that cannot be greatly affected by a controlled number of arbitrarily corrupted points: the trimmed inner product. We consider three popular algorithms in the uncorrupted setting: Thresholding Regression, Lasso, and the Dantzig selector, and show that the counterparts obtained using the trimmed inner product are provably robust.
ER  -

APA

Chen, Y., Caramanis, C. & Mannor, S.. (2013). Robust Sparse Regression under Adversarial Corruption. Proceedings of the 30th International Conference on Machine Learning, in Proceedings of Machine Learning Research 28(3):774-782 Available from https://proceedings.mlr.press/v28/chen13h.html.

Robust Sparse Regression under Adversarial Corruption

Abstract

Cite this Paper

Related Material