List Decodable Subspace Recovery

Prasad Raghavendra, Morris Yau
Proceedings of Thirty Third Conference on Learning Theory, PMLR 125:3206-3226, 2020.

Abstract

Learning from data in the presence of outliers is a fundamental problem in statistics. In this work, we study robust statistics in the presence of overwhelming outliers for the fundamental problem of subspace recovery. Given a dataset where an α fraction (less than half) of the data is distributed uniformly in an unknown k dimensional subspace in d dimensions, and with no additional assumptions on the remaining data, the goal is to recover a succinct list of O(1α) subspaces one of which is nontrivially correlated with the planted subspace. We provide the first polynomial time algorithm for the ’list decodable subspace recovery’ problem, and subsume it under a more general framework of list decoding over distributions that are "certifiably resilient" capturing state of the art results for list decodable mean estimation and regression.

Cite this Paper


BibTeX
@InProceedings{pmlr-v125-raghavendra20a, title = {List Decodable Subspace Recovery}, author = {Raghavendra, Prasad and Yau, Morris}, booktitle = {Proceedings of Thirty Third Conference on Learning Theory}, pages = {3206--3226}, year = {2020}, editor = {Abernethy, Jacob and Agarwal, Shivani}, volume = {125}, series = {Proceedings of Machine Learning Research}, month = {09--12 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v125/raghavendra20a/raghavendra20a.pdf}, url = {https://proceedings.mlr.press/v125/raghavendra20a.html}, abstract = { Learning from data in the presence of outliers is a fundamental problem in statistics. In this work, we study robust statistics in the presence of overwhelming outliers for the fundamental problem of subspace recovery. Given a dataset where an $\alpha$ fraction (less than half) of the data is distributed uniformly in an unknown $k$ dimensional subspace in $d$ dimensions, and with no additional assumptions on the remaining data, the goal is to recover a succinct list of $O(\frac{1}{\alpha})$ subspaces one of which is nontrivially correlated with the planted subspace. We provide the first polynomial time algorithm for the ’list decodable subspace recovery’ problem, and subsume it under a more general framework of list decoding over distributions that are "certifiably resilient" capturing state of the art results for list decodable mean estimation and regression.} }
Endnote
%0 Conference Paper %T List Decodable Subspace Recovery %A Prasad Raghavendra %A Morris Yau %B Proceedings of Thirty Third Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2020 %E Jacob Abernethy %E Shivani Agarwal %F pmlr-v125-raghavendra20a %I PMLR %P 3206--3226 %U https://proceedings.mlr.press/v125/raghavendra20a.html %V 125 %X Learning from data in the presence of outliers is a fundamental problem in statistics. In this work, we study robust statistics in the presence of overwhelming outliers for the fundamental problem of subspace recovery. Given a dataset where an $\alpha$ fraction (less than half) of the data is distributed uniformly in an unknown $k$ dimensional subspace in $d$ dimensions, and with no additional assumptions on the remaining data, the goal is to recover a succinct list of $O(\frac{1}{\alpha})$ subspaces one of which is nontrivially correlated with the planted subspace. We provide the first polynomial time algorithm for the ’list decodable subspace recovery’ problem, and subsume it under a more general framework of list decoding over distributions that are "certifiably resilient" capturing state of the art results for list decodable mean estimation and regression.
APA
Raghavendra, P. & Yau, M.. (2020). List Decodable Subspace Recovery. Proceedings of Thirty Third Conference on Learning Theory, in Proceedings of Machine Learning Research 125:3206-3226 Available from https://proceedings.mlr.press/v125/raghavendra20a.html.

Related Material