Random Consensus Robust PCA

Daniel Pimentel-Alarcon, Robert Nowak
Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, PMLR 54:344-352, 2017.

Abstract

This paper presents R2PCA, a random consensus method for robust principal component analysis. R2PCA takes RANSAC’s principle of using as little data as possible one step further. It iteratively selects small subsets of the data to identify pieces of the principal components, to then stitch them together. We show that if the principal components are in general position and the errors are sufficiently sparse, R2PCA will exactly recover the principal components with probability 1, in lieu of assumptions on coherence or the distribution of the sparse errors, and even under adversarial settings. R2PCA enjoys many advantages: it works well under noise, its computational complexity scales linearly in the ambient dimension, it is easily parallelizable, and due to its low sample complexity, it can be used in settings where data is so large it cannot even be stored in memory. We complement our theoretical findings with synthetic and real data experiments showing that r2pca outperforms state-of-the-art methods in a broad range of settings.

Cite this Paper


BibTeX
@InProceedings{pmlr-v54-pimentel-alarcon17a, title = {{Random Consensus Robust PCA}}, author = {Pimentel-Alarcon, Daniel and Nowak, Robert}, booktitle = {Proceedings of the 20th International Conference on Artificial Intelligence and Statistics}, pages = {344--352}, year = {2017}, editor = {Singh, Aarti and Zhu, Jerry}, volume = {54}, series = {Proceedings of Machine Learning Research}, month = {20--22 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v54/pimentel-alarcon17a/pimentel-alarcon17a.pdf}, url = {https://proceedings.mlr.press/v54/pimentel-alarcon17a.html}, abstract = {This paper presents R2PCA, a random consensus method for robust principal component analysis. R2PCA takes RANSAC’s principle of using as little data as possible one step further. It iteratively selects small subsets of the data to identify pieces of the principal components, to then stitch them together. We show that if the principal components are in general position and the errors are sufficiently sparse, R2PCA will exactly recover the principal components with probability 1, in lieu of assumptions on coherence or the distribution of the sparse errors, and even under adversarial settings. R2PCA enjoys many advantages: it works well under noise, its computational complexity scales linearly in the ambient dimension, it is easily parallelizable, and due to its low sample complexity, it can be used in settings where data is so large it cannot even be stored in memory. We complement our theoretical findings with synthetic and real data experiments showing that r2pca outperforms state-of-the-art methods in a broad range of settings.} }
Endnote
%0 Conference Paper %T Random Consensus Robust PCA %A Daniel Pimentel-Alarcon %A Robert Nowak %B Proceedings of the 20th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2017 %E Aarti Singh %E Jerry Zhu %F pmlr-v54-pimentel-alarcon17a %I PMLR %P 344--352 %U https://proceedings.mlr.press/v54/pimentel-alarcon17a.html %V 54 %X This paper presents R2PCA, a random consensus method for robust principal component analysis. R2PCA takes RANSAC’s principle of using as little data as possible one step further. It iteratively selects small subsets of the data to identify pieces of the principal components, to then stitch them together. We show that if the principal components are in general position and the errors are sufficiently sparse, R2PCA will exactly recover the principal components with probability 1, in lieu of assumptions on coherence or the distribution of the sparse errors, and even under adversarial settings. R2PCA enjoys many advantages: it works well under noise, its computational complexity scales linearly in the ambient dimension, it is easily parallelizable, and due to its low sample complexity, it can be used in settings where data is so large it cannot even be stored in memory. We complement our theoretical findings with synthetic and real data experiments showing that r2pca outperforms state-of-the-art methods in a broad range of settings.
APA
Pimentel-Alarcon, D. & Nowak, R.. (2017). Random Consensus Robust PCA. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 54:344-352 Available from https://proceedings.mlr.press/v54/pimentel-alarcon17a.html.

Related Material