Coherence Pursuit: Fast, Simple, and Robust Subspace Recovery
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:2864-2873, 2017.
This paper presents a remarkably simple, yet powerful, algorithm for robust Principal Component Analysis (PCA). In the proposed approach, an outlier is set apart from an inlier by comparing their coherence with the rest of the data points. As inliers lie on a low dimensional subspace, they are likely to have strong mutual coherence provided there are enough inliers. By contrast, outliers do not typically admit low dimensional structures, wherefore an outlier is unlikely to bear strong resemblance with a large number of data points. The mutual coherences are computed by forming the Gram matrix of normalized data points. Subsequently, the subspace is recovered from the span of a small subset of the data points that exhibit strong coherence with the rest of the data. As coherence pursuit only involves one simple matrix multiplication, it is significantly faster than the state of-the-art robust PCA algorithms. We provide a mathematical analysis of the proposed algorithm under a random model for the distribution of the inliers and outliers. It is shown that the proposed method can recover the correct subspace even if the data is predominantly outliers. To the best of our knowledge, this is the first provable robust PCA algorithm that is simultaneously non-iterative, can tolerate a large number of outliers and is robust to linearly dependent outliers.