Spectral Gap Error Bounds for Improving CUR Matrix Decomposition and the Nyström Method

[edit]

David Anderson, Simon Du, Michael Mahoney, Christopher Melgaard, Kunming Wu, Ming Gu ;
Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, PMLR 38:19-27, 2015.

Abstract

The CUR matrix decomposition and the related Nyström method build low-rank approximations of data matrices by selecting a small number of representative rows and columns of the data. Here, we introduce novel \emphspectral gap error bounds that judiciously exploit the potentially rapid spectrum decay in the input matrix, a most common occurrence in machine learning and data analysis. Our error bounds are much tighter than existing ones for matrices with rapid spectrum decay, and they justify the use of a constant amount of oversampling relative to the rank parameter k, i.e, when the number of columns/rows is \ell=k+ O(1). We demonstrate our analysis on a novel deterministic algorithm, \emphStableCUR, which additionally eliminates a previously unrecognized source of potential instability in CUR decompositions. While our algorithm accepts any method of row and column selection, we implement it with a recent column selection scheme with strong singular value bounds. Empirical results on various classes of real world data matrices demonstrate that our algorithm is as efficient as and often outperforms competing algorithms.

Related Material