Revisiting the Nystrom method for improved large-scale machine learning

Alex Gittens; Michael Mahoney

Revisiting the Nystrom method for improved large-scale machine learning

Alex Gittens, Michael Mahoney

Proceedings of the 30th International Conference on Machine Learning, PMLR 28(3):567-575, 2013.

Abstract

We reconsider randomized algorithms for the low-rank approximation of SPSD matrices such as Laplacian and kernel matrices that arise in data analysis and machine learning applications. Our main results consist of an empirical evaluation of the performance quality and running time of sampling and projection methods on a diverse suite of SPSD matrices. Our results highlight complementary aspects of sampling versus projection methods, and they point to differences between uniform and nonuniform sampling methods based on leverage scores. We complement our empirical results with a suite of worst-case theoretical bounds for both random sampling and random projection methods. These bounds are qualitatively superior to existing bounds— e.g., improved additive-error bounds for spectral and Frobenius norm error and relative-error bounds for trace norm error.

Cite this Paper

BibTeX


@InProceedings{pmlr-v28-gittens13,
  title = 	 {Revisiting the Nystrom method for improved large-scale machine learning},
  author = 	 {Gittens, Alex and Mahoney, Michael},
  booktitle = 	 {Proceedings of the 30th International Conference on Machine Learning},
  pages = 	 {567--575},
  year = 	 {2013},
  editor = 	 {Dasgupta, Sanjoy and McAllester, David},
  volume = 	 {28},
  number =       {3},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Atlanta, Georgia, USA},
  month = 	 {17--19 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v28/gittens13.pdf},
  url = 	 {https://proceedings.mlr.press/v28/gittens13.html},
  abstract = 	 {We reconsider randomized algorithms for the low-rank approximation of SPSD matrices such as Laplacian and kernel matrices that arise in data analysis and machine learning applications.    Our main results consist of an empirical evaluation of the performance quality and running time of sampling and projection methods on a diverse suite of SPSD matrices. Our results highlight complementary aspects of sampling versus projection methods, and they point to differences between uniform and nonuniform sampling methods based on leverage scores.    We complement our empirical results with a suite of worst-case theoretical bounds for both random sampling and random projection methods. These bounds are qualitatively superior to existing bounds— e.g., improved additive-error bounds for spectral and Frobenius norm error and relative-error bounds for trace norm error.}
}

Endnote

%0 Conference Paper
%T Revisiting the Nystrom method for improved large-scale machine learning
%A Alex Gittens
%A Michael Mahoney
%B Proceedings of the 30th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2013
%E Sanjoy Dasgupta
%E David McAllester	
%F pmlr-v28-gittens13
%I PMLR
%P 567--575
%U https://proceedings.mlr.press/v28/gittens13.html
%V 28
%N 3
%X We reconsider randomized algorithms for the low-rank approximation of SPSD matrices such as Laplacian and kernel matrices that arise in data analysis and machine learning applications.    Our main results consist of an empirical evaluation of the performance quality and running time of sampling and projection methods on a diverse suite of SPSD matrices. Our results highlight complementary aspects of sampling versus projection methods, and they point to differences between uniform and nonuniform sampling methods based on leverage scores.    We complement our empirical results with a suite of worst-case theoretical bounds for both random sampling and random projection methods. These bounds are qualitatively superior to existing bounds— e.g., improved additive-error bounds for spectral and Frobenius norm error and relative-error bounds for trace norm error.

RIS


TY  - CPAPER
TI  - Revisiting the Nystrom method for improved large-scale machine learning
AU  - Alex Gittens
AU  - Michael Mahoney
BT  - Proceedings of the 30th International Conference on Machine Learning
DA  - 2013/05/26
ED  - Sanjoy Dasgupta
ED  - David McAllester	
ID  - pmlr-v28-gittens13
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 28
IS  - 3
SP  - 567
EP  - 575
L1  - http://proceedings.mlr.press/v28/gittens13.pdf
UR  - https://proceedings.mlr.press/v28/gittens13.html
AB  - We reconsider randomized algorithms for the low-rank approximation of SPSD matrices such as Laplacian and kernel matrices that arise in data analysis and machine learning applications.    Our main results consist of an empirical evaluation of the performance quality and running time of sampling and projection methods on a diverse suite of SPSD matrices. Our results highlight complementary aspects of sampling versus projection methods, and they point to differences between uniform and nonuniform sampling methods based on leverage scores.    We complement our empirical results with a suite of worst-case theoretical bounds for both random sampling and random projection methods. These bounds are qualitatively superior to existing bounds— e.g., improved additive-error bounds for spectral and Frobenius norm error and relative-error bounds for trace norm error.
ER  -

APA


Gittens, A. & Mahoney, M.. (2013). Revisiting the Nystrom method for improved large-scale machine learning. Proceedings of the 30th International Conference on Machine Learning, in Proceedings of Machine Learning Research 28(3):567-575 Available from https://proceedings.mlr.press/v28/gittens13.html.

Related Material

Download PDF