Spectral Clustering via the Power Method - Provably

Christos Boutsidis; Prabhanjan Kambadur; Alex Gittens

Spectral Clustering via the Power Method - Provably

Christos Boutsidis, Prabhanjan Kambadur, Alex Gittens

Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:40-48, 2015.

Abstract

Spectral clustering is one of the most important algorithms in data mining and machine intelligence; however, its computational complexity limits its application to truly large scale data analysis. The computational bottleneck in spectral clustering is computing a few of the top eigenvectors of the (normalized) Laplacian matrix corresponding to the graph representing the data to be clustered. One way to speed up the computation of these eigenvectors is to use the “power method” from the numerical linear algebra literature. Although the power method has been empirically used to speed up spectral clustering, the theory behind this approach, to the best of our knowledge, remains unexplored. This paper provides the first such rigorous theoretical justification, arguing that a small number of power iterations suffices to obtain near-optimal partitionings using the approximate eigenvectors. Specifically, we prove that solving the k-means clustering problem on the approximate eigenvectors obtained via the power method gives an additive-error approximation to solving the k-means problem on the optimal eigenvectors.

Cite this Paper

BibTeX


@InProceedings{pmlr-v37-boutsidis15,
  title = 	 {Spectral Clustering via the Power Method - Provably},
  author = 	 {Boutsidis, Christos and Kambadur, Prabhanjan and Gittens, Alex},
  booktitle = 	 {Proceedings of the 32nd International Conference on Machine Learning},
  pages = 	 {40--48},
  year = 	 {2015},
  editor = 	 {Bach, Francis and Blei, David},
  volume = 	 {37},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Lille, France},
  month = 	 {07--09 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v37/boutsidis15.pdf},
  url = 	 {https://proceedings.mlr.press/v37/boutsidis15.html},
  abstract = 	 {Spectral clustering is one of the most important algorithms in data mining and machine intelligence; however, its computational complexity limits its application to truly large scale data analysis. The computational bottleneck in spectral clustering is computing a few of the top eigenvectors of the (normalized) Laplacian matrix corresponding to the graph representing the data to be clustered. One way to speed up the computation of these eigenvectors is to use the “power method” from the numerical linear algebra literature. Although the power method has been empirically used to speed up spectral clustering, the theory behind this approach, to the best of our knowledge, remains unexplored. This paper provides the first such rigorous theoretical justification, arguing that a small number of power iterations suffices to obtain near-optimal partitionings using the approximate eigenvectors. Specifically, we prove that solving the k-means clustering problem on the approximate eigenvectors obtained via the power method gives an additive-error approximation to solving the k-means problem on the optimal eigenvectors.}
}

Endnote

%0 Conference Paper
%T Spectral Clustering via the Power Method - Provably
%A Christos Boutsidis
%A Prabhanjan Kambadur
%A Alex Gittens
%B Proceedings of the 32nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2015
%E Francis Bach
%E David Blei	
%F pmlr-v37-boutsidis15
%I PMLR
%P 40--48
%U https://proceedings.mlr.press/v37/boutsidis15.html
%V 37
%X Spectral clustering is one of the most important algorithms in data mining and machine intelligence; however, its computational complexity limits its application to truly large scale data analysis. The computational bottleneck in spectral clustering is computing a few of the top eigenvectors of the (normalized) Laplacian matrix corresponding to the graph representing the data to be clustered. One way to speed up the computation of these eigenvectors is to use the “power method” from the numerical linear algebra literature. Although the power method has been empirically used to speed up spectral clustering, the theory behind this approach, to the best of our knowledge, remains unexplored. This paper provides the first such rigorous theoretical justification, arguing that a small number of power iterations suffices to obtain near-optimal partitionings using the approximate eigenvectors. Specifically, we prove that solving the k-means clustering problem on the approximate eigenvectors obtained via the power method gives an additive-error approximation to solving the k-means problem on the optimal eigenvectors.

RIS


TY  - CPAPER
TI  - Spectral Clustering via the Power Method - Provably
AU  - Christos Boutsidis
AU  - Prabhanjan Kambadur
AU  - Alex Gittens
BT  - Proceedings of the 32nd International Conference on Machine Learning
DA  - 2015/06/01
ED  - Francis Bach
ED  - David Blei	
ID  - pmlr-v37-boutsidis15
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 37
SP  - 40
EP  - 48
L1  - http://proceedings.mlr.press/v37/boutsidis15.pdf
UR  - https://proceedings.mlr.press/v37/boutsidis15.html
AB  - Spectral clustering is one of the most important algorithms in data mining and machine intelligence; however, its computational complexity limits its application to truly large scale data analysis. The computational bottleneck in spectral clustering is computing a few of the top eigenvectors of the (normalized) Laplacian matrix corresponding to the graph representing the data to be clustered. One way to speed up the computation of these eigenvectors is to use the “power method” from the numerical linear algebra literature. Although the power method has been empirically used to speed up spectral clustering, the theory behind this approach, to the best of our knowledge, remains unexplored. This paper provides the first such rigorous theoretical justification, arguing that a small number of power iterations suffices to obtain near-optimal partitionings using the approximate eigenvectors. Specifically, we prove that solving the k-means clustering problem on the approximate eigenvectors obtained via the power method gives an additive-error approximation to solving the k-means problem on the optimal eigenvectors.
ER  -

APA


Boutsidis, C., Kambadur, P. & Gittens, A.. (2015). Spectral Clustering via the Power Method - Provably. Proceedings of the 32nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 37:40-48 Available from https://proceedings.mlr.press/v37/boutsidis15.html.

Related Material

Download PDF