How Close Are the Eigenvectors of the Sample and Actual Covariance Matrices?

Andreas Loukas

How Close Are the Eigenvectors of the Sample and Actual Covariance Matrices?

Andreas Loukas

Proceedings of the 34th International Conference on Machine Learning, PMLR 70:2228-2237, 2017.

Abstract

How many samples are sufficient to guarantee that the eigenvectors of the sample covariance matrix are close to those of the actual covariance matrix? For a wide family of distributions, including distributions with finite second moment and sub-gaussian distributions supported in a centered Euclidean ball, we prove that the inner product between eigenvectors of the sample and actual covariance matrices decreases proportionally to the respective eigenvalue distance and the number of samples. Our findings imply non-asymptotic concentration bounds for eigenvectors and eigenvalues and carry strong consequences for the non-asymptotic analysis of PCA and its applications. For instance, they provide conditions for separating components estimated from $O(1)$ samples and show that even few samples can be sufficient to perform dimensionality reduction, especially for low-rank covariances.

Cite this Paper

BibTeX


@InProceedings{pmlr-v70-loukas17a,
  title = 	 {How Close Are the Eigenvectors of the Sample and Actual Covariance Matrices?},
  author =       {Andreas Loukas},
  booktitle = 	 {Proceedings of the 34th International Conference on Machine Learning},
  pages = 	 {2228--2237},
  year = 	 {2017},
  editor = 	 {Precup, Doina and Teh, Yee Whye},
  volume = 	 {70},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {06--11 Aug},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v70/loukas17a/loukas17a.pdf},
  url = 	 {https://proceedings.mlr.press/v70/loukas17a.html},
  abstract = 	 {How many samples are sufficient to guarantee that the eigenvectors of the sample covariance matrix are close to those of the actual covariance matrix? For a wide family of distributions, including distributions with finite second moment and sub-gaussian distributions supported in a centered Euclidean ball, we prove that the inner product between eigenvectors of the sample and actual covariance matrices decreases proportionally to the respective eigenvalue distance and the number of samples. Our findings imply non-asymptotic concentration bounds for eigenvectors and eigenvalues and carry strong consequences for the non-asymptotic analysis of PCA and its applications. For instance, they provide conditions for separating components estimated from $O(1)$ samples and show that even few samples can be sufficient to perform dimensionality reduction, especially for low-rank covariances.}
}

Endnote

%0 Conference Paper
%T How Close Are the Eigenvectors of the Sample and Actual Covariance Matrices?
%A Andreas Loukas
%B Proceedings of the 34th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2017
%E Doina Precup
%E Yee Whye Teh	
%F pmlr-v70-loukas17a
%I PMLR
%P 2228--2237
%U https://proceedings.mlr.press/v70/loukas17a.html
%V 70
%X How many samples are sufficient to guarantee that the eigenvectors of the sample covariance matrix are close to those of the actual covariance matrix? For a wide family of distributions, including distributions with finite second moment and sub-gaussian distributions supported in a centered Euclidean ball, we prove that the inner product between eigenvectors of the sample and actual covariance matrices decreases proportionally to the respective eigenvalue distance and the number of samples. Our findings imply non-asymptotic concentration bounds for eigenvectors and eigenvalues and carry strong consequences for the non-asymptotic analysis of PCA and its applications. For instance, they provide conditions for separating components estimated from $O(1)$ samples and show that even few samples can be sufficient to perform dimensionality reduction, especially for low-rank covariances.

APA


Loukas, A.. (2017). How Close Are the Eigenvectors of the Sample and Actual Covariance Matrices?. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:2228-2237 Available from https://proceedings.mlr.press/v70/loukas17a.html.

Related Material

Download PDF