An Improved Gap-Dependency Analysis of the Noisy Power Method

Maria-Florina Balcan, Simon Shaolei Du, Yining Wang, Adams Wei Yu
29th Annual Conference on Learning Theory, PMLR 49:284-309, 2016.

Abstract

We consider the \emphnoisy power method algorithm, which has wide applications in machine learning and statistics, especially those related to principal component analysis (PCA) under resource (communication, memory or privacy) constraints. Existing analysis of the noisy power method shows an unsatisfactory dependency over the “consecutive" spectral gap (\sigma_k-\sigma_k+1) of an input data matrix, which could be very small and hence limits the algorithm’s applicability. In this paper, we present a new analysis of the noisy power method that achieves improved gap dependency for both sample complexity and noise tolerance bounds. More specifically, we improve the dependency over (\sigma_k-\sigma_k+1) to dependency over (\sigma_k-\sigma_q+1), where q is an intermediate algorithm parameter and could be much larger than the target rank k. Our proofs are built upon a novel characterization of proximity between two subspaces that differ from canonical angle characterizations analyzed in previous works. Finally, we apply our improved bounds to distributed private PCA and memory-efficient streaming PCA and obtain bounds that are superior to existing results in the literature.

Cite this Paper


BibTeX
@InProceedings{pmlr-v49-balcan16a, title = {An Improved Gap-Dependency Analysis of the Noisy Power Method}, author = {Balcan, Maria-Florina and Du, Simon Shaolei and Wang, Yining and Yu, Adams Wei}, booktitle = {29th Annual Conference on Learning Theory}, pages = {284--309}, year = {2016}, editor = {Feldman, Vitaly and Rakhlin, Alexander and Shamir, Ohad}, volume = {49}, series = {Proceedings of Machine Learning Research}, address = {Columbia University, New York, New York, USA}, month = {23--26 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v49/balcan16a.pdf}, url = {https://proceedings.mlr.press/v49/balcan16a.html}, abstract = {We consider the \emphnoisy power method algorithm, which has wide applications in machine learning and statistics, especially those related to principal component analysis (PCA) under resource (communication, memory or privacy) constraints. Existing analysis of the noisy power method shows an unsatisfactory dependency over the “consecutive" spectral gap (\sigma_k-\sigma_k+1) of an input data matrix, which could be very small and hence limits the algorithm’s applicability. In this paper, we present a new analysis of the noisy power method that achieves improved gap dependency for both sample complexity and noise tolerance bounds. More specifically, we improve the dependency over (\sigma_k-\sigma_k+1) to dependency over (\sigma_k-\sigma_q+1), where q is an intermediate algorithm parameter and could be much larger than the target rank k. Our proofs are built upon a novel characterization of proximity between two subspaces that differ from canonical angle characterizations analyzed in previous works. Finally, we apply our improved bounds to distributed private PCA and memory-efficient streaming PCA and obtain bounds that are superior to existing results in the literature.} }
Endnote
%0 Conference Paper %T An Improved Gap-Dependency Analysis of the Noisy Power Method %A Maria-Florina Balcan %A Simon Shaolei Du %A Yining Wang %A Adams Wei Yu %B 29th Annual Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2016 %E Vitaly Feldman %E Alexander Rakhlin %E Ohad Shamir %F pmlr-v49-balcan16a %I PMLR %P 284--309 %U https://proceedings.mlr.press/v49/balcan16a.html %V 49 %X We consider the \emphnoisy power method algorithm, which has wide applications in machine learning and statistics, especially those related to principal component analysis (PCA) under resource (communication, memory or privacy) constraints. Existing analysis of the noisy power method shows an unsatisfactory dependency over the “consecutive" spectral gap (\sigma_k-\sigma_k+1) of an input data matrix, which could be very small and hence limits the algorithm’s applicability. In this paper, we present a new analysis of the noisy power method that achieves improved gap dependency for both sample complexity and noise tolerance bounds. More specifically, we improve the dependency over (\sigma_k-\sigma_k+1) to dependency over (\sigma_k-\sigma_q+1), where q is an intermediate algorithm parameter and could be much larger than the target rank k. Our proofs are built upon a novel characterization of proximity between two subspaces that differ from canonical angle characterizations analyzed in previous works. Finally, we apply our improved bounds to distributed private PCA and memory-efficient streaming PCA and obtain bounds that are superior to existing results in the literature.
RIS
TY - CPAPER TI - An Improved Gap-Dependency Analysis of the Noisy Power Method AU - Maria-Florina Balcan AU - Simon Shaolei Du AU - Yining Wang AU - Adams Wei Yu BT - 29th Annual Conference on Learning Theory DA - 2016/06/06 ED - Vitaly Feldman ED - Alexander Rakhlin ED - Ohad Shamir ID - pmlr-v49-balcan16a PB - PMLR DP - Proceedings of Machine Learning Research VL - 49 SP - 284 EP - 309 L1 - http://proceedings.mlr.press/v49/balcan16a.pdf UR - https://proceedings.mlr.press/v49/balcan16a.html AB - We consider the \emphnoisy power method algorithm, which has wide applications in machine learning and statistics, especially those related to principal component analysis (PCA) under resource (communication, memory or privacy) constraints. Existing analysis of the noisy power method shows an unsatisfactory dependency over the “consecutive" spectral gap (\sigma_k-\sigma_k+1) of an input data matrix, which could be very small and hence limits the algorithm’s applicability. In this paper, we present a new analysis of the noisy power method that achieves improved gap dependency for both sample complexity and noise tolerance bounds. More specifically, we improve the dependency over (\sigma_k-\sigma_k+1) to dependency over (\sigma_k-\sigma_q+1), where q is an intermediate algorithm parameter and could be much larger than the target rank k. Our proofs are built upon a novel characterization of proximity between two subspaces that differ from canonical angle characterizations analyzed in previous works. Finally, we apply our improved bounds to distributed private PCA and memory-efficient streaming PCA and obtain bounds that are superior to existing results in the literature. ER -
APA
Balcan, M., Du, S.S., Wang, Y. & Yu, A.W.. (2016). An Improved Gap-Dependency Analysis of the Noisy Power Method. 29th Annual Conference on Learning Theory, in Proceedings of Machine Learning Research 49:284-309 Available from https://proceedings.mlr.press/v49/balcan16a.html.

Related Material