Schatten Norms in Matrix Streams: Hello Sparsity, Goodbye Dimension

Vladimir Braverman, Robert Krauthgamer, Aditya Krishnan, Roi Sinoff
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:1100-1110, 2020.

Abstract

Spectral functions of large matrices contains important structural information about the underlying data, and is thus becoming increasingly important. Many times, large matrices representing real-world data are sparse or doubly sparse (i.e., sparse in both rows and columns), and are accessed as a stream of updates, typically organized in row-order. In this setting, where space (memory) is the limiting resource, all known algorithms require space that is polynomial in the dimension of the matrix, even for sparse matrices. We address this challenge by providing the first algorithms whose space requirement is independent of the matrix dimension, assuming the matrix is doubly-sparse and presented in row-order. Our algorithms approximate the Schatten p-norms, which we use in turn to approximate other spectral functions, such as logarithm of the determinant, trace of matrix inverse, and Estrada index. We validate these theoretical performance bounds by numerical experiments on real-world matrices representing social networks. We further prove that multiple passes are unavoidable in this setting, and show extensions of our primary technique, including a trade-off between space requirements and number of passes.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-braverman20b, title = {Schatten Norms in Matrix Streams: Hello Sparsity, Goodbye Dimension}, author = {Braverman, Vladimir and Krauthgamer, Robert and Krishnan, Aditya and Sinoff, Roi}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {1100--1110}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/braverman20b/braverman20b.pdf}, url = {https://proceedings.mlr.press/v119/braverman20b.html}, abstract = {Spectral functions of large matrices contains important structural information about the underlying data, and is thus becoming increasingly important. Many times, large matrices representing real-world data are sparse or doubly sparse (i.e., sparse in both rows and columns), and are accessed as a stream of updates, typically organized in row-order. In this setting, where space (memory) is the limiting resource, all known algorithms require space that is polynomial in the dimension of the matrix, even for sparse matrices. We address this challenge by providing the first algorithms whose space requirement is independent of the matrix dimension, assuming the matrix is doubly-sparse and presented in row-order. Our algorithms approximate the Schatten p-norms, which we use in turn to approximate other spectral functions, such as logarithm of the determinant, trace of matrix inverse, and Estrada index. We validate these theoretical performance bounds by numerical experiments on real-world matrices representing social networks. We further prove that multiple passes are unavoidable in this setting, and show extensions of our primary technique, including a trade-off between space requirements and number of passes.} }
Endnote
%0 Conference Paper %T Schatten Norms in Matrix Streams: Hello Sparsity, Goodbye Dimension %A Vladimir Braverman %A Robert Krauthgamer %A Aditya Krishnan %A Roi Sinoff %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-braverman20b %I PMLR %P 1100--1110 %U https://proceedings.mlr.press/v119/braverman20b.html %V 119 %X Spectral functions of large matrices contains important structural information about the underlying data, and is thus becoming increasingly important. Many times, large matrices representing real-world data are sparse or doubly sparse (i.e., sparse in both rows and columns), and are accessed as a stream of updates, typically organized in row-order. In this setting, where space (memory) is the limiting resource, all known algorithms require space that is polynomial in the dimension of the matrix, even for sparse matrices. We address this challenge by providing the first algorithms whose space requirement is independent of the matrix dimension, assuming the matrix is doubly-sparse and presented in row-order. Our algorithms approximate the Schatten p-norms, which we use in turn to approximate other spectral functions, such as logarithm of the determinant, trace of matrix inverse, and Estrada index. We validate these theoretical performance bounds by numerical experiments on real-world matrices representing social networks. We further prove that multiple passes are unavoidable in this setting, and show extensions of our primary technique, including a trade-off between space requirements and number of passes.
APA
Braverman, V., Krauthgamer, R., Krishnan, A. & Sinoff, R.. (2020). Schatten Norms in Matrix Streams: Hello Sparsity, Goodbye Dimension. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:1100-1110 Available from https://proceedings.mlr.press/v119/braverman20b.html.

Related Material