Compressed Factorization: Fast and Accurate Low-Rank Factorization of Compressively-Sensed Data

Vatsal Sharan, Kai Sheng Tai, Peter Bailis, Gregory Valiant
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:5690-5700, 2019.

Abstract

What learning algorithms can be run directly on compressively-sensed data? In this work, we consider the question of accurately and efficiently computing low-rank matrix or tensor factorizations given data compressed via random projections. We examine the approach of first performing factorization in the compressed domain, and then reconstructing the original high-dimensional factors from the recovered (compressed) factors. In both the matrix and tensor settings, we establish conditions under which this natural approach will provably recover the original factors. While it is well-known that random projections preserve a number of geometric properties of a dataset, our work can be viewed as showing that they can also preserve certain solutions of non-convex, NP-Hard problems like non-negative matrix factorization. We support these theoretical results with experiments on synthetic data and demonstrate the practical applicability of compressed factorization on real-world gene expression and EEG time series datasets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-sharan19a, title = {Compressed Factorization: Fast and Accurate Low-Rank Factorization of Compressively-Sensed Data}, author = {Sharan, Vatsal and Tai, Kai Sheng and Bailis, Peter and Valiant, Gregory}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {5690--5700}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/sharan19a/sharan19a.pdf}, url = {https://proceedings.mlr.press/v97/sharan19a.html}, abstract = {What learning algorithms can be run directly on compressively-sensed data? In this work, we consider the question of accurately and efficiently computing low-rank matrix or tensor factorizations given data compressed via random projections. We examine the approach of first performing factorization in the compressed domain, and then reconstructing the original high-dimensional factors from the recovered (compressed) factors. In both the matrix and tensor settings, we establish conditions under which this natural approach will provably recover the original factors. While it is well-known that random projections preserve a number of geometric properties of a dataset, our work can be viewed as showing that they can also preserve certain solutions of non-convex, NP-Hard problems like non-negative matrix factorization. We support these theoretical results with experiments on synthetic data and demonstrate the practical applicability of compressed factorization on real-world gene expression and EEG time series datasets.} }
Endnote
%0 Conference Paper %T Compressed Factorization: Fast and Accurate Low-Rank Factorization of Compressively-Sensed Data %A Vatsal Sharan %A Kai Sheng Tai %A Peter Bailis %A Gregory Valiant %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-sharan19a %I PMLR %P 5690--5700 %U https://proceedings.mlr.press/v97/sharan19a.html %V 97 %X What learning algorithms can be run directly on compressively-sensed data? In this work, we consider the question of accurately and efficiently computing low-rank matrix or tensor factorizations given data compressed via random projections. We examine the approach of first performing factorization in the compressed domain, and then reconstructing the original high-dimensional factors from the recovered (compressed) factors. In both the matrix and tensor settings, we establish conditions under which this natural approach will provably recover the original factors. While it is well-known that random projections preserve a number of geometric properties of a dataset, our work can be viewed as showing that they can also preserve certain solutions of non-convex, NP-Hard problems like non-negative matrix factorization. We support these theoretical results with experiments on synthetic data and demonstrate the practical applicability of compressed factorization on real-world gene expression and EEG time series datasets.
APA
Sharan, V., Tai, K.S., Bailis, P. & Valiant, G.. (2019). Compressed Factorization: Fast and Accurate Low-Rank Factorization of Compressively-Sensed Data. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:5690-5700 Available from https://proceedings.mlr.press/v97/sharan19a.html.

Related Material