Dictionary Learning for Massive Matrix Factorization

Arthur Mensch, Julien Mairal, Bertrand Thirion, Gael Varoquaux
Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:1737-1746, 2016.

Abstract

Sparse matrix factorization is a popular tool to obtain interpretable data decompositions, which are also effective to perform data completion or denoising. Its applicability to large datasets has been addressed with online and randomized methods, that reduce the complexity in one of the matrix dimension, but not in both of them. In this paper, we tackle very large matrices in both dimensions. We propose a new factorization method that scales gracefully to terabyte-scale datasets. Those could not be processed by previous algorithms in a reasonable amount of time. We demonstrate the efficiency of our approach on massive functional Magnetic Resonance Imaging (fMRI) data, and on matrix completion problems for recommender systems, where we obtain significant speed-ups compared to state-of-the art coordinate descent methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v48-mensch16, title = {Dictionary Learning for Massive Matrix Factorization}, author = {Mensch, Arthur and Mairal, Julien and Thirion, Bertrand and Varoquaux, Gael}, booktitle = {Proceedings of The 33rd International Conference on Machine Learning}, pages = {1737--1746}, year = {2016}, editor = {Balcan, Maria Florina and Weinberger, Kilian Q.}, volume = {48}, series = {Proceedings of Machine Learning Research}, address = {New York, New York, USA}, month = {20--22 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v48/mensch16.pdf}, url = {https://proceedings.mlr.press/v48/mensch16.html}, abstract = {Sparse matrix factorization is a popular tool to obtain interpretable data decompositions, which are also effective to perform data completion or denoising. Its applicability to large datasets has been addressed with online and randomized methods, that reduce the complexity in one of the matrix dimension, but not in both of them. In this paper, we tackle very large matrices in both dimensions. We propose a new factorization method that scales gracefully to terabyte-scale datasets. Those could not be processed by previous algorithms in a reasonable amount of time. We demonstrate the efficiency of our approach on massive functional Magnetic Resonance Imaging (fMRI) data, and on matrix completion problems for recommender systems, where we obtain significant speed-ups compared to state-of-the art coordinate descent methods.} }
Endnote
%0 Conference Paper %T Dictionary Learning for Massive Matrix Factorization %A Arthur Mensch %A Julien Mairal %A Bertrand Thirion %A Gael Varoquaux %B Proceedings of The 33rd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2016 %E Maria Florina Balcan %E Kilian Q. Weinberger %F pmlr-v48-mensch16 %I PMLR %P 1737--1746 %U https://proceedings.mlr.press/v48/mensch16.html %V 48 %X Sparse matrix factorization is a popular tool to obtain interpretable data decompositions, which are also effective to perform data completion or denoising. Its applicability to large datasets has been addressed with online and randomized methods, that reduce the complexity in one of the matrix dimension, but not in both of them. In this paper, we tackle very large matrices in both dimensions. We propose a new factorization method that scales gracefully to terabyte-scale datasets. Those could not be processed by previous algorithms in a reasonable amount of time. We demonstrate the efficiency of our approach on massive functional Magnetic Resonance Imaging (fMRI) data, and on matrix completion problems for recommender systems, where we obtain significant speed-ups compared to state-of-the art coordinate descent methods.
RIS
TY - CPAPER TI - Dictionary Learning for Massive Matrix Factorization AU - Arthur Mensch AU - Julien Mairal AU - Bertrand Thirion AU - Gael Varoquaux BT - Proceedings of The 33rd International Conference on Machine Learning DA - 2016/06/11 ED - Maria Florina Balcan ED - Kilian Q. Weinberger ID - pmlr-v48-mensch16 PB - PMLR DP - Proceedings of Machine Learning Research VL - 48 SP - 1737 EP - 1746 L1 - http://proceedings.mlr.press/v48/mensch16.pdf UR - https://proceedings.mlr.press/v48/mensch16.html AB - Sparse matrix factorization is a popular tool to obtain interpretable data decompositions, which are also effective to perform data completion or denoising. Its applicability to large datasets has been addressed with online and randomized methods, that reduce the complexity in one of the matrix dimension, but not in both of them. In this paper, we tackle very large matrices in both dimensions. We propose a new factorization method that scales gracefully to terabyte-scale datasets. Those could not be processed by previous algorithms in a reasonable amount of time. We demonstrate the efficiency of our approach on massive functional Magnetic Resonance Imaging (fMRI) data, and on matrix completion problems for recommender systems, where we obtain significant speed-ups compared to state-of-the art coordinate descent methods. ER -
APA
Mensch, A., Mairal, J., Thirion, B. & Varoquaux, G.. (2016). Dictionary Learning for Massive Matrix Factorization. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:1737-1746 Available from https://proceedings.mlr.press/v48/mensch16.html.

Related Material