Statistical Optimization of Non-Negative Matrix Factorization

Anoop Korattikara Balan, Levi Boyles, Max Welling, Jingu Kim, Haesun Park
Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, PMLR 15:128-136, 2011.

Abstract

Non-Negative Matrix Factorization (NMF) is a dimensionality reduction method that has been shown to be very useful for a variety of tasks in machine learning and data mining. One of the fastest algorithms for NMF is the Block Principal Pivoting method (BPP) of (Kim & Park, 2008b), which follows a block coordinate descent approach. The optimization in each iteration involves solving a large number of expensive least squares problems. Taking the view that the design matrix was generated by a stochastic process, and using the asymptotic normality of the least squares estimator, we propose a method for improving the performance of BPP. Our method starts with a small subset of the columns and rows of the original matrix and uses frequentist hypothesis tests to adaptively increase the size of the problem. This achieves two objectives: 1) during the initial phase of the algorithm we solve far fewer, much smaller sized least squares problems and 2) all hypothesis tests failing while using all the data represents a principled, automatic stopping criterion. Our experiments on three real world datasets show that our algorithm significantly improves the performance of the original BPP algorithm.

Cite this Paper


BibTeX
@InProceedings{pmlr-v15-balan11a, title = {Statistical Optimization of Non-Negative Matrix Factorization}, author = {Balan, Anoop Korattikara and Boyles, Levi and Welling, Max and Kim, Jingu and Park, Haesun}, booktitle = {Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics}, pages = {128--136}, year = {2011}, editor = {Gordon, Geoffrey and Dunson, David and Dudík, Miroslav}, volume = {15}, series = {Proceedings of Machine Learning Research}, address = {Fort Lauderdale, FL, USA}, month = {11--13 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v15/balan11a/balan11a.pdf}, url = {https://proceedings.mlr.press/v15/balan11a.html}, abstract = {Non-Negative Matrix Factorization (NMF) is a dimensionality reduction method that has been shown to be very useful for a variety of tasks in machine learning and data mining. One of the fastest algorithms for NMF is the Block Principal Pivoting method (BPP) of (Kim & Park, 2008b), which follows a block coordinate descent approach. The optimization in each iteration involves solving a large number of expensive least squares problems. Taking the view that the design matrix was generated by a stochastic process, and using the asymptotic normality of the least squares estimator, we propose a method for improving the performance of BPP. Our method starts with a small subset of the columns and rows of the original matrix and uses frequentist hypothesis tests to adaptively increase the size of the problem. This achieves two objectives: 1) during the initial phase of the algorithm we solve far fewer, much smaller sized least squares problems and 2) all hypothesis tests failing while using all the data represents a principled, automatic stopping criterion. Our experiments on three real world datasets show that our algorithm significantly improves the performance of the original BPP algorithm.} }
Endnote
%0 Conference Paper %T Statistical Optimization of Non-Negative Matrix Factorization %A Anoop Korattikara Balan %A Levi Boyles %A Max Welling %A Jingu Kim %A Haesun Park %B Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2011 %E Geoffrey Gordon %E David Dunson %E Miroslav Dudík %F pmlr-v15-balan11a %I PMLR %P 128--136 %U https://proceedings.mlr.press/v15/balan11a.html %V 15 %X Non-Negative Matrix Factorization (NMF) is a dimensionality reduction method that has been shown to be very useful for a variety of tasks in machine learning and data mining. One of the fastest algorithms for NMF is the Block Principal Pivoting method (BPP) of (Kim & Park, 2008b), which follows a block coordinate descent approach. The optimization in each iteration involves solving a large number of expensive least squares problems. Taking the view that the design matrix was generated by a stochastic process, and using the asymptotic normality of the least squares estimator, we propose a method for improving the performance of BPP. Our method starts with a small subset of the columns and rows of the original matrix and uses frequentist hypothesis tests to adaptively increase the size of the problem. This achieves two objectives: 1) during the initial phase of the algorithm we solve far fewer, much smaller sized least squares problems and 2) all hypothesis tests failing while using all the data represents a principled, automatic stopping criterion. Our experiments on three real world datasets show that our algorithm significantly improves the performance of the original BPP algorithm.
RIS
TY - CPAPER TI - Statistical Optimization of Non-Negative Matrix Factorization AU - Anoop Korattikara Balan AU - Levi Boyles AU - Max Welling AU - Jingu Kim AU - Haesun Park BT - Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics DA - 2011/06/14 ED - Geoffrey Gordon ED - David Dunson ED - Miroslav Dudík ID - pmlr-v15-balan11a PB - PMLR DP - Proceedings of Machine Learning Research VL - 15 SP - 128 EP - 136 L1 - http://proceedings.mlr.press/v15/balan11a/balan11a.pdf UR - https://proceedings.mlr.press/v15/balan11a.html AB - Non-Negative Matrix Factorization (NMF) is a dimensionality reduction method that has been shown to be very useful for a variety of tasks in machine learning and data mining. One of the fastest algorithms for NMF is the Block Principal Pivoting method (BPP) of (Kim & Park, 2008b), which follows a block coordinate descent approach. The optimization in each iteration involves solving a large number of expensive least squares problems. Taking the view that the design matrix was generated by a stochastic process, and using the asymptotic normality of the least squares estimator, we propose a method for improving the performance of BPP. Our method starts with a small subset of the columns and rows of the original matrix and uses frequentist hypothesis tests to adaptively increase the size of the problem. This achieves two objectives: 1) during the initial phase of the algorithm we solve far fewer, much smaller sized least squares problems and 2) all hypothesis tests failing while using all the data represents a principled, automatic stopping criterion. Our experiments on three real world datasets show that our algorithm significantly improves the performance of the original BPP algorithm. ER -
APA
Balan, A.K., Boyles, L., Welling, M., Kim, J. & Park, H.. (2011). Statistical Optimization of Non-Negative Matrix Factorization. Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 15:128-136 Available from https://proceedings.mlr.press/v15/balan11a.html.

Related Material