A Parallel, Block Greedy Method for Sparse Inverse Covariance Estimation for Ultra-high Dimensions

Prabhanjan Kambadur, Aurelie Lozano
Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics, PMLR 31:351-359, 2013.

Abstract

Discovering the graph structure of a Gaussian Markov Random Field is an important problem in application areas such as computational biology and atmospheric sciences. This task, which translates to estimating the sparsity pattern of the inverse covariance matrix, has been extensively studied in the literature. However, the existing approaches are unable to handle ultra-high dimensional datasets and there is a crucial need to develop methods that are both highly scalable and memory-efficient. In this paper, we present GINCO, a blocked greedy method for sparse inverse covariance matrix estimation. We also present detailed description of a highly-scalable and memory-efficient implementation of GINCO, which is able to operate on both shared- and distributed-memory architectures. Our implementation is able recover the sparsity pattern of 25,000 vertex random and chain graphs with 87% and 84% accuracy in \le 5 minutes using \le 10GB of memory on a single 8-core machine. Furthermore, our method is statistically consistent in recovering the sparsity pattern of the inverse covariance matrix, which we demonstrate through extensive empirical studies.

Cite this Paper


BibTeX
@InProceedings{pmlr-v31-kambadur13a, title = {A Parallel, Block Greedy Method for Sparse Inverse Covariance Estimation for Ultra-high Dimensions}, author = {Prabhanjan Kambadur and Aurelie Lozano}, booktitle = {Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics}, pages = {351--359}, year = {2013}, editor = {Carlos M. Carvalho and Pradeep Ravikumar}, volume = {31}, series = {Proceedings of Machine Learning Research}, address = {Scottsdale, Arizona, USA}, month = {29 Apr--01 May}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v31/kambadur13a.pdf}, url = {http://proceedings.mlr.press/v31/kambadur13a.html}, abstract = {Discovering the graph structure of a Gaussian Markov Random Field is an important problem in application areas such as computational biology and atmospheric sciences. This task, which translates to estimating the sparsity pattern of the inverse covariance matrix, has been extensively studied in the literature. However, the existing approaches are unable to handle ultra-high dimensional datasets and there is a crucial need to develop methods that are both highly scalable and memory-efficient. In this paper, we present GINCO, a blocked greedy method for sparse inverse covariance matrix estimation. We also present detailed description of a highly-scalable and memory-efficient implementation of GINCO, which is able to operate on both shared- and distributed-memory architectures. Our implementation is able recover the sparsity pattern of 25,000 vertex random and chain graphs with 87% and 84% accuracy in \le 5 minutes using \le 10GB of memory on a single 8-core machine. Furthermore, our method is statistically consistent in recovering the sparsity pattern of the inverse covariance matrix, which we demonstrate through extensive empirical studies.} }
Endnote
%0 Conference Paper %T A Parallel, Block Greedy Method for Sparse Inverse Covariance Estimation for Ultra-high Dimensions %A Prabhanjan Kambadur %A Aurelie Lozano %B Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2013 %E Carlos M. Carvalho %E Pradeep Ravikumar %F pmlr-v31-kambadur13a %I PMLR %P 351--359 %U http://proceedings.mlr.press/v31/kambadur13a.html %V 31 %X Discovering the graph structure of a Gaussian Markov Random Field is an important problem in application areas such as computational biology and atmospheric sciences. This task, which translates to estimating the sparsity pattern of the inverse covariance matrix, has been extensively studied in the literature. However, the existing approaches are unable to handle ultra-high dimensional datasets and there is a crucial need to develop methods that are both highly scalable and memory-efficient. In this paper, we present GINCO, a blocked greedy method for sparse inverse covariance matrix estimation. We also present detailed description of a highly-scalable and memory-efficient implementation of GINCO, which is able to operate on both shared- and distributed-memory architectures. Our implementation is able recover the sparsity pattern of 25,000 vertex random and chain graphs with 87% and 84% accuracy in \le 5 minutes using \le 10GB of memory on a single 8-core machine. Furthermore, our method is statistically consistent in recovering the sparsity pattern of the inverse covariance matrix, which we demonstrate through extensive empirical studies.
RIS
TY - CPAPER TI - A Parallel, Block Greedy Method for Sparse Inverse Covariance Estimation for Ultra-high Dimensions AU - Prabhanjan Kambadur AU - Aurelie Lozano BT - Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics DA - 2013/04/29 ED - Carlos M. Carvalho ED - Pradeep Ravikumar ID - pmlr-v31-kambadur13a PB - PMLR DP - Proceedings of Machine Learning Research VL - 31 SP - 351 EP - 359 L1 - http://proceedings.mlr.press/v31/kambadur13a.pdf UR - http://proceedings.mlr.press/v31/kambadur13a.html AB - Discovering the graph structure of a Gaussian Markov Random Field is an important problem in application areas such as computational biology and atmospheric sciences. This task, which translates to estimating the sparsity pattern of the inverse covariance matrix, has been extensively studied in the literature. However, the existing approaches are unable to handle ultra-high dimensional datasets and there is a crucial need to develop methods that are both highly scalable and memory-efficient. In this paper, we present GINCO, a blocked greedy method for sparse inverse covariance matrix estimation. We also present detailed description of a highly-scalable and memory-efficient implementation of GINCO, which is able to operate on both shared- and distributed-memory architectures. Our implementation is able recover the sparsity pattern of 25,000 vertex random and chain graphs with 87% and 84% accuracy in \le 5 minutes using \le 10GB of memory on a single 8-core machine. Furthermore, our method is statistically consistent in recovering the sparsity pattern of the inverse covariance matrix, which we demonstrate through extensive empirical studies. ER -
APA
Kambadur, P. & Lozano, A.. (2013). A Parallel, Block Greedy Method for Sparse Inverse Covariance Estimation for Ultra-high Dimensions. Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 31:351-359 Available from http://proceedings.mlr.press/v31/kambadur13a.html.

Related Material