Large-Scale Sparse Inverse Covariance Estimation via Thresholding and Max-Det Matrix Completion

Richard Zhang; Salar Fattahi; Somayeh Sojoudi

Large-Scale Sparse Inverse Covariance Estimation via Thresholding and Max-Det Matrix Completion

Richard Zhang, Salar Fattahi, Somayeh Sojoudi

Proceedings of the 35th International Conference on Machine Learning, PMLR 80:5766-5775, 2018.

Abstract

The sparse inverse covariance estimation problem is commonly solved using an $\ell_{1}$-regularized Gaussian maximum likelihood estimator known as “graphical lasso”, but its computational cost becomes prohibitive for large data sets. A recently line of results showed{–}under mild assumptions{–}that the graphical lasso estimator can be retrieved by soft-thresholding the sample covariance matrix and solving a maximum determinant matrix completion (MDMC) problem. This paper proves an extension of this result, and describes a Newton-CG algorithm to efficiently solve the MDMC problem. Assuming that the thresholded sample covariance matrix is sparse with a sparse Cholesky factorization, we prove that the algorithm converges to an $\epsilon$-accurate solution in $O(n\log(1/\epsilon))$ time and $O(n)$ memory. The algorithm is highly efficient in practice: we solve the associated MDMC problems with as many as 200,000 variables to 7-9 digits of accuracy in less than an hour on a standard laptop computer running MATLAB.

Cite this Paper

BibTeX

@InProceedings{pmlr-v80-zhang18c,
  title = 	 {Large-Scale Sparse Inverse Covariance Estimation via Thresholding and Max-Det Matrix Completion},
  author =       {Zhang, Richard and Fattahi, Salar and Sojoudi, Somayeh},
  booktitle = 	 {Proceedings of the 35th International Conference on Machine Learning},
  pages = 	 {5766--5775},
  year = 	 {2018},
  editor = 	 {Dy, Jennifer and Krause, Andreas},
  volume = 	 {80},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {10--15 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v80/zhang18c/zhang18c.pdf},
  url = 	 {https://proceedings.mlr.press/v80/zhang18c.html},
  abstract = 	 {The sparse inverse covariance estimation problem is commonly solved using an $\ell_{1}$-regularized Gaussian maximum likelihood estimator known as “graphical lasso”, but its computational cost becomes prohibitive for large data sets. A recently line of results showed{–}under mild assumptions{–}that the graphical lasso estimator can be retrieved by soft-thresholding the sample covariance matrix and solving a maximum determinant matrix completion (MDMC) problem. This paper proves an extension of this result, and describes a Newton-CG algorithm to efficiently solve the MDMC problem. Assuming that the thresholded sample covariance matrix is sparse with a sparse Cholesky factorization, we prove that the algorithm converges to an $\epsilon$-accurate solution in $O(n\log(1/\epsilon))$ time and $O(n)$ memory. The algorithm is highly efficient in practice: we solve the associated MDMC problems with as many as 200,000 variables to 7-9 digits of accuracy in less than an hour on a standard laptop computer running MATLAB.}
}

Endnote

%0 Conference Paper
%T Large-Scale Sparse Inverse Covariance Estimation via Thresholding and Max-Det Matrix Completion
%A Richard Zhang
%A Salar Fattahi
%A Somayeh Sojoudi
%B Proceedings of the 35th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2018
%E Jennifer Dy
%E Andreas Krause	
%F pmlr-v80-zhang18c
%I PMLR
%P 5766--5775
%U https://proceedings.mlr.press/v80/zhang18c.html
%V 80
%X The sparse inverse covariance estimation problem is commonly solved using an $\ell_{1}$-regularized Gaussian maximum likelihood estimator known as “graphical lasso”, but its computational cost becomes prohibitive for large data sets. A recently line of results showed{–}under mild assumptions{–}that the graphical lasso estimator can be retrieved by soft-thresholding the sample covariance matrix and solving a maximum determinant matrix completion (MDMC) problem. This paper proves an extension of this result, and describes a Newton-CG algorithm to efficiently solve the MDMC problem. Assuming that the thresholded sample covariance matrix is sparse with a sparse Cholesky factorization, we prove that the algorithm converges to an $\epsilon$-accurate solution in $O(n\log(1/\epsilon))$ time and $O(n)$ memory. The algorithm is highly efficient in practice: we solve the associated MDMC problems with as many as 200,000 variables to 7-9 digits of accuracy in less than an hour on a standard laptop computer running MATLAB.

APA

Zhang, R., Fattahi, S. & Sojoudi, S.. (2018). Large-Scale Sparse Inverse Covariance Estimation via Thresholding and Max-Det Matrix Completion. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:5766-5775 Available from https://proceedings.mlr.press/v80/zhang18c.html.

Large-Scale Sparse Inverse Covariance Estimation via Thresholding and Max-Det Matrix Completion

Abstract

Cite this Paper

Related Material