High-Dimensional Variance-Reduced Stochastic Gradient Expectation-Maximization Algorithm

Rongda Zhu, Lingxiao Wang, Chengxiang Zhai, Quanquan Gu
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:4180-4188, 2017.

Abstract

We propose a generic stochastic expectation-maximization (EM) algorithm for the estimation of high-dimensional latent variable models. At the core of our algorithm is a novel semi-stochastic variance-reduced gradient designed for the $Q$-function in the EM algorithm. Under a mild condition on the initialization, our algorithm is guaranteed to attain a linear convergence rate to the unknown parameter of the latent variable model, and achieve an optimal statistical rate up to a logarithmic factor for parameter estimation. Compared with existing high-dimensional EM algorithms, our algorithm enjoys a better computational complexity and is therefore more efficient. We apply our generic algorithm to two illustrative latent variable models: Gaussian mixture model and mixture of linear regression, and demonstrate the advantages of our algorithm by both theoretical analysis and numerical experiments. We believe that the proposed semi-stochastic gradient is of independent interest for general nonconvex optimization problems with bivariate structures.

Cite this Paper


BibTeX
@InProceedings{pmlr-v70-zhu17a, title = {High-Dimensional Variance-Reduced Stochastic Gradient Expectation-Maximization Algorithm}, author = {Rongda Zhu and Lingxiao Wang and Chengxiang Zhai and Quanquan Gu}, booktitle = {Proceedings of the 34th International Conference on Machine Learning}, pages = {4180--4188}, year = {2017}, editor = {Precup, Doina and Teh, Yee Whye}, volume = {70}, series = {Proceedings of Machine Learning Research}, month = {06--11 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v70/zhu17a/zhu17a.pdf}, url = {https://proceedings.mlr.press/v70/zhu17a.html}, abstract = {We propose a generic stochastic expectation-maximization (EM) algorithm for the estimation of high-dimensional latent variable models. At the core of our algorithm is a novel semi-stochastic variance-reduced gradient designed for the $Q$-function in the EM algorithm. Under a mild condition on the initialization, our algorithm is guaranteed to attain a linear convergence rate to the unknown parameter of the latent variable model, and achieve an optimal statistical rate up to a logarithmic factor for parameter estimation. Compared with existing high-dimensional EM algorithms, our algorithm enjoys a better computational complexity and is therefore more efficient. We apply our generic algorithm to two illustrative latent variable models: Gaussian mixture model and mixture of linear regression, and demonstrate the advantages of our algorithm by both theoretical analysis and numerical experiments. We believe that the proposed semi-stochastic gradient is of independent interest for general nonconvex optimization problems with bivariate structures.} }
Endnote
%0 Conference Paper %T High-Dimensional Variance-Reduced Stochastic Gradient Expectation-Maximization Algorithm %A Rongda Zhu %A Lingxiao Wang %A Chengxiang Zhai %A Quanquan Gu %B Proceedings of the 34th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2017 %E Doina Precup %E Yee Whye Teh %F pmlr-v70-zhu17a %I PMLR %P 4180--4188 %U https://proceedings.mlr.press/v70/zhu17a.html %V 70 %X We propose a generic stochastic expectation-maximization (EM) algorithm for the estimation of high-dimensional latent variable models. At the core of our algorithm is a novel semi-stochastic variance-reduced gradient designed for the $Q$-function in the EM algorithm. Under a mild condition on the initialization, our algorithm is guaranteed to attain a linear convergence rate to the unknown parameter of the latent variable model, and achieve an optimal statistical rate up to a logarithmic factor for parameter estimation. Compared with existing high-dimensional EM algorithms, our algorithm enjoys a better computational complexity and is therefore more efficient. We apply our generic algorithm to two illustrative latent variable models: Gaussian mixture model and mixture of linear regression, and demonstrate the advantages of our algorithm by both theoretical analysis and numerical experiments. We believe that the proposed semi-stochastic gradient is of independent interest for general nonconvex optimization problems with bivariate structures.
APA
Zhu, R., Wang, L., Zhai, C. & Gu, Q.. (2017). High-Dimensional Variance-Reduced Stochastic Gradient Expectation-Maximization Algorithm. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:4180-4188 Available from https://proceedings.mlr.press/v70/zhu17a.html.

Related Material