Extreme Stochastic Variational Inference: Distributed Inference for Large Scale Mixture Models

Jiong Zhang; Parameswaran Raman; Shihao Ji; Hsiang-Fu Yu; S.V.N. Vishwanathan; Inderjit Dhillon

Extreme Stochastic Variational Inference: Distributed Inference for Large Scale Mixture Models

Jiong Zhang, Parameswaran Raman, Shihao Ji, Hsiang-Fu Yu, S.V.N. Vishwanathan, Inderjit Dhillon

Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, PMLR 89:935-943, 2019.

Abstract

Mixture of exponential family models are among the most fundamental and widely used statistical models. Stochastic variational inference (SVI), the state-of-the-art algorithm for parameter estimation in such models is inherently serial. Moreover, it requires the parameters to fit in the memory of a single processor; this poses serious limitations on scalability when the number of parameters is in billions. In this paper, we present extreme stochastic variational inference (ESVI), a distributed, asynchronous and lock-free algorithm to perform variational inference for mixture models on massive real world datasets. ESVI overcomes the limitations of SVI by requiring that each processor only access a subset of the data and a subset of the parameters, thus providing data and model parallelism simultaneously. Our empirical study demonstrates that ESVI not only outperforms VI and SVI in wallclock-time, but also achieves a better quality solution. To further speed up computation and save memory when fitting large number of topics, we propose a variant ESVI-TOPK which maintains only the top-k important topics. Empirically, we found that using top 25% topics suffices to achieve the same accuracy as storing all the topics.

Cite this Paper

BibTeX


@InProceedings{pmlr-v89-zhang19c,
  title = 	 {Extreme Stochastic Variational Inference: Distributed Inference for Large Scale Mixture Models},
  author =       {Zhang, Jiong and Raman, Parameswaran and Ji, Shihao and Yu, Hsiang-Fu and Vishwanathan, S.V.N. and Dhillon, Inderjit},
  booktitle = 	 {Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics},
  pages = 	 {935--943},
  year = 	 {2019},
  editor = 	 {Chaudhuri, Kamalika and Sugiyama, Masashi},
  volume = 	 {89},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {16--18 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v89/zhang19c/zhang19c.pdf},
  url = 	 {https://proceedings.mlr.press/v89/zhang19c.html},
  abstract = 	 {Mixture of exponential family models are among the most fundamental and widely used statistical models. Stochastic variational inference (SVI), the state-of-the-art algorithm for parameter estimation in such models is inherently serial. Moreover, it requires the parameters to fit in the memory of a single processor; this poses serious limitations on scalability when the number of parameters is in billions. In this paper, we present extreme stochastic variational inference (ESVI), a distributed, asynchronous and lock-free algorithm to perform variational inference for mixture models on massive real world datasets. ESVI overcomes the limitations of SVI by requiring that each processor only access a subset of the data and a subset of the parameters, thus providing data and model parallelism simultaneously. Our empirical study demonstrates that ESVI not only outperforms VI and SVI in wallclock-time, but also achieves a better quality solution. To further speed up computation and save memory when fitting large number of topics, we propose a variant ESVI-TOPK which maintains only the top-k important topics. Empirically, we found that using top 25% topics suffices to achieve the same accuracy as storing all the topics.}
}

Endnote

%0 Conference Paper
%T Extreme Stochastic Variational Inference: Distributed Inference for Large Scale Mixture Models
%A Jiong Zhang
%A Parameswaran Raman
%A Shihao Ji
%A Hsiang-Fu Yu
%A S.V.N. Vishwanathan
%A Inderjit Dhillon
%B Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2019
%E Kamalika Chaudhuri
%E Masashi Sugiyama	
%F pmlr-v89-zhang19c
%I PMLR
%P 935--943
%U https://proceedings.mlr.press/v89/zhang19c.html
%V 89
%X Mixture of exponential family models are among the most fundamental and widely used statistical models. Stochastic variational inference (SVI), the state-of-the-art algorithm for parameter estimation in such models is inherently serial. Moreover, it requires the parameters to fit in the memory of a single processor; this poses serious limitations on scalability when the number of parameters is in billions. In this paper, we present extreme stochastic variational inference (ESVI), a distributed, asynchronous and lock-free algorithm to perform variational inference for mixture models on massive real world datasets. ESVI overcomes the limitations of SVI by requiring that each processor only access a subset of the data and a subset of the parameters, thus providing data and model parallelism simultaneously. Our empirical study demonstrates that ESVI not only outperforms VI and SVI in wallclock-time, but also achieves a better quality solution. To further speed up computation and save memory when fitting large number of topics, we propose a variant ESVI-TOPK which maintains only the top-k important topics. Empirically, we found that using top 25% topics suffices to achieve the same accuracy as storing all the topics.

APA


Zhang, J., Raman, P., Ji, S., Yu, H., Vishwanathan, S. & Dhillon, I.. (2019). Extreme Stochastic Variational Inference: Distributed Inference for Large Scale Mixture Models. Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 89:935-943 Available from https://proceedings.mlr.press/v89/zhang19c.html.

Extreme Stochastic Variational Inference: Distributed Inference for Large Scale Mixture Models

Abstract

Cite this Paper

Related Material