Distributed Stochastic Gradient MCMC

Sungjin Ahn; Babak Shahbaba; Max Welling

Distributed Stochastic Gradient MCMC

Sungjin Ahn, Babak Shahbaba, Max Welling

Proceedings of the 31st International Conference on Machine Learning, PMLR 32(2):1044-1052, 2014.

Abstract

Probabilistic inference on a big data scale is becoming increasingly relevant to both the machine learning and statistics communities. Here we introduce the first fully distributed MCMC algorithm based on stochastic gradients. We argue that stochastic gradient MCMC algorithms are particularly suited for distributed inference because individual chains can draw minibatches from their local pool of data for a flexible amount of time before jumping to or syncing with other chains. This greatly reduces communication overhead and allows adaptive load balancing. Our experiments for LDA on Wikipedia and Pubmed show that relative to the state of the art in distributed MCMC we reduce compute time from 27 hours to half an hour in order to reach the same perplexity level.

Cite this Paper

BibTeX


@InProceedings{pmlr-v32-ahn14,
  title = 	 {Distributed Stochastic Gradient MCMC},
  author = 	 {Ahn, Sungjin and Shahbaba, Babak and Welling, Max},
  booktitle = 	 {Proceedings of the 31st International Conference on Machine Learning},
  pages = 	 {1044--1052},
  year = 	 {2014},
  editor = 	 {Xing, Eric P. and Jebara, Tony},
  volume = 	 {32},
  number =       {2},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Bejing, China},
  month = 	 {22--24 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v32/ahn14.pdf},
  url = 	 {https://proceedings.mlr.press/v32/ahn14.html},
  abstract = 	 {Probabilistic inference on a big data scale is becoming increasingly relevant to both the machine learning and statistics communities. Here we introduce the first fully distributed MCMC algorithm based on stochastic gradients. We argue that stochastic gradient MCMC algorithms are particularly suited for distributed inference because individual chains can draw minibatches from their local pool of data for a flexible amount of time before jumping to or syncing with other chains. This greatly reduces communication overhead and allows adaptive load balancing. Our experiments for LDA on Wikipedia and Pubmed show that relative to the state of the art in distributed MCMC we reduce compute time from 27 hours to half an hour in order to reach the same perplexity level.}
}

Endnote

%0 Conference Paper
%T Distributed Stochastic Gradient MCMC
%A Sungjin Ahn
%A Babak Shahbaba
%A Max Welling
%B Proceedings of the 31st International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2014
%E Eric P. Xing
%E Tony Jebara	
%F pmlr-v32-ahn14
%I PMLR
%P 1044--1052
%U https://proceedings.mlr.press/v32/ahn14.html
%V 32
%N 2
%X Probabilistic inference on a big data scale is becoming increasingly relevant to both the machine learning and statistics communities. Here we introduce the first fully distributed MCMC algorithm based on stochastic gradients. We argue that stochastic gradient MCMC algorithms are particularly suited for distributed inference because individual chains can draw minibatches from their local pool of data for a flexible amount of time before jumping to or syncing with other chains. This greatly reduces communication overhead and allows adaptive load balancing. Our experiments for LDA on Wikipedia and Pubmed show that relative to the state of the art in distributed MCMC we reduce compute time from 27 hours to half an hour in order to reach the same perplexity level.

RIS


TY  - CPAPER
TI  - Distributed Stochastic Gradient MCMC
AU  - Sungjin Ahn
AU  - Babak Shahbaba
AU  - Max Welling
BT  - Proceedings of the 31st International Conference on Machine Learning
DA  - 2014/06/18
ED  - Eric P. Xing
ED  - Tony Jebara	
ID  - pmlr-v32-ahn14
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 32
IS  - 2
SP  - 1044
EP  - 1052
L1  - http://proceedings.mlr.press/v32/ahn14.pdf
UR  - https://proceedings.mlr.press/v32/ahn14.html
AB  - Probabilistic inference on a big data scale is becoming increasingly relevant to both the machine learning and statistics communities. Here we introduce the first fully distributed MCMC algorithm based on stochastic gradients. We argue that stochastic gradient MCMC algorithms are particularly suited for distributed inference because individual chains can draw minibatches from their local pool of data for a flexible amount of time before jumping to or syncing with other chains. This greatly reduces communication overhead and allows adaptive load balancing. Our experiments for LDA on Wikipedia and Pubmed show that relative to the state of the art in distributed MCMC we reduce compute time from 27 hours to half an hour in order to reach the same perplexity level.
ER  -

APA


Ahn, S., Shahbaba, B. & Welling, M.. (2014). Distributed Stochastic Gradient MCMC. Proceedings of the 31st International Conference on Machine Learning, in Proceedings of Machine Learning Research 32(2):1044-1052 Available from https://proceedings.mlr.press/v32/ahn14.html.

Distributed Stochastic Gradient MCMC

Abstract

Cite this Paper

Related Material