Scalable Metropolis-Hastings for Exact Bayesian Inference with Large Datasets

Rob Cornish, Paul Vanetti, Alexandre Bouchard-Cote, George Deligiannidis, Arnaud Doucet
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:1351-1360, 2019.

Abstract

Bayesian inference via standard Markov Chain Monte Carlo (MCMC) methods such as Metropolis-Hastings is too computationally intensive to handle large datasets, since the cost per step usually scales like $O(n)$ in the number of data points $n$. We propose the Scalable Metropolis-Hastings (SMH) kernel that only requires processing on average $O(1)$ or even $O(1/\sqrt{n})$ data points per step. This scheme is based on a combination of factorized acceptance probabilities, procedures for fast simulation of Bernoulli processes, and control variate ideas. Contrary to many MCMC subsampling schemes such as fixed step-size Stochastic Gradient Langevin Dynamics, our approach is exact insofar as the invariant distribution is the true posterior and not an approximation to it. We characterise the performance of our algorithm theoretically, and give realistic and verifiable conditions under which it is geometrically ergodic. This theory is borne out by empirical results that demonstrate overall performance benefits over standard Metropolis-Hastings and various subsampling algorithms.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-cornish19a, title = {Scalable {M}etropolis-{H}astings for Exact {B}ayesian Inference with Large Datasets}, author = {Cornish, Rob and Vanetti, Paul and Bouchard-Cote, Alexandre and Deligiannidis, George and Doucet, Arnaud}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {1351--1360}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/cornish19a/cornish19a.pdf}, url = {https://proceedings.mlr.press/v97/cornish19a.html}, abstract = {Bayesian inference via standard Markov Chain Monte Carlo (MCMC) methods such as Metropolis-Hastings is too computationally intensive to handle large datasets, since the cost per step usually scales like $O(n)$ in the number of data points $n$. We propose the Scalable Metropolis-Hastings (SMH) kernel that only requires processing on average $O(1)$ or even $O(1/\sqrt{n})$ data points per step. This scheme is based on a combination of factorized acceptance probabilities, procedures for fast simulation of Bernoulli processes, and control variate ideas. Contrary to many MCMC subsampling schemes such as fixed step-size Stochastic Gradient Langevin Dynamics, our approach is exact insofar as the invariant distribution is the true posterior and not an approximation to it. We characterise the performance of our algorithm theoretically, and give realistic and verifiable conditions under which it is geometrically ergodic. This theory is borne out by empirical results that demonstrate overall performance benefits over standard Metropolis-Hastings and various subsampling algorithms.} }
Endnote
%0 Conference Paper %T Scalable Metropolis-Hastings for Exact Bayesian Inference with Large Datasets %A Rob Cornish %A Paul Vanetti %A Alexandre Bouchard-Cote %A George Deligiannidis %A Arnaud Doucet %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-cornish19a %I PMLR %P 1351--1360 %U https://proceedings.mlr.press/v97/cornish19a.html %V 97 %X Bayesian inference via standard Markov Chain Monte Carlo (MCMC) methods such as Metropolis-Hastings is too computationally intensive to handle large datasets, since the cost per step usually scales like $O(n)$ in the number of data points $n$. We propose the Scalable Metropolis-Hastings (SMH) kernel that only requires processing on average $O(1)$ or even $O(1/\sqrt{n})$ data points per step. This scheme is based on a combination of factorized acceptance probabilities, procedures for fast simulation of Bernoulli processes, and control variate ideas. Contrary to many MCMC subsampling schemes such as fixed step-size Stochastic Gradient Langevin Dynamics, our approach is exact insofar as the invariant distribution is the true posterior and not an approximation to it. We characterise the performance of our algorithm theoretically, and give realistic and verifiable conditions under which it is geometrically ergodic. This theory is borne out by empirical results that demonstrate overall performance benefits over standard Metropolis-Hastings and various subsampling algorithms.
APA
Cornish, R., Vanetti, P., Bouchard-Cote, A., Deligiannidis, G. & Doucet, A.. (2019). Scalable Metropolis-Hastings for Exact Bayesian Inference with Large Datasets. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:1351-1360 Available from https://proceedings.mlr.press/v97/cornish19a.html.

Related Material