WASP: Scalable Bayes via barycenters of subset posteriors

Sanvesh Srivastava; Volkan Cevher; Quoc Dinh; David Dunson

WASP: Scalable Bayes via barycenters of subset posteriors

Sanvesh Srivastava, Volkan Cevher, Quoc Dinh, David Dunson

Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, PMLR 38:912-920, 2015.

Abstract

The promise of Bayesian methods for big data sets has not fully been realized due to the lack of scalable computational algorithms. For massive data, it is necessary to store and process subsets on different machines in a distributed manner. We propose a simple, general, and highly efficient approach, which first runs a posterior sampling algorithm in parallel on different machines for subsets of a large data set. To combine these subset posteriors, we calculate the Wasserstein barycenter via a highly efficient linear program. The resulting estimate for the Wasserstein posterior (WASP) has an atomic form, facilitating straightforward estimation of posterior summaries of functionals of interest. The WASP approach allows posterior sampling algorithms for smaller data sets to be trivially scaled to huge data. We provide theoretical justification in terms of posterior consistency and algorithm efficiency. Examples are provided in complex settings including Gaussian process regression and nonparametric Bayes mixture models.

Cite this Paper

BibTeX


@InProceedings{pmlr-v38-srivastava15,
  title = 	 {{WASP: Scalable Bayes via barycenters of subset posteriors}},
  author = 	 {Srivastava, Sanvesh and Cevher, Volkan and Dinh, Quoc and Dunson, David},
  booktitle = 	 {Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics},
  pages = 	 {912--920},
  year = 	 {2015},
  editor = 	 {Lebanon, Guy and Vishwanathan, S. V. N.},
  volume = 	 {38},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {San Diego, California, USA},
  month = 	 {09--12 May},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v38/srivastava15.pdf},
  url = 	 {https://proceedings.mlr.press/v38/srivastava15.html},
  abstract = 	 {The promise of Bayesian methods for big data sets has not fully been realized due to the lack of scalable computational algorithms. For massive data, it is necessary to store and process subsets on different machines in a distributed manner. We propose a simple, general, and highly efficient approach, which first runs a posterior sampling algorithm in parallel on different machines for subsets of a large data set. To combine these subset posteriors, we calculate the Wasserstein barycenter via a highly efficient linear program. The resulting estimate for the Wasserstein posterior (WASP) has an atomic form, facilitating straightforward estimation of posterior summaries of functionals of interest. The WASP approach allows posterior sampling algorithms for smaller data sets to be trivially scaled to huge data. We provide theoretical justification in terms of posterior consistency and algorithm efficiency.  Examples are provided in complex settings including Gaussian process regression and nonparametric Bayes mixture models.}
}

Endnote

%0 Conference Paper
%T WASP: Scalable Bayes via barycenters of subset posteriors
%A Sanvesh Srivastava
%A Volkan Cevher
%A Quoc Dinh
%A David Dunson
%B Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2015
%E Guy Lebanon
%E S. V. N. Vishwanathan	
%F pmlr-v38-srivastava15
%I PMLR
%P 912--920
%U https://proceedings.mlr.press/v38/srivastava15.html
%V 38
%X The promise of Bayesian methods for big data sets has not fully been realized due to the lack of scalable computational algorithms. For massive data, it is necessary to store and process subsets on different machines in a distributed manner. We propose a simple, general, and highly efficient approach, which first runs a posterior sampling algorithm in parallel on different machines for subsets of a large data set. To combine these subset posteriors, we calculate the Wasserstein barycenter via a highly efficient linear program. The resulting estimate for the Wasserstein posterior (WASP) has an atomic form, facilitating straightforward estimation of posterior summaries of functionals of interest. The WASP approach allows posterior sampling algorithms for smaller data sets to be trivially scaled to huge data. We provide theoretical justification in terms of posterior consistency and algorithm efficiency.  Examples are provided in complex settings including Gaussian process regression and nonparametric Bayes mixture models.

RIS


TY  - CPAPER
TI  - WASP: Scalable Bayes via barycenters of subset posteriors
AU  - Sanvesh Srivastava
AU  - Volkan Cevher
AU  - Quoc Dinh
AU  - David Dunson
BT  - Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics
DA  - 2015/02/21
ED  - Guy Lebanon
ED  - S. V. N. Vishwanathan	
ID  - pmlr-v38-srivastava15
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 38
SP  - 912
EP  - 920
L1  - http://proceedings.mlr.press/v38/srivastava15.pdf
UR  - https://proceedings.mlr.press/v38/srivastava15.html
AB  - The promise of Bayesian methods for big data sets has not fully been realized due to the lack of scalable computational algorithms. For massive data, it is necessary to store and process subsets on different machines in a distributed manner. We propose a simple, general, and highly efficient approach, which first runs a posterior sampling algorithm in parallel on different machines for subsets of a large data set. To combine these subset posteriors, we calculate the Wasserstein barycenter via a highly efficient linear program. The resulting estimate for the Wasserstein posterior (WASP) has an atomic form, facilitating straightforward estimation of posterior summaries of functionals of interest. The WASP approach allows posterior sampling algorithms for smaller data sets to be trivially scaled to huge data. We provide theoretical justification in terms of posterior consistency and algorithm efficiency.  Examples are provided in complex settings including Gaussian process regression and nonparametric Bayes mixture models.
ER  -

APA


Srivastava, S., Cevher, V., Dinh, Q. & Dunson, D.. (2015). WASP: Scalable Bayes via barycenters of subset posteriors. Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 38:912-920 Available from https://proceedings.mlr.press/v38/srivastava15.html.

WASP: Scalable Bayes via barycenters of subset posteriors

Abstract

Cite this Paper

Related Material