Scalable and Robust Bayesian Inference via the Median Posterior

Stanislav Minsker, Sanvesh Srivastava, Lizhen Lin, David Dunson
Proceedings of the 31st International Conference on Machine Learning, PMLR 32(2):1656-1664, 2014.

Abstract

Many Bayesian learning methods for massive data benefit from working with small subsets of observations. In particular, significant progress has been made in scalable Bayesian learning via stochastic approximation. However, Bayesian learning methods in distributed computing environments are often problem- or distribution-specific and use ad hoc techniques. We propose a novel general approach to Bayesian inference that is scalable and robust to corruption in the data. Our technique is based on the idea of splitting the data into several non-overlapping subgroups, evaluating the posterior distribution given each independent subgroup, and then combining the results. The main novelty is the proposed aggregation step which is based on finding the geometric median of posterior distributions. We present both theoretical and numerical results illustrating the advantages of our approach.

Cite this Paper


BibTeX
@InProceedings{pmlr-v32-minsker14, title = {Scalable and Robust Bayesian Inference via the Median Posterior}, author = {Minsker, Stanislav and Srivastava, Sanvesh and Lin, Lizhen and Dunson, David}, booktitle = {Proceedings of the 31st International Conference on Machine Learning}, pages = {1656--1664}, year = {2014}, editor = {Xing, Eric P. and Jebara, Tony}, volume = {32}, number = {2}, series = {Proceedings of Machine Learning Research}, address = {Bejing, China}, month = {22--24 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v32/minsker14.pdf}, url = {https://proceedings.mlr.press/v32/minsker14.html}, abstract = {Many Bayesian learning methods for massive data benefit from working with small subsets of observations. In particular, significant progress has been made in scalable Bayesian learning via stochastic approximation. However, Bayesian learning methods in distributed computing environments are often problem- or distribution-specific and use ad hoc techniques. We propose a novel general approach to Bayesian inference that is scalable and robust to corruption in the data. Our technique is based on the idea of splitting the data into several non-overlapping subgroups, evaluating the posterior distribution given each independent subgroup, and then combining the results. The main novelty is the proposed aggregation step which is based on finding the geometric median of posterior distributions. We present both theoretical and numerical results illustrating the advantages of our approach.} }
Endnote
%0 Conference Paper %T Scalable and Robust Bayesian Inference via the Median Posterior %A Stanislav Minsker %A Sanvesh Srivastava %A Lizhen Lin %A David Dunson %B Proceedings of the 31st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2014 %E Eric P. Xing %E Tony Jebara %F pmlr-v32-minsker14 %I PMLR %P 1656--1664 %U https://proceedings.mlr.press/v32/minsker14.html %V 32 %N 2 %X Many Bayesian learning methods for massive data benefit from working with small subsets of observations. In particular, significant progress has been made in scalable Bayesian learning via stochastic approximation. However, Bayesian learning methods in distributed computing environments are often problem- or distribution-specific and use ad hoc techniques. We propose a novel general approach to Bayesian inference that is scalable and robust to corruption in the data. Our technique is based on the idea of splitting the data into several non-overlapping subgroups, evaluating the posterior distribution given each independent subgroup, and then combining the results. The main novelty is the proposed aggregation step which is based on finding the geometric median of posterior distributions. We present both theoretical and numerical results illustrating the advantages of our approach.
RIS
TY - CPAPER TI - Scalable and Robust Bayesian Inference via the Median Posterior AU - Stanislav Minsker AU - Sanvesh Srivastava AU - Lizhen Lin AU - David Dunson BT - Proceedings of the 31st International Conference on Machine Learning DA - 2014/06/18 ED - Eric P. Xing ED - Tony Jebara ID - pmlr-v32-minsker14 PB - PMLR DP - Proceedings of Machine Learning Research VL - 32 IS - 2 SP - 1656 EP - 1664 L1 - http://proceedings.mlr.press/v32/minsker14.pdf UR - https://proceedings.mlr.press/v32/minsker14.html AB - Many Bayesian learning methods for massive data benefit from working with small subsets of observations. In particular, significant progress has been made in scalable Bayesian learning via stochastic approximation. However, Bayesian learning methods in distributed computing environments are often problem- or distribution-specific and use ad hoc techniques. We propose a novel general approach to Bayesian inference that is scalable and robust to corruption in the data. Our technique is based on the idea of splitting the data into several non-overlapping subgroups, evaluating the posterior distribution given each independent subgroup, and then combining the results. The main novelty is the proposed aggregation step which is based on finding the geometric median of posterior distributions. We present both theoretical and numerical results illustrating the advantages of our approach. ER -
APA
Minsker, S., Srivastava, S., Lin, L. & Dunson, D.. (2014). Scalable and Robust Bayesian Inference via the Median Posterior. Proceedings of the 31st International Conference on Machine Learning, in Proceedings of Machine Learning Research 32(2):1656-1664 Available from https://proceedings.mlr.press/v32/minsker14.html.

Related Material