d-VMP: Distributed Variational Message Passing

Andrés R. Masegosa, Ana M. Martı́nez, Helge Langseth, Thomas D. Nielsen, Antonio Salmerón, Darío Ramos-López, Anders L. Madsen
; Proceedings of the Eighth International Conference on Probabilistic Graphical Models, PMLR 52:321-332, 2016.

Abstract

Motivated by a real-world financial dataset, we propose a distributed variational message passing scheme for learning conjugate exponential models. We show that the method can be seen as a projected natural gradient ascent algorithm, and it therefore has good convergence properties. This is supported experimentally, where we show that the approach is robust wrt. common problems like imbalanced data, heavy-tailed empirical distributions, and a high degree of missing values. The scheme is based on map-reduce operations, and utilizes the memory management of modern big data frameworks like Apache Flink to obtain a time-efficient and scalable implementation. The proposed algorithm compares favourably to stochastic variational inference both in terms of speed and quality of the learned models. For the scalability analysis, we evaluate our approach over a network with more than one billion nodes (and approx. 75% latent variables) using a computer cluster with 128 processing units.

Cite this Paper


BibTeX
@InProceedings{pmlr-v52-masegosa16, title = {d-{VMP}: Distributed Variational Message Passing}, author = {Andrés R. Masegosa and Ana M. Martı́nez and Helge Langseth and Thomas D. Nielsen and Antonio Salmerón and Darío Ramos-López and Anders L. Madsen}, booktitle = {Proceedings of the Eighth International Conference on Probabilistic Graphical Models}, pages = {321--332}, year = {2016}, editor = {Alessandro Antonucci and Giorgio Corani and Cassio Polpo Campos}}, volume = {52}, series = {Proceedings of Machine Learning Research}, address = {Lugano, Switzerland}, month = {06--09 Sep}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v52/masegosa16.pdf}, url = {http://proceedings.mlr.press/v52/masegosa16.html}, abstract = {Motivated by a real-world financial dataset, we propose a distributed variational message passing scheme for learning conjugate exponential models. We show that the method can be seen as a projected natural gradient ascent algorithm, and it therefore has good convergence properties. This is supported experimentally, where we show that the approach is robust wrt. common problems like imbalanced data, heavy-tailed empirical distributions, and a high degree of missing values. The scheme is based on map-reduce operations, and utilizes the memory management of modern big data frameworks like Apache Flink to obtain a time-efficient and scalable implementation. The proposed algorithm compares favourably to stochastic variational inference both in terms of speed and quality of the learned models. For the scalability analysis, we evaluate our approach over a network with more than one billion nodes (and approx. 75% latent variables) using a computer cluster with 128 processing units.} }
Endnote
%0 Conference Paper %T d-VMP: Distributed Variational Message Passing %A Andrés R. Masegosa %A Ana M. Martı́nez %A Helge Langseth %A Thomas D. Nielsen %A Antonio Salmerón %A Darío Ramos-López %A Anders L. Madsen %B Proceedings of the Eighth International Conference on Probabilistic Graphical Models %C Proceedings of Machine Learning Research %D 2016 %E Alessandro Antonucci %E Giorgio Corani %E Cassio Polpo Campos} %F pmlr-v52-masegosa16 %I PMLR %J Proceedings of Machine Learning Research %P 321--332 %U http://proceedings.mlr.press %V 52 %W PMLR %X Motivated by a real-world financial dataset, we propose a distributed variational message passing scheme for learning conjugate exponential models. We show that the method can be seen as a projected natural gradient ascent algorithm, and it therefore has good convergence properties. This is supported experimentally, where we show that the approach is robust wrt. common problems like imbalanced data, heavy-tailed empirical distributions, and a high degree of missing values. The scheme is based on map-reduce operations, and utilizes the memory management of modern big data frameworks like Apache Flink to obtain a time-efficient and scalable implementation. The proposed algorithm compares favourably to stochastic variational inference both in terms of speed and quality of the learned models. For the scalability analysis, we evaluate our approach over a network with more than one billion nodes (and approx. 75% latent variables) using a computer cluster with 128 processing units.
RIS
TY - CPAPER TI - d-VMP: Distributed Variational Message Passing AU - Andrés R. Masegosa AU - Ana M. Martı́nez AU - Helge Langseth AU - Thomas D. Nielsen AU - Antonio Salmerón AU - Darío Ramos-López AU - Anders L. Madsen BT - Proceedings of the Eighth International Conference on Probabilistic Graphical Models PY - 2016/08/15 DA - 2016/08/15 ED - Alessandro Antonucci ED - Giorgio Corani ED - Cassio Polpo Campos} ID - pmlr-v52-masegosa16 PB - PMLR SP - 321 DP - PMLR EP - 332 L1 - http://proceedings.mlr.press/v52/masegosa16.pdf UR - http://proceedings.mlr.press/v52/masegosa16.html AB - Motivated by a real-world financial dataset, we propose a distributed variational message passing scheme for learning conjugate exponential models. We show that the method can be seen as a projected natural gradient ascent algorithm, and it therefore has good convergence properties. This is supported experimentally, where we show that the approach is robust wrt. common problems like imbalanced data, heavy-tailed empirical distributions, and a high degree of missing values. The scheme is based on map-reduce operations, and utilizes the memory management of modern big data frameworks like Apache Flink to obtain a time-efficient and scalable implementation. The proposed algorithm compares favourably to stochastic variational inference both in terms of speed and quality of the learned models. For the scalability analysis, we evaluate our approach over a network with more than one billion nodes (and approx. 75% latent variables) using a computer cluster with 128 processing units. ER -
APA
Masegosa, A.R., Martı́nez, A.M., Langseth, H., Nielsen, T.D., Salmerón, A., Ramos-López, D. & Madsen, A.L.. (2016). d-VMP: Distributed Variational Message Passing. Proceedings of the Eighth International Conference on Probabilistic Graphical Models, in PMLR 52:321-332

Related Material