Variational Training for Large-Scale Noisy-OR Bayesian Networks

Geng Ji, Dehua Cheng, Huazhong Ning, Changhe Yuan, Hanning Zhou, Liang Xiong, Erik B. Sudderth
Proceedings of The 35th Uncertainty in Artificial Intelligence Conference, PMLR 115:873-882, 2020.

Abstract

We propose a stochastic variational inference algorithm for training large-scale Bayesian networks, where noisy-OR conditional distributions are used to capture higher-order relationships. One application is to the learning of hierarchical topic models for text data. While previous work has focused on two-layer networks popular in applications like medical diagnosis, we develop scalable algorithms for deep networks that capture a multi-level hierarchy of interactions. Our key innovation is a family of constrained variational bounds that only explicitly optimize posterior probabilities for the sub-graph of topics most related to the sparse observations in a given document. These constrained bounds have comparable accuracy but dramatically reduced computational cost. Using stochastic gradient updates based on our variational bounds, we learn noisy-OR Bayesian networks orders of magnitude faster than was possible with prior Monte Carlo learning algorithms, and provide a new tool for understanding large-scale binary data.

Cite this Paper


BibTeX
@InProceedings{pmlr-v115-ji20a, title = {Variational Training for Large-Scale Noisy-OR Bayesian Networks}, author = {Ji, Geng and Cheng, Dehua and Ning, Huazhong and Yuan, Changhe and Zhou, Hanning and Xiong, Liang and Sudderth, Erik B.}, booktitle = {Proceedings of The 35th Uncertainty in Artificial Intelligence Conference}, pages = {873--882}, year = {2020}, editor = {Adams, Ryan P. and Gogate, Vibhav}, volume = {115}, series = {Proceedings of Machine Learning Research}, month = {22--25 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v115/ji20a/ji20a.pdf}, url = {https://proceedings.mlr.press/v115/ji20a.html}, abstract = {We propose a stochastic variational inference algorithm for training large-scale Bayesian networks, where noisy-OR conditional distributions are used to capture higher-order relationships. One application is to the learning of hierarchical topic models for text data. While previous work has focused on two-layer networks popular in applications like medical diagnosis, we develop scalable algorithms for deep networks that capture a multi-level hierarchy of interactions. Our key innovation is a family of constrained variational bounds that only explicitly optimize posterior probabilities for the sub-graph of topics most related to the sparse observations in a given document. These constrained bounds have comparable accuracy but dramatically reduced computational cost. Using stochastic gradient updates based on our variational bounds, we learn noisy-OR Bayesian networks orders of magnitude faster than was possible with prior Monte Carlo learning algorithms, and provide a new tool for understanding large-scale binary data.} }
Endnote
%0 Conference Paper %T Variational Training for Large-Scale Noisy-OR Bayesian Networks %A Geng Ji %A Dehua Cheng %A Huazhong Ning %A Changhe Yuan %A Hanning Zhou %A Liang Xiong %A Erik B. Sudderth %B Proceedings of The 35th Uncertainty in Artificial Intelligence Conference %C Proceedings of Machine Learning Research %D 2020 %E Ryan P. Adams %E Vibhav Gogate %F pmlr-v115-ji20a %I PMLR %P 873--882 %U https://proceedings.mlr.press/v115/ji20a.html %V 115 %X We propose a stochastic variational inference algorithm for training large-scale Bayesian networks, where noisy-OR conditional distributions are used to capture higher-order relationships. One application is to the learning of hierarchical topic models for text data. While previous work has focused on two-layer networks popular in applications like medical diagnosis, we develop scalable algorithms for deep networks that capture a multi-level hierarchy of interactions. Our key innovation is a family of constrained variational bounds that only explicitly optimize posterior probabilities for the sub-graph of topics most related to the sparse observations in a given document. These constrained bounds have comparable accuracy but dramatically reduced computational cost. Using stochastic gradient updates based on our variational bounds, we learn noisy-OR Bayesian networks orders of magnitude faster than was possible with prior Monte Carlo learning algorithms, and provide a new tool for understanding large-scale binary data.
APA
Ji, G., Cheng, D., Ning, H., Yuan, C., Zhou, H., Xiong, L. & Sudderth, E.B.. (2020). Variational Training for Large-Scale Noisy-OR Bayesian Networks. Proceedings of The 35th Uncertainty in Artificial Intelligence Conference, in Proceedings of Machine Learning Research 115:873-882 Available from https://proceedings.mlr.press/v115/ji20a.html.

Related Material