Structure Learning from Related Data Sets with a Hierarchical Bayesian Score

Laura Azzimonti, Giorgio Corani, Marco Scutari
Proceedings of the 10th International Conference on Probabilistic Graphical Models, PMLR 138:5-16, 2020.

Abstract

Score functions for learning the structure of Bayesian networks in the literature assume that data are a homogeneous set of observations; whereas it is often the case that they comprise different related, but not homogeneous, data sets collected in different ways. In this paper we propose a new Bayesian Dirichlet score, which we call Bayesian Hierarchical Dirichlet (BHD). The proposed score is based on a hierarchical model that pools information across data sets to learn a single encompassing network structure, while taking into account the differences in their probabilistic structures. We derive a closed-form expression for BHD using a variational approximation of the marginal likelihood and we study its performance using simulated data. We find that, when data comprise multiple related data sets, BHD outperforms the Bayesian Dirichlet equivalent uniform (BDeu) score in terms of reconstruction accuracy as measured by the Structural Hamming distance, and that it is as accurate as BDeu when data are homogeneous. Moreover, the estimated networks are sparser and therefore more interpretable than those obtained with BDeu, thanks to a lower number of false positive arcs.

Cite this Paper


BibTeX
@InProceedings{pmlr-v138-azzimonti20a, title = {{Structure Learning from Related Data Sets with a Hierarchical Bayesian Score}}, author = {Azzimonti, Laura and Corani, Giorgio and Scutari, Marco}, booktitle = {Proceedings of the 10th International Conference on Probabilistic Graphical Models}, pages = {5--16}, year = {2020}, editor = {Manfred Jaeger and Thomas Dyhre Nielsen}, volume = {138}, series = {Proceedings of Machine Learning Research}, month = {23--25 Sep}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v138/azzimonti20a/azzimonti20a.pdf}, url = { http://proceedings.mlr.press/v138/azzimonti20a.html }, abstract = {Score functions for learning the structure of Bayesian networks in the literature assume that data are a homogeneous set of observations; whereas it is often the case that they comprise different related, but not homogeneous, data sets collected in different ways. In this paper we propose a new Bayesian Dirichlet score, which we call Bayesian Hierarchical Dirichlet (BHD). The proposed score is based on a hierarchical model that pools information across data sets to learn a single encompassing network structure, while taking into account the differences in their probabilistic structures. We derive a closed-form expression for BHD using a variational approximation of the marginal likelihood and we study its performance using simulated data. We find that, when data comprise multiple related data sets, BHD outperforms the Bayesian Dirichlet equivalent uniform (BDeu) score in terms of reconstruction accuracy as measured by the Structural Hamming distance, and that it is as accurate as BDeu when data are homogeneous. Moreover, the estimated networks are sparser and therefore more interpretable than those obtained with BDeu, thanks to a lower number of false positive arcs.} }
Endnote
%0 Conference Paper %T Structure Learning from Related Data Sets with a Hierarchical Bayesian Score %A Laura Azzimonti %A Giorgio Corani %A Marco Scutari %B Proceedings of the 10th International Conference on Probabilistic Graphical Models %C Proceedings of Machine Learning Research %D 2020 %E Manfred Jaeger %E Thomas Dyhre Nielsen %F pmlr-v138-azzimonti20a %I PMLR %P 5--16 %U http://proceedings.mlr.press/v138/azzimonti20a.html %V 138 %X Score functions for learning the structure of Bayesian networks in the literature assume that data are a homogeneous set of observations; whereas it is often the case that they comprise different related, but not homogeneous, data sets collected in different ways. In this paper we propose a new Bayesian Dirichlet score, which we call Bayesian Hierarchical Dirichlet (BHD). The proposed score is based on a hierarchical model that pools information across data sets to learn a single encompassing network structure, while taking into account the differences in their probabilistic structures. We derive a closed-form expression for BHD using a variational approximation of the marginal likelihood and we study its performance using simulated data. We find that, when data comprise multiple related data sets, BHD outperforms the Bayesian Dirichlet equivalent uniform (BDeu) score in terms of reconstruction accuracy as measured by the Structural Hamming distance, and that it is as accurate as BDeu when data are homogeneous. Moreover, the estimated networks are sparser and therefore more interpretable than those obtained with BDeu, thanks to a lower number of false positive arcs.
APA
Azzimonti, L., Corani, G. & Scutari, M.. (2020). Structure Learning from Related Data Sets with a Hierarchical Bayesian Score. Proceedings of the 10th International Conference on Probabilistic Graphical Models, in Proceedings of Machine Learning Research 138:5-16 Available from http://proceedings.mlr.press/v138/azzimonti20a.html .

Related Material