Using Mixed-Effects Models to Learn Bayesian Networks from Related Data Sets

Marco Scutari, Christopher Marquis, Laura Azzimonti
Proceedings of The 11th International Conference on Probabilistic Graphical Models, PMLR 186:73-84, 2022.

Abstract

We commonly assume that data are a homogeneous set of observations when learning the structure of Bayesian networks. However, they often comprise different data sets that are related but not homogeneous because they have been collected in different ways or from different populations. In a previous work, we proposed a closed-form Bayesian Hierarchical Dirichlet score for discrete data that pools information across related data sets to learn a single encompassing network structure, while taking into account the differences in their probabilistic structures. In this paper, we provide an analogous solution for learning a Bayesian network from continuous data using mixed-effects models to pool information across the related data sets. We study its structural, parametric, predictive and classification accuracy and we show that it outperforms both conditional Gaussian Bayesian networks (that do not perform any pooling) and classical Gaussian Bayesian networks (that disregard the heterogeneous nature of the data). The improvement is marked for low sample sizes and for unbalanced data sets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v186-scutari22a, title = {Using Mixed-Effects Models to Learn Bayesian Networks from Related Data Sets}, author = {Scutari, Marco and Marquis, Christopher and Azzimonti, Laura}, booktitle = {Proceedings of The 11th International Conference on Probabilistic Graphical Models}, pages = {73--84}, year = {2022}, editor = {Salmerón, Antonio and Rumı́, Rafael}, volume = {186}, series = {Proceedings of Machine Learning Research}, month = {05--07 Oct}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v186/scutari22a/scutari22a.pdf}, url = {https://proceedings.mlr.press/v186/scutari22a.html}, abstract = {We commonly assume that data are a homogeneous set of observations when learning the structure of Bayesian networks. However, they often comprise different data sets that are related but not homogeneous because they have been collected in different ways or from different populations. In a previous work, we proposed a closed-form Bayesian Hierarchical Dirichlet score for discrete data that pools information across related data sets to learn a single encompassing network structure, while taking into account the differences in their probabilistic structures. In this paper, we provide an analogous solution for learning a Bayesian network from continuous data using mixed-effects models to pool information across the related data sets. We study its structural, parametric, predictive and classification accuracy and we show that it outperforms both conditional Gaussian Bayesian networks (that do not perform any pooling) and classical Gaussian Bayesian networks (that disregard the heterogeneous nature of the data). The improvement is marked for low sample sizes and for unbalanced data sets. } }
Endnote
%0 Conference Paper %T Using Mixed-Effects Models to Learn Bayesian Networks from Related Data Sets %A Marco Scutari %A Christopher Marquis %A Laura Azzimonti %B Proceedings of The 11th International Conference on Probabilistic Graphical Models %C Proceedings of Machine Learning Research %D 2022 %E Antonio Salmerón %E Rafael Rumı́ %F pmlr-v186-scutari22a %I PMLR %P 73--84 %U https://proceedings.mlr.press/v186/scutari22a.html %V 186 %X We commonly assume that data are a homogeneous set of observations when learning the structure of Bayesian networks. However, they often comprise different data sets that are related but not homogeneous because they have been collected in different ways or from different populations. In a previous work, we proposed a closed-form Bayesian Hierarchical Dirichlet score for discrete data that pools information across related data sets to learn a single encompassing network structure, while taking into account the differences in their probabilistic structures. In this paper, we provide an analogous solution for learning a Bayesian network from continuous data using mixed-effects models to pool information across the related data sets. We study its structural, parametric, predictive and classification accuracy and we show that it outperforms both conditional Gaussian Bayesian networks (that do not perform any pooling) and classical Gaussian Bayesian networks (that disregard the heterogeneous nature of the data). The improvement is marked for low sample sizes and for unbalanced data sets.
APA
Scutari, M., Marquis, C. & Azzimonti, L.. (2022). Using Mixed-Effects Models to Learn Bayesian Networks from Related Data Sets. Proceedings of The 11th International Conference on Probabilistic Graphical Models, in Proceedings of Machine Learning Research 186:73-84 Available from https://proceedings.mlr.press/v186/scutari22a.html.

Related Material