Learning the Bayesian network structure: dirichlet prior versus data

Harald Steck
Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence, PMLR R6:511-518, 2008.

Abstract

In the Bayesian approach to structure learning of graphical models, the equivalent sample size (ESS) in the Dirichlet prior over the model parameters was recently shown to have an important effect on the maximum-a-posteriori estimate of the Bayesian network structure. In our first contribution, we theoretically analyze the case of large ESS-values, which complements previous work: among other results, we find that the presence of an edge in a Bayesian network is favoured over its absence even if both the Dirichlet prior and the data imply independence, as long as the conditional empirical distribution is notably different from uniform. In our second contribution, we focus on realistic ESS-values, and provide an analytical approximation to the ’optimal’ ESS-value in a predictive sense (its accuracy is also validated experimentally): this approximation provides an understanding as to which properties of the data have the main effect determining the ’optimal’ ESS-value.

Cite this Paper


BibTeX
@InProceedings{pmlr-vR6-steck08a, title = {Learning the Bayesian network structure: dirichlet prior versus data}, author = {Steck, Harald}, booktitle = {Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence}, pages = {511--518}, year = {2008}, editor = {McAllester, David A. and Myllymäki, Petri}, volume = {R6}, series = {Proceedings of Machine Learning Research}, month = {09--12 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/r6/main/assets/steck08a/steck08a.pdf}, url = {https://proceedings.mlr.press/r6/steck08a.html}, abstract = {In the Bayesian approach to structure learning of graphical models, the equivalent sample size (ESS) in the Dirichlet prior over the model parameters was recently shown to have an important effect on the maximum-a-posteriori estimate of the Bayesian network structure. In our first contribution, we theoretically analyze the case of large ESS-values, which complements previous work: among other results, we find that the presence of an edge in a Bayesian network is favoured over its absence even if both the Dirichlet prior and the data imply independence, as long as the conditional empirical distribution is notably different from uniform. In our second contribution, we focus on realistic ESS-values, and provide an analytical approximation to the ’optimal’ ESS-value in a predictive sense (its accuracy is also validated experimentally): this approximation provides an understanding as to which properties of the data have the main effect determining the ’optimal’ ESS-value.}, note = {Reissued by PMLR on 09 October 2024.} }
Endnote
%0 Conference Paper %T Learning the Bayesian network structure: dirichlet prior versus data %A Harald Steck %B Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2008 %E David A. McAllester %E Petri Myllymäki %F pmlr-vR6-steck08a %I PMLR %P 511--518 %U https://proceedings.mlr.press/r6/steck08a.html %V R6 %X In the Bayesian approach to structure learning of graphical models, the equivalent sample size (ESS) in the Dirichlet prior over the model parameters was recently shown to have an important effect on the maximum-a-posteriori estimate of the Bayesian network structure. In our first contribution, we theoretically analyze the case of large ESS-values, which complements previous work: among other results, we find that the presence of an edge in a Bayesian network is favoured over its absence even if both the Dirichlet prior and the data imply independence, as long as the conditional empirical distribution is notably different from uniform. In our second contribution, we focus on realistic ESS-values, and provide an analytical approximation to the ’optimal’ ESS-value in a predictive sense (its accuracy is also validated experimentally): this approximation provides an understanding as to which properties of the data have the main effect determining the ’optimal’ ESS-value. %Z Reissued by PMLR on 09 October 2024.
APA
Steck, H.. (2008). Learning the Bayesian network structure: dirichlet prior versus data. Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research R6:511-518 Available from https://proceedings.mlr.press/r6/steck08a.html. Reissued by PMLR on 09 October 2024.

Related Material