Dirichlet Bayesian Network Scores and the Maximum Entropy Principle

Marco Scutari
Proceedings of The 3rd International Workshop on Advanced Methodologies for Bayesian Networks, PMLR 73:8-20, 2017.

Abstract

A classic approach for learning Bayesian networks from data is to select the \emph{maximum a posteriori} (MAP) network. In the case of discrete Bayesian networks, the MAP network is selected by maximising one of several possible Bayesian Dirichlet (BD) scores; the most famous is the \emph{Bayesian Dirichlet equivalent uniform} (BDeu) score from Heckerman \emph{et al.} (1995). The key properties of BDeu arise from its underlying uniform prior, which makes structure learning computationally efficient; does not require the elicitation of prior knowledge from experts; and satisfies score equivalence. In this paper we will discuss the impact of this uniform prior on structure learning from an information theoretic perspective, showing how BDeu may violate the maximum entropy principle when applied to sparse data and how it may also be problematic from a Bayesian model selection perspective. On the other hand, the BDs score proposed in Scutari (2016) arises from a piecewise prior and it does not appear to violate the maximum entropy principle, even though it is asymptotically equivalent to BDeu.

Cite this Paper


BibTeX
@InProceedings{pmlr-v73-scutari17a, title = {Dirichlet Bayesian Network Scores and the Maximum Entropy Principle}, author = {Scutari, Marco}, booktitle = {Proceedings of The 3rd International Workshop on Advanced Methodologies for Bayesian Networks}, pages = {8--20}, year = {2017}, editor = {Hyttinen, Antti and Suzuki, Joe and Malone, Brandon}, volume = {73}, series = {Proceedings of Machine Learning Research}, month = {20--22 Sep}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v73/scutari17a/scutari17a.pdf}, url = {https://proceedings.mlr.press/v73/scutari17a.html}, abstract = {A classic approach for learning Bayesian networks from data is to select the \emph{maximum a posteriori} (MAP) network. In the case of discrete Bayesian networks, the MAP network is selected by maximising one of several possible Bayesian Dirichlet (BD) scores; the most famous is the \emph{Bayesian Dirichlet equivalent uniform} (BDeu) score from Heckerman \emph{et al.} (1995). The key properties of BDeu arise from its underlying uniform prior, which makes structure learning computationally efficient; does not require the elicitation of prior knowledge from experts; and satisfies score equivalence. In this paper we will discuss the impact of this uniform prior on structure learning from an information theoretic perspective, showing how BDeu may violate the maximum entropy principle when applied to sparse data and how it may also be problematic from a Bayesian model selection perspective. On the other hand, the BDs score proposed in Scutari (2016) arises from a piecewise prior and it does not appear to violate the maximum entropy principle, even though it is asymptotically equivalent to BDeu.} }
Endnote
%0 Conference Paper %T Dirichlet Bayesian Network Scores and the Maximum Entropy Principle %A Marco Scutari %B Proceedings of The 3rd International Workshop on Advanced Methodologies for Bayesian Networks %C Proceedings of Machine Learning Research %D 2017 %E Antti Hyttinen %E Joe Suzuki %E Brandon Malone %F pmlr-v73-scutari17a %I PMLR %P 8--20 %U https://proceedings.mlr.press/v73/scutari17a.html %V 73 %X A classic approach for learning Bayesian networks from data is to select the \emph{maximum a posteriori} (MAP) network. In the case of discrete Bayesian networks, the MAP network is selected by maximising one of several possible Bayesian Dirichlet (BD) scores; the most famous is the \emph{Bayesian Dirichlet equivalent uniform} (BDeu) score from Heckerman \emph{et al.} (1995). The key properties of BDeu arise from its underlying uniform prior, which makes structure learning computationally efficient; does not require the elicitation of prior knowledge from experts; and satisfies score equivalence. In this paper we will discuss the impact of this uniform prior on structure learning from an information theoretic perspective, showing how BDeu may violate the maximum entropy principle when applied to sparse data and how it may also be problematic from a Bayesian model selection perspective. On the other hand, the BDs score proposed in Scutari (2016) arises from a piecewise prior and it does not appear to violate the maximum entropy principle, even though it is asymptotically equivalent to BDeu.
APA
Scutari, M.. (2017). Dirichlet Bayesian Network Scores and the Maximum Entropy Principle. Proceedings of The 3rd International Workshop on Advanced Methodologies for Bayesian Networks, in Proceedings of Machine Learning Research 73:8-20 Available from https://proceedings.mlr.press/v73/scutari17a.html.

Related Material