[edit]
Dirichlet Bayesian Network Scores and the Maximum Entropy Principle
Proceedings of The 3rd International Workshop on Advanced Methodologies for Bayesian Networks, PMLR 73:8-20, 2017.
Abstract
A classic approach for learning Bayesian networks from data is to select the
\emph{maximum a posteriori} (MAP) network. In the case of discrete Bayesian
networks, the MAP network is selected by maximising one of several possible
Bayesian Dirichlet (BD) scores; the most famous is the \emph{Bayesian
Dirichlet equivalent uniform} (BDeu) score from Heckerman \emph{et al.} (1995). The key
properties of BDeu arise from its underlying uniform prior, which makes
structure learning computationally efficient; does not require the elicitation
of prior knowledge from experts; and satisfies score equivalence.
In this paper we will discuss the impact of this uniform prior on structure
learning from an information theoretic perspective, showing how BDeu may
violate the maximum entropy principle when applied to sparse data and how it
may also be problematic from a Bayesian model selection perspective. On the
other hand, the BDs score proposed in Scutari (2016) arises from a piecewise
prior and it does not appear to violate the maximum entropy principle, even
though it is asymptotically equivalent to BDeu.