Dirichlet Bayesian Network Scores and the Maximum Entropy Principle

Marco Scutari

Dirichlet Bayesian Network Scores and the Maximum Entropy Principle

Marco Scutari

Proceedings of The 3rd International Workshop on Advanced Methodologies for Bayesian Networks, PMLR 73:8-20, 2017.

Abstract

A classic approach for learning Bayesian networks from data is to select the \emph{maximum a posteriori} (MAP) network. In the case of discrete Bayesian networks, the MAP network is selected by maximising one of several possible Bayesian Dirichlet (BD) scores; the most famous is the \emph{Bayesian Dirichlet equivalent uniform} (BDeu) score from Heckerman \emph{et al.} (1995). The key properties of BDeu arise from its underlying uniform prior, which makes structure learning computationally efficient; does not require the elicitation of prior knowledge from experts; and satisfies score equivalence. In this paper we will discuss the impact of this uniform prior on structure learning from an information theoretic perspective, showing how BDeu may violate the maximum entropy principle when applied to sparse data and how it may also be problematic from a Bayesian model selection perspective. On the other hand, the BDs score proposed in Scutari (2016) arises from a piecewise prior and it does not appear to violate the maximum entropy principle, even though it is asymptotically equivalent to BDeu.

Cite this Paper

BibTeX


@InProceedings{pmlr-v73-scutari17a,
  title = 	 {Dirichlet Bayesian Network Scores and the Maximum Entropy Principle},
  author = 	 {Scutari, Marco},
  booktitle = 	 {Proceedings of The 3rd International Workshop on Advanced Methodologies for Bayesian Networks},
  pages = 	 {8--20},
  year = 	 {2017},
  editor = 	 {Hyttinen, Antti and Suzuki, Joe and Malone, Brandon},
  volume = 	 {73},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {20--22 Sep},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v73/scutari17a/scutari17a.pdf},
  url = 	 {https://proceedings.mlr.press/v73/scutari17a.html},
  abstract = 	 {A classic approach for learning Bayesian networks from data is to select the
 \emph{maximum a posteriori} (MAP) network. In the case of discrete Bayesian
 networks, the MAP network is selected by maximising one of several possible
 Bayesian Dirichlet (BD) scores; the most famous is the \emph{Bayesian
 Dirichlet equivalent uniform} (BDeu) score from Heckerman \emph{et al.} (1995). The key
 properties of BDeu arise from its underlying uniform prior, which makes
 structure learning computationally efficient; does not require the elicitation
 of prior knowledge from experts; and satisfies score equivalence.
 In this paper we will discuss the impact of this uniform prior on structure
 learning from an information theoretic perspective, showing how BDeu may
 violate the maximum entropy principle when applied to sparse data and how it
 may also be problematic from a Bayesian model selection perspective. On the
 other hand, the BDs score proposed in Scutari (2016) arises from a piecewise
 prior and it does not appear to violate the maximum entropy principle, even
 though it is asymptotically equivalent to BDeu.}
}

Endnote

%0 Conference Paper
%T Dirichlet Bayesian Network Scores and the Maximum Entropy Principle
%A Marco Scutari
%B Proceedings of The 3rd International Workshop on Advanced Methodologies for Bayesian Networks
%C Proceedings of Machine Learning Research
%D 2017
%E Antti Hyttinen
%E Joe Suzuki
%E Brandon Malone	
%F pmlr-v73-scutari17a
%I PMLR
%P 8--20
%U https://proceedings.mlr.press/v73/scutari17a.html
%V 73
%X A classic approach for learning Bayesian networks from data is to select the
 \emph{maximum a posteriori} (MAP) network. In the case of discrete Bayesian
 networks, the MAP network is selected by maximising one of several possible
 Bayesian Dirichlet (BD) scores; the most famous is the \emph{Bayesian
 Dirichlet equivalent uniform} (BDeu) score from Heckerman \emph{et al.} (1995). The key
 properties of BDeu arise from its underlying uniform prior, which makes
 structure learning computationally efficient; does not require the elicitation
 of prior knowledge from experts; and satisfies score equivalence.
 In this paper we will discuss the impact of this uniform prior on structure
 learning from an information theoretic perspective, showing how BDeu may
 violate the maximum entropy principle when applied to sparse data and how it
 may also be problematic from a Bayesian model selection perspective. On the
 other hand, the BDs score proposed in Scutari (2016) arises from a piecewise
 prior and it does not appear to violate the maximum entropy principle, even
 though it is asymptotically equivalent to BDeu.

APA


Scutari, M.. (2017). Dirichlet Bayesian Network Scores and the Maximum Entropy Principle. Proceedings of The 3rd International Workshop on Advanced Methodologies for Bayesian Networks, in Proceedings of Machine Learning Research 73:8-20 Available from https://proceedings.mlr.press/v73/scutari17a.html.

Related Material

Download PDF