On Structure Priors for Learning Bayesian Networks
Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, PMLR 89:1687-1695, 2019.
To learn a Bayesian network structure from data, one popular approach is to maximize a decomposable likelihood-based score. While various scores have been proposed, they usually assume a uniform prior, or “penalty,” over the possible directed acyclic graphs (DAGs); relatively little attention has been paid to alternative priors. We investigate empirically several structure priors in combination with different scores, using benchmark data sets and data sets generated from benchmark networks. Our results suggest that, in practice, priors that strongly favor sparsity perform significantly better than the uniform prior or even the informed variant that is conditioned on the correct number of parents for each node. For an analytic comparison of different priors, we generalize a known recurrence equation for the number of DAGs to accommodate modular weightings of DAGs, a result that is also of independent interest.