[edit]
A Comparison of Scientific and Engineering Criteria for Bayesian Model Selection
Proceedings of the Sixth International Workshop on Artificial Intelligence and Statistics, PMLR R1:275-282, 1997.
Abstract
Given a set of possible model structures for variables $\mathbf{X}$ and a set of possible parameters for each structure, the Bayesian "estimate" of the probability distribution for $\mathbf{X}$ given observed data is obtained by averaging over the possible model structures and their parameters. An often-used approximation for this estimate is obtained by selecting a single model structure and averaging over its parameters. The approximation is useful because it is computationally efficient, and because it provides a model that facilitates understanding· of the domain. A common criterion for model selection is the posterior probability of the model. Another criterion for model selection, proposed by San Martini and Spezzafari (1984) , is the predictive performance of a model for the next observation to be seen. From the standpoint of domain understanding, both criteria are useful, because one identifies the model that is most likely, whereas the other identifies the model that is the best predictor of the next observation. To highlight the difference, we refer to the posterior-probability and alternative criteria as the \emph{scientific criterion} (SC) and \emph{engineering criterion} (EC), respectively. When we are interested in predicting the next observation, the model-averaged estimate is at least as good as that produced by EC, which itself is at least as good as the estimate produced by SC. We show experimentally that, for Bayesian-network models containing discrete variables only, differences in predictive performance between the model-averaged estimate and EC and between EC and SC can be substantial.