A Comparison of Scientific and Engineering Criteria for Bayesian Model Selection

David Heckerman, David Maxwell Chickering
Proceedings of the Sixth International Workshop on Artificial Intelligence and Statistics, PMLR R1:275-282, 1997.

Abstract

Given a set of possible model structures for variables $\mathbf{X}$ and a set of possible parameters for each structure, the Bayesian "estimate" of the probability distribution for $\mathbf{X}$ given observed data is obtained by averaging over the possible model structures and their parameters. An often-used approximation for this estimate is obtained by selecting a single model structure and averaging over its parameters. The approximation is useful because it is computationally efficient, and because it provides a model that facilitates understanding· of the domain. A common criterion for model selection is the posterior probability of the model. Another criterion for model selection, proposed by San Martini and Spezzafari (1984) , is the predictive performance of a model for the next observation to be seen. From the standpoint of domain understanding, both criteria are useful, because one identifies the model that is most likely, whereas the other identifies the model that is the best predictor of the next observation. To highlight the difference, we refer to the posterior-probability and alternative criteria as the \emph{scientific criterion} (SC) and \emph{engineering criterion} (EC), respectively. When we are interested in predicting the next observation, the model-averaged estimate is at least as good as that produced by EC, which itself is at least as good as the estimate produced by SC. We show experimentally that, for Bayesian-network models containing discrete variables only, differences in predictive performance between the model-averaged estimate and EC and between EC and SC can be substantial.

Cite this Paper


BibTeX
@InProceedings{pmlr-vR1-heckerman97a, title = {A Comparison of Scientific and Engineering Criteria for {B}ayesian Model Selection}, author = {Heckerman, David and Chickering, David Maxwell}, booktitle = {Proceedings of the Sixth International Workshop on Artificial Intelligence and Statistics}, pages = {275--282}, year = {1997}, editor = {Madigan, David and Smyth, Padhraic}, volume = {R1}, series = {Proceedings of Machine Learning Research}, month = {04--07 Jan}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/r1/heckerman97a/heckerman97a.pdf}, url = {https://proceedings.mlr.press/r1/heckerman97a.html}, abstract = {Given a set of possible model structures for variables $\mathbf{X}$ and a set of possible parameters for each structure, the Bayesian "estimate" of the probability distribution for $\mathbf{X}$ given observed data is obtained by averaging over the possible model structures and their parameters. An often-used approximation for this estimate is obtained by selecting a single model structure and averaging over its parameters. The approximation is useful because it is computationally efficient, and because it provides a model that facilitates understanding· of the domain. A common criterion for model selection is the posterior probability of the model. Another criterion for model selection, proposed by San Martini and Spezzafari (1984) , is the predictive performance of a model for the next observation to be seen. From the standpoint of domain understanding, both criteria are useful, because one identifies the model that is most likely, whereas the other identifies the model that is the best predictor of the next observation. To highlight the difference, we refer to the posterior-probability and alternative criteria as the \emph{scientific criterion} (SC) and \emph{engineering criterion} (EC), respectively. When we are interested in predicting the next observation, the model-averaged estimate is at least as good as that produced by EC, which itself is at least as good as the estimate produced by SC. We show experimentally that, for Bayesian-network models containing discrete variables only, differences in predictive performance between the model-averaged estimate and EC and between EC and SC can be substantial.}, note = {Reissued by PMLR on 30 March 2021.} }
Endnote
%0 Conference Paper %T A Comparison of Scientific and Engineering Criteria for Bayesian Model Selection %A David Heckerman %A David Maxwell Chickering %B Proceedings of the Sixth International Workshop on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 1997 %E David Madigan %E Padhraic Smyth %F pmlr-vR1-heckerman97a %I PMLR %P 275--282 %U https://proceedings.mlr.press/r1/heckerman97a.html %V R1 %X Given a set of possible model structures for variables $\mathbf{X}$ and a set of possible parameters for each structure, the Bayesian "estimate" of the probability distribution for $\mathbf{X}$ given observed data is obtained by averaging over the possible model structures and their parameters. An often-used approximation for this estimate is obtained by selecting a single model structure and averaging over its parameters. The approximation is useful because it is computationally efficient, and because it provides a model that facilitates understanding· of the domain. A common criterion for model selection is the posterior probability of the model. Another criterion for model selection, proposed by San Martini and Spezzafari (1984) , is the predictive performance of a model for the next observation to be seen. From the standpoint of domain understanding, both criteria are useful, because one identifies the model that is most likely, whereas the other identifies the model that is the best predictor of the next observation. To highlight the difference, we refer to the posterior-probability and alternative criteria as the \emph{scientific criterion} (SC) and \emph{engineering criterion} (EC), respectively. When we are interested in predicting the next observation, the model-averaged estimate is at least as good as that produced by EC, which itself is at least as good as the estimate produced by SC. We show experimentally that, for Bayesian-network models containing discrete variables only, differences in predictive performance between the model-averaged estimate and EC and between EC and SC can be substantial. %Z Reissued by PMLR on 30 March 2021.
APA
Heckerman, D. & Chickering, D.M.. (1997). A Comparison of Scientific and Engineering Criteria for Bayesian Model Selection. Proceedings of the Sixth International Workshop on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research R1:275-282 Available from https://proceedings.mlr.press/r1/heckerman97a.html. Reissued by PMLR on 30 March 2021.

Related Material