Model choice: A minimum posterior predictive loss approach

Sujit Kumar Ghosh, Alan E. Gelfand
Proceedings of the Seventh International Workshop on Artificial Intelligence and Statistics, PMLR R2, 1999.

Abstract

Model choice is a fundamental activity in the analysis of data sets, an activity which has become increasingly more important as computational advances enable the fitting of increasingly complex models. Such complexity typically arises through hierarchical structure which requires specification at each stage of probabilistic mechanisms, mean and dispersion forms, explanatory variables, etc. Nonnested hierarchical models introducing random effects may not be handled by classical methods. Bayesian approaches using predictive distributions can be used though the FORMAL solution, which includes Bayes factors as a special case, can be criticized. It seems natural to evaluate model performance by comparing what it predicts with what has been observed. Most classical criteria utilize such comparison. We propose a predictive criterion where the goal is good prediction of a replicate of the observed data but tempered by fidelity to the observed values. We obtain this criterion by minimizing posterior loss for a given model and then, for models under consideration, selecting the one which minimizes this criterion. For a version of log scoring loss we can do the minimization explicitly, obtaining an expression which can be interpreted as a penalized deviance criterion. We illustrate its performance with an application to a large data set involving residential property transactions.

Cite this Paper


BibTeX
@InProceedings{pmlr-vR2-ghosh99a, title = {Model choice: {A} minimum posterior predictive loss approach}, author = {Ghosh, Sujit Kumar and Gelfand, Alan E.}, booktitle = {Proceedings of the Seventh International Workshop on Artificial Intelligence and Statistics}, year = {1999}, editor = {Heckerman, David and Whittaker, Joe}, volume = {R2}, series = {Proceedings of Machine Learning Research}, month = {03--06 Jan}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/r2/ghosh99a/ghosh99a.pdf}, url = {https://proceedings.mlr.press/r2/ghosh99a.html}, abstract = {Model choice is a fundamental activity in the analysis of data sets, an activity which has become increasingly more important as computational advances enable the fitting of increasingly complex models. Such complexity typically arises through hierarchical structure which requires specification at each stage of probabilistic mechanisms, mean and dispersion forms, explanatory variables, etc. Nonnested hierarchical models introducing random effects may not be handled by classical methods. Bayesian approaches using predictive distributions can be used though the FORMAL solution, which includes Bayes factors as a special case, can be criticized. It seems natural to evaluate model performance by comparing what it predicts with what has been observed. Most classical criteria utilize such comparison. We propose a predictive criterion where the goal is good prediction of a replicate of the observed data but tempered by fidelity to the observed values. We obtain this criterion by minimizing posterior loss for a given model and then, for models under consideration, selecting the one which minimizes this criterion. For a version of log scoring loss we can do the minimization explicitly, obtaining an expression which can be interpreted as a penalized deviance criterion. We illustrate its performance with an application to a large data set involving residential property transactions.}, note = {Reissued by PMLR on 20 August 2020.} }
Endnote
%0 Conference Paper %T Model choice: A minimum posterior predictive loss approach %A Sujit Kumar Ghosh %A Alan E. Gelfand %B Proceedings of the Seventh International Workshop on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 1999 %E David Heckerman %E Joe Whittaker %F pmlr-vR2-ghosh99a %I PMLR %U https://proceedings.mlr.press/r2/ghosh99a.html %V R2 %X Model choice is a fundamental activity in the analysis of data sets, an activity which has become increasingly more important as computational advances enable the fitting of increasingly complex models. Such complexity typically arises through hierarchical structure which requires specification at each stage of probabilistic mechanisms, mean and dispersion forms, explanatory variables, etc. Nonnested hierarchical models introducing random effects may not be handled by classical methods. Bayesian approaches using predictive distributions can be used though the FORMAL solution, which includes Bayes factors as a special case, can be criticized. It seems natural to evaluate model performance by comparing what it predicts with what has been observed. Most classical criteria utilize such comparison. We propose a predictive criterion where the goal is good prediction of a replicate of the observed data but tempered by fidelity to the observed values. We obtain this criterion by minimizing posterior loss for a given model and then, for models under consideration, selecting the one which minimizes this criterion. For a version of log scoring loss we can do the minimization explicitly, obtaining an expression which can be interpreted as a penalized deviance criterion. We illustrate its performance with an application to a large data set involving residential property transactions. %Z Reissued by PMLR on 20 August 2020.
APA
Ghosh, S.K. & Gelfand, A.E.. (1999). Model choice: A minimum posterior predictive loss approach. Proceedings of the Seventh International Workshop on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research R2 Available from https://proceedings.mlr.press/r2/ghosh99a.html. Reissued by PMLR on 20 August 2020.

Related Material