Managing Multiple Models
Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics, PMLR R3:41-48, 2001.
Recent research in model selection and adaptive modeling has produced an embarrassment of riches. By using any one of several different techniques, an analyst is able to generate a number of models that describe the same data set well. Examples include multiple tree models generated by bootstrapping or stochastic searches, and different subsets of variables in linear regression models identified by stochastic or exhaustive searches. While model averaging can use these models to improve prediction accuracy, interpretation of the resultant models becomes difficult. We seek a compromise, developing measures of dissimilarity between different models and using these to select good models which may reveal different aspects of the data. Data on housing prices in Boston are used to illustrate this in the context of treed regression models.