Improving Model Selection by Employing the Test Data

Max Westphal, Werner Brannath
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:6747-6756, 2019.

Abstract

Model selection and evaluation are usually strictly separated by means of data splitting to enable an unbiased estimation and a simple statistical inference for the unknown generalization performance of the final prediction model. We investigate the properties of novel evaluation strategies, namely when the final model is selected based on empirical performances on the test data. To guard against selection induced overoptimism, we employ a parametric multiple test correction based on the approximate multivariate distribution of performance estimates. Our numerical experiments involve training common machine learning algorithms (EN, CART, SVM, XGB) on various artificial classification tasks. At its core, our proposed approach improves model selection in terms of the expected final model performance without introducing overoptimism. We furthermore observed a higher probability for a successful evaluation study, making it easier in practice to empirically demonstrate a sufficiently high predictive performance.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-westphal19a, title = {Improving Model Selection by Employing the Test Data}, author = {Westphal, Max and Brannath, Werner}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {6747--6756}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/westphal19a/westphal19a.pdf}, url = {https://proceedings.mlr.press/v97/westphal19a.html}, abstract = {Model selection and evaluation are usually strictly separated by means of data splitting to enable an unbiased estimation and a simple statistical inference for the unknown generalization performance of the final prediction model. We investigate the properties of novel evaluation strategies, namely when the final model is selected based on empirical performances on the test data. To guard against selection induced overoptimism, we employ a parametric multiple test correction based on the approximate multivariate distribution of performance estimates. Our numerical experiments involve training common machine learning algorithms (EN, CART, SVM, XGB) on various artificial classification tasks. At its core, our proposed approach improves model selection in terms of the expected final model performance without introducing overoptimism. We furthermore observed a higher probability for a successful evaluation study, making it easier in practice to empirically demonstrate a sufficiently high predictive performance.} }
Endnote
%0 Conference Paper %T Improving Model Selection by Employing the Test Data %A Max Westphal %A Werner Brannath %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-westphal19a %I PMLR %P 6747--6756 %U https://proceedings.mlr.press/v97/westphal19a.html %V 97 %X Model selection and evaluation are usually strictly separated by means of data splitting to enable an unbiased estimation and a simple statistical inference for the unknown generalization performance of the final prediction model. We investigate the properties of novel evaluation strategies, namely when the final model is selected based on empirical performances on the test data. To guard against selection induced overoptimism, we employ a parametric multiple test correction based on the approximate multivariate distribution of performance estimates. Our numerical experiments involve training common machine learning algorithms (EN, CART, SVM, XGB) on various artificial classification tasks. At its core, our proposed approach improves model selection in terms of the expected final model performance without introducing overoptimism. We furthermore observed a higher probability for a successful evaluation study, making it easier in practice to empirically demonstrate a sufficiently high predictive performance.
APA
Westphal, M. & Brannath, W.. (2019). Improving Model Selection by Employing the Test Data. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:6747-6756 Available from https://proceedings.mlr.press/v97/westphal19a.html.

Related Material