Bayesian leave-one-out cross-validation for large data

Måns Magnusson, Michael Andersen, Johan Jonasson, Aki Vehtari
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:4244-4253, 2019.

Abstract

Model inference, such as model comparison, model checking, and model selection, is an important part of model development. Leave-one-out cross-validation (LOO) is a general approach for assessing the generalizability of a model, but unfortunately, LOO does not scale well to large datasets. We propose a combination of using approximate inference techniques and probability-proportional-to-size-sampling (PPS) for fast LOO model evaluation for large datasets. We provide both theoretical and empirical results showing good properties for large data.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-magnusson19a, title = {{B}ayesian leave-one-out cross-validation for large data}, author = {Magnusson, M{\aa}ns and Andersen, Michael and Jonasson, Johan and Vehtari, Aki}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {4244--4253}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/magnusson19a/magnusson19a.pdf}, url = {https://proceedings.mlr.press/v97/magnusson19a.html}, abstract = {Model inference, such as model comparison, model checking, and model selection, is an important part of model development. Leave-one-out cross-validation (LOO) is a general approach for assessing the generalizability of a model, but unfortunately, LOO does not scale well to large datasets. We propose a combination of using approximate inference techniques and probability-proportional-to-size-sampling (PPS) for fast LOO model evaluation for large datasets. We provide both theoretical and empirical results showing good properties for large data.} }
Endnote
%0 Conference Paper %T Bayesian leave-one-out cross-validation for large data %A Måns Magnusson %A Michael Andersen %A Johan Jonasson %A Aki Vehtari %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-magnusson19a %I PMLR %P 4244--4253 %U https://proceedings.mlr.press/v97/magnusson19a.html %V 97 %X Model inference, such as model comparison, model checking, and model selection, is an important part of model development. Leave-one-out cross-validation (LOO) is a general approach for assessing the generalizability of a model, but unfortunately, LOO does not scale well to large datasets. We propose a combination of using approximate inference techniques and probability-proportional-to-size-sampling (PPS) for fast LOO model evaluation for large datasets. We provide both theoretical and empirical results showing good properties for large data.
APA
Magnusson, M., Andersen, M., Jonasson, J. & Vehtari, A.. (2019). Bayesian leave-one-out cross-validation for large data. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:4244-4253 Available from https://proceedings.mlr.press/v97/magnusson19a.html.

Related Material