Scaling up the Automatic Statistician: Scalable Structure Discovery using Gaussian Processes

Hyunjik Kim, Yee Whye Teh
Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, PMLR 84:575-584, 2018.

Abstract

Automating statistical modelling is a challenging problem in artificial intelligence. The Automatic Statistician employs a kernel search algorithm using Gaussian Processes (GP) to provide interpretable statistical models for regression problems. However this does not scale due to its O(N^3) running time for the model selection. We propose Scalable Kernel Composition (SKC), a scalable kernel search algorithm that extends the Automatic Statistician to bigger data sets. In doing so, we derive a cheap upper bound on the GP marginal likelihood that is used in SKC with the variational lower bound to sandwich the marginal likelihood. We show that the upper bound is significantly tighter than the lower bound and useful for model selection.

Cite this Paper


BibTeX
@InProceedings{pmlr-v84-kim18a, title = {Scaling up the Automatic Statistician: Scalable Structure Discovery using Gaussian Processes}, author = {Kim, Hyunjik and Teh, Yee Whye}, booktitle = {Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics}, pages = {575--584}, year = {2018}, editor = {Storkey, Amos and Perez-Cruz, Fernando}, volume = {84}, series = {Proceedings of Machine Learning Research}, month = {09--11 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v84/kim18a/kim18a.pdf}, url = {https://proceedings.mlr.press/v84/kim18a.html}, abstract = {Automating statistical modelling is a challenging problem in artificial intelligence. The Automatic Statistician employs a kernel search algorithm using Gaussian Processes (GP) to provide interpretable statistical models for regression problems. However this does not scale due to its O(N^3) running time for the model selection. We propose Scalable Kernel Composition (SKC), a scalable kernel search algorithm that extends the Automatic Statistician to bigger data sets. In doing so, we derive a cheap upper bound on the GP marginal likelihood that is used in SKC with the variational lower bound to sandwich the marginal likelihood. We show that the upper bound is significantly tighter than the lower bound and useful for model selection.} }
Endnote
%0 Conference Paper %T Scaling up the Automatic Statistician: Scalable Structure Discovery using Gaussian Processes %A Hyunjik Kim %A Yee Whye Teh %B Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2018 %E Amos Storkey %E Fernando Perez-Cruz %F pmlr-v84-kim18a %I PMLR %P 575--584 %U https://proceedings.mlr.press/v84/kim18a.html %V 84 %X Automating statistical modelling is a challenging problem in artificial intelligence. The Automatic Statistician employs a kernel search algorithm using Gaussian Processes (GP) to provide interpretable statistical models for regression problems. However this does not scale due to its O(N^3) running time for the model selection. We propose Scalable Kernel Composition (SKC), a scalable kernel search algorithm that extends the Automatic Statistician to bigger data sets. In doing so, we derive a cheap upper bound on the GP marginal likelihood that is used in SKC with the variational lower bound to sandwich the marginal likelihood. We show that the upper bound is significantly tighter than the lower bound and useful for model selection.
APA
Kim, H. & Teh, Y.W.. (2018). Scaling up the Automatic Statistician: Scalable Structure Discovery using Gaussian Processes. Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 84:575-584 Available from https://proceedings.mlr.press/v84/kim18a.html.

Related Material