Scalable Gaussian Processes with Grid-Structured Eigenfunctions (GP-GRIEF)

Trefor Evans, Prasanth Nair
; Proceedings of the 35th International Conference on Machine Learning, PMLR 80:1417-1426, 2018.

Abstract

We introduce a kernel approximation strategy that enables computation of the Gaussian process log marginal likelihood and all hyperparameter derivatives in O(p) time. Our GRIEF kernel consists of p eigenfunctions found using a Nystrom approximation from a dense Cartesian product grid of inducing points. By exploiting algebraic properties of Kronecker and Khatri-Rao tensor products, computational complexity of the training procedure can be practically independent of the number of inducing points. This allows us to use arbitrarily many inducing points to achieve a globally accurate kernel approximation, even in high-dimensional problems. The fast likelihood evaluation enables type-I or II Bayesian inference on large-scale datasets. We benchmark our algorithms on real-world problems with up to two-million training points and 10^33 inducing points.

Cite this Paper


BibTeX
@InProceedings{pmlr-v80-evans18a, title = {Scalable {G}aussian Processes with Grid-Structured Eigenfunctions ({GP}-{GRIEF})}, author = {Evans, Trefor and Nair, Prasanth}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, pages = {1417--1426}, year = {2018}, editor = {Jennifer Dy and Andreas Krause}, volume = {80}, series = {Proceedings of Machine Learning Research}, address = {Stockholmsmässan, Stockholm Sweden}, month = {10--15 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v80/evans18a/evans18a.pdf}, url = {http://proceedings.mlr.press/v80/evans18a.html}, abstract = {We introduce a kernel approximation strategy that enables computation of the Gaussian process log marginal likelihood and all hyperparameter derivatives in O(p) time. Our GRIEF kernel consists of p eigenfunctions found using a Nystrom approximation from a dense Cartesian product grid of inducing points. By exploiting algebraic properties of Kronecker and Khatri-Rao tensor products, computational complexity of the training procedure can be practically independent of the number of inducing points. This allows us to use arbitrarily many inducing points to achieve a globally accurate kernel approximation, even in high-dimensional problems. The fast likelihood evaluation enables type-I or II Bayesian inference on large-scale datasets. We benchmark our algorithms on real-world problems with up to two-million training points and 10^33 inducing points.} }
Endnote
%0 Conference Paper %T Scalable Gaussian Processes with Grid-Structured Eigenfunctions (GP-GRIEF) %A Trefor Evans %A Prasanth Nair %B Proceedings of the 35th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2018 %E Jennifer Dy %E Andreas Krause %F pmlr-v80-evans18a %I PMLR %J Proceedings of Machine Learning Research %P 1417--1426 %U http://proceedings.mlr.press %V 80 %W PMLR %X We introduce a kernel approximation strategy that enables computation of the Gaussian process log marginal likelihood and all hyperparameter derivatives in O(p) time. Our GRIEF kernel consists of p eigenfunctions found using a Nystrom approximation from a dense Cartesian product grid of inducing points. By exploiting algebraic properties of Kronecker and Khatri-Rao tensor products, computational complexity of the training procedure can be practically independent of the number of inducing points. This allows us to use arbitrarily many inducing points to achieve a globally accurate kernel approximation, even in high-dimensional problems. The fast likelihood evaluation enables type-I or II Bayesian inference on large-scale datasets. We benchmark our algorithms on real-world problems with up to two-million training points and 10^33 inducing points.
APA
Evans, T. & Nair, P.. (2018). Scalable Gaussian Processes with Grid-Structured Eigenfunctions (GP-GRIEF). Proceedings of the 35th International Conference on Machine Learning, in PMLR 80:1417-1426

Related Material