Faster Kernel Interpolation for Gaussian Processes

Mohit Yadav; Daniel Sheldon; Cameron Musco

Faster Kernel Interpolation for Gaussian Processes

Mohit Yadav, Daniel Sheldon, Cameron Musco

Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:2971-2979, 2021.

Abstract

A key challenge in scaling Gaussian Process (GP) regression to massive datasets is that exact inference requires computation with a dense n × n kernel matrix, where n is the number of data points. Significant work focuses on approximating the kernel matrix via interpolation using a smaller set of m “inducing points”. Structured kernel interpolation (SKI) is among the most scalable methods: by placing inducing points on a dense grid and using structured matrix algebra, SKI achieves per-iteration time of O(n + m log m) for approximate inference. This linear scaling in n enables approximate inference for very large data sets; however, the cost is per-iteration, which remains a limitation for extremely large n. We show that the SKI per-iteration time can be reduced to O(m log m) after a single O(n) time precomputation step by reframing SKI as solving a natural Bayesian linear regression problem with a fixed set of m compact basis functions. For a fixed grid, our new method scales to truly massive data sets: after the initial linear time pass, all subsequent computations are independent of n. We demonstrate speedups in practice for a wide range of m and n and for all the main GP inference tasks. With per-iteration complexity independent of the dataset size n for a fixed grid, our method scales to truly massive data sets. We demonstrate speedups in practice for a wide range of m and n and apply the method to GP inference on a three-dimensional weather radar dataset with over 100 million points.

Cite this Paper

BibTeX


@InProceedings{pmlr-v130-yadav21a,
  title = 	 { Faster Kernel Interpolation for Gaussian Processes },
  author =       {Yadav, Mohit and Sheldon, Daniel and Musco, Cameron},
  booktitle = 	 {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {2971--2979},
  year = 	 {2021},
  editor = 	 {Banerjee, Arindam and Fukumizu, Kenji},
  volume = 	 {130},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--15 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v130/yadav21a/yadav21a.pdf},
  url = 	 {https://proceedings.mlr.press/v130/yadav21a.html},
  abstract = 	 { A key challenge in scaling Gaussian Process (GP) regression to massive datasets is that exact inference requires computation with a dense n × n kernel matrix, where n is the number of data points. Significant work focuses on approximating the kernel matrix via interpolation using a smaller set of m “inducing points”. Structured kernel interpolation (SKI) is among the most scalable methods: by placing inducing points on a dense grid and using structured matrix algebra, SKI achieves per-iteration time of O(n + m log m) for approximate inference. This linear scaling in n enables approximate inference for very large data sets; however, the cost is per-iteration, which remains a limitation for extremely large n. We show that the SKI per-iteration time can be reduced to O(m log m) after a single O(n) time precomputation step by reframing SKI as solving a natural Bayesian linear regression problem with a fixed set of m compact basis functions. For a fixed grid, our new method scales to truly massive data sets: after the initial linear time pass, all subsequent computations are independent of n. We demonstrate speedups in practice for a wide range of m and n and for all the main GP inference tasks. With per-iteration complexity independent of the dataset size n for a fixed grid, our method scales to truly massive data sets. We demonstrate speedups in practice for a wide range of m and n and apply the method to GP inference on a three-dimensional weather radar dataset with over 100 million points. }
}

Endnote

%0 Conference Paper
%T  Faster Kernel Interpolation for Gaussian Processes 
%A Mohit Yadav
%A Daniel Sheldon
%A Cameron Musco
%B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2021
%E Arindam Banerjee
%E Kenji Fukumizu	
%F pmlr-v130-yadav21a
%I PMLR
%P 2971--2979
%U https://proceedings.mlr.press/v130/yadav21a.html
%V 130
%X  A key challenge in scaling Gaussian Process (GP) regression to massive datasets is that exact inference requires computation with a dense n × n kernel matrix, where n is the number of data points. Significant work focuses on approximating the kernel matrix via interpolation using a smaller set of m “inducing points”. Structured kernel interpolation (SKI) is among the most scalable methods: by placing inducing points on a dense grid and using structured matrix algebra, SKI achieves per-iteration time of O(n + m log m) for approximate inference. This linear scaling in n enables approximate inference for very large data sets; however, the cost is per-iteration, which remains a limitation for extremely large n. We show that the SKI per-iteration time can be reduced to O(m log m) after a single O(n) time precomputation step by reframing SKI as solving a natural Bayesian linear regression problem with a fixed set of m compact basis functions. For a fixed grid, our new method scales to truly massive data sets: after the initial linear time pass, all subsequent computations are independent of n. We demonstrate speedups in practice for a wide range of m and n and for all the main GP inference tasks. With per-iteration complexity independent of the dataset size n for a fixed grid, our method scales to truly massive data sets. We demonstrate speedups in practice for a wide range of m and n and apply the method to GP inference on a three-dimensional weather radar dataset with over 100 million points.

APA


Yadav, M., Sheldon, D. & Musco, C.. (2021).  Faster Kernel Interpolation for Gaussian Processes . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:2971-2979 Available from https://proceedings.mlr.press/v130/yadav21a.html.

Faster Kernel Interpolation for Gaussian Processes

Abstract

Cite this Paper

Related Material