Error bounds for any regression model using Gaussian processes with gradient information

Rafael Savvides, Hoang Phuc Hau Luu, Kai Puolamäki
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:397-405, 2024.

Abstract

We provide an upper bound for the expected quadratic loss on new data for any regression model. We derive the bound by modelling the underlying function by a Gaussian process (GP). Instead of a single kernel or family of kernels of the same form, we consider all GPs with translation-invariant and continuously twice differentiable kernels having a bounded signal variance and prior covariance of the gradient. To obtain a bound for the expected posterior loss, we present bounds for the posterior variance and squared bias. The squared bias bound depends on the regression model used, which can be arbitrary and not based on GPs. The bounds scale well with data size, in contrast to computing the GP posterior by a Cholesky factorisation of a large matrix. More importantly, our bounds do not require strong prior knowledge as we do not specify the exact kernel form. We validate our theoretical findings by numerical experiments and show that the bounds have applications in uncertainty estimation and concept drift detection.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-savvides24a, title = {Error bounds for any regression model using {G}aussian processes with gradient information}, author = {Savvides, Rafael and Phuc Hau Luu, Hoang and Puolam\"{a}ki, Kai}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {397--405}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/savvides24a/savvides24a.pdf}, url = {https://proceedings.mlr.press/v238/savvides24a.html}, abstract = {We provide an upper bound for the expected quadratic loss on new data for any regression model. We derive the bound by modelling the underlying function by a Gaussian process (GP). Instead of a single kernel or family of kernels of the same form, we consider all GPs with translation-invariant and continuously twice differentiable kernels having a bounded signal variance and prior covariance of the gradient. To obtain a bound for the expected posterior loss, we present bounds for the posterior variance and squared bias. The squared bias bound depends on the regression model used, which can be arbitrary and not based on GPs. The bounds scale well with data size, in contrast to computing the GP posterior by a Cholesky factorisation of a large matrix. More importantly, our bounds do not require strong prior knowledge as we do not specify the exact kernel form. We validate our theoretical findings by numerical experiments and show that the bounds have applications in uncertainty estimation and concept drift detection.} }
Endnote
%0 Conference Paper %T Error bounds for any regression model using Gaussian processes with gradient information %A Rafael Savvides %A Hoang Phuc Hau Luu %A Kai Puolamäki %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-savvides24a %I PMLR %P 397--405 %U https://proceedings.mlr.press/v238/savvides24a.html %V 238 %X We provide an upper bound for the expected quadratic loss on new data for any regression model. We derive the bound by modelling the underlying function by a Gaussian process (GP). Instead of a single kernel or family of kernels of the same form, we consider all GPs with translation-invariant and continuously twice differentiable kernels having a bounded signal variance and prior covariance of the gradient. To obtain a bound for the expected posterior loss, we present bounds for the posterior variance and squared bias. The squared bias bound depends on the regression model used, which can be arbitrary and not based on GPs. The bounds scale well with data size, in contrast to computing the GP posterior by a Cholesky factorisation of a large matrix. More importantly, our bounds do not require strong prior knowledge as we do not specify the exact kernel form. We validate our theoretical findings by numerical experiments and show that the bounds have applications in uncertainty estimation and concept drift detection.
APA
Savvides, R., Phuc Hau Luu, H. & Puolamäki, K.. (2024). Error bounds for any regression model using Gaussian processes with gradient information. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:397-405 Available from https://proceedings.mlr.press/v238/savvides24a.html.

Related Material