Accurate and Scalable Stochastic Gaussian Process Regression via Learnable Coreset-based Variational Inference

Mert Ketenci, Adler J Perotte, Noémie Elhadad, Iñigo Urteaga
Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence, PMLR 286:2101-2142, 2025.

Abstract

We introduce a novel stochastic variational inference method for Gaussian process ($\mathcal{GP}$) regression, by deriving a posterior over a learnable set of coresets: i.e., over pseudo-input/output, weighted pairs. Unlike former free-form variational families for stochastic inference, our coreset-based variational $\mathcal{GP}$ (CVGP) is defined in terms of the $\mathcal{GP}$ prior and the (weighted) data likelihood. This formulation naturally incorporates inductive biases of the prior, and ensures its kernel and likelihood dependencies are shared with the posterior. We derive a variational lower-bound on the log-marginal likelihood by marginalizing over the latent $\mathcal{GP}$ coreset variables, and show that CVGP’s lower-bound is amenable to stochastic optimization. CVGP reduces the dimensionality of the variational parameter search space to linear $\mathcal{O}(M)$ complexity, while ensuring numerical stability at $\mathcal{O}(M^3)$ time complexity and $\mathcal{O}(M^2)$ space complexity. Evaluations on real-world and simulated regression problems demonstrate that CVGP achieves superior inference and predictive performance than state-of-the-art, stochastic sparse $\mathcal{GP}$ approximation methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v286-ketenci25a, title = {Accurate and Scalable Stochastic Gaussian Process Regression via Learnable Coreset-based Variational Inference}, author = {Ketenci, Mert and Perotte, Adler J and Elhadad, No\'{e}mie and Urteaga, I\~{n}igo}, booktitle = {Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence}, pages = {2101--2142}, year = {2025}, editor = {Chiappa, Silvia and Magliacane, Sara}, volume = {286}, series = {Proceedings of Machine Learning Research}, month = {21--25 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v286/main/assets/ketenci25a/ketenci25a.pdf}, url = {https://proceedings.mlr.press/v286/ketenci25a.html}, abstract = {We introduce a novel stochastic variational inference method for Gaussian process ($\mathcal{GP}$) regression, by deriving a posterior over a learnable set of coresets: i.e., over pseudo-input/output, weighted pairs. Unlike former free-form variational families for stochastic inference, our coreset-based variational $\mathcal{GP}$ (CVGP) is defined in terms of the $\mathcal{GP}$ prior and the (weighted) data likelihood. This formulation naturally incorporates inductive biases of the prior, and ensures its kernel and likelihood dependencies are shared with the posterior. We derive a variational lower-bound on the log-marginal likelihood by marginalizing over the latent $\mathcal{GP}$ coreset variables, and show that CVGP’s lower-bound is amenable to stochastic optimization. CVGP reduces the dimensionality of the variational parameter search space to linear $\mathcal{O}(M)$ complexity, while ensuring numerical stability at $\mathcal{O}(M^3)$ time complexity and $\mathcal{O}(M^2)$ space complexity. Evaluations on real-world and simulated regression problems demonstrate that CVGP achieves superior inference and predictive performance than state-of-the-art, stochastic sparse $\mathcal{GP}$ approximation methods.} }
Endnote
%0 Conference Paper %T Accurate and Scalable Stochastic Gaussian Process Regression via Learnable Coreset-based Variational Inference %A Mert Ketenci %A Adler J Perotte %A Noémie Elhadad %A Iñigo Urteaga %B Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2025 %E Silvia Chiappa %E Sara Magliacane %F pmlr-v286-ketenci25a %I PMLR %P 2101--2142 %U https://proceedings.mlr.press/v286/ketenci25a.html %V 286 %X We introduce a novel stochastic variational inference method for Gaussian process ($\mathcal{GP}$) regression, by deriving a posterior over a learnable set of coresets: i.e., over pseudo-input/output, weighted pairs. Unlike former free-form variational families for stochastic inference, our coreset-based variational $\mathcal{GP}$ (CVGP) is defined in terms of the $\mathcal{GP}$ prior and the (weighted) data likelihood. This formulation naturally incorporates inductive biases of the prior, and ensures its kernel and likelihood dependencies are shared with the posterior. We derive a variational lower-bound on the log-marginal likelihood by marginalizing over the latent $\mathcal{GP}$ coreset variables, and show that CVGP’s lower-bound is amenable to stochastic optimization. CVGP reduces the dimensionality of the variational parameter search space to linear $\mathcal{O}(M)$ complexity, while ensuring numerical stability at $\mathcal{O}(M^3)$ time complexity and $\mathcal{O}(M^2)$ space complexity. Evaluations on real-world and simulated regression problems demonstrate that CVGP achieves superior inference and predictive performance than state-of-the-art, stochastic sparse $\mathcal{GP}$ approximation methods.
APA
Ketenci, M., Perotte, A.J., Elhadad, N. & Urteaga, I.. (2025). Accurate and Scalable Stochastic Gaussian Process Regression via Learnable Coreset-based Variational Inference. Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 286:2101-2142 Available from https://proceedings.mlr.press/v286/ketenci25a.html.

Related Material