Parametric Gaussian Process Regressors

Martin Jankowiak, Geoff Pleiss, Jacob Gardner
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:4702-4712, 2020.

Abstract

The combination of inducing point methods with stochastic variational inference has enabled approximate Gaussian Process (GP) inference on large datasets. Unfortunately, the resulting predictive distributions often exhibit substantially underestimated uncertainties. Notably, in the regression case the predictive variance is typically dominated by observation noise, yielding uncertainty estimates that make little use of the input-dependent function uncertainty that makes GP priors attractive. In this work we propose two simple methods for scalable GP regression that address this issue and thus yield substantially improved predictive uncertainties. The first applies variational inference to FITC (Fully Independent Training Conditional; Snelson et. al. 2006). The second bypasses posterior approximations and instead directly targets the posterior predictive distribution. In an extensive empirical comparison with a number of alternative methods for scalable GP regression, we find that the resulting predictive distributions exhibit significantly better calibrated uncertainties and higher log likelihoods–often by as much as half a nat per datapoint.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-jankowiak20a, title = {Parametric {G}aussian Process Regressors}, author = {Jankowiak, Martin and Pleiss, Geoff and Gardner, Jacob}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {4702--4712}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/jankowiak20a/jankowiak20a.pdf}, url = {https://proceedings.mlr.press/v119/jankowiak20a.html}, abstract = {The combination of inducing point methods with stochastic variational inference has enabled approximate Gaussian Process (GP) inference on large datasets. Unfortunately, the resulting predictive distributions often exhibit substantially underestimated uncertainties. Notably, in the regression case the predictive variance is typically dominated by observation noise, yielding uncertainty estimates that make little use of the input-dependent function uncertainty that makes GP priors attractive. In this work we propose two simple methods for scalable GP regression that address this issue and thus yield substantially improved predictive uncertainties. The first applies variational inference to FITC (Fully Independent Training Conditional; Snelson et. al. 2006). The second bypasses posterior approximations and instead directly targets the posterior predictive distribution. In an extensive empirical comparison with a number of alternative methods for scalable GP regression, we find that the resulting predictive distributions exhibit significantly better calibrated uncertainties and higher log likelihoods–often by as much as half a nat per datapoint.} }
Endnote
%0 Conference Paper %T Parametric Gaussian Process Regressors %A Martin Jankowiak %A Geoff Pleiss %A Jacob Gardner %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-jankowiak20a %I PMLR %P 4702--4712 %U https://proceedings.mlr.press/v119/jankowiak20a.html %V 119 %X The combination of inducing point methods with stochastic variational inference has enabled approximate Gaussian Process (GP) inference on large datasets. Unfortunately, the resulting predictive distributions often exhibit substantially underestimated uncertainties. Notably, in the regression case the predictive variance is typically dominated by observation noise, yielding uncertainty estimates that make little use of the input-dependent function uncertainty that makes GP priors attractive. In this work we propose two simple methods for scalable GP regression that address this issue and thus yield substantially improved predictive uncertainties. The first applies variational inference to FITC (Fully Independent Training Conditional; Snelson et. al. 2006). The second bypasses posterior approximations and instead directly targets the posterior predictive distribution. In an extensive empirical comparison with a number of alternative methods for scalable GP regression, we find that the resulting predictive distributions exhibit significantly better calibrated uncertainties and higher log likelihoods–often by as much as half a nat per datapoint.
APA
Jankowiak, M., Pleiss, G. & Gardner, J.. (2020). Parametric Gaussian Process Regressors. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:4702-4712 Available from https://proceedings.mlr.press/v119/jankowiak20a.html.

Related Material