Amortized Variational Deep Kernel Learning

Alan L. S. Matias, César Lincoln Mattos, João Paulo Pordeus Gomes, Diego Mesquita
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:35063-35078, 2024.

Abstract

Deep kernel learning (DKL) marries the uncertainty quantification of Gaussian processes (GPs) and the representational power of deep neural networks. However, training DKL is challenging and often leads to overfitting. Most notably, DKL often learns “non-local” kernels — incurring spurious correlations. To remedy this issue, we propose using amortized inducing points and a parameter-sharing scheme, which ties together the amortization and DKL networks. This design imposes an explicit dependency between the ELBO’s model fit and capacity terms. In turn, this prevents the former from dominating the optimization procedure and incurring the aforementioned spurious correlations. Extensive experiments show that our resulting method, amortized varitional DKL (AVDKL), i) consistently outperforms DKL and standard GPs for tabular data; ii) achieves significantly higher accuracy than DKL in node classification tasks; and iii) leads to substantially better accuracy and negative log-likelihood than DKL on CIFAR100.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-matias24a, title = {Amortized Variational Deep Kernel Learning}, author = {Matias, Alan L. S. and Mattos, C\'{e}sar Lincoln and Gomes, Jo\~{a}o Paulo Pordeus and Mesquita, Diego}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {35063--35078}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/matias24a/matias24a.pdf}, url = {https://proceedings.mlr.press/v235/matias24a.html}, abstract = {Deep kernel learning (DKL) marries the uncertainty quantification of Gaussian processes (GPs) and the representational power of deep neural networks. However, training DKL is challenging and often leads to overfitting. Most notably, DKL often learns “non-local” kernels — incurring spurious correlations. To remedy this issue, we propose using amortized inducing points and a parameter-sharing scheme, which ties together the amortization and DKL networks. This design imposes an explicit dependency between the ELBO’s model fit and capacity terms. In turn, this prevents the former from dominating the optimization procedure and incurring the aforementioned spurious correlations. Extensive experiments show that our resulting method, amortized varitional DKL (AVDKL), i) consistently outperforms DKL and standard GPs for tabular data; ii) achieves significantly higher accuracy than DKL in node classification tasks; and iii) leads to substantially better accuracy and negative log-likelihood than DKL on CIFAR100.} }
Endnote
%0 Conference Paper %T Amortized Variational Deep Kernel Learning %A Alan L. S. Matias %A César Lincoln Mattos %A João Paulo Pordeus Gomes %A Diego Mesquita %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-matias24a %I PMLR %P 35063--35078 %U https://proceedings.mlr.press/v235/matias24a.html %V 235 %X Deep kernel learning (DKL) marries the uncertainty quantification of Gaussian processes (GPs) and the representational power of deep neural networks. However, training DKL is challenging and often leads to overfitting. Most notably, DKL often learns “non-local” kernels — incurring spurious correlations. To remedy this issue, we propose using amortized inducing points and a parameter-sharing scheme, which ties together the amortization and DKL networks. This design imposes an explicit dependency between the ELBO’s model fit and capacity terms. In turn, this prevents the former from dominating the optimization procedure and incurring the aforementioned spurious correlations. Extensive experiments show that our resulting method, amortized varitional DKL (AVDKL), i) consistently outperforms DKL and standard GPs for tabular data; ii) achieves significantly higher accuracy than DKL in node classification tasks; and iii) leads to substantially better accuracy and negative log-likelihood than DKL on CIFAR100.
APA
Matias, A.L.S., Mattos, C.L., Gomes, J.P.P. & Mesquita, D.. (2024). Amortized Variational Deep Kernel Learning. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:35063-35078 Available from https://proceedings.mlr.press/v235/matias24a.html.

Related Material