Deep kernels with probabilistic embeddings for small-data learning

Ankur Mallick, Chaitanya Dwivedi, Bhavya Kailkhura, Gauri Joshi, T. Yong-Jin Han
Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, PMLR 161:918-928, 2021.

Abstract

Gaussian Processes (GPs) are known to provide accurate predictions and uncertainty estimates even with small amounts of labeled data by capturing similarity between data points through their kernel function. However traditional GP kernels are not very effective at capturing similarity between high dimensional data points. Neural networks can be used to learn good representations that encode intricate structures in high dimensional data, and can be used as inputs to the GP kernel. However the huge data requirement of neural networks makes this approach ineffective in small data settings. To solves the conflicting problems of representation learning and data efficiency, we propose to learn deep kernels on probabilistic embeddings by using a probabilistic neural network. Our approach maps high-dimensional data to a probability distribution in a low dimensional subspace and then computes a kernel between these distributions to capture similarity. To enable end-to-end learning, we derive a functional gradient descent procedure for training the model. Experiments on a variety of datasets show that our approach outperforms the state-of-the-art in GP kernel learning in both supervised and semi-supervised settings. We also extend our approach to other small-data paradigms such as few-shot classification where it outperforms previous approaches on mini-Imagenet and CUB datasets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v161-mallick21a, title = {Deep kernels with probabilistic embeddings for small-data learning}, author = {Mallick, Ankur and Dwivedi, Chaitanya and Kailkhura, Bhavya and Joshi, Gauri and Han, T. Yong-Jin}, booktitle = {Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence}, pages = {918--928}, year = {2021}, editor = {de Campos, Cassio and Maathuis, Marloes H.}, volume = {161}, series = {Proceedings of Machine Learning Research}, month = {27--30 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v161/mallick21a/mallick21a.pdf}, url = {https://proceedings.mlr.press/v161/mallick21a.html}, abstract = {Gaussian Processes (GPs) are known to provide accurate predictions and uncertainty estimates even with small amounts of labeled data by capturing similarity between data points through their kernel function. However traditional GP kernels are not very effective at capturing similarity between high dimensional data points. Neural networks can be used to learn good representations that encode intricate structures in high dimensional data, and can be used as inputs to the GP kernel. However the huge data requirement of neural networks makes this approach ineffective in small data settings. To solves the conflicting problems of representation learning and data efficiency, we propose to learn deep kernels on probabilistic embeddings by using a probabilistic neural network. Our approach maps high-dimensional data to a probability distribution in a low dimensional subspace and then computes a kernel between these distributions to capture similarity. To enable end-to-end learning, we derive a functional gradient descent procedure for training the model. Experiments on a variety of datasets show that our approach outperforms the state-of-the-art in GP kernel learning in both supervised and semi-supervised settings. We also extend our approach to other small-data paradigms such as few-shot classification where it outperforms previous approaches on mini-Imagenet and CUB datasets.} }
Endnote
%0 Conference Paper %T Deep kernels with probabilistic embeddings for small-data learning %A Ankur Mallick %A Chaitanya Dwivedi %A Bhavya Kailkhura %A Gauri Joshi %A T. Yong-Jin Han %B Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2021 %E Cassio de Campos %E Marloes H. Maathuis %F pmlr-v161-mallick21a %I PMLR %P 918--928 %U https://proceedings.mlr.press/v161/mallick21a.html %V 161 %X Gaussian Processes (GPs) are known to provide accurate predictions and uncertainty estimates even with small amounts of labeled data by capturing similarity between data points through their kernel function. However traditional GP kernels are not very effective at capturing similarity between high dimensional data points. Neural networks can be used to learn good representations that encode intricate structures in high dimensional data, and can be used as inputs to the GP kernel. However the huge data requirement of neural networks makes this approach ineffective in small data settings. To solves the conflicting problems of representation learning and data efficiency, we propose to learn deep kernels on probabilistic embeddings by using a probabilistic neural network. Our approach maps high-dimensional data to a probability distribution in a low dimensional subspace and then computes a kernel between these distributions to capture similarity. To enable end-to-end learning, we derive a functional gradient descent procedure for training the model. Experiments on a variety of datasets show that our approach outperforms the state-of-the-art in GP kernel learning in both supervised and semi-supervised settings. We also extend our approach to other small-data paradigms such as few-shot classification where it outperforms previous approaches on mini-Imagenet and CUB datasets.
APA
Mallick, A., Dwivedi, C., Kailkhura, B., Joshi, G. & Han, T.Y.. (2021). Deep kernels with probabilistic embeddings for small-data learning. Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 161:918-928 Available from https://proceedings.mlr.press/v161/mallick21a.html.

Related Material