Differentiable Compositional Kernel Learning for Gaussian Processes

Shengyang Sun, Guodong Zhang, Chaoqi Wang, Wenyuan Zeng, Jiaman Li, Roger Grosse
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:4828-4837, 2018.

Abstract

The generalization properties of Gaussian processes depend heavily on the choice of kernel, and this choice remains a dark art. We present the Neural Kernel Network (NKN), a flexible family of kernels represented by a neural network. The NKN’s architecture is based on the composition rules for kernels, so that each unit of the network corresponds to a valid kernel. It can compactly approximate compositional kernel structures such as those used by the Automatic Statistician (Lloyd et al., 2014), but because the architecture is differentiable, it is end-to-end trainable with gradient- based optimization. We show that the NKN is universal for the class of stationary kernels. Empirically we demonstrate NKN’s pattern discovery and extrapolation abilities on several tasks that depend crucially on identifying the underlying structure, including time series and texture extrapolation, as well as Bayesian optimization.

Cite this Paper


BibTeX
@InProceedings{pmlr-v80-sun18e, title = {Differentiable Compositional Kernel Learning for {G}aussian Processes}, author = {Sun, Shengyang and Zhang, Guodong and Wang, Chaoqi and Zeng, Wenyuan and Li, Jiaman and Grosse, Roger}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, pages = {4828--4837}, year = {2018}, editor = {Dy, Jennifer and Krause, Andreas}, volume = {80}, series = {Proceedings of Machine Learning Research}, month = {10--15 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v80/sun18e/sun18e.pdf}, url = {http://proceedings.mlr.press/v80/sun18e.html}, abstract = {The generalization properties of Gaussian processes depend heavily on the choice of kernel, and this choice remains a dark art. We present the Neural Kernel Network (NKN), a flexible family of kernels represented by a neural network. The NKN’s architecture is based on the composition rules for kernels, so that each unit of the network corresponds to a valid kernel. It can compactly approximate compositional kernel structures such as those used by the Automatic Statistician (Lloyd et al., 2014), but because the architecture is differentiable, it is end-to-end trainable with gradient- based optimization. We show that the NKN is universal for the class of stationary kernels. Empirically we demonstrate NKN’s pattern discovery and extrapolation abilities on several tasks that depend crucially on identifying the underlying structure, including time series and texture extrapolation, as well as Bayesian optimization.} }
Endnote
%0 Conference Paper %T Differentiable Compositional Kernel Learning for Gaussian Processes %A Shengyang Sun %A Guodong Zhang %A Chaoqi Wang %A Wenyuan Zeng %A Jiaman Li %A Roger Grosse %B Proceedings of the 35th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2018 %E Jennifer Dy %E Andreas Krause %F pmlr-v80-sun18e %I PMLR %P 4828--4837 %U http://proceedings.mlr.press/v80/sun18e.html %V 80 %X The generalization properties of Gaussian processes depend heavily on the choice of kernel, and this choice remains a dark art. We present the Neural Kernel Network (NKN), a flexible family of kernels represented by a neural network. The NKN’s architecture is based on the composition rules for kernels, so that each unit of the network corresponds to a valid kernel. It can compactly approximate compositional kernel structures such as those used by the Automatic Statistician (Lloyd et al., 2014), but because the architecture is differentiable, it is end-to-end trainable with gradient- based optimization. We show that the NKN is universal for the class of stationary kernels. Empirically we demonstrate NKN’s pattern discovery and extrapolation abilities on several tasks that depend crucially on identifying the underlying structure, including time series and texture extrapolation, as well as Bayesian optimization.
APA
Sun, S., Zhang, G., Wang, C., Zeng, W., Li, J. & Grosse, R.. (2018). Differentiable Compositional Kernel Learning for Gaussian Processes. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:4828-4837 Available from http://proceedings.mlr.press/v80/sun18e.html.

Related Material