Deep Kernel Learning

Andrew Gordon Wilson; Zhiting Hu; Ruslan Salakhutdinov; Eric P. Xing

Deep Kernel Learning

Andrew Gordon Wilson, Zhiting Hu, Ruslan Salakhutdinov, Eric P. Xing

Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, PMLR 51:370-378, 2016.

Abstract

We introduce scalable deep kernels, which combine the structural properties of deep learning architectures with the non-parametric flexibility of kernel methods. Specifically, we transform the inputs of a spectral mixture base kernel with a deep architecture, using local kernel interpolation, inducing points, and structure exploiting (Kronecker and Toeplitz) algebra for a scalable kernel representation. These closed-form kernels can be used as drop-in replacements for standard kernels, with benefits in expressive power and scalability. We jointly learn the properties of these kernels through the marginal likelihood of a Gaussian process. Inference and learning cost O(n) for n training points, and predictions cost O(1) per test point. On a large and diverse collection of applications, including a dataset with 2 million examples, we show improved performance over scalable Gaussian processes with flexible kernel learning models, and stand-alone deep architectures.

Cite this Paper

BibTeX


@InProceedings{pmlr-v51-wilson16,
  title = 	 {Deep Kernel Learning},
  author = 	 {Wilson, Andrew Gordon and Hu, Zhiting and Salakhutdinov, Ruslan and Xing, Eric P.},
  booktitle = 	 {Proceedings of the 19th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {370--378},
  year = 	 {2016},
  editor = 	 {Gretton, Arthur and Robert, Christian C.},
  volume = 	 {51},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Cadiz, Spain},
  month = 	 {09--11 May},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v51/wilson16.pdf},
  url = 	 {https://proceedings.mlr.press/v51/wilson16.html},
  abstract = 	 {We introduce scalable deep kernels, which combine the structural properties of deep learning architectures with the non-parametric flexibility of kernel methods.  Specifically, we transform the inputs of a spectral mixture base kernel with a deep architecture, using local kernel interpolation, inducing points, and structure exploiting (Kronecker and Toeplitz) algebra for a scalable kernel representation.  These closed-form kernels can be used as drop-in replacements for standard kernels, with benefits in expressive power and scalability.  We jointly learn the properties of these kernels through the marginal likelihood of a Gaussian process.  Inference and learning cost O(n) for n training points, and predictions cost O(1) per test point.  On a large and diverse collection of applications, including a dataset with 2 million examples, we show improved performance over scalable Gaussian processes with flexible kernel learning models, and stand-alone deep architectures.}
}

Endnote

%0 Conference Paper
%T Deep Kernel Learning
%A Andrew Gordon Wilson
%A Zhiting Hu
%A Ruslan Salakhutdinov
%A Eric P. Xing
%B Proceedings of the 19th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2016
%E Arthur Gretton
%E Christian C. Robert	
%F pmlr-v51-wilson16
%I PMLR
%P 370--378
%U https://proceedings.mlr.press/v51/wilson16.html
%V 51
%X We introduce scalable deep kernels, which combine the structural properties of deep learning architectures with the non-parametric flexibility of kernel methods.  Specifically, we transform the inputs of a spectral mixture base kernel with a deep architecture, using local kernel interpolation, inducing points, and structure exploiting (Kronecker and Toeplitz) algebra for a scalable kernel representation.  These closed-form kernels can be used as drop-in replacements for standard kernels, with benefits in expressive power and scalability.  We jointly learn the properties of these kernels through the marginal likelihood of a Gaussian process.  Inference and learning cost O(n) for n training points, and predictions cost O(1) per test point.  On a large and diverse collection of applications, including a dataset with 2 million examples, we show improved performance over scalable Gaussian processes with flexible kernel learning models, and stand-alone deep architectures.

RIS


TY  - CPAPER
TI  - Deep Kernel Learning
AU  - Andrew Gordon Wilson
AU  - Zhiting Hu
AU  - Ruslan Salakhutdinov
AU  - Eric P. Xing
BT  - Proceedings of the 19th International Conference on Artificial Intelligence and Statistics
DA  - 2016/05/02
ED  - Arthur Gretton
ED  - Christian C. Robert	
ID  - pmlr-v51-wilson16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 51
SP  - 370
EP  - 378
L1  - http://proceedings.mlr.press/v51/wilson16.pdf
UR  - https://proceedings.mlr.press/v51/wilson16.html
AB  - We introduce scalable deep kernels, which combine the structural properties of deep learning architectures with the non-parametric flexibility of kernel methods.  Specifically, we transform the inputs of a spectral mixture base kernel with a deep architecture, using local kernel interpolation, inducing points, and structure exploiting (Kronecker and Toeplitz) algebra for a scalable kernel representation.  These closed-form kernels can be used as drop-in replacements for standard kernels, with benefits in expressive power and scalability.  We jointly learn the properties of these kernels through the marginal likelihood of a Gaussian process.  Inference and learning cost O(n) for n training points, and predictions cost O(1) per test point.  On a large and diverse collection of applications, including a dataset with 2 million examples, we show improved performance over scalable Gaussian processes with flexible kernel learning models, and stand-alone deep architectures.
ER  -

APA


Wilson, A.G., Hu, Z., Salakhutdinov, R. & Xing, E.P.. (2016). Deep Kernel Learning. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 51:370-378 Available from https://proceedings.mlr.press/v51/wilson16.html.

Deep Kernel Learning

Abstract

Cite this Paper

Related Material