Relative Error Embeddings of the Gaussian Kernel Distance

Di Chen, Jeff M. Phillips
Proceedings of the 28th International Conference on Algorithmic Learning Theory, PMLR 76:560-576, 2017.

Abstract

A reproducing kernel defines an embedding of a data point into an infinite dimensional reproducing kernel Hilbert space (RKHS). The norm in this space describes a distance, which we call the kernel distance. The random Fourier features (of Rahimi and Recht) describe an oblivious approximate mapping into finite dimensional Euclidean space that behaves similar to the RKHS. We show in this paper that for the Gaussian kernel the Euclidean norm between these mapped to features has $(1+\varepsilon)$-relative error with respect to the kernel distance. When there are $n$ data points, we show that $O((1/\varepsilon^2) \log n)$ dimensions of the approximate feature space are sufficient and necessary. Without a bound on $n$, but when the original points lie in $\mathbb{R}^d$ and have diameter bounded by $\mathcal{M}$, then we show that $O((d/\varepsilon^2) \log \mathcal{M})$ dimensions are sufficient, and that this many are required, up to $\log(1/\varepsilon)$ factors. We empirically confirm that relative error is indeed preserved for kernel PCA using these approximate feature maps.

Cite this Paper


BibTeX
@InProceedings{pmlr-v76-chen17a, title = {Relative Error Embeddings of the Gaussian Kernel Distance}, author = {Chen, Di and Phillips, Jeff M.}, booktitle = {Proceedings of the 28th International Conference on Algorithmic Learning Theory}, pages = {560--576}, year = {2017}, editor = {Hanneke, Steve and Reyzin, Lev}, volume = {76}, series = {Proceedings of Machine Learning Research}, month = {15--17 Oct}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v76/chen17a/chen17a.pdf}, url = {https://proceedings.mlr.press/v76/chen17a.html}, abstract = {A reproducing kernel defines an embedding of a data point into an infinite dimensional reproducing kernel Hilbert space (RKHS). The norm in this space describes a distance, which we call the kernel distance. The random Fourier features (of Rahimi and Recht) describe an oblivious approximate mapping into finite dimensional Euclidean space that behaves similar to the RKHS. We show in this paper that for the Gaussian kernel the Euclidean norm between these mapped to features has $(1+\varepsilon)$-relative error with respect to the kernel distance. When there are $n$ data points, we show that $O((1/\varepsilon^2) \log n)$ dimensions of the approximate feature space are sufficient and necessary. Without a bound on $n$, but when the original points lie in $\mathbb{R}^d$ and have diameter bounded by $\mathcal{M}$, then we show that $O((d/\varepsilon^2) \log \mathcal{M})$ dimensions are sufficient, and that this many are required, up to $\log(1/\varepsilon)$ factors. We empirically confirm that relative error is indeed preserved for kernel PCA using these approximate feature maps.} }
Endnote
%0 Conference Paper %T Relative Error Embeddings of the Gaussian Kernel Distance %A Di Chen %A Jeff M. Phillips %B Proceedings of the 28th International Conference on Algorithmic Learning Theory %C Proceedings of Machine Learning Research %D 2017 %E Steve Hanneke %E Lev Reyzin %F pmlr-v76-chen17a %I PMLR %P 560--576 %U https://proceedings.mlr.press/v76/chen17a.html %V 76 %X A reproducing kernel defines an embedding of a data point into an infinite dimensional reproducing kernel Hilbert space (RKHS). The norm in this space describes a distance, which we call the kernel distance. The random Fourier features (of Rahimi and Recht) describe an oblivious approximate mapping into finite dimensional Euclidean space that behaves similar to the RKHS. We show in this paper that for the Gaussian kernel the Euclidean norm between these mapped to features has $(1+\varepsilon)$-relative error with respect to the kernel distance. When there are $n$ data points, we show that $O((1/\varepsilon^2) \log n)$ dimensions of the approximate feature space are sufficient and necessary. Without a bound on $n$, but when the original points lie in $\mathbb{R}^d$ and have diameter bounded by $\mathcal{M}$, then we show that $O((d/\varepsilon^2) \log \mathcal{M})$ dimensions are sufficient, and that this many are required, up to $\log(1/\varepsilon)$ factors. We empirically confirm that relative error is indeed preserved for kernel PCA using these approximate feature maps.
APA
Chen, D. & Phillips, J.M.. (2017). Relative Error Embeddings of the Gaussian Kernel Distance. Proceedings of the 28th International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 76:560-576 Available from https://proceedings.mlr.press/v76/chen17a.html.

Related Material