How rotational invariance of common kernels prevents generalization in high dimensions

Konstantin Donhauser; Mingqi Wu; Fanny Yang

How rotational invariance of common kernels prevents generalization in high dimensions

Konstantin Donhauser, Mingqi Wu, Fanny Yang

Proceedings of the 38th International Conference on Machine Learning, PMLR 139:2804-2814, 2021.

Abstract

Kernel ridge regression is well-known to achieve minimax optimal rates in low-dimensional settings. However, its behavior in high dimensions is much less understood. Recent work establishes consistency for high-dimensional kernel regression for a number of specific assumptions on the data distribution. In this paper, we show that in high dimensions, the rotational invariance property of commonly studied kernels (such as RBF, inner product kernels and fully-connected NTK of any depth) leads to inconsistent estimation unless the ground truth is a low-degree polynomial. Our lower bound on the generalization error holds for a wide range of distributions and kernels with different eigenvalue decays. This lower bound suggests that consistency results for kernel ridge regression in high dimensions generally require a more refined analysis that depends on the structure of the kernel beyond its eigenvalue decay.

Cite this Paper

BibTeX

@InProceedings{pmlr-v139-donhauser21a,
  title = 	 {How rotational invariance of common kernels prevents generalization in high dimensions},
  author =       {Donhauser, Konstantin and Wu, Mingqi and Yang, Fanny},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {2804--2814},
  year = 	 {2021},
  editor = 	 {Meila, Marina and Zhang, Tong},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v139/donhauser21a/donhauser21a.pdf},
  url = 	 {https://proceedings.mlr.press/v139/donhauser21a.html},
  abstract = 	 {Kernel ridge regression is well-known to achieve minimax optimal rates in low-dimensional settings. However, its behavior in high dimensions is much less understood. Recent work establishes consistency for high-dimensional kernel regression for a number of specific assumptions on the data distribution. In this paper, we show that in high dimensions, the rotational invariance property of commonly studied kernels (such as RBF, inner product kernels and fully-connected NTK of any depth) leads to inconsistent estimation unless the ground truth is a low-degree polynomial. Our lower bound on the generalization error holds for a wide range of distributions and kernels with different eigenvalue decays. This lower bound suggests that consistency results for kernel ridge regression in high dimensions generally require a more refined analysis that depends on the structure of the kernel beyond its eigenvalue decay.}
}

Endnote

%0 Conference Paper
%T How rotational invariance of common kernels prevents generalization in high dimensions
%A Konstantin Donhauser
%A Mingqi Wu
%A Fanny Yang
%B Proceedings of the 38th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Marina Meila
%E Tong Zhang	
%F pmlr-v139-donhauser21a
%I PMLR
%P 2804--2814
%U https://proceedings.mlr.press/v139/donhauser21a.html
%V 139
%X Kernel ridge regression is well-known to achieve minimax optimal rates in low-dimensional settings. However, its behavior in high dimensions is much less understood. Recent work establishes consistency for high-dimensional kernel regression for a number of specific assumptions on the data distribution. In this paper, we show that in high dimensions, the rotational invariance property of commonly studied kernels (such as RBF, inner product kernels and fully-connected NTK of any depth) leads to inconsistent estimation unless the ground truth is a low-degree polynomial. Our lower bound on the generalization error holds for a wide range of distributions and kernels with different eigenvalue decays. This lower bound suggests that consistency results for kernel ridge regression in high dimensions generally require a more refined analysis that depends on the structure of the kernel beyond its eigenvalue decay.

APA

Donhauser, K., Wu, M. & Yang, F.. (2021). How rotational invariance of common kernels prevents generalization in high dimensions. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:2804-2814 Available from https://proceedings.mlr.press/v139/donhauser21a.html.

How rotational invariance of common kernels prevents generalization in high dimensions

Abstract

Cite this Paper

Related Material