Failure and success of the spectral bias prediction for Laplace Kernel Ridge Regression: the case of low-dimensional data

Umberto M Tomasini, Antonio Sclocchi, Matthieu Wyart
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:21548-21583, 2022.


Recently, several theories including the replica method made predictions for the generalization error of Kernel Ridge Regression. In some regimes, they predict that the method has a ‘spectral bias’: decomposing the true function f on the eigenbasis of the kernel, it fits well the coefficients associated with the O(P) largest eigenvalues, where P is the size of the training set. This prediction works very well on benchmark data sets such as images, yet the assumptions these approaches make on the data are never satisfied in practice. To clarify when the spectral bias prediction holds, we first focus on a one-dimensional model where rigorous results are obtained and then use scaling arguments to generalize and test our findings in higher dimensions. Our predictions include the classification case f(x)=sign(x1) with a data distribution that vanishes at the decision boundary p(x)xχ1. For χ>0 and a Laplace kernel, we find that (i) there exists a cross-over ridge λd,χ(P)P1d+χ such that for λλd,χ(P), the replica method applies, but not for λλd,χ(P), (ii) in the ridge-less case, spectral bias predicts the correct training curve exponent only in the limit d.

Cite this Paper

@InProceedings{pmlr-v162-tomasini22a, title = {Failure and success of the spectral bias prediction for {L}aplace Kernel Ridge Regression: the case of low-dimensional data}, author = {Tomasini, Umberto M and Sclocchi, Antonio and Wyart, Matthieu}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {21548--21583}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {}, url = {}, abstract = {Recently, several theories including the replica method made predictions for the generalization error of Kernel Ridge Regression. In some regimes, they predict that the method has a ‘spectral bias’: decomposing the true function $f^*$ on the eigenbasis of the kernel, it fits well the coefficients associated with the O(P) largest eigenvalues, where $P$ is the size of the training set. This prediction works very well on benchmark data sets such as images, yet the assumptions these approaches make on the data are never satisfied in practice. To clarify when the spectral bias prediction holds, we first focus on a one-dimensional model where rigorous results are obtained and then use scaling arguments to generalize and test our findings in higher dimensions. Our predictions include the classification case $f(x)=$sign$(x_1)$ with a data distribution that vanishes at the decision boundary $p(x)\sim x_1^{\chi}$. For $\chi>0$ and a Laplace kernel, we find that (i) there exists a cross-over ridge $\lambda^*_{d,\chi}(P)\sim P^{-\frac{1}{d+\chi}}$ such that for $\lambda\gg \lambda^*_{d,\chi}(P)$, the replica method applies, but not for $\lambda\ll\lambda^*_{d,\chi}(P)$, (ii) in the ridge-less case, spectral bias predicts the correct training curve exponent only in the limit $d\rightarrow\infty$.} }
%0 Conference Paper %T Failure and success of the spectral bias prediction for Laplace Kernel Ridge Regression: the case of low-dimensional data %A Umberto M Tomasini %A Antonio Sclocchi %A Matthieu Wyart %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-tomasini22a %I PMLR %P 21548--21583 %U %V 162 %X Recently, several theories including the replica method made predictions for the generalization error of Kernel Ridge Regression. In some regimes, they predict that the method has a ‘spectral bias’: decomposing the true function $f^*$ on the eigenbasis of the kernel, it fits well the coefficients associated with the O(P) largest eigenvalues, where $P$ is the size of the training set. This prediction works very well on benchmark data sets such as images, yet the assumptions these approaches make on the data are never satisfied in practice. To clarify when the spectral bias prediction holds, we first focus on a one-dimensional model where rigorous results are obtained and then use scaling arguments to generalize and test our findings in higher dimensions. Our predictions include the classification case $f(x)=$sign$(x_1)$ with a data distribution that vanishes at the decision boundary $p(x)\sim x_1^{\chi}$. For $\chi>0$ and a Laplace kernel, we find that (i) there exists a cross-over ridge $\lambda^*_{d,\chi}(P)\sim P^{-\frac{1}{d+\chi}}$ such that for $\lambda\gg \lambda^*_{d,\chi}(P)$, the replica method applies, but not for $\lambda\ll\lambda^*_{d,\chi}(P)$, (ii) in the ridge-less case, spectral bias predicts the correct training curve exponent only in the limit $d\rightarrow\infty$.
Tomasini, U.M., Sclocchi, A. & Wyart, M.. (2022). Failure and success of the spectral bias prediction for Laplace Kernel Ridge Regression: the case of low-dimensional data. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:21548-21583 Available from

Related Material