Generalization in Kernel Regression Under Realistic Assumptions

Daniel Barzilai, Ohad Shamir
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:3096-3132, 2024.

Abstract

It is by now well-established that modern over-parameterized models seem to elude the bias-variance tradeoff and generalize well despite overfitting noise. Many recent works attempt to analyze this phenomenon in the relatively tractable setting of kernel regression. However, as we argue in detail, most past works on this topic either make unrealistic assumptions, or focus on a narrow problem setup. This work aims to provide a unified theory to upper bound the excess risk of kernel regression for nearly all common and realistic settings. When applied to common kernels, our results imply benign overfitting in high input dimensions, nearly tempered overfitting in fixed dimensions, and explicit convergence rates for regularized regression. As a by-product, we obtain time-dependent bounds for neural networks trained in the kernel regime. Our results rely on new relative perturbation bounds for the eigenvalues of kernel matrices, which may be of independent interest. These reveal a self-regularization phenomenon, whereby a heavy tail in the eigendecomposition of the kernel implicitly leads to good generalization.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-barzilai24a, title = {Generalization in Kernel Regression Under Realistic Assumptions}, author = {Barzilai, Daniel and Shamir, Ohad}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {3096--3132}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/barzilai24a/barzilai24a.pdf}, url = {https://proceedings.mlr.press/v235/barzilai24a.html}, abstract = {It is by now well-established that modern over-parameterized models seem to elude the bias-variance tradeoff and generalize well despite overfitting noise. Many recent works attempt to analyze this phenomenon in the relatively tractable setting of kernel regression. However, as we argue in detail, most past works on this topic either make unrealistic assumptions, or focus on a narrow problem setup. This work aims to provide a unified theory to upper bound the excess risk of kernel regression for nearly all common and realistic settings. When applied to common kernels, our results imply benign overfitting in high input dimensions, nearly tempered overfitting in fixed dimensions, and explicit convergence rates for regularized regression. As a by-product, we obtain time-dependent bounds for neural networks trained in the kernel regime. Our results rely on new relative perturbation bounds for the eigenvalues of kernel matrices, which may be of independent interest. These reveal a self-regularization phenomenon, whereby a heavy tail in the eigendecomposition of the kernel implicitly leads to good generalization.} }
Endnote
%0 Conference Paper %T Generalization in Kernel Regression Under Realistic Assumptions %A Daniel Barzilai %A Ohad Shamir %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-barzilai24a %I PMLR %P 3096--3132 %U https://proceedings.mlr.press/v235/barzilai24a.html %V 235 %X It is by now well-established that modern over-parameterized models seem to elude the bias-variance tradeoff and generalize well despite overfitting noise. Many recent works attempt to analyze this phenomenon in the relatively tractable setting of kernel regression. However, as we argue in detail, most past works on this topic either make unrealistic assumptions, or focus on a narrow problem setup. This work aims to provide a unified theory to upper bound the excess risk of kernel regression for nearly all common and realistic settings. When applied to common kernels, our results imply benign overfitting in high input dimensions, nearly tempered overfitting in fixed dimensions, and explicit convergence rates for regularized regression. As a by-product, we obtain time-dependent bounds for neural networks trained in the kernel regime. Our results rely on new relative perturbation bounds for the eigenvalues of kernel matrices, which may be of independent interest. These reveal a self-regularization phenomenon, whereby a heavy tail in the eigendecomposition of the kernel implicitly leads to good generalization.
APA
Barzilai, D. & Shamir, O.. (2024). Generalization in Kernel Regression Under Realistic Assumptions. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:3096-3132 Available from https://proceedings.mlr.press/v235/barzilai24a.html.

Related Material