[edit]
Near-Interpolators: Rapid Norm Growth and the Trade-Off between Interpolation and Generalization
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:4483-4491, 2024.
Abstract
We study the generalization capability of nearly-interpolating linear regressors: β’s whose training error τ is positive but small, i.e., below the noise floor. Under a random matrix theoretic assumption on the data distribution and an eigendecay assumption on the data covariance matrix Σ, we demonstrate that any near-interpolator exhibits rapid norm growth: for τ fixed, β has squared ℓ2-norm E[‖ where n is the number of samples and \alpha >1 is the exponent of the eigendecay, i.e., \lambda_i({\Sigma}) \sim i^{-\alpha}. This implies that existing data-independent norm-based bounds are necessarily loose. On the other hand, in the same regime we precisely characterize the asymptotic trade-off between interpolation and generalization. Our characterization reveals that larger norm scaling exponents \alpha correspond to worse trade-offs between interpolation and generalization. We verify empirically that a similar phenomenon holds for nearly-interpolating shallow neural networks.