Is Memorization Helpful or Harmful? Prior Information Sets the Threshold

Chen Cheng, Rina Foygel Barber
Proceedings of Thirty Ninth Conference on Learning Theory, PMLR 336:1399-1433, 2026.

Abstract

We examine the connection between training error and generalization error for arbitrary estimating procedures, working in an overparameterized linear model under general priors in a Bayesian setup. We find determining factors inherent to the prior distribution $\pi$, giving explicit conditions under which optimal generalization necessitates that the training error be (i) near interpolating relative to the noise size (i.e., memorization is necessary), or (ii) close to the noise level (i.e., overfitting is harmful). Remarkably, these phenomena occur when the noise reaches thresholds determined by the Fisher information and the variance parameters of the prior $\pi$.

Cite this Paper


BibTeX
@InProceedings{pmlr-v336-cheng26a, title = {Is Memorization Helpful or Harmful? Prior Information Sets the Threshold}, author = {Cheng, Chen and Barber, Rina Foygel}, booktitle = {Proceedings of Thirty Ninth Conference on Learning Theory}, pages = {1399--1433}, year = {2026}, editor = {Hanneke, Steve and Lattimore, Tor}, volume = {336}, series = {Proceedings of Machine Learning Research}, month = {29 Jun--03 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v336/main/assets/cheng26a/cheng26a.pdf}, url = {https://proceedings.mlr.press/v336/cheng26a.html}, abstract = {We examine the connection between training error and generalization error for arbitrary estimating procedures, working in an overparameterized linear model under general priors in a Bayesian setup. We find determining factors inherent to the prior distribution $\pi$, giving explicit conditions under which optimal generalization necessitates that the training error be (i) near interpolating relative to the noise size (i.e., memorization is necessary), or (ii) close to the noise level (i.e., overfitting is harmful). Remarkably, these phenomena occur when the noise reaches thresholds determined by the Fisher information and the variance parameters of the prior $\pi$.} }
Endnote
%0 Conference Paper %T Is Memorization Helpful or Harmful? Prior Information Sets the Threshold %A Chen Cheng %A Rina Foygel Barber %B Proceedings of Thirty Ninth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2026 %E Steve Hanneke %E Tor Lattimore %F pmlr-v336-cheng26a %I PMLR %P 1399--1433 %U https://proceedings.mlr.press/v336/cheng26a.html %V 336 %X We examine the connection between training error and generalization error for arbitrary estimating procedures, working in an overparameterized linear model under general priors in a Bayesian setup. We find determining factors inherent to the prior distribution $\pi$, giving explicit conditions under which optimal generalization necessitates that the training error be (i) near interpolating relative to the noise size (i.e., memorization is necessary), or (ii) close to the noise level (i.e., overfitting is harmful). Remarkably, these phenomena occur when the noise reaches thresholds determined by the Fisher information and the variance parameters of the prior $\pi$.
APA
Cheng, C. & Barber, R.F.. (2026). Is Memorization Helpful or Harmful? Prior Information Sets the Threshold. Proceedings of Thirty Ninth Conference on Learning Theory, in Proceedings of Machine Learning Research 336:1399-1433 Available from https://proceedings.mlr.press/v336/cheng26a.html.

Related Material