Accuracy on the wrong line: On the pitfalls of noisy data for out-of-distribution generalisation

Amartya Sanyal, Yaxi Hu, Yaodong Yu, Yian Ma, Yixin Wang, Bernhard Schölkopf
Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, PMLR 258:2170-2178, 2025.

Abstract

Accuracy-on-the-line is a widely observed phenomenon in machine learning, where a model’s accuracy on in-distribution (ID) and out-of-distribution (OOD) data is positively correlated across different hyperparameters and data configurations. But when does this useful relationship break down? In this work, we explore its robustness. The key observation is that noisy data and the presence of nuisance features can be sufficient to shatter the Accuracy-on-the-line phenomenon. In these cases, ID and OOD accuracy can become negatively correlated, leading to "Accuracy-on-the-wrong-line". This phenomenon can also occur in the presence of spurious (shortcut) features, which tend to overshadow the more complex signal (core, non-spurious) features, resulting in a large nuisance feature space. Moreover, scaling to larger datasets does not mitigate this undesirable behaviour and may even exacerbate it. We formally prove a lower bound on OOD error in a linear classification model, characterising the conditions on the noise and nuisance features for a large OOD error. We finally demonstrate this phenomenon across both synthetic and real datasets with noisy data and nuisance features.

Cite this Paper


BibTeX
@InProceedings{pmlr-v258-sanyal25a, title = {Accuracy on the wrong line: On the pitfalls of noisy data for out-of-distribution generalisation}, author = {Sanyal, Amartya and Hu, Yaxi and Yu, Yaodong and Ma, Yian and Wang, Yixin and Sch{\"o}lkopf, Bernhard}, booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics}, pages = {2170--2178}, year = {2025}, editor = {Li, Yingzhen and Mandt, Stephan and Agrawal, Shipra and Khan, Emtiyaz}, volume = {258}, series = {Proceedings of Machine Learning Research}, month = {03--05 May}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v258/main/assets/sanyal25a/sanyal25a.pdf}, url = {https://proceedings.mlr.press/v258/sanyal25a.html}, abstract = {Accuracy-on-the-line is a widely observed phenomenon in machine learning, where a model’s accuracy on in-distribution (ID) and out-of-distribution (OOD) data is positively correlated across different hyperparameters and data configurations. But when does this useful relationship break down? In this work, we explore its robustness. The key observation is that noisy data and the presence of nuisance features can be sufficient to shatter the Accuracy-on-the-line phenomenon. In these cases, ID and OOD accuracy can become negatively correlated, leading to "Accuracy-on-the-wrong-line". This phenomenon can also occur in the presence of spurious (shortcut) features, which tend to overshadow the more complex signal (core, non-spurious) features, resulting in a large nuisance feature space. Moreover, scaling to larger datasets does not mitigate this undesirable behaviour and may even exacerbate it. We formally prove a lower bound on OOD error in a linear classification model, characterising the conditions on the noise and nuisance features for a large OOD error. We finally demonstrate this phenomenon across both synthetic and real datasets with noisy data and nuisance features.} }
Endnote
%0 Conference Paper %T Accuracy on the wrong line: On the pitfalls of noisy data for out-of-distribution generalisation %A Amartya Sanyal %A Yaxi Hu %A Yaodong Yu %A Yian Ma %A Yixin Wang %A Bernhard Schölkopf %B Proceedings of The 28th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2025 %E Yingzhen Li %E Stephan Mandt %E Shipra Agrawal %E Emtiyaz Khan %F pmlr-v258-sanyal25a %I PMLR %P 2170--2178 %U https://proceedings.mlr.press/v258/sanyal25a.html %V 258 %X Accuracy-on-the-line is a widely observed phenomenon in machine learning, where a model’s accuracy on in-distribution (ID) and out-of-distribution (OOD) data is positively correlated across different hyperparameters and data configurations. But when does this useful relationship break down? In this work, we explore its robustness. The key observation is that noisy data and the presence of nuisance features can be sufficient to shatter the Accuracy-on-the-line phenomenon. In these cases, ID and OOD accuracy can become negatively correlated, leading to "Accuracy-on-the-wrong-line". This phenomenon can also occur in the presence of spurious (shortcut) features, which tend to overshadow the more complex signal (core, non-spurious) features, resulting in a large nuisance feature space. Moreover, scaling to larger datasets does not mitigate this undesirable behaviour and may even exacerbate it. We formally prove a lower bound on OOD error in a linear classification model, characterising the conditions on the noise and nuisance features for a large OOD error. We finally demonstrate this phenomenon across both synthetic and real datasets with noisy data and nuisance features.
APA
Sanyal, A., Hu, Y., Yu, Y., Ma, Y., Wang, Y. & Schölkopf, B.. (2025). Accuracy on the wrong line: On the pitfalls of noisy data for out-of-distribution generalisation. Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 258:2170-2178 Available from https://proceedings.mlr.press/v258/sanyal25a.html.

Related Material