[edit]
Statistical Bias Leads to Overestimated OOD Generalization in Algorithmic Tasks for Seq2Seq Transformer Models
Proceedings of The 4th Conference on Lifelong Learning Agents, PMLR 330:204-221, 2026.
Abstract
This study aims to understand how statistical bias affects the model’s ability to generalize to in-distribution and out-of-distribution data on algorithmic tasks. Prior research indicates that transformers may inadvertently learn to rely on these spurious correlations, leading to an overestimation of their generalization capabilities. To investigate this, we evaluated the seq2seq transformer models in several synthetic algorithmic tasks, systematically introducing and varying the presence of these biases. We also analyze how different architectural design choices of the transformer models affect their generalization. Our findings suggest that the presence of statistical biases can affect model performance in out-of-distribution data, leading to an overestimation of its generalization capabilities. The models rely heavily on these spurious correlations for inference, as indicated by their performance on tasks that include such biases.