How Do Transformers Learn Variable Binding in Symbolic Programs?

Yiwei Wu; Atticus Geiger; Raphaël Millière

How Do Transformers Learn Variable Binding in Symbolic Programs?

Yiwei Wu, Atticus Geiger, Raphaël Millière

Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:67284-67299, 2025.

Abstract

Variable binding—the ability to associate variables with values—is fundamental to symbolic computation and cognition. Although classical architectures typically implement variable binding via addressable memory, it is not well understood how modern neural networks lacking built-in binding operations may acquire this capacity. We investigate this by training a Transformer to dereference queried variables in symbolic programs where variables are assigned either numerical constants or other variables. Each program requires following chains of variable assignments up to four steps deep to find the queried value, and also contains irrelevant chains of assignments acting as distractors. Our analysis reveals a developmental trajectory with three distinct phases during training: (1) random prediction of numerical constants, (2) a shallow heuristic prioritizing early variable assignments, and (3) the emergence of a systematic mechanism for dereferencing assignment chains. Using causal interventions, we find that the model learns to exploit the residual stream as an addressable memory space, with specialized attention heads routing information across token positions. This mechanism allows the model to dynamically track variable bindings across layers, resulting in accurate dereferencing. Our results show how Transformer models can learn to implement systematic variable binding without explicit architectural support, bridging connectionist and symbolic approaches.

Cite this Paper

BibTeX

@InProceedings{pmlr-v267-wu25j,
  title = 	 {How Do Transformers Learn Variable Binding in Symbolic Programs?},
  author =       {Wu, Yiwei and Geiger, Atticus and Milli\`{e}re, Rapha\"{e}l},
  booktitle = 	 {Proceedings of the 42nd International Conference on Machine Learning},
  pages = 	 {67284--67299},
  year = 	 {2025},
  editor = 	 {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry},
  volume = 	 {267},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--19 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v267/main/assets/wu25j/wu25j.pdf},
  url = 	 {https://proceedings.mlr.press/v267/wu25j.html},
  abstract = 	 {Variable binding—the ability to associate variables with values—is fundamental to symbolic computation and cognition. Although classical architectures typically implement variable binding via addressable memory, it is not well understood how modern neural networks lacking built-in binding operations may acquire this capacity. We investigate this by training a Transformer to dereference queried variables in symbolic programs where variables are assigned either numerical constants or other variables. Each program requires following chains of variable assignments up to four steps deep to find the queried value, and also contains irrelevant chains of assignments acting as distractors. Our analysis reveals a developmental trajectory with three distinct phases during training: (1) random prediction of numerical constants, (2) a shallow heuristic prioritizing early variable assignments, and (3) the emergence of a systematic mechanism for dereferencing assignment chains. Using causal interventions, we find that the model learns to exploit the residual stream as an addressable memory space, with specialized attention heads routing information across token positions. This mechanism allows the model to dynamically track variable bindings across layers, resulting in accurate dereferencing. Our results show how Transformer models can learn to implement systematic variable binding without explicit architectural support, bridging connectionist and symbolic approaches.}
}

Endnote

%0 Conference Paper
%T How Do Transformers Learn Variable Binding in Symbolic Programs?
%A Yiwei Wu
%A Atticus Geiger
%A Raphaël Millière
%B Proceedings of the 42nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2025
%E Aarti Singh
%E Maryam Fazel
%E Daniel Hsu
%E Simon Lacoste-Julien
%E Felix Berkenkamp
%E Tegan Maharaj
%E Kiri Wagstaff
%E Jerry Zhu	
%F pmlr-v267-wu25j
%I PMLR
%P 67284--67299
%U https://proceedings.mlr.press/v267/wu25j.html
%V 267
%X Variable binding—the ability to associate variables with values—is fundamental to symbolic computation and cognition. Although classical architectures typically implement variable binding via addressable memory, it is not well understood how modern neural networks lacking built-in binding operations may acquire this capacity. We investigate this by training a Transformer to dereference queried variables in symbolic programs where variables are assigned either numerical constants or other variables. Each program requires following chains of variable assignments up to four steps deep to find the queried value, and also contains irrelevant chains of assignments acting as distractors. Our analysis reveals a developmental trajectory with three distinct phases during training: (1) random prediction of numerical constants, (2) a shallow heuristic prioritizing early variable assignments, and (3) the emergence of a systematic mechanism for dereferencing assignment chains. Using causal interventions, we find that the model learns to exploit the residual stream as an addressable memory space, with specialized attention heads routing information across token positions. This mechanism allows the model to dynamically track variable bindings across layers, resulting in accurate dereferencing. Our results show how Transformer models can learn to implement systematic variable binding without explicit architectural support, bridging connectionist and symbolic approaches.

APA

Wu, Y., Geiger, A. & Millière, R.. (2025). How Do Transformers Learn Variable Binding in Symbolic Programs?. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:67284-67299 Available from https://proceedings.mlr.press/v267/wu25j.html.

How Do Transformers Learn Variable Binding in Symbolic Programs?

Abstract

Cite this Paper

Related Material