Simulation of Graph Algorithms with Looped Transformers

Artur Back De Luca, Kimon Fountoulakis
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:2319-2363, 2024.

Abstract

The execution of graph algorithms using neural networks has recently attracted significant interest due to promising empirical progress. This motivates further understanding of how neural networks can replicate reasoning steps with relational data. In this work, we study the ability of transformer networks to simulate algorithms on graphs from a theoretical perspective. The architecture we use is a looped transformer with extra attention heads that interact with the graph. We prove by construction that this architecture can simulate individual algorithms such as Dijkstra’s shortest path, Breadth- and Depth-First Search, and Kosaraju’s strongly connected components, as well as multiple algorithms simultaneously. The number of parameters in the networks does not increase with the input graph size, which implies that the networks can simulate the above algorithms for any graph. Despite this property, we show a limit to simulation in our solution due to finite precision. Finally, we show a Turing Completeness result with constant width when the extra attention heads are utilized.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-back-de-luca24a, title = {Simulation of Graph Algorithms with Looped Transformers}, author = {Back De Luca, Artur and Fountoulakis, Kimon}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {2319--2363}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/back-de-luca24a/back-de-luca24a.pdf}, url = {https://proceedings.mlr.press/v235/back-de-luca24a.html}, abstract = {The execution of graph algorithms using neural networks has recently attracted significant interest due to promising empirical progress. This motivates further understanding of how neural networks can replicate reasoning steps with relational data. In this work, we study the ability of transformer networks to simulate algorithms on graphs from a theoretical perspective. The architecture we use is a looped transformer with extra attention heads that interact with the graph. We prove by construction that this architecture can simulate individual algorithms such as Dijkstra’s shortest path, Breadth- and Depth-First Search, and Kosaraju’s strongly connected components, as well as multiple algorithms simultaneously. The number of parameters in the networks does not increase with the input graph size, which implies that the networks can simulate the above algorithms for any graph. Despite this property, we show a limit to simulation in our solution due to finite precision. Finally, we show a Turing Completeness result with constant width when the extra attention heads are utilized.} }
Endnote
%0 Conference Paper %T Simulation of Graph Algorithms with Looped Transformers %A Artur Back De Luca %A Kimon Fountoulakis %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-back-de-luca24a %I PMLR %P 2319--2363 %U https://proceedings.mlr.press/v235/back-de-luca24a.html %V 235 %X The execution of graph algorithms using neural networks has recently attracted significant interest due to promising empirical progress. This motivates further understanding of how neural networks can replicate reasoning steps with relational data. In this work, we study the ability of transformer networks to simulate algorithms on graphs from a theoretical perspective. The architecture we use is a looped transformer with extra attention heads that interact with the graph. We prove by construction that this architecture can simulate individual algorithms such as Dijkstra’s shortest path, Breadth- and Depth-First Search, and Kosaraju’s strongly connected components, as well as multiple algorithms simultaneously. The number of parameters in the networks does not increase with the input graph size, which implies that the networks can simulate the above algorithms for any graph. Despite this property, we show a limit to simulation in our solution due to finite precision. Finally, we show a Turing Completeness result with constant width when the extra attention heads are utilized.
APA
Back De Luca, A. & Fountoulakis, K.. (2024). Simulation of Graph Algorithms with Looped Transformers. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:2319-2363 Available from https://proceedings.mlr.press/v235/back-de-luca24a.html.

Related Material