Comparing Graph Transformers via Positional Encodings

Mitchell Black, Zhengchao Wan, Gal Mishne, Amir Nayyeri, Yusu Wang
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:4103-4139, 2024.

Abstract

The distinguishing power of graph transformers is tied to the choice of positional encoding: features used to augment the base transformer with information about the graph. There are two primary types of positional encoding: absolute positional encodings (APEs) and relative positional encodings (RPEs). APEs assign features to each node and are given as input to the transformer. RPEs instead assign a feature to each pair of nodes, e.g., shortest-path distance, and are used to augment the attention block. A priori, it is unclear which method is better for maximizing the power of the resulting graph transformer. In this paper, we aim to understand the relationship between these different types of positional encodings. Interestingly, we show that graph transformers using APEs and RPEs are equivalent in their ability to distinguish non-isomorphic graphs. In particular, we demonstrate how to interchange APEs and RPEs while maintaining their distinguishing power in terms of graph transformers. However, in the case of graphs with node features, we show that RPEs may have an advantage over APEs. Based on our theoretical results, we provide a study of different APEs and RPEs—including the shortest-path and resistance distance and the recently introduced stable and expressive positional encoding (SPE)—and compare their distinguishing power in terms of transformers. We believe our work will help navigate the vast number of positional encoding choices and provide guidance on the future design of positional encodings for graph transformers.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-black24b, title = {Comparing Graph Transformers via Positional Encodings}, author = {Black, Mitchell and Wan, Zhengchao and Mishne, Gal and Nayyeri, Amir and Wang, Yusu}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {4103--4139}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/black24b/black24b.pdf}, url = {https://proceedings.mlr.press/v235/black24b.html}, abstract = {The distinguishing power of graph transformers is tied to the choice of positional encoding: features used to augment the base transformer with information about the graph. There are two primary types of positional encoding: absolute positional encodings (APEs) and relative positional encodings (RPEs). APEs assign features to each node and are given as input to the transformer. RPEs instead assign a feature to each pair of nodes, e.g., shortest-path distance, and are used to augment the attention block. A priori, it is unclear which method is better for maximizing the power of the resulting graph transformer. In this paper, we aim to understand the relationship between these different types of positional encodings. Interestingly, we show that graph transformers using APEs and RPEs are equivalent in their ability to distinguish non-isomorphic graphs. In particular, we demonstrate how to interchange APEs and RPEs while maintaining their distinguishing power in terms of graph transformers. However, in the case of graphs with node features, we show that RPEs may have an advantage over APEs. Based on our theoretical results, we provide a study of different APEs and RPEs—including the shortest-path and resistance distance and the recently introduced stable and expressive positional encoding (SPE)—and compare their distinguishing power in terms of transformers. We believe our work will help navigate the vast number of positional encoding choices and provide guidance on the future design of positional encodings for graph transformers.} }
Endnote
%0 Conference Paper %T Comparing Graph Transformers via Positional Encodings %A Mitchell Black %A Zhengchao Wan %A Gal Mishne %A Amir Nayyeri %A Yusu Wang %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-black24b %I PMLR %P 4103--4139 %U https://proceedings.mlr.press/v235/black24b.html %V 235 %X The distinguishing power of graph transformers is tied to the choice of positional encoding: features used to augment the base transformer with information about the graph. There are two primary types of positional encoding: absolute positional encodings (APEs) and relative positional encodings (RPEs). APEs assign features to each node and are given as input to the transformer. RPEs instead assign a feature to each pair of nodes, e.g., shortest-path distance, and are used to augment the attention block. A priori, it is unclear which method is better for maximizing the power of the resulting graph transformer. In this paper, we aim to understand the relationship between these different types of positional encodings. Interestingly, we show that graph transformers using APEs and RPEs are equivalent in their ability to distinguish non-isomorphic graphs. In particular, we demonstrate how to interchange APEs and RPEs while maintaining their distinguishing power in terms of graph transformers. However, in the case of graphs with node features, we show that RPEs may have an advantage over APEs. Based on our theoretical results, we provide a study of different APEs and RPEs—including the shortest-path and resistance distance and the recently introduced stable and expressive positional encoding (SPE)—and compare their distinguishing power in terms of transformers. We believe our work will help navigate the vast number of positional encoding choices and provide guidance on the future design of positional encodings for graph transformers.
APA
Black, M., Wan, Z., Mishne, G., Nayyeri, A. & Wang, Y.. (2024). Comparing Graph Transformers via Positional Encodings. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:4103-4139 Available from https://proceedings.mlr.press/v235/black24b.html.

Related Material