Supercharging Graph Transformers with Advective Diffusion

Qitian Wu, Chenxiao Yang, Kaipeng Zeng, Michael M. Bronstein
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:67959-67985, 2025.

Abstract

The capability of generalization is a cornerstone for the success of modern learning systems. For non-Euclidean data, e.g., graphs, that particularly involves topological structures, one important aspect neglected by prior studies is how machine learning models generalize under topological shifts. This paper proposes AdvDIFFormer, a physics-inspired graph Transformer model designed to address this challenge. The model is derived from advective diffusion equations which describe a class of continuous message passing process with observed and latent topological structures. We show that AdvDIFFormer has provable capability for controlling generalization error with topological shifts, which in contrast cannot be guaranteed by graph diffusion models, i.e., the generalization of common graph neural networks in continuous space. Empirically, the model demonstrates superiority in various predictive tasks across information networks, molecular screening and protein interactions

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-wu25an, title = {Supercharging Graph Transformers with Advective Diffusion}, author = {Wu, Qitian and Yang, Chenxiao and Zeng, Kaipeng and Bronstein, Michael M.}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {67959--67985}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/wu25an/wu25an.pdf}, url = {https://proceedings.mlr.press/v267/wu25an.html}, abstract = {The capability of generalization is a cornerstone for the success of modern learning systems. For non-Euclidean data, e.g., graphs, that particularly involves topological structures, one important aspect neglected by prior studies is how machine learning models generalize under topological shifts. This paper proposes AdvDIFFormer, a physics-inspired graph Transformer model designed to address this challenge. The model is derived from advective diffusion equations which describe a class of continuous message passing process with observed and latent topological structures. We show that AdvDIFFormer has provable capability for controlling generalization error with topological shifts, which in contrast cannot be guaranteed by graph diffusion models, i.e., the generalization of common graph neural networks in continuous space. Empirically, the model demonstrates superiority in various predictive tasks across information networks, molecular screening and protein interactions} }
Endnote
%0 Conference Paper %T Supercharging Graph Transformers with Advective Diffusion %A Qitian Wu %A Chenxiao Yang %A Kaipeng Zeng %A Michael M. Bronstein %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-wu25an %I PMLR %P 67959--67985 %U https://proceedings.mlr.press/v267/wu25an.html %V 267 %X The capability of generalization is a cornerstone for the success of modern learning systems. For non-Euclidean data, e.g., graphs, that particularly involves topological structures, one important aspect neglected by prior studies is how machine learning models generalize under topological shifts. This paper proposes AdvDIFFormer, a physics-inspired graph Transformer model designed to address this challenge. The model is derived from advective diffusion equations which describe a class of continuous message passing process with observed and latent topological structures. We show that AdvDIFFormer has provable capability for controlling generalization error with topological shifts, which in contrast cannot be guaranteed by graph diffusion models, i.e., the generalization of common graph neural networks in continuous space. Empirically, the model demonstrates superiority in various predictive tasks across information networks, molecular screening and protein interactions
APA
Wu, Q., Yang, C., Zeng, K. & Bronstein, M.M.. (2025). Supercharging Graph Transformers with Advective Diffusion. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:67959-67985 Available from https://proceedings.mlr.press/v267/wu25an.html.

Related Material