PDE-Transformer: Efficient and Versatile Transformers for Physics Simulations

Benjamin Holzschuh, Qiang Liu, Georg Kohl, Nils Thuerey
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:23562-23602, 2025.

Abstract

We introduce PDE-Transformer, an improved transformer-based architecture for surrogate modeling of physics simulations on regular grids. We combine recent architectural improvements of diffusion transformers with adjustments specific for large-scale simulations to yield a more scalable and versatile general-purpose transformer architecture, which can be used as the backbone for building large-scale foundation models in physical sciences. We demonstrate that our proposed architecture outperforms state-of-the-art transformer architectures for computer vision on a large dataset of 16 different types of PDEs. We propose to embed different physical channels individually as spatio-temporal tokens, which interact via channel-wise self-attention. This helps to maintain a consistent information density of tokens when learning multiple types of PDEs simultaneously. We demonstrate that our pre-trained models achieve improved performance on several challenging downstream tasks compared to training from scratch and also beat other foundation model architectures for physics simulations. Our source code is available at https://github.com/tum-pbs/pde-transformer.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-holzschuh25a, title = {{PDE}-Transformer: Efficient and Versatile Transformers for Physics Simulations}, author = {Holzschuh, Benjamin and Liu, Qiang and Kohl, Georg and Thuerey, Nils}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {23562--23602}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/holzschuh25a/holzschuh25a.pdf}, url = {https://proceedings.mlr.press/v267/holzschuh25a.html}, abstract = {We introduce PDE-Transformer, an improved transformer-based architecture for surrogate modeling of physics simulations on regular grids. We combine recent architectural improvements of diffusion transformers with adjustments specific for large-scale simulations to yield a more scalable and versatile general-purpose transformer architecture, which can be used as the backbone for building large-scale foundation models in physical sciences. We demonstrate that our proposed architecture outperforms state-of-the-art transformer architectures for computer vision on a large dataset of 16 different types of PDEs. We propose to embed different physical channels individually as spatio-temporal tokens, which interact via channel-wise self-attention. This helps to maintain a consistent information density of tokens when learning multiple types of PDEs simultaneously. We demonstrate that our pre-trained models achieve improved performance on several challenging downstream tasks compared to training from scratch and also beat other foundation model architectures for physics simulations. Our source code is available at https://github.com/tum-pbs/pde-transformer.} }
Endnote
%0 Conference Paper %T PDE-Transformer: Efficient and Versatile Transformers for Physics Simulations %A Benjamin Holzschuh %A Qiang Liu %A Georg Kohl %A Nils Thuerey %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-holzschuh25a %I PMLR %P 23562--23602 %U https://proceedings.mlr.press/v267/holzschuh25a.html %V 267 %X We introduce PDE-Transformer, an improved transformer-based architecture for surrogate modeling of physics simulations on regular grids. We combine recent architectural improvements of diffusion transformers with adjustments specific for large-scale simulations to yield a more scalable and versatile general-purpose transformer architecture, which can be used as the backbone for building large-scale foundation models in physical sciences. We demonstrate that our proposed architecture outperforms state-of-the-art transformer architectures for computer vision on a large dataset of 16 different types of PDEs. We propose to embed different physical channels individually as spatio-temporal tokens, which interact via channel-wise self-attention. This helps to maintain a consistent information density of tokens when learning multiple types of PDEs simultaneously. We demonstrate that our pre-trained models achieve improved performance on several challenging downstream tasks compared to training from scratch and also beat other foundation model architectures for physics simulations. Our source code is available at https://github.com/tum-pbs/pde-transformer.
APA
Holzschuh, B., Liu, Q., Kohl, G. & Thuerey, N.. (2025). PDE-Transformer: Efficient and Versatile Transformers for Physics Simulations. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:23562-23602 Available from https://proceedings.mlr.press/v267/holzschuh25a.html.

Related Material