Four Principles for Physically Interpretable World Models

Jordan Peper, Zhenjiang Mao, Yuang Geng, Siyuan Pan, Ivan Ruchkin
Proceedings of the International Conference on Neuro-symbolic Systems, PMLR 288:66-89, 2025.

Abstract

As autonomous systems are increasingly deployed in open and uncertain settings, there is a growing need for trustworthy neuro-symbolic world models that can reliably predict future high-dimensional observations. The learned latent representations in world models lack direct mapping to meaningful physical quantities and dynamics, limiting their utility and interpretability in downstream planning, control, and safety verification. In this paper, we argue for a fundamental shift from physically informed to physically interpretable world models—and crystallize four principles that leverage symbolic knowledge to achieve these ends: (1) functionally organizing the latent space according to the physical intent, (2) learning aligned invariant and equivariant representations of the physical world, (3) integrating multiple forms and strengths of supervision into a unified training process, and (4) partitioning generative outputs to support scalability and verifiability. We experimentally demonstrate the value of each principle on two benchmarks. This paper opens several intriguing research directions to achieve and capitalize on full physical interpretability in learned world models.

Cite this Paper


BibTeX
@InProceedings{pmlr-v288-peper25a, title = {Four Principles for Physically Interpretable World Models}, author = {Peper, Jordan and Mao, Zhenjiang and Geng, Yuang and Pan, Siyuan and Ruchkin, Ivan}, booktitle = {Proceedings of the International Conference on Neuro-symbolic Systems}, pages = {66--89}, year = {2025}, editor = {Pappas, George and Ravikumar, Pradeep and Seshia, Sanjit A.}, volume = {288}, series = {Proceedings of Machine Learning Research}, month = {28--30 May}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v288/main/assets/peper25a/peper25a.pdf}, url = {https://proceedings.mlr.press/v288/peper25a.html}, abstract = {As autonomous systems are increasingly deployed in open and uncertain settings, there is a growing need for trustworthy neuro-symbolic world models that can reliably predict future high-dimensional observations. The learned latent representations in world models lack direct mapping to meaningful physical quantities and dynamics, limiting their utility and interpretability in downstream planning, control, and safety verification. In this paper, we argue for a fundamental shift from physically informed to physically interpretable world models—and crystallize four principles that leverage symbolic knowledge to achieve these ends: (1) functionally organizing the latent space according to the physical intent, (2) learning aligned invariant and equivariant representations of the physical world, (3) integrating multiple forms and strengths of supervision into a unified training process, and (4) partitioning generative outputs to support scalability and verifiability. We experimentally demonstrate the value of each principle on two benchmarks. This paper opens several intriguing research directions to achieve and capitalize on full physical interpretability in learned world models.} }
Endnote
%0 Conference Paper %T Four Principles for Physically Interpretable World Models %A Jordan Peper %A Zhenjiang Mao %A Yuang Geng %A Siyuan Pan %A Ivan Ruchkin %B Proceedings of the International Conference on Neuro-symbolic Systems %C Proceedings of Machine Learning Research %D 2025 %E George Pappas %E Pradeep Ravikumar %E Sanjit A. Seshia %F pmlr-v288-peper25a %I PMLR %P 66--89 %U https://proceedings.mlr.press/v288/peper25a.html %V 288 %X As autonomous systems are increasingly deployed in open and uncertain settings, there is a growing need for trustworthy neuro-symbolic world models that can reliably predict future high-dimensional observations. The learned latent representations in world models lack direct mapping to meaningful physical quantities and dynamics, limiting their utility and interpretability in downstream planning, control, and safety verification. In this paper, we argue for a fundamental shift from physically informed to physically interpretable world models—and crystallize four principles that leverage symbolic knowledge to achieve these ends: (1) functionally organizing the latent space according to the physical intent, (2) learning aligned invariant and equivariant representations of the physical world, (3) integrating multiple forms and strengths of supervision into a unified training process, and (4) partitioning generative outputs to support scalability and verifiability. We experimentally demonstrate the value of each principle on two benchmarks. This paper opens several intriguing research directions to achieve and capitalize on full physical interpretability in learned world models.
APA
Peper, J., Mao, Z., Geng, Y., Pan, S. & Ruchkin, I.. (2025). Four Principles for Physically Interpretable World Models. Proceedings of the International Conference on Neuro-symbolic Systems, in Proceedings of Machine Learning Research 288:66-89 Available from https://proceedings.mlr.press/v288/peper25a.html.

Related Material