Neurosymbolic World Models for Sequential Decision Making

Leonardo Hernandez Cano; Maxine Perroni-Scharf; Neil Dhir; Arun Ramamurthy; Armando Solar-Lezama

Neurosymbolic World Models for Sequential Decision Making

Leonardo Hernandez Cano, Maxine Perroni-Scharf, Neil Dhir, Arun Ramamurthy, Armando Solar-Lezama

Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:23047-23062, 2025.

Abstract

We present Structured World Modeling for Policy Optimization (SWMPO), a framework for unsupervised learning of neurosymbolic Finite State Machines (FSM) that capture environmental structure for policy optimization. Traditional unsupervised world modeling methods rely on unstructured representations, such as neural networks, that do not explicitly represent high-level patterns within the system (e.g., patterns in the dynamics of regions such as water and land). Instead, SWMPO models the environment as a finite state machine (FSM), where each state corresponds to a specific region with distinct dynamics. This structured representation can then be leveraged for tasks like policy optimization. Previous works that synthesize FSMs for this purpose have been limited to discrete spaces, not continuous spaces. Instead, our proposed FSM synthesis algorithm operates in an unsupervised manner, leveraging low-level features from unprocessed, non-visual data, making it adaptable across various domains. The synthesized FSM models are expressive enough to be used in a model-based Reinforcement Learning scheme that leverages offline data to efficiently synthesize environment-specific world models. We demonstrate the advantages of SWMPO by benchmarking its environment modeling capabilities in simulated environments.

Cite this Paper

BibTeX

@InProceedings{pmlr-v267-hernandez-cano25a,
  title = 	 {Neurosymbolic World Models for Sequential Decision Making},
  author =       {Hernandez Cano, Leonardo and Perroni-Scharf, Maxine and Dhir, Neil and Ramamurthy, Arun and Solar-Lezama, Armando},
  booktitle = 	 {Proceedings of the 42nd International Conference on Machine Learning},
  pages = 	 {23047--23062},
  year = 	 {2025},
  editor = 	 {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry},
  volume = 	 {267},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--19 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v267/main/assets/hernandez-cano25a/hernandez-cano25a.pdf},
  url = 	 {https://proceedings.mlr.press/v267/hernandez-cano25a.html},
  abstract = 	 {We present Structured World Modeling for Policy Optimization (SWMPO), a framework for unsupervised learning of neurosymbolic Finite State Machines (FSM) that capture environmental structure for policy optimization. Traditional unsupervised world modeling methods rely on unstructured representations, such as neural networks, that do not explicitly represent high-level patterns within the system (e.g., patterns in the dynamics of regions such as water and land). Instead, SWMPO models the environment as a finite state machine (FSM), where each state corresponds to a specific region with distinct dynamics. This structured representation can then be leveraged for tasks like policy optimization. Previous works that synthesize FSMs for this purpose have been limited to discrete spaces, not continuous spaces. Instead, our proposed FSM synthesis algorithm operates in an unsupervised manner, leveraging low-level features from unprocessed, non-visual data, making it adaptable across various domains. The synthesized FSM models are expressive enough to be used in a model-based Reinforcement Learning scheme that leverages offline data to efficiently synthesize environment-specific world models. We demonstrate the advantages of SWMPO by benchmarking its environment modeling capabilities in simulated environments.}
}

Endnote

%0 Conference Paper
%T Neurosymbolic World Models for Sequential Decision Making
%A Leonardo Hernandez Cano
%A Maxine Perroni-Scharf
%A Neil Dhir
%A Arun Ramamurthy
%A Armando Solar-Lezama
%B Proceedings of the 42nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2025
%E Aarti Singh
%E Maryam Fazel
%E Daniel Hsu
%E Simon Lacoste-Julien
%E Felix Berkenkamp
%E Tegan Maharaj
%E Kiri Wagstaff
%E Jerry Zhu	
%F pmlr-v267-hernandez-cano25a
%I PMLR
%P 23047--23062
%U https://proceedings.mlr.press/v267/hernandez-cano25a.html
%V 267
%X We present Structured World Modeling for Policy Optimization (SWMPO), a framework for unsupervised learning of neurosymbolic Finite State Machines (FSM) that capture environmental structure for policy optimization. Traditional unsupervised world modeling methods rely on unstructured representations, such as neural networks, that do not explicitly represent high-level patterns within the system (e.g., patterns in the dynamics of regions such as water and land). Instead, SWMPO models the environment as a finite state machine (FSM), where each state corresponds to a specific region with distinct dynamics. This structured representation can then be leveraged for tasks like policy optimization. Previous works that synthesize FSMs for this purpose have been limited to discrete spaces, not continuous spaces. Instead, our proposed FSM synthesis algorithm operates in an unsupervised manner, leveraging low-level features from unprocessed, non-visual data, making it adaptable across various domains. The synthesized FSM models are expressive enough to be used in a model-based Reinforcement Learning scheme that leverages offline data to efficiently synthesize environment-specific world models. We demonstrate the advantages of SWMPO by benchmarking its environment modeling capabilities in simulated environments.

APA

Hernandez Cano, L., Perroni-Scharf, M., Dhir, N., Ramamurthy, A. & Solar-Lezama, A.. (2025). Neurosymbolic World Models for Sequential Decision Making. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:23047-23062 Available from https://proceedings.mlr.press/v267/hernandez-cano25a.html.

Neurosymbolic World Models for Sequential Decision Making

Abstract

Cite this Paper

Related Material