Learning Symbolic Persistent Macro-Actions for POMDP Solving Over Time

Celeste Veronese, Daniele Meli, Alessandro Farinelli
Proceedings of The 19th International Conference on Neurosymbolic Learning and Reasoning, PMLR 284:1026-1040, 2025.

Abstract

This paper proposes an integration of temporal logical reasoning and Partially Observable Markov Decision Processes (POMDPs) to achieve interpretable decision-making under uncertainty with macro-actions. Our method leverages a fragment of Linear Temporal Logic (LTL) based on Event Calculus (EC) to generate persistent (i.e., constant) macro-actions, which guide Monte Carlo Tree Search (MCTS)-based POMDP solvers over a time horizon, significantly reducing inference time while ensuring robust performance. Such macro-actions are learnt via Inductive Logic Programming (ILP) from a few traces of execution (belief-action pairs), thus eliminating the need for manually designed heuristics and requiring only the specification of the POMDP transition model. In the Pocman and Rocksample benchmark scenarios, our learned macro-actions demonstrate increased expressiveness and generality when compared to time-independent heuristics, indeed offering substantial computational efficiency improvements.

Cite this Paper


BibTeX
@InProceedings{pmlr-v284-veronese25a, title = {Learning Symbolic Persistent Macro-Actions for POMDP Solving Over Time}, author = {Veronese, Celeste and Meli, Daniele and Farinelli, Alessandro}, booktitle = {Proceedings of The 19th International Conference on Neurosymbolic Learning and Reasoning}, pages = {1026--1040}, year = {2025}, editor = {H. Gilpin, Leilani and Giunchiglia, Eleonora and Hitzler, Pascal and van Krieken, Emile}, volume = {284}, series = {Proceedings of Machine Learning Research}, month = {08--10 Sep}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v284/main/assets/veronese25a/veronese25a.pdf}, url = {https://proceedings.mlr.press/v284/veronese25a.html}, abstract = {This paper proposes an integration of temporal logical reasoning and Partially Observable Markov Decision Processes (POMDPs) to achieve interpretable decision-making under uncertainty with macro-actions. Our method leverages a fragment of Linear Temporal Logic (LTL) based on Event Calculus (EC) to generate persistent (i.e., constant) macro-actions, which guide Monte Carlo Tree Search (MCTS)-based POMDP solvers over a time horizon, significantly reducing inference time while ensuring robust performance. Such macro-actions are learnt via Inductive Logic Programming (ILP) from a few traces of execution (belief-action pairs), thus eliminating the need for manually designed heuristics and requiring only the specification of the POMDP transition model. In the Pocman and Rocksample benchmark scenarios, our learned macro-actions demonstrate increased expressiveness and generality when compared to time-independent heuristics, indeed offering substantial computational efficiency improvements.} }
Endnote
%0 Conference Paper %T Learning Symbolic Persistent Macro-Actions for POMDP Solving Over Time %A Celeste Veronese %A Daniele Meli %A Alessandro Farinelli %B Proceedings of The 19th International Conference on Neurosymbolic Learning and Reasoning %C Proceedings of Machine Learning Research %D 2025 %E Leilani H. Gilpin %E Eleonora Giunchiglia %E Pascal Hitzler %E Emile van Krieken %F pmlr-v284-veronese25a %I PMLR %P 1026--1040 %U https://proceedings.mlr.press/v284/veronese25a.html %V 284 %X This paper proposes an integration of temporal logical reasoning and Partially Observable Markov Decision Processes (POMDPs) to achieve interpretable decision-making under uncertainty with macro-actions. Our method leverages a fragment of Linear Temporal Logic (LTL) based on Event Calculus (EC) to generate persistent (i.e., constant) macro-actions, which guide Monte Carlo Tree Search (MCTS)-based POMDP solvers over a time horizon, significantly reducing inference time while ensuring robust performance. Such macro-actions are learnt via Inductive Logic Programming (ILP) from a few traces of execution (belief-action pairs), thus eliminating the need for manually designed heuristics and requiring only the specification of the POMDP transition model. In the Pocman and Rocksample benchmark scenarios, our learned macro-actions demonstrate increased expressiveness and generality when compared to time-independent heuristics, indeed offering substantial computational efficiency improvements.
APA
Veronese, C., Meli, D. & Farinelli, A.. (2025). Learning Symbolic Persistent Macro-Actions for POMDP Solving Over Time. Proceedings of The 19th International Conference on Neurosymbolic Learning and Reasoning, in Proceedings of Machine Learning Research 284:1026-1040 Available from https://proceedings.mlr.press/v284/veronese25a.html.

Related Material