Provably Correct Automata Embeddings for Optimal Automata-Conditioned Reinforcement Learning

Beyazit Yalcinkaya, Niklas Lauffer, Marcell Vazquez-Chanlatte, Sanjit A. Seshia
Proceedings of the International Conference on Neuro-symbolic Systems, PMLR 288:661-675, 2025.

Abstract

Automata-conditioned reinforcement learning (RL) has given promising results for learning multi-task policies capable of performing temporally extended objectives given at runtime, done by pretraining and freezing automata embeddings prior to training the downstream policy. However, no theoretical guarantees were given. This work provides a theoretical framework for the automata-conditioned RL problem and shows that it is probably approximately correct (PAC) learnable. We then present a technique for learning provably correct automata embeddings, guaranteeing optimal multi-task policy learning. Our experimental evaluation confirms these theoretical results.

Cite this Paper


BibTeX
@InProceedings{pmlr-v288-yalcinkaya25a, title = {Provably Correct Automata Embeddings for Optimal Automata-Conditioned Reinforcement Learning}, author = {Yalcinkaya, Beyazit and Lauffer, Niklas and Vazquez-Chanlatte, Marcell and Seshia, Sanjit A.}, booktitle = {Proceedings of the International Conference on Neuro-symbolic Systems}, pages = {661--675}, year = {2025}, editor = {Pappas, George and Ravikumar, Pradeep and Seshia, Sanjit A.}, volume = {288}, series = {Proceedings of Machine Learning Research}, month = {28--30 May}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v288/main/assets/yalcinkaya25a/yalcinkaya25a.pdf}, url = {https://proceedings.mlr.press/v288/yalcinkaya25a.html}, abstract = {Automata-conditioned reinforcement learning (RL) has given promising results for learning multi-task policies capable of performing temporally extended objectives given at runtime, done by pretraining and freezing automata embeddings prior to training the downstream policy. However, no theoretical guarantees were given. This work provides a theoretical framework for the automata-conditioned RL problem and shows that it is probably approximately correct (PAC) learnable. We then present a technique for learning provably correct automata embeddings, guaranteeing optimal multi-task policy learning. Our experimental evaluation confirms these theoretical results.} }
Endnote
%0 Conference Paper %T Provably Correct Automata Embeddings for Optimal Automata-Conditioned Reinforcement Learning %A Beyazit Yalcinkaya %A Niklas Lauffer %A Marcell Vazquez-Chanlatte %A Sanjit A. Seshia %B Proceedings of the International Conference on Neuro-symbolic Systems %C Proceedings of Machine Learning Research %D 2025 %E George Pappas %E Pradeep Ravikumar %E Sanjit A. Seshia %F pmlr-v288-yalcinkaya25a %I PMLR %P 661--675 %U https://proceedings.mlr.press/v288/yalcinkaya25a.html %V 288 %X Automata-conditioned reinforcement learning (RL) has given promising results for learning multi-task policies capable of performing temporally extended objectives given at runtime, done by pretraining and freezing automata embeddings prior to training the downstream policy. However, no theoretical guarantees were given. This work provides a theoretical framework for the automata-conditioned RL problem and shows that it is probably approximately correct (PAC) learnable. We then present a technique for learning provably correct automata embeddings, guaranteeing optimal multi-task policy learning. Our experimental evaluation confirms these theoretical results.
APA
Yalcinkaya, B., Lauffer, N., Vazquez-Chanlatte, M. & Seshia, S.A.. (2025). Provably Correct Automata Embeddings for Optimal Automata-Conditioned Reinforcement Learning. Proceedings of the International Conference on Neuro-symbolic Systems, in Proceedings of Machine Learning Research 288:661-675 Available from https://proceedings.mlr.press/v288/yalcinkaya25a.html.

Related Material