A Pontryagin Perspective on Reinforcement Learning

Onno Eberhard; Claire Vernade; Michael Muehlebach

A Pontryagin Perspective on Reinforcement Learning

Onno Eberhard, Claire Vernade, Michael Muehlebach

Proceedings of the 7th Annual Learning for Dynamics \& Control Conference, PMLR 283:233-244, 2025.

Abstract

Reinforcement learning has traditionally focused on learning state-dependent policies to solve optimal control problems in a closed-loop fashion. In this work, we introduce the paradigm of open-loop reinforcement learning where a fixed action sequence is learned instead. We present three new algorithms: one robust model-based method and two sample-efficient model-free methods. Rather than basing our algorithms on Bellman’s equation from dynamic programming, our work builds on Pontryagin’s principle from the theory of open-loop optimal control. We provide convergence guarantees and evaluate all methods empirically on a pendulum swing-up task, as well as on two high-dimensional MuJoCo tasks, significantly outperforming existing baselines.

Cite this Paper

BibTeX

@InProceedings{pmlr-v283-eberhard25a,
  title = 	 {A Pontryagin Perspective on Reinforcement Learning},
  author =       {Eberhard, Onno and Vernade, Claire and Muehlebach, Michael},
  booktitle = 	 {Proceedings of the 7th Annual Learning for Dynamics \& Control Conference},
  pages = 	 {233--244},
  year = 	 {2025},
  editor = 	 {Ozay, Necmiye and Balzano, Laura and Panagou, Dimitra and Abate, Alessandro},
  volume = 	 {283},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {04--06 Jun},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v283/main/assets/eberhard25a/eberhard25a.pdf},
  url = 	 {https://proceedings.mlr.press/v283/eberhard25a.html},
  abstract = 	 {Reinforcement learning has traditionally focused on learning state-dependent policies to solve optimal control problems in a closed-loop fashion. In this work, we introduce the paradigm of open-loop reinforcement learning where a fixed action sequence is learned instead. We present three new algorithms: one robust model-based method and two sample-efficient model-free methods. Rather than basing our algorithms on Bellman’s equation from dynamic programming, our work builds on Pontryagin’s principle from the theory of open-loop optimal control. We provide convergence guarantees and evaluate all methods empirically on a pendulum swing-up task, as well as on two high-dimensional MuJoCo tasks, significantly outperforming existing baselines.}
}

Endnote

%0 Conference Paper
%T A Pontryagin Perspective on Reinforcement Learning
%A Onno Eberhard
%A Claire Vernade
%A Michael Muehlebach
%B Proceedings of the 7th Annual Learning for Dynamics \& Control Conference
%C Proceedings of Machine Learning Research
%D 2025
%E Necmiye Ozay
%E Laura Balzano
%E Dimitra Panagou
%E Alessandro Abate	
%F pmlr-v283-eberhard25a
%I PMLR
%P 233--244
%U https://proceedings.mlr.press/v283/eberhard25a.html
%V 283
%X Reinforcement learning has traditionally focused on learning state-dependent policies to solve optimal control problems in a closed-loop fashion. In this work, we introduce the paradigm of open-loop reinforcement learning where a fixed action sequence is learned instead. We present three new algorithms: one robust model-based method and two sample-efficient model-free methods. Rather than basing our algorithms on Bellman’s equation from dynamic programming, our work builds on Pontryagin’s principle from the theory of open-loop optimal control. We provide convergence guarantees and evaluate all methods empirically on a pendulum swing-up task, as well as on two high-dimensional MuJoCo tasks, significantly outperforming existing baselines.

APA

Eberhard, O., Vernade, C. & Muehlebach, M.. (2025). A Pontryagin Perspective on Reinforcement Learning. Proceedings of the 7th Annual Learning for Dynamics \& Control Conference, in Proceedings of Machine Learning Research 283:233-244 Available from https://proceedings.mlr.press/v283/eberhard25a.html.

Related Material

Download PDF