Continuous-time Model-based Reinforcement Learning

Cagatay Yildiz; Markus Heinonen; Harri Lähdesmäki

Continuous-time Model-based Reinforcement Learning

Cagatay Yildiz, Markus Heinonen, Harri Lähdesmäki

Proceedings of the 38th International Conference on Machine Learning, PMLR 139:12009-12018, 2021.

Abstract

Model-based reinforcement learning (MBRL) approaches rely on discrete-time state transition models whereas physical systems and the vast majority of control tasks operate in continuous-time. To avoid time-discretization approximation of the underlying process, we propose a continuous-time MBRL framework based on a novel actor-critic method. Our approach also infers the unknown state evolution differentials with Bayesian neural ordinary differential equations (ODE) to account for epistemic uncertainty. We implement and test our method on a new ODE-RL suite that explicitly solves continuous-time control systems. Our experiments illustrate that the model is robust against irregular and noisy data, and can solve classic control problems in a sample-efficient manner.

Cite this Paper

BibTeX

@InProceedings{pmlr-v139-yildiz21a,
  title = 	 {Continuous-time Model-based Reinforcement Learning},
  author =       {Yildiz, Cagatay and Heinonen, Markus and L{\"a}hdesm{\"a}ki, Harri},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {12009--12018},
  year = 	 {2021},
  editor = 	 {Meila, Marina and Zhang, Tong},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v139/yildiz21a/yildiz21a.pdf},
  url = 	 {https://proceedings.mlr.press/v139/yildiz21a.html},
  abstract = 	 {Model-based reinforcement learning (MBRL) approaches rely on discrete-time state transition models whereas physical systems and the vast majority of control tasks operate in continuous-time. To avoid time-discretization approximation of the underlying process, we propose a continuous-time MBRL framework based on a novel actor-critic method. Our approach also infers the unknown state evolution differentials with Bayesian neural ordinary differential equations (ODE) to account for epistemic uncertainty. We implement and test our method on a new ODE-RL suite that explicitly solves continuous-time control systems. Our experiments illustrate that the model is robust against irregular and noisy data, and can solve classic control problems in a sample-efficient manner.}
}

Endnote

%0 Conference Paper
%T Continuous-time Model-based Reinforcement Learning
%A Cagatay Yildiz
%A Markus Heinonen
%A Harri Lähdesmäki
%B Proceedings of the 38th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Marina Meila
%E Tong Zhang	
%F pmlr-v139-yildiz21a
%I PMLR
%P 12009--12018
%U https://proceedings.mlr.press/v139/yildiz21a.html
%V 139
%X Model-based reinforcement learning (MBRL) approaches rely on discrete-time state transition models whereas physical systems and the vast majority of control tasks operate in continuous-time. To avoid time-discretization approximation of the underlying process, we propose a continuous-time MBRL framework based on a novel actor-critic method. Our approach also infers the unknown state evolution differentials with Bayesian neural ordinary differential equations (ODE) to account for epistemic uncertainty. We implement and test our method on a new ODE-RL suite that explicitly solves continuous-time control systems. Our experiments illustrate that the model is robust against irregular and noisy data, and can solve classic control problems in a sample-efficient manner.

APA

Yildiz, C., Heinonen, M. & Lähdesmäki, H.. (2021). Continuous-time Model-based Reinforcement Learning. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:12009-12018 Available from https://proceedings.mlr.press/v139/yildiz21a.html.

Continuous-time Model-based Reinforcement Learning

Abstract

Cite this Paper

Related Material