Practical Reinforcement Learning For MPC: Learning from sparse objectives in under an hour on a real robot

Napat Karnchanachari; Miguel Iglesia Valls; David Hoeller; Marco Hutter

Practical Reinforcement Learning For MPC: Learning from sparse objectives in under an hour on a real robot

Napat Karnchanachari, Miguel Iglesia Valls, David Hoeller, Marco Hutter

Proceedings of the 2nd Conference on Learning for Dynamics and Control, PMLR 120:211-224, 2020.

Abstract

Model Predictive Control (MPC) is a powerful control technique that handles constraints, takes the system’s dynamics into account, and is optimal with respect to a given cost function. In practice, however, it often requires an expert to craft and tune this cost function and find trade-offs between different state penalties to satisfy simple high level objectives. In this paper, we use Reinforcement Learning and in particular value learning to approximate the value function given only high level objectives, which can be sparse and binary. Building upon previous works, we present improvements that allowed us to successfully deploy the method on a real world unmanned ground vehicle. Our experiments show that our method can learn the cost function from scratch and without human intervention, while reaching a performance level similar to that of an expert-tuned MPC. We perform a quantitative comparison of these methods with standard MPC approaches both in simulation and on the real robot.

Cite this Paper

BibTeX


@InProceedings{pmlr-v120-karnchanachari20a,
  title = 	 {Practical Reinforcement Learning For MPC: Learning from sparse objectives in under an hour on a real robot},
  author =       {Karnchanachari, Napat and de la Iglesia Valls, Miguel and Hoeller, David and Hutter, Marco},
  booktitle = 	 {Proceedings of the 2nd Conference on Learning for Dynamics and Control},
  pages = 	 {211--224},
  year = 	 {2020},
  editor = 	 {Bayen, Alexandre M. and Jadbabaie, Ali and Pappas, George and Parrilo, Pablo A. and Recht, Benjamin and Tomlin, Claire and Zeilinger, Melanie},
  volume = 	 {120},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {10--11 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v120/karnchanachari20a/karnchanachari20a.pdf},
  url = 	 {https://proceedings.mlr.press/v120/karnchanachari20a.html},
  abstract = 	 {Model Predictive Control (MPC) is a powerful control technique that handles constraints, takes the system’s dynamics into account, and is optimal with respect to a given cost function. In practice, however, it often requires an expert to craft and tune this cost function and find trade-offs between different state penalties to satisfy simple high level objectives. In this paper, we use Reinforcement Learning and in particular value learning to approximate the value function given only high level objectives, which can be sparse and binary. Building upon previous works, we present improvements that allowed us to successfully deploy the method on a real world unmanned ground vehicle. Our experiments show that our method can learn the cost function from scratch and without human intervention, while reaching a performance level similar to that of an expert-tuned MPC. We perform a quantitative comparison of these methods with standard MPC approaches both in simulation and on the real robot.}
}

Endnote

%0 Conference Paper
%T Practical Reinforcement Learning For MPC: Learning from sparse objectives in under an hour on a real robot
%A Napat Karnchanachari
%A Miguel Iglesia Valls
%A David Hoeller
%A Marco Hutter
%B Proceedings of the 2nd Conference on Learning for Dynamics and Control
%C Proceedings of Machine Learning Research
%D 2020
%E Alexandre M. Bayen
%E Ali Jadbabaie
%E George Pappas
%E Pablo A. Parrilo
%E Benjamin Recht
%E Claire Tomlin
%E Melanie Zeilinger	
%F pmlr-v120-karnchanachari20a
%I PMLR
%P 211--224
%U https://proceedings.mlr.press/v120/karnchanachari20a.html
%V 120
%X Model Predictive Control (MPC) is a powerful control technique that handles constraints, takes the system’s dynamics into account, and is optimal with respect to a given cost function. In practice, however, it often requires an expert to craft and tune this cost function and find trade-offs between different state penalties to satisfy simple high level objectives. In this paper, we use Reinforcement Learning and in particular value learning to approximate the value function given only high level objectives, which can be sparse and binary. Building upon previous works, we present improvements that allowed us to successfully deploy the method on a real world unmanned ground vehicle. Our experiments show that our method can learn the cost function from scratch and without human intervention, while reaching a performance level similar to that of an expert-tuned MPC. We perform a quantitative comparison of these methods with standard MPC approaches both in simulation and on the real robot.

APA


Karnchanachari, N., Iglesia Valls, M., Hoeller, D. & Hutter, M.. (2020). Practical Reinforcement Learning For MPC: Learning from sparse objectives in under an hour on a real robot. Proceedings of the 2nd Conference on Learning for Dynamics and Control, in Proceedings of Machine Learning Research 120:211-224 Available from https://proceedings.mlr.press/v120/karnchanachari20a.html.

Related Material

Download PDF