Learning Locomotion Skills from MPC in Sensor Space

Majid Khadiv, Avadesh Meduri, Huaijiang Zhu, Ludovic Righetti, Bernhard Schölkopf
Proceedings of The 5th Annual Learning for Dynamics and Control Conference, PMLR 211:1218-1230, 2023.

Abstract

Nonlinear model predictive control (NMPC) is one the most powerful tools for generating control policies for legged locomotion. However, the large computation load required for solving optimal control problem at each control cycle hinders its use for embedded control of legged robots. Furthermore, the need for a high-quality state estimation module makes the application of NMPC in real world very challenging, especially for highly agile maneuvers. In this paper, we propose to use NMPC as an expert and learn control policies from proprioceptive sensory measurements. We perform an extensive set of simulations on the quadruped robot Solo12 and show that it is possible to learn different gaits using only proprioceptive sensory information and without any camera or lidar which are normally used to avoid drift in state estimation. Interestingly, our simulation results show that with the same structure of the function approximators, learning estimator and control policy separately outperforms end-to-end learning of dynamic gaits such as jump and bound.

Cite this Paper


BibTeX
@InProceedings{pmlr-v211-khadiv23a, title = {Learning Locomotion Skills from MPC in Sensor Space}, author = {Khadiv, Majid and Meduri, Avadesh and Zhu, Huaijiang and Righetti, Ludovic and Sch\"olkopf, Bernhard}, booktitle = {Proceedings of The 5th Annual Learning for Dynamics and Control Conference}, pages = {1218--1230}, year = {2023}, editor = {Matni, Nikolai and Morari, Manfred and Pappas, George J.}, volume = {211}, series = {Proceedings of Machine Learning Research}, month = {15--16 Jun}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v211/khadiv23a/khadiv23a.pdf}, url = {https://proceedings.mlr.press/v211/khadiv23a.html}, abstract = {Nonlinear model predictive control (NMPC) is one the most powerful tools for generating control policies for legged locomotion. However, the large computation load required for solving optimal control problem at each control cycle hinders its use for embedded control of legged robots. Furthermore, the need for a high-quality state estimation module makes the application of NMPC in real world very challenging, especially for highly agile maneuvers. In this paper, we propose to use NMPC as an expert and learn control policies from proprioceptive sensory measurements. We perform an extensive set of simulations on the quadruped robot Solo12 and show that it is possible to learn different gaits using only proprioceptive sensory information and without any camera or lidar which are normally used to avoid drift in state estimation. Interestingly, our simulation results show that with the same structure of the function approximators, learning estimator and control policy separately outperforms end-to-end learning of dynamic gaits such as jump and bound.} }
Endnote
%0 Conference Paper %T Learning Locomotion Skills from MPC in Sensor Space %A Majid Khadiv %A Avadesh Meduri %A Huaijiang Zhu %A Ludovic Righetti %A Bernhard Schölkopf %B Proceedings of The 5th Annual Learning for Dynamics and Control Conference %C Proceedings of Machine Learning Research %D 2023 %E Nikolai Matni %E Manfred Morari %E George J. Pappas %F pmlr-v211-khadiv23a %I PMLR %P 1218--1230 %U https://proceedings.mlr.press/v211/khadiv23a.html %V 211 %X Nonlinear model predictive control (NMPC) is one the most powerful tools for generating control policies for legged locomotion. However, the large computation load required for solving optimal control problem at each control cycle hinders its use for embedded control of legged robots. Furthermore, the need for a high-quality state estimation module makes the application of NMPC in real world very challenging, especially for highly agile maneuvers. In this paper, we propose to use NMPC as an expert and learn control policies from proprioceptive sensory measurements. We perform an extensive set of simulations on the quadruped robot Solo12 and show that it is possible to learn different gaits using only proprioceptive sensory information and without any camera or lidar which are normally used to avoid drift in state estimation. Interestingly, our simulation results show that with the same structure of the function approximators, learning estimator and control policy separately outperforms end-to-end learning of dynamic gaits such as jump and bound.
APA
Khadiv, M., Meduri, A., Zhu, H., Righetti, L. & Schölkopf, B.. (2023). Learning Locomotion Skills from MPC in Sensor Space. Proceedings of The 5th Annual Learning for Dynamics and Control Conference, in Proceedings of Machine Learning Research 211:1218-1230 Available from https://proceedings.mlr.press/v211/khadiv23a.html.

Related Material