Reinforcement Learning Control of a Physical Robot Device for Assisted Human Walking without a Simulator

Junmin Zhong, Emiliano Quinones Yumbla, Seyed Yousef Soltanian, Ruofan Wu, Wenlong Zhang, Jennie Si
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:78575-78601, 2025.

Abstract

This study presents an innovative reinforcement learning (RL) control approach to facilitate soft exosuit-assisted human walking. Our goal is to address the ongoing challenges in developing reliable RL-based methods for controlling physical devices. To overcome key obstacles—such as limited data, the absence of a simulator for human-robot interaction during walking, the need for low computational overhead in real-time deployment, and the demand for rapid adaptation to achieve personalized control while ensuring human safety—we propose an online Adaptation from an offline Imitating Expert Policy (AIP) approach. Our offline learning mimics human expert actions through real human walking demonstrations without robot assistance. The resulted policy is then used to initialize online actor-critic learning, the goal of which is to optimally personalize robot assistance. In addition to being fast and robust, our online RL method also posses important properties such as learning convergence, dynamic stability, and solution optimality. We have successfully demonstrated our simple and robust framework for safe robot control on all five tested human participants, without selectively presenting results. The qualitative performance guarantees provided by our online RL, along with the consistent experimental validation of AIP control, represent the first demonstration of online adaptation for softsuit control personalization and serve as important evidence for the use of online RL in controlling a physical device to solve a real-life problem.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-zhong25f, title = {Reinforcement Learning Control of a Physical Robot Device for Assisted Human Walking without a Simulator}, author = {Zhong, Junmin and Yumbla, Emiliano Quinones and Soltanian, Seyed Yousef and Wu, Ruofan and Zhang, Wenlong and Si, Jennie}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {78575--78601}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/zhong25f/zhong25f.pdf}, url = {https://proceedings.mlr.press/v267/zhong25f.html}, abstract = {This study presents an innovative reinforcement learning (RL) control approach to facilitate soft exosuit-assisted human walking. Our goal is to address the ongoing challenges in developing reliable RL-based methods for controlling physical devices. To overcome key obstacles—such as limited data, the absence of a simulator for human-robot interaction during walking, the need for low computational overhead in real-time deployment, and the demand for rapid adaptation to achieve personalized control while ensuring human safety—we propose an online Adaptation from an offline Imitating Expert Policy (AIP) approach. Our offline learning mimics human expert actions through real human walking demonstrations without robot assistance. The resulted policy is then used to initialize online actor-critic learning, the goal of which is to optimally personalize robot assistance. In addition to being fast and robust, our online RL method also posses important properties such as learning convergence, dynamic stability, and solution optimality. We have successfully demonstrated our simple and robust framework for safe robot control on all five tested human participants, without selectively presenting results. The qualitative performance guarantees provided by our online RL, along with the consistent experimental validation of AIP control, represent the first demonstration of online adaptation for softsuit control personalization and serve as important evidence for the use of online RL in controlling a physical device to solve a real-life problem.} }
Endnote
%0 Conference Paper %T Reinforcement Learning Control of a Physical Robot Device for Assisted Human Walking without a Simulator %A Junmin Zhong %A Emiliano Quinones Yumbla %A Seyed Yousef Soltanian %A Ruofan Wu %A Wenlong Zhang %A Jennie Si %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-zhong25f %I PMLR %P 78575--78601 %U https://proceedings.mlr.press/v267/zhong25f.html %V 267 %X This study presents an innovative reinforcement learning (RL) control approach to facilitate soft exosuit-assisted human walking. Our goal is to address the ongoing challenges in developing reliable RL-based methods for controlling physical devices. To overcome key obstacles—such as limited data, the absence of a simulator for human-robot interaction during walking, the need for low computational overhead in real-time deployment, and the demand for rapid adaptation to achieve personalized control while ensuring human safety—we propose an online Adaptation from an offline Imitating Expert Policy (AIP) approach. Our offline learning mimics human expert actions through real human walking demonstrations without robot assistance. The resulted policy is then used to initialize online actor-critic learning, the goal of which is to optimally personalize robot assistance. In addition to being fast and robust, our online RL method also posses important properties such as learning convergence, dynamic stability, and solution optimality. We have successfully demonstrated our simple and robust framework for safe robot control on all five tested human participants, without selectively presenting results. The qualitative performance guarantees provided by our online RL, along with the consistent experimental validation of AIP control, represent the first demonstration of online adaptation for softsuit control personalization and serve as important evidence for the use of online RL in controlling a physical device to solve a real-life problem.
APA
Zhong, J., Yumbla, E.Q., Soltanian, S.Y., Wu, R., Zhang, W. & Si, J.. (2025). Reinforcement Learning Control of a Physical Robot Device for Assisted Human Walking without a Simulator. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:78575-78601 Available from https://proceedings.mlr.press/v267/zhong25f.html.

Related Material