Sample-Efficient Online Control Policy Learning with Real-Time Recursive Model Updates

Zixin Zhang, James Avtges, Todd Murphey
Proceedings of The 9th Conference on Robot Learning, PMLR 305:1914-1939, 2025.

Abstract

Data-driven control methods need to be sample-efficient and lightweight, especially when data acquisition and computational resources are limited—such as during learning on hardware. Most modern data-driven methods require large datasets and struggle with real-time updates of models, limiting their performance in dynamic environments. Koopman theory formally represents nonlinear systems as linear models over observables, and Koopman representations can be determined from data in an optimization-friendly setting with potentially rapid model updates. In this paper, we present a highly sample-efficient, Koopman-based learning pipeline: Recursive Koopman Learning (RKL). We identify sufficient conditions for model convergence and provide formal algorithmic analysis supporting our claim that RKL is lightweight and fast, with complexity independent of dataset size. We validate our method on a simulated planar two-link arm and a hybrid nonlinear hardware system with soft actuators, showing that real-time recursive Koopman model updates improve the sample efficiency and stability of data-driven controller synthesis—requiring only <10% of the data compared to benchmarks. The high-performance C++ codebase will be open-sourced.

Cite this Paper


BibTeX
@InProceedings{pmlr-v305-zhang25d, title = {Sample-Efficient Online Control Policy Learning with Real-Time Recursive Model Updates}, author = {Zhang, Zixin and Avtges, James and Murphey, Todd}, booktitle = {Proceedings of The 9th Conference on Robot Learning}, pages = {1914--1939}, year = {2025}, editor = {Lim, Joseph and Song, Shuran and Park, Hae-Won}, volume = {305}, series = {Proceedings of Machine Learning Research}, month = {27--30 Sep}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v305/main/assets/zhang25d/zhang25d.pdf}, url = {https://proceedings.mlr.press/v305/zhang25d.html}, abstract = {Data-driven control methods need to be sample-efficient and lightweight, especially when data acquisition and computational resources are limited—such as during learning on hardware. Most modern data-driven methods require large datasets and struggle with real-time updates of models, limiting their performance in dynamic environments. Koopman theory formally represents nonlinear systems as linear models over observables, and Koopman representations can be determined from data in an optimization-friendly setting with potentially rapid model updates. In this paper, we present a highly sample-efficient, Koopman-based learning pipeline: Recursive Koopman Learning (RKL). We identify sufficient conditions for model convergence and provide formal algorithmic analysis supporting our claim that RKL is lightweight and fast, with complexity independent of dataset size. We validate our method on a simulated planar two-link arm and a hybrid nonlinear hardware system with soft actuators, showing that real-time recursive Koopman model updates improve the sample efficiency and stability of data-driven controller synthesis—requiring only <10% of the data compared to benchmarks. The high-performance C++ codebase will be open-sourced.} }
Endnote
%0 Conference Paper %T Sample-Efficient Online Control Policy Learning with Real-Time Recursive Model Updates %A Zixin Zhang %A James Avtges %A Todd Murphey %B Proceedings of The 9th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2025 %E Joseph Lim %E Shuran Song %E Hae-Won Park %F pmlr-v305-zhang25d %I PMLR %P 1914--1939 %U https://proceedings.mlr.press/v305/zhang25d.html %V 305 %X Data-driven control methods need to be sample-efficient and lightweight, especially when data acquisition and computational resources are limited—such as during learning on hardware. Most modern data-driven methods require large datasets and struggle with real-time updates of models, limiting their performance in dynamic environments. Koopman theory formally represents nonlinear systems as linear models over observables, and Koopman representations can be determined from data in an optimization-friendly setting with potentially rapid model updates. In this paper, we present a highly sample-efficient, Koopman-based learning pipeline: Recursive Koopman Learning (RKL). We identify sufficient conditions for model convergence and provide formal algorithmic analysis supporting our claim that RKL is lightweight and fast, with complexity independent of dataset size. We validate our method on a simulated planar two-link arm and a hybrid nonlinear hardware system with soft actuators, showing that real-time recursive Koopman model updates improve the sample efficiency and stability of data-driven controller synthesis—requiring only <10% of the data compared to benchmarks. The high-performance C++ codebase will be open-sourced.
APA
Zhang, Z., Avtges, J. & Murphey, T.. (2025). Sample-Efficient Online Control Policy Learning with Real-Time Recursive Model Updates. Proceedings of The 9th Conference on Robot Learning, in Proceedings of Machine Learning Research 305:1914-1939 Available from https://proceedings.mlr.press/v305/zhang25d.html.

Related Material