KOROL: Learning Visualizable Object Feature with Koopman Operator Rollout for Manipulation

Hongyi Chen, ABULIKEMU ABUDUWEILI, Aviral Agrawal, Yunhai Han, Harish Ravichandar, Changliu Liu, Jeffrey Ichnowski
Proceedings of The 8th Conference on Robot Learning, PMLR 270:4509-4524, 2025.

Abstract

Learning dexterous manipulation skills presents significant challenges due to complex nonlinear dynamics that underlie the interactions between objects and multi-fingered hands. Koopman operators have emerged as a robust method for modeling such nonlinear dynamics within a linear framework. However, current methods rely on runtime access to ground-truth (GT) object states, making them unsuitable for vision-based practical applications. Unlike image-to-action policies that implicitly learn visual features for control, we use a dynamics model, specifically the Koopman operator, to learn visually interpretable object features critical for robotic manipulation within a scene. We construct a Koopman operator using object features predicted by a feature extractor and utilize it to auto-regressively advance system states. We train the feature extractor to embed scene information into object features, thereby enabling the accurate propagation of robot trajectories. We evaluate our approach on simulated and real-world robot tasks, with results showing that it outperformed the model-based imitation learning NDP by 1.08× and the image-to-action Diffusion Policy by 1.16×. The results suggest that our method maintains task success rates with learned features and extends applicability to real-world manipulation without GT object states. Project video and code are available at: https://github.com/hychen-naza/KOROL.

Cite this Paper


BibTeX
@InProceedings{pmlr-v270-chen25g, title = {KOROL: Learning Visualizable Object Feature with Koopman Operator Rollout for Manipulation}, author = {Chen, Hongyi and ABUDUWEILI, ABULIKEMU and Agrawal, Aviral and Han, Yunhai and Ravichandar, Harish and Liu, Changliu and Ichnowski, Jeffrey}, booktitle = {Proceedings of The 8th Conference on Robot Learning}, pages = {4509--4524}, year = {2025}, editor = {Agrawal, Pulkit and Kroemer, Oliver and Burgard, Wolfram}, volume = {270}, series = {Proceedings of Machine Learning Research}, month = {06--09 Nov}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v270/main/assets/chen25g/chen25g.pdf}, url = {https://proceedings.mlr.press/v270/chen25g.html}, abstract = {Learning dexterous manipulation skills presents significant challenges due to complex nonlinear dynamics that underlie the interactions between objects and multi-fingered hands. Koopman operators have emerged as a robust method for modeling such nonlinear dynamics within a linear framework. However, current methods rely on runtime access to ground-truth (GT) object states, making them unsuitable for vision-based practical applications. Unlike image-to-action policies that implicitly learn visual features for control, we use a dynamics model, specifically the Koopman operator, to learn visually interpretable object features critical for robotic manipulation within a scene. We construct a Koopman operator using object features predicted by a feature extractor and utilize it to auto-regressively advance system states. We train the feature extractor to embed scene information into object features, thereby enabling the accurate propagation of robot trajectories. We evaluate our approach on simulated and real-world robot tasks, with results showing that it outperformed the model-based imitation learning NDP by 1.08$\times$ and the image-to-action Diffusion Policy by 1.16$\times$. The results suggest that our method maintains task success rates with learned features and extends applicability to real-world manipulation without GT object states. Project video and code are available at: https://github.com/hychen-naza/KOROL.} }
Endnote
%0 Conference Paper %T KOROL: Learning Visualizable Object Feature with Koopman Operator Rollout for Manipulation %A Hongyi Chen %A ABULIKEMU ABUDUWEILI %A Aviral Agrawal %A Yunhai Han %A Harish Ravichandar %A Changliu Liu %A Jeffrey Ichnowski %B Proceedings of The 8th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2025 %E Pulkit Agrawal %E Oliver Kroemer %E Wolfram Burgard %F pmlr-v270-chen25g %I PMLR %P 4509--4524 %U https://proceedings.mlr.press/v270/chen25g.html %V 270 %X Learning dexterous manipulation skills presents significant challenges due to complex nonlinear dynamics that underlie the interactions between objects and multi-fingered hands. Koopman operators have emerged as a robust method for modeling such nonlinear dynamics within a linear framework. However, current methods rely on runtime access to ground-truth (GT) object states, making them unsuitable for vision-based practical applications. Unlike image-to-action policies that implicitly learn visual features for control, we use a dynamics model, specifically the Koopman operator, to learn visually interpretable object features critical for robotic manipulation within a scene. We construct a Koopman operator using object features predicted by a feature extractor and utilize it to auto-regressively advance system states. We train the feature extractor to embed scene information into object features, thereby enabling the accurate propagation of robot trajectories. We evaluate our approach on simulated and real-world robot tasks, with results showing that it outperformed the model-based imitation learning NDP by 1.08$\times$ and the image-to-action Diffusion Policy by 1.16$\times$. The results suggest that our method maintains task success rates with learned features and extends applicability to real-world manipulation without GT object states. Project video and code are available at: https://github.com/hychen-naza/KOROL.
APA
Chen, H., ABUDUWEILI, A., Agrawal, A., Han, Y., Ravichandar, H., Liu, C. & Ichnowski, J.. (2025). KOROL: Learning Visualizable Object Feature with Koopman Operator Rollout for Manipulation. Proceedings of The 8th Conference on Robot Learning, in Proceedings of Machine Learning Research 270:4509-4524 Available from https://proceedings.mlr.press/v270/chen25g.html.

Related Material