Training Strategies for Efficient Embodied Reasoning

William Chen, Suneel Belkhale, Suvir Mirchandani, Karl Pertsch, Danny Driess, Oier Mees, Sergey Levine
Proceedings of The 9th Conference on Robot Learning, PMLR 305:365-391, 2025.

Abstract

Robot chain-of-thought reasoning (CoT) – wherein a model predicts helpful intermediate representations before choosing actions – provides an effective method for improving the generalization and performance of robot policies, especially vision-language-action models (VLAs). While such approaches have been shown to improve performance and generalization, they suffer from core limitations, like needing specialized robot reasoning data and slow inference speeds. To design new robot reasoning approaches that address these issues, a more complete characterization of why reasoning helps policy performance is critical. We hypothesize several mechanisms by which robot reasoning improves policies – (1) better representation learning, (2) improved learning curricularization, and (3) increased expressivity – then devise simple variants of robot CoT reasoning to isolate and test each one. We find that learning to generate reasonings does lead to better VLA representations, while attending to the reasonings aids in actually leveraging these features for improved action prediction. Our results provide us with a better understanding of why CoT reasoning helps VLAs, which we use to introduce two simple and lightweight alternative recipes for robot reasoning. Our proposed approaches achieve significant performance gains over non-reasoning policies, state-of-the-art results on the LIBERO-90 benchmark, and a 3x inference speedup compared to standard robot reasoning.

Cite this Paper


BibTeX
@InProceedings{pmlr-v305-chen25a, title = {Training Strategies for Efficient Embodied Reasoning}, author = {Chen, William and Belkhale, Suneel and Mirchandani, Suvir and Pertsch, Karl and Driess, Danny and Mees, Oier and Levine, Sergey}, booktitle = {Proceedings of The 9th Conference on Robot Learning}, pages = {365--391}, year = {2025}, editor = {Lim, Joseph and Song, Shuran and Park, Hae-Won}, volume = {305}, series = {Proceedings of Machine Learning Research}, month = {27--30 Sep}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v305/main/assets/chen25a/chen25a.pdf}, url = {https://proceedings.mlr.press/v305/chen25a.html}, abstract = {Robot chain-of-thought reasoning (CoT) – wherein a model predicts helpful intermediate representations before choosing actions – provides an effective method for improving the generalization and performance of robot policies, especially vision-language-action models (VLAs). While such approaches have been shown to improve performance and generalization, they suffer from core limitations, like needing specialized robot reasoning data and slow inference speeds. To design new robot reasoning approaches that address these issues, a more complete characterization of why reasoning helps policy performance is critical. We hypothesize several mechanisms by which robot reasoning improves policies – (1) better representation learning, (2) improved learning curricularization, and (3) increased expressivity – then devise simple variants of robot CoT reasoning to isolate and test each one. We find that learning to generate reasonings does lead to better VLA representations, while attending to the reasonings aids in actually leveraging these features for improved action prediction. Our results provide us with a better understanding of why CoT reasoning helps VLAs, which we use to introduce two simple and lightweight alternative recipes for robot reasoning. Our proposed approaches achieve significant performance gains over non-reasoning policies, state-of-the-art results on the LIBERO-90 benchmark, and a 3x inference speedup compared to standard robot reasoning.} }
Endnote
%0 Conference Paper %T Training Strategies for Efficient Embodied Reasoning %A William Chen %A Suneel Belkhale %A Suvir Mirchandani %A Karl Pertsch %A Danny Driess %A Oier Mees %A Sergey Levine %B Proceedings of The 9th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2025 %E Joseph Lim %E Shuran Song %E Hae-Won Park %F pmlr-v305-chen25a %I PMLR %P 365--391 %U https://proceedings.mlr.press/v305/chen25a.html %V 305 %X Robot chain-of-thought reasoning (CoT) – wherein a model predicts helpful intermediate representations before choosing actions – provides an effective method for improving the generalization and performance of robot policies, especially vision-language-action models (VLAs). While such approaches have been shown to improve performance and generalization, they suffer from core limitations, like needing specialized robot reasoning data and slow inference speeds. To design new robot reasoning approaches that address these issues, a more complete characterization of why reasoning helps policy performance is critical. We hypothesize several mechanisms by which robot reasoning improves policies – (1) better representation learning, (2) improved learning curricularization, and (3) increased expressivity – then devise simple variants of robot CoT reasoning to isolate and test each one. We find that learning to generate reasonings does lead to better VLA representations, while attending to the reasonings aids in actually leveraging these features for improved action prediction. Our results provide us with a better understanding of why CoT reasoning helps VLAs, which we use to introduce two simple and lightweight alternative recipes for robot reasoning. Our proposed approaches achieve significant performance gains over non-reasoning policies, state-of-the-art results on the LIBERO-90 benchmark, and a 3x inference speedup compared to standard robot reasoning.
APA
Chen, W., Belkhale, S., Mirchandani, S., Pertsch, K., Driess, D., Mees, O. & Levine, S.. (2025). Training Strategies for Efficient Embodied Reasoning. Proceedings of The 9th Conference on Robot Learning, in Proceedings of Machine Learning Research 305:365-391 Available from https://proceedings.mlr.press/v305/chen25a.html.

Related Material