Eye, Robot: Learning to Look to Act with a BC-RL Perception-Action Loop

Justin Kerr, Kush Hari, Ethan Weber, Chung Min Kim, Brent Yi, tyler bonnen, Ken Goldberg, Angjoo Kanazawa
Proceedings of The 9th Conference on Robot Learning, PMLR 305:3647-3664, 2025.

Abstract

Humans do not passively observe the visual world—we actively look in order to act. Motivated by this principle, we introduce EyeRobot, a robotic system with gaze behavior that emerges from the need to complete real-world tasks. We develop a mechanical eyeball that can freely rotate to observe its surroundings and train a gaze policy to control it using reinforcement learning. We accomplish this by introducing a BC-RL loop trained using teleoperated demonstrations recorded with a 360 camera. The resulting video enables a simulation environment that supports rendering arbitrary eyeball viewpoints, allowing reinforcement learning of gaze behavior. The hand (BC) agent is trained from rendered eye observations, and the eye (RL) agent is rewarded when the hand produces correct actions. In this way, hand-eye coordination emerges as the eye looks towards regions which allow the hand to complete the task. We evaluate EyeRobot on five large workspace manipulation tasks and compare performance to two common camera setups: wrist and external cameras. Our experiments suggest EyeRobot exhibits hand-eye coordination which effectively facilitates action such as visual search or target switching, which enable manipulation across large workspaces.

Cite this Paper


BibTeX
@InProceedings{pmlr-v305-kerr25a, title = {Eye, Robot: Learning to Look to Act with a BC-RL Perception-Action Loop}, author = {Kerr, Justin and Hari, Kush and Weber, Ethan and Kim, Chung Min and Yi, Brent and bonnen, tyler and Goldberg, Ken and Kanazawa, Angjoo}, booktitle = {Proceedings of The 9th Conference on Robot Learning}, pages = {3647--3664}, year = {2025}, editor = {Lim, Joseph and Song, Shuran and Park, Hae-Won}, volume = {305}, series = {Proceedings of Machine Learning Research}, month = {27--30 Sep}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v305/main/assets/kerr25a/kerr25a.pdf}, url = {https://proceedings.mlr.press/v305/kerr25a.html}, abstract = {Humans do not passively observe the visual world—we actively look in order to act. Motivated by this principle, we introduce EyeRobot, a robotic system with gaze behavior that emerges from the need to complete real-world tasks. We develop a mechanical eyeball that can freely rotate to observe its surroundings and train a gaze policy to control it using reinforcement learning. We accomplish this by introducing a BC-RL loop trained using teleoperated demonstrations recorded with a 360 camera. The resulting video enables a simulation environment that supports rendering arbitrary eyeball viewpoints, allowing reinforcement learning of gaze behavior. The hand (BC) agent is trained from rendered eye observations, and the eye (RL) agent is rewarded when the hand produces correct actions. In this way, hand-eye coordination emerges as the eye looks towards regions which allow the hand to complete the task. We evaluate EyeRobot on five large workspace manipulation tasks and compare performance to two common camera setups: wrist and external cameras. Our experiments suggest EyeRobot exhibits hand-eye coordination which effectively facilitates action such as visual search or target switching, which enable manipulation across large workspaces.} }
Endnote
%0 Conference Paper %T Eye, Robot: Learning to Look to Act with a BC-RL Perception-Action Loop %A Justin Kerr %A Kush Hari %A Ethan Weber %A Chung Min Kim %A Brent Yi %A tyler bonnen %A Ken Goldberg %A Angjoo Kanazawa %B Proceedings of The 9th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2025 %E Joseph Lim %E Shuran Song %E Hae-Won Park %F pmlr-v305-kerr25a %I PMLR %P 3647--3664 %U https://proceedings.mlr.press/v305/kerr25a.html %V 305 %X Humans do not passively observe the visual world—we actively look in order to act. Motivated by this principle, we introduce EyeRobot, a robotic system with gaze behavior that emerges from the need to complete real-world tasks. We develop a mechanical eyeball that can freely rotate to observe its surroundings and train a gaze policy to control it using reinforcement learning. We accomplish this by introducing a BC-RL loop trained using teleoperated demonstrations recorded with a 360 camera. The resulting video enables a simulation environment that supports rendering arbitrary eyeball viewpoints, allowing reinforcement learning of gaze behavior. The hand (BC) agent is trained from rendered eye observations, and the eye (RL) agent is rewarded when the hand produces correct actions. In this way, hand-eye coordination emerges as the eye looks towards regions which allow the hand to complete the task. We evaluate EyeRobot on five large workspace manipulation tasks and compare performance to two common camera setups: wrist and external cameras. Our experiments suggest EyeRobot exhibits hand-eye coordination which effectively facilitates action such as visual search or target switching, which enable manipulation across large workspaces.
APA
Kerr, J., Hari, K., Weber, E., Kim, C.M., Yi, B., bonnen, t., Goldberg, K. & Kanazawa, A.. (2025). Eye, Robot: Learning to Look to Act with a BC-RL Perception-Action Loop. Proceedings of The 9th Conference on Robot Learning, in Proceedings of Machine Learning Research 305:3647-3664 Available from https://proceedings.mlr.press/v305/kerr25a.html.

Related Material