CUPID: Curating Data your Robot Loves with Influence Functions

Christopher Agia, Rohan Sinha, Jingyun Yang, Rika Antonova, Marco Pavone, Haruki Nishimura, Masha Itkina, Jeannette Bohg
Proceedings of The 9th Conference on Robot Learning, PMLR 305:2907-2932, 2025.

Abstract

In robot imitation learning, policy performance is tightly coupled with the quality and composition of the demonstration data. Yet, developing a precise understanding of how individual demonstrations contribute to downstream outcomes—such as closed-loop task success or failure—remains a persistent challenge. Inspired by the theory of influence functions, we propose CUPID. Given a set of evaluation rollouts, CUPID estimates the influence of a training demonstration on the policy’s expected return. This enables ranking and selection of demonstrations according to their impact on the policy’s closed-loop performance. We use our estimator to curate data by 1) filtering out training demonstrations that harmed the policy’s performance and 2) subselecting newly collected trajectories that will most help improve the policy. Extensive simulated and hardware experiments show that our approach consistently identifies which data drives test-time performance. For example, training with less than 33% of curated data can result in state-of-the-art diffusion policies on the simulated Robomimic benchmark, and we observe similar improvements in hardware experiments. Furthermore, our hardware experiments show that our influence-based estimator can identify robust strategies under distribution shift, isolate spurious correlations, and even enhance post-training of generalist policies.

Cite this Paper


BibTeX
@InProceedings{pmlr-v305-agia25a, title = {CUPID: Curating Data your Robot Loves with Influence Functions}, author = {Agia, Christopher and Sinha, Rohan and Yang, Jingyun and Antonova, Rika and Pavone, Marco and Nishimura, Haruki and Itkina, Masha and Bohg, Jeannette}, booktitle = {Proceedings of The 9th Conference on Robot Learning}, pages = {2907--2932}, year = {2025}, editor = {Lim, Joseph and Song, Shuran and Park, Hae-Won}, volume = {305}, series = {Proceedings of Machine Learning Research}, month = {27--30 Sep}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v305/main/assets/agia25a/agia25a.pdf}, url = {https://proceedings.mlr.press/v305/agia25a.html}, abstract = {In robot imitation learning, policy performance is tightly coupled with the quality and composition of the demonstration data. Yet, developing a precise understanding of how individual demonstrations contribute to downstream outcomes—such as closed-loop task success or failure—remains a persistent challenge. Inspired by the theory of influence functions, we propose CUPID. Given a set of evaluation rollouts, CUPID estimates the influence of a training demonstration on the policy’s expected return. This enables ranking and selection of demonstrations according to their impact on the policy’s closed-loop performance. We use our estimator to curate data by 1) filtering out training demonstrations that harmed the policy’s performance and 2) subselecting newly collected trajectories that will most help improve the policy. Extensive simulated and hardware experiments show that our approach consistently identifies which data drives test-time performance. For example, training with less than 33% of curated data can result in state-of-the-art diffusion policies on the simulated Robomimic benchmark, and we observe similar improvements in hardware experiments. Furthermore, our hardware experiments show that our influence-based estimator can identify robust strategies under distribution shift, isolate spurious correlations, and even enhance post-training of generalist policies.} }
Endnote
%0 Conference Paper %T CUPID: Curating Data your Robot Loves with Influence Functions %A Christopher Agia %A Rohan Sinha %A Jingyun Yang %A Rika Antonova %A Marco Pavone %A Haruki Nishimura %A Masha Itkina %A Jeannette Bohg %B Proceedings of The 9th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2025 %E Joseph Lim %E Shuran Song %E Hae-Won Park %F pmlr-v305-agia25a %I PMLR %P 2907--2932 %U https://proceedings.mlr.press/v305/agia25a.html %V 305 %X In robot imitation learning, policy performance is tightly coupled with the quality and composition of the demonstration data. Yet, developing a precise understanding of how individual demonstrations contribute to downstream outcomes—such as closed-loop task success or failure—remains a persistent challenge. Inspired by the theory of influence functions, we propose CUPID. Given a set of evaluation rollouts, CUPID estimates the influence of a training demonstration on the policy’s expected return. This enables ranking and selection of demonstrations according to their impact on the policy’s closed-loop performance. We use our estimator to curate data by 1) filtering out training demonstrations that harmed the policy’s performance and 2) subselecting newly collected trajectories that will most help improve the policy. Extensive simulated and hardware experiments show that our approach consistently identifies which data drives test-time performance. For example, training with less than 33% of curated data can result in state-of-the-art diffusion policies on the simulated Robomimic benchmark, and we observe similar improvements in hardware experiments. Furthermore, our hardware experiments show that our influence-based estimator can identify robust strategies under distribution shift, isolate spurious correlations, and even enhance post-training of generalist policies.
APA
Agia, C., Sinha, R., Yang, J., Antonova, R., Pavone, M., Nishimura, H., Itkina, M. & Bohg, J.. (2025). CUPID: Curating Data your Robot Loves with Influence Functions. Proceedings of The 9th Conference on Robot Learning, in Proceedings of Machine Learning Research 305:2907-2932 Available from https://proceedings.mlr.press/v305/agia25a.html.

Related Material