ClutterDexGrasp: A Sim-to-Real System for General Dexterous Grasping in Cluttered Scenes

Zeyuan Chen, Qiyang Yan, Yuanpei Chen, Tianhao Wu, Jiyao Zhang, Zihan Ding, Jinzhou Li, Yaodong Yang, Hao Dong
Proceedings of The 9th Conference on Robot Learning, PMLR 305:885-905, 2025.

Abstract

Dexterous grasping in cluttered scenes presents significant challenges due to diverse object geometries, occlusions, and potential collisions. Existing methods primarily focus on single-object grasping or grasp-pose prediction without interaction, which are insufficient for complex, cluttered scenes. Recent vision-language-action models offer a potential solution but require extensive real-world demonstrations, making them costly and difficult to scale. To address these limitations, we revisit the sim-to-real transfer pipeline and develop key techniques that enable zero-shot deployment in reality while maintaining robust generalization. We propose ClutterDexGrasp, a two-stage teacher-student framework for closed-loop target-oriented dexterous grasping in cluttered scenes. The framework features a teacher policy trained in simulation using clutter density curriculum learning, incorporating both a novel geometry- and spatially-embedded scene representation and a comprehensive safety curriculum, enabling general, dynamic, and safe grasping behaviors. Through imitation learning, we distill the teacher’s knowledge into a student 3D diffusion policy (DP3) that operates on partial point cloud observations. To the best of our knowledge, this represents the first zero-shot sim-to-real closed-loop system for target oriented dexterous grasping in cluttered scenes, demonstrating robust performance across diverse objects and layouts.

Cite this Paper


BibTeX
@InProceedings{pmlr-v305-chen25b, title = {ClutterDexGrasp: A Sim-to-Real System for General Dexterous Grasping in Cluttered Scenes}, author = {Chen, Zeyuan and Yan, Qiyang and Chen, Yuanpei and Wu, Tianhao and Zhang, Jiyao and Ding, Zihan and Li, Jinzhou and Yang, Yaodong and Dong, Hao}, booktitle = {Proceedings of The 9th Conference on Robot Learning}, pages = {885--905}, year = {2025}, editor = {Lim, Joseph and Song, Shuran and Park, Hae-Won}, volume = {305}, series = {Proceedings of Machine Learning Research}, month = {27--30 Sep}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v305/main/assets/chen25b/chen25b.pdf}, url = {https://proceedings.mlr.press/v305/chen25b.html}, abstract = {Dexterous grasping in cluttered scenes presents significant challenges due to diverse object geometries, occlusions, and potential collisions. Existing methods primarily focus on single-object grasping or grasp-pose prediction without interaction, which are insufficient for complex, cluttered scenes. Recent vision-language-action models offer a potential solution but require extensive real-world demonstrations, making them costly and difficult to scale. To address these limitations, we revisit the sim-to-real transfer pipeline and develop key techniques that enable zero-shot deployment in reality while maintaining robust generalization. We propose ClutterDexGrasp, a two-stage teacher-student framework for closed-loop target-oriented dexterous grasping in cluttered scenes. The framework features a teacher policy trained in simulation using clutter density curriculum learning, incorporating both a novel geometry- and spatially-embedded scene representation and a comprehensive safety curriculum, enabling general, dynamic, and safe grasping behaviors. Through imitation learning, we distill the teacher’s knowledge into a student 3D diffusion policy (DP3) that operates on partial point cloud observations. To the best of our knowledge, this represents the first zero-shot sim-to-real closed-loop system for target oriented dexterous grasping in cluttered scenes, demonstrating robust performance across diverse objects and layouts.} }
Endnote
%0 Conference Paper %T ClutterDexGrasp: A Sim-to-Real System for General Dexterous Grasping in Cluttered Scenes %A Zeyuan Chen %A Qiyang Yan %A Yuanpei Chen %A Tianhao Wu %A Jiyao Zhang %A Zihan Ding %A Jinzhou Li %A Yaodong Yang %A Hao Dong %B Proceedings of The 9th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2025 %E Joseph Lim %E Shuran Song %E Hae-Won Park %F pmlr-v305-chen25b %I PMLR %P 885--905 %U https://proceedings.mlr.press/v305/chen25b.html %V 305 %X Dexterous grasping in cluttered scenes presents significant challenges due to diverse object geometries, occlusions, and potential collisions. Existing methods primarily focus on single-object grasping or grasp-pose prediction without interaction, which are insufficient for complex, cluttered scenes. Recent vision-language-action models offer a potential solution but require extensive real-world demonstrations, making them costly and difficult to scale. To address these limitations, we revisit the sim-to-real transfer pipeline and develop key techniques that enable zero-shot deployment in reality while maintaining robust generalization. We propose ClutterDexGrasp, a two-stage teacher-student framework for closed-loop target-oriented dexterous grasping in cluttered scenes. The framework features a teacher policy trained in simulation using clutter density curriculum learning, incorporating both a novel geometry- and spatially-embedded scene representation and a comprehensive safety curriculum, enabling general, dynamic, and safe grasping behaviors. Through imitation learning, we distill the teacher’s knowledge into a student 3D diffusion policy (DP3) that operates on partial point cloud observations. To the best of our knowledge, this represents the first zero-shot sim-to-real closed-loop system for target oriented dexterous grasping in cluttered scenes, demonstrating robust performance across diverse objects and layouts.
APA
Chen, Z., Yan, Q., Chen, Y., Wu, T., Zhang, J., Ding, Z., Li, J., Yang, Y. & Dong, H.. (2025). ClutterDexGrasp: A Sim-to-Real System for General Dexterous Grasping in Cluttered Scenes. Proceedings of The 9th Conference on Robot Learning, in Proceedings of Machine Learning Research 305:885-905 Available from https://proceedings.mlr.press/v305/chen25b.html.

Related Material