Diff-LfD: Contact-aware Model-based Learning from Visual Demonstration for Robotic Manipulation via Differentiable Physics-based Simulation and Rendering

Xinghao Zhu, JingHan Ke, Zhixuan Xu, Zhixin Sun, Bizhe Bai, Jun Lv, Qingtao Liu, Yuwei Zeng, Qi Ye, Cewu Lu, Masayoshi Tomizuka, Lin Shao
Proceedings of The 7th Conference on Robot Learning, PMLR 229:499-512, 2023.

Abstract

Learning from Demonstration (LfD) is an efficient technique for robots to acquire new skills through expert observation, significantly mitigating the need for laborious manual reward function design. This paper introduces a novel framework for model-based LfD in the context of robotic manipulation. Our proposed pipeline is underpinned by two primary components: self-supervised pose and shape estimation and contact sequence generation. The former utilizes differentiable rendering to estimate object poses and shapes from demonstration videos, while the latter iteratively optimizes contact points and forces using differentiable simulation, consequently effectuating object transformations. Empirical evidence demonstrates the efficacy of our LfD pipeline in acquiring manipulation actions from human demonstrations. Complementary to this, ablation studies focusing on object tracking and contact sequence inference underscore the robustness and efficiency of our approach in generating long-horizon manipulation actions, even amidst environmental noise. Validation of our results extends to real-world deployment of the proposed pipeline. Supplementary materials and videos are available on our webpage.

Cite this Paper


BibTeX
@InProceedings{pmlr-v229-zhu23a, title = {Diff-LfD: Contact-aware Model-based Learning from Visual Demonstration for Robotic Manipulation via Differentiable Physics-based Simulation and Rendering}, author = {Zhu, Xinghao and Ke, JingHan and Xu, Zhixuan and Sun, Zhixin and Bai, Bizhe and Lv, Jun and Liu, Qingtao and Zeng, Yuwei and Ye, Qi and Lu, Cewu and Tomizuka, Masayoshi and Shao, Lin}, booktitle = {Proceedings of The 7th Conference on Robot Learning}, pages = {499--512}, year = {2023}, editor = {Tan, Jie and Toussaint, Marc and Darvish, Kourosh}, volume = {229}, series = {Proceedings of Machine Learning Research}, month = {06--09 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v229/zhu23a/zhu23a.pdf}, url = {https://proceedings.mlr.press/v229/zhu23a.html}, abstract = {Learning from Demonstration (LfD) is an efficient technique for robots to acquire new skills through expert observation, significantly mitigating the need for laborious manual reward function design. This paper introduces a novel framework for model-based LfD in the context of robotic manipulation. Our proposed pipeline is underpinned by two primary components: self-supervised pose and shape estimation and contact sequence generation. The former utilizes differentiable rendering to estimate object poses and shapes from demonstration videos, while the latter iteratively optimizes contact points and forces using differentiable simulation, consequently effectuating object transformations. Empirical evidence demonstrates the efficacy of our LfD pipeline in acquiring manipulation actions from human demonstrations. Complementary to this, ablation studies focusing on object tracking and contact sequence inference underscore the robustness and efficiency of our approach in generating long-horizon manipulation actions, even amidst environmental noise. Validation of our results extends to real-world deployment of the proposed pipeline. Supplementary materials and videos are available on our webpage.} }
Endnote
%0 Conference Paper %T Diff-LfD: Contact-aware Model-based Learning from Visual Demonstration for Robotic Manipulation via Differentiable Physics-based Simulation and Rendering %A Xinghao Zhu %A JingHan Ke %A Zhixuan Xu %A Zhixin Sun %A Bizhe Bai %A Jun Lv %A Qingtao Liu %A Yuwei Zeng %A Qi Ye %A Cewu Lu %A Masayoshi Tomizuka %A Lin Shao %B Proceedings of The 7th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2023 %E Jie Tan %E Marc Toussaint %E Kourosh Darvish %F pmlr-v229-zhu23a %I PMLR %P 499--512 %U https://proceedings.mlr.press/v229/zhu23a.html %V 229 %X Learning from Demonstration (LfD) is an efficient technique for robots to acquire new skills through expert observation, significantly mitigating the need for laborious manual reward function design. This paper introduces a novel framework for model-based LfD in the context of robotic manipulation. Our proposed pipeline is underpinned by two primary components: self-supervised pose and shape estimation and contact sequence generation. The former utilizes differentiable rendering to estimate object poses and shapes from demonstration videos, while the latter iteratively optimizes contact points and forces using differentiable simulation, consequently effectuating object transformations. Empirical evidence demonstrates the efficacy of our LfD pipeline in acquiring manipulation actions from human demonstrations. Complementary to this, ablation studies focusing on object tracking and contact sequence inference underscore the robustness and efficiency of our approach in generating long-horizon manipulation actions, even amidst environmental noise. Validation of our results extends to real-world deployment of the proposed pipeline. Supplementary materials and videos are available on our webpage.
APA
Zhu, X., Ke, J., Xu, Z., Sun, Z., Bai, B., Lv, J., Liu, Q., Zeng, Y., Ye, Q., Lu, C., Tomizuka, M. & Shao, L.. (2023). Diff-LfD: Contact-aware Model-based Learning from Visual Demonstration for Robotic Manipulation via Differentiable Physics-based Simulation and Rendering. Proceedings of The 7th Conference on Robot Learning, in Proceedings of Machine Learning Research 229:499-512 Available from https://proceedings.mlr.press/v229/zhu23a.html.

Related Material