A Differentiable Recipe for Learning Visual Non-Prehensile Planar Manipulation

Bernardo Aceituno, Alberto Rodriguez, Shubham Tulsiani, Abhinav Gupta, Mustafa Mukadam
Proceedings of the 5th Conference on Robot Learning, PMLR 164:137-147, 2022.

Abstract

Specifying tasks with videos is a powerful technique towards acquiring novel and general robot skills. However, reasoning over mechanics and dexterous interactions can make it challenging to scale visual learning for contact-rich manipulation. In this work, we focus on the problem of visual dexterous planar manipulation: given a video of an object in planar motion, find contact-aware robot actions that reproduce the same object motion. We propose a novel learning architecture that combines video decoding neural models with priors from contact mechanics by leveraging differentiable optimization and differentiable simulation. Through extensive simulated experiments, we investigate the interplay between traditional model-based techniques and modern deep learning approaches. We find that our modular and fully differentiable architecture outperforms learning-only methods on unseen objects and and motions. https://github.com/baceituno/dlm.

Cite this Paper


BibTeX
@InProceedings{pmlr-v164-aceituno22a, title = {A Differentiable Recipe for Learning Visual Non-Prehensile Planar Manipulation}, author = {Aceituno, Bernardo and Rodriguez, Alberto and Tulsiani, Shubham and Gupta, Abhinav and Mukadam, Mustafa}, booktitle = {Proceedings of the 5th Conference on Robot Learning}, pages = {137--147}, year = {2022}, editor = {Faust, Aleksandra and Hsu, David and Neumann, Gerhard}, volume = {164}, series = {Proceedings of Machine Learning Research}, month = {08--11 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v164/aceituno22a/aceituno22a.pdf}, url = {https://proceedings.mlr.press/v164/aceituno22a.html}, abstract = {Specifying tasks with videos is a powerful technique towards acquiring novel and general robot skills. However, reasoning over mechanics and dexterous interactions can make it challenging to scale visual learning for contact-rich manipulation. In this work, we focus on the problem of visual dexterous planar manipulation: given a video of an object in planar motion, find contact-aware robot actions that reproduce the same object motion. We propose a novel learning architecture that combines video decoding neural models with priors from contact mechanics by leveraging differentiable optimization and differentiable simulation. Through extensive simulated experiments, we investigate the interplay between traditional model-based techniques and modern deep learning approaches. We find that our modular and fully differentiable architecture outperforms learning-only methods on unseen objects and and motions. https://github.com/baceituno/dlm.} }
Endnote
%0 Conference Paper %T A Differentiable Recipe for Learning Visual Non-Prehensile Planar Manipulation %A Bernardo Aceituno %A Alberto Rodriguez %A Shubham Tulsiani %A Abhinav Gupta %A Mustafa Mukadam %B Proceedings of the 5th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2022 %E Aleksandra Faust %E David Hsu %E Gerhard Neumann %F pmlr-v164-aceituno22a %I PMLR %P 137--147 %U https://proceedings.mlr.press/v164/aceituno22a.html %V 164 %X Specifying tasks with videos is a powerful technique towards acquiring novel and general robot skills. However, reasoning over mechanics and dexterous interactions can make it challenging to scale visual learning for contact-rich manipulation. In this work, we focus on the problem of visual dexterous planar manipulation: given a video of an object in planar motion, find contact-aware robot actions that reproduce the same object motion. We propose a novel learning architecture that combines video decoding neural models with priors from contact mechanics by leveraging differentiable optimization and differentiable simulation. Through extensive simulated experiments, we investigate the interplay between traditional model-based techniques and modern deep learning approaches. We find that our modular and fully differentiable architecture outperforms learning-only methods on unseen objects and and motions. https://github.com/baceituno/dlm.
APA
Aceituno, B., Rodriguez, A., Tulsiani, S., Gupta, A. & Mukadam, M.. (2022). A Differentiable Recipe for Learning Visual Non-Prehensile Planar Manipulation. Proceedings of the 5th Conference on Robot Learning, in Proceedings of Machine Learning Research 164:137-147 Available from https://proceedings.mlr.press/v164/aceituno22a.html.

Related Material