Learning Deep Policies for Robot Bin Picking by Simulating Robust Grasping Sequences

Jeffrey Mahler; Ken Goldberg

Learning Deep Policies for Robot Bin Picking by Simulating Robust Grasping Sequences

Jeffrey Mahler, Ken Goldberg

Proceedings of the 1st Annual Conference on Robot Learning, PMLR 78:515-524, 2017.

Abstract

Recent results suggest that it is possible to grasp a variety of singulated objects with high precision using Convolutional Neural Networks (CNNs) trained on synthetic data. This paper considers the task of bin picking, where multiple objects are randomly arranged in a heap and the objective is to sequentially grasp and transport each into a packing box. We model bin picking with a discrete-time Partially Observable Markov Decision Process that specifies states of the heap, point cloud observations, and rewards. We collect synthetic demonstrations of bin picking from an algorithmic supervisor uses full state information to optimize for the most robust collision-free grasp in a forward simulator based on pybullet to model dynamic object-object interactions and robust wrench space analysis from the Dexterity Network (Dex-Net) to model quasi-static contact between the gripper and object. We learn a policy by fine-tuning a Grasp Quality CNN on Dex-Net 2.1 to classify the supervisor’s actions from a dataset of 10,000 rollouts of the supervisor in the simulator with noise injection. In 2,192 physical trials of bin picking with an ABB YuMi on a dataset of 50 novel objects, we find that the resulting policies can achieve 94

$%$ success rate and 96

$%$ average precision (very few false positives) on heaps of 5-10 objects and can clear heaps of 10 objects in under three minutes. Datasets, experiments, and supplemental material are available at \urlhttp://berkeleyautomation.github.io/dex-net.

Cite this Paper

BibTeX


@InProceedings{pmlr-v78-mahler17a,
  title = 	 {Learning Deep Policies for Robot Bin Picking by Simulating Robust Grasping Sequences},
  author = 	 {Mahler, Jeffrey and Goldberg, Ken},
  booktitle = 	 {Proceedings of the 1st Annual Conference on Robot Learning},
  pages = 	 {515--524},
  year = 	 {2017},
  editor = 	 {Levine, Sergey and Vanhoucke, Vincent and Goldberg, Ken},
  volume = 	 {78},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--15 Nov},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v78/mahler17a/mahler17a.pdf},
  url = 	 {https://proceedings.mlr.press/v78/mahler17a.html},
  abstract = 	 {Recent results suggest that it is possible to grasp a variety of singulated objects with high precision using Convolutional Neural Networks (CNNs) trained on synthetic data. This paper considers the task of bin picking, where multiple objects are randomly arranged in a heap and the objective is to sequentially grasp and transport each into a packing box. We model bin picking with a discrete-time Partially Observable Markov Decision Process that specifies states of the heap, point cloud observations, and rewards. We collect synthetic demonstrations of bin picking from an algorithmic supervisor uses full state information to optimize for the most robust collision-free grasp in a forward simulator based on pybullet to model dynamic object-object interactions and robust wrench space analysis from the Dexterity Network (Dex-Net) to model quasi-static contact between the gripper and object. We learn a policy by fine-tuning a Grasp Quality CNN on Dex-Net 2.1 to classify the supervisor’s actions from a dataset of 10,000 rollouts of the supervisor in the simulator with noise injection. In 2,192 physical trials of bin picking with an ABB YuMi on a dataset of 50 novel objects, we find that the resulting policies can achieve 94$%$ success rate and 96$%$ average precision (very few false positives) on heaps of 5-10 objects and can clear heaps of 10 objects in under three minutes. Datasets, experiments, and supplemental material are available at \urlhttp://berkeleyautomation.github.io/dex-net.}
}

Endnote

%0 Conference Paper
%T Learning Deep Policies for Robot Bin Picking by Simulating Robust Grasping Sequences
%A Jeffrey Mahler
%A Ken Goldberg
%B Proceedings of the 1st Annual Conference on Robot Learning
%C Proceedings of Machine Learning Research
%D 2017
%E Sergey Levine
%E Vincent Vanhoucke
%E Ken Goldberg	
%F pmlr-v78-mahler17a
%I PMLR
%P 515--524
%U https://proceedings.mlr.press/v78/mahler17a.html
%V 78
%X Recent results suggest that it is possible to grasp a variety of singulated objects with high precision using Convolutional Neural Networks (CNNs) trained on synthetic data. This paper considers the task of bin picking, where multiple objects are randomly arranged in a heap and the objective is to sequentially grasp and transport each into a packing box. We model bin picking with a discrete-time Partially Observable Markov Decision Process that specifies states of the heap, point cloud observations, and rewards. We collect synthetic demonstrations of bin picking from an algorithmic supervisor uses full state information to optimize for the most robust collision-free grasp in a forward simulator based on pybullet to model dynamic object-object interactions and robust wrench space analysis from the Dexterity Network (Dex-Net) to model quasi-static contact between the gripper and object. We learn a policy by fine-tuning a Grasp Quality CNN on Dex-Net 2.1 to classify the supervisor’s actions from a dataset of 10,000 rollouts of the supervisor in the simulator with noise injection. In 2,192 physical trials of bin picking with an ABB YuMi on a dataset of 50 novel objects, we find that the resulting policies can achieve 94$%$ success rate and 96$%$ average precision (very few false positives) on heaps of 5-10 objects and can clear heaps of 10 objects in under three minutes. Datasets, experiments, and supplemental material are available at \urlhttp://berkeleyautomation.github.io/dex-net.

APA


Mahler, J. & Goldberg, K.. (2017). Learning Deep Policies for Robot Bin Picking by Simulating Robust Grasping Sequences. Proceedings of the 1st Annual Conference on Robot Learning, in Proceedings of Machine Learning Research 78:515-524 Available from https://proceedings.mlr.press/v78/mahler17a.html.

Related Material

Download PDF