Learning Deep Policies for Robot Bin Picking by Simulating Robust Grasping Sequences


Jeffrey Mahler, Ken Goldberg ;
Proceedings of the 1st Annual Conference on Robot Learning, PMLR 78:515-524, 2017.


Recent results suggest that it is possible to grasp a variety of singulated objects with high precision using Convolutional Neural Networks (CNNs) trained on synthetic data. This paper considers the task of bin picking, where multiple objects are randomly arranged in a heap and the objective is to sequentially grasp and transport each into a packing box. We model bin picking with a discrete-time Partially Observable Markov Decision Process that specifies states of the heap, point cloud observations, and rewards. We collect synthetic demonstrations of bin picking from an algorithmic supervisor uses full state information to optimize for the most robust collision-free grasp in a forward simulator based on pybullet to model dynamic object-object interactions and robust wrench space analysis from the Dexterity Network (Dex-Net) to model quasi-static contact between the gripper and object. We learn a policy by fine-tuning a Grasp Quality CNN on Dex-Net 2.1 to classify the supervisor’s actions from a dataset of 10,000 rollouts of the supervisor in the simulator with noise injection. In 2,192 physical trials of bin picking with an ABB YuMi on a dataset of 50 novel objects, we find that the resulting policies can achieve 94$%$ success rate and 96$%$ average precision (very few false positives) on heaps of 5-10 objects and can clear heaps of 10 objects in under three minutes. Datasets, experiments, and supplemental material are available at \urlhttp://berkeleyautomation.github.io/dex-net.

Related Material