Diverse Plausible Shape Completions from Ambiguous Depth Images

Bradley Saund, Dmitry Berenson
Proceedings of the 2020 Conference on Robot Learning, PMLR 155:1802-1813, 2021.

Abstract

We propose PSSNet, a network architecture for generating diverse plausible 3D reconstructions from a single 2.5D depth image. Existing methods tend to produce only small variations on a single shape, even when multiple shapes are consistent with an observation. To obtain diversity we alter a Variational Auto Encoder by providing a learned shape bounding box feature as side information during training. Since these features are known during training, we are able to add a supervised loss to the encoder and noiseless values to the decoder. To evaluate, we sample a set of completions from a network, construct a set of plausible shape matches for each test observation, and compare using our plausible diversity metric defined over sets of shapes. We perform experiments using Shapenet mugs and partially-occluded YCB objects and find that our method performs comparably in datasets with little ambiguity, and outperforms existing methods when many shapes plausibly fit an observed depth image. We demonstrate one use for PSSNet on a physical robot when grasping objects in occlusion and clutter.

Cite this Paper


BibTeX
@InProceedings{pmlr-v155-saund21a, title = {Diverse Plausible Shape Completions from Ambiguous Depth Images}, author = {Saund, Bradley and Berenson, Dmitry}, booktitle = {Proceedings of the 2020 Conference on Robot Learning}, pages = {1802--1813}, year = {2021}, editor = {Kober, Jens and Ramos, Fabio and Tomlin, Claire}, volume = {155}, series = {Proceedings of Machine Learning Research}, month = {16--18 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v155/saund21a/saund21a.pdf}, url = {https://proceedings.mlr.press/v155/saund21a.html}, abstract = {We propose PSSNet, a network architecture for generating diverse plausible 3D reconstructions from a single 2.5D depth image. Existing methods tend to produce only small variations on a single shape, even when multiple shapes are consistent with an observation. To obtain diversity we alter a Variational Auto Encoder by providing a learned shape bounding box feature as side information during training. Since these features are known during training, we are able to add a supervised loss to the encoder and noiseless values to the decoder. To evaluate, we sample a set of completions from a network, construct a set of plausible shape matches for each test observation, and compare using our plausible diversity metric defined over sets of shapes. We perform experiments using Shapenet mugs and partially-occluded YCB objects and find that our method performs comparably in datasets with little ambiguity, and outperforms existing methods when many shapes plausibly fit an observed depth image. We demonstrate one use for PSSNet on a physical robot when grasping objects in occlusion and clutter.} }
Endnote
%0 Conference Paper %T Diverse Plausible Shape Completions from Ambiguous Depth Images %A Bradley Saund %A Dmitry Berenson %B Proceedings of the 2020 Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2021 %E Jens Kober %E Fabio Ramos %E Claire Tomlin %F pmlr-v155-saund21a %I PMLR %P 1802--1813 %U https://proceedings.mlr.press/v155/saund21a.html %V 155 %X We propose PSSNet, a network architecture for generating diverse plausible 3D reconstructions from a single 2.5D depth image. Existing methods tend to produce only small variations on a single shape, even when multiple shapes are consistent with an observation. To obtain diversity we alter a Variational Auto Encoder by providing a learned shape bounding box feature as side information during training. Since these features are known during training, we are able to add a supervised loss to the encoder and noiseless values to the decoder. To evaluate, we sample a set of completions from a network, construct a set of plausible shape matches for each test observation, and compare using our plausible diversity metric defined over sets of shapes. We perform experiments using Shapenet mugs and partially-occluded YCB objects and find that our method performs comparably in datasets with little ambiguity, and outperforms existing methods when many shapes plausibly fit an observed depth image. We demonstrate one use for PSSNet on a physical robot when grasping objects in occlusion and clutter.
APA
Saund, B. & Berenson, D.. (2021). Diverse Plausible Shape Completions from Ambiguous Depth Images. Proceedings of the 2020 Conference on Robot Learning, in Proceedings of Machine Learning Research 155:1802-1813 Available from https://proceedings.mlr.press/v155/saund21a.html.

Related Material