ROAD: Learning an Implicit Recursive Octree Auto-Decoder to Efficiently Encode 3D Shapes

Sergey Zakharov, Rares Andrei Ambrus, Katherine Liu, Adrien Gaidon
Proceedings of The 6th Conference on Robot Learning, PMLR 205:2136-2147, 2023.

Abstract

Compact and accurate representations of 3D shapes are central to many perception and robotics tasks. State-of-the-art learning-based methods can reconstruct single objects but scale poorly to large datasets. We present a novel recursive implicit representation to efficiently and accurately encode large datasets of complex 3D shapes by recursively traversing an implicit octree in latent space. Our implicit Recursive Octree Auto-Decoder (ROAD) learns a hierarchically structured latent space enabling state-of-the-art reconstruction results at a compression ratio above 99%. We also propose an efficient curriculum learning scheme that naturally exploits the coarse-to-fine properties of the underlying octree spatial representation. We explore the scaling law relating latent space dimension, dataset size, and reconstruction accuracy, showing that increasing the latent space dimension is enough to scale to large shape datasets. Finally, we show that our learned latent space encodes a coarse-to-fine hierarchical structure yielding reusable latents across different levels of details, and we provide qualitative evidence of generalization to novel shapes outside the training set.

Cite this Paper


BibTeX
@InProceedings{pmlr-v205-zakharov23a, title = {ROAD: Learning an Implicit Recursive Octree Auto-Decoder to Efficiently Encode 3D Shapes}, author = {Zakharov, Sergey and Ambrus, Rares Andrei and Liu, Katherine and Gaidon, Adrien}, booktitle = {Proceedings of The 6th Conference on Robot Learning}, pages = {2136--2147}, year = {2023}, editor = {Liu, Karen and Kulic, Dana and Ichnowski, Jeff}, volume = {205}, series = {Proceedings of Machine Learning Research}, month = {14--18 Dec}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v205/zakharov23a/zakharov23a.pdf}, url = {https://proceedings.mlr.press/v205/zakharov23a.html}, abstract = {Compact and accurate representations of 3D shapes are central to many perception and robotics tasks. State-of-the-art learning-based methods can reconstruct single objects but scale poorly to large datasets. We present a novel recursive implicit representation to efficiently and accurately encode large datasets of complex 3D shapes by recursively traversing an implicit octree in latent space. Our implicit Recursive Octree Auto-Decoder (ROAD) learns a hierarchically structured latent space enabling state-of-the-art reconstruction results at a compression ratio above 99%. We also propose an efficient curriculum learning scheme that naturally exploits the coarse-to-fine properties of the underlying octree spatial representation. We explore the scaling law relating latent space dimension, dataset size, and reconstruction accuracy, showing that increasing the latent space dimension is enough to scale to large shape datasets. Finally, we show that our learned latent space encodes a coarse-to-fine hierarchical structure yielding reusable latents across different levels of details, and we provide qualitative evidence of generalization to novel shapes outside the training set. } }
Endnote
%0 Conference Paper %T ROAD: Learning an Implicit Recursive Octree Auto-Decoder to Efficiently Encode 3D Shapes %A Sergey Zakharov %A Rares Andrei Ambrus %A Katherine Liu %A Adrien Gaidon %B Proceedings of The 6th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2023 %E Karen Liu %E Dana Kulic %E Jeff Ichnowski %F pmlr-v205-zakharov23a %I PMLR %P 2136--2147 %U https://proceedings.mlr.press/v205/zakharov23a.html %V 205 %X Compact and accurate representations of 3D shapes are central to many perception and robotics tasks. State-of-the-art learning-based methods can reconstruct single objects but scale poorly to large datasets. We present a novel recursive implicit representation to efficiently and accurately encode large datasets of complex 3D shapes by recursively traversing an implicit octree in latent space. Our implicit Recursive Octree Auto-Decoder (ROAD) learns a hierarchically structured latent space enabling state-of-the-art reconstruction results at a compression ratio above 99%. We also propose an efficient curriculum learning scheme that naturally exploits the coarse-to-fine properties of the underlying octree spatial representation. We explore the scaling law relating latent space dimension, dataset size, and reconstruction accuracy, showing that increasing the latent space dimension is enough to scale to large shape datasets. Finally, we show that our learned latent space encodes a coarse-to-fine hierarchical structure yielding reusable latents across different levels of details, and we provide qualitative evidence of generalization to novel shapes outside the training set.
APA
Zakharov, S., Ambrus, R.A., Liu, K. & Gaidon, A.. (2023). ROAD: Learning an Implicit Recursive Octree Auto-Decoder to Efficiently Encode 3D Shapes. Proceedings of The 6th Conference on Robot Learning, in Proceedings of Machine Learning Research 205:2136-2147 Available from https://proceedings.mlr.press/v205/zakharov23a.html.

Related Material