[edit]
Make-A-Shape: a Ten-Million-scale 3D Shape Model
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:20660-20681, 2024.
Abstract
The progression in large-scale 3D generative models has been impeded by significant resource requirements for training and challenges like inefficient representations. This paper introduces Make-A-Shape, a novel 3D generative model trained on a vast scale, using 10 million publicly-available shapes. We first innovate the wavelet-tree representation to encode high-resolution SDF shapes with minimal loss, leveraging our newly-proposed subband coefficient filtering scheme. We then design a subband coefficient packing scheme to facilitate diffusion-based generation and a subband adaptive training strategy for effective training on the large-scale dataset. Our generative framework is versatile, capable of conditioning on various input modalities such as images, point clouds, and voxels, enabling a variety of downstream applications, e.g., unconditional generation, completion, and conditional generation. Our approach clearly surpasses the existing baselines in delivering high-quality results and can efficiently generate shapes within two seconds for most conditions.