Point Cloud Dataset Distillation

Deyu Bo, Xinchao Wang
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:4588-4601, 2025.

Abstract

This study introduces dataset distillation (DD) tailored for 3D data, particularly point clouds. DD aims to substitute large-scale real datasets with a small set of synthetic samples while preserving model performance. Existing methods mainly focus on structured data such as images. However, adapting DD for unstructured point clouds poses challenges due to their diverse orientations and resolutions in 3D space. To address these challenges, we theoretically demonstrate the importance of matching rotation-invariant features between real and synthetic data for 3D distillation. We further propose a plug-and-play point cloud rotator to align the point cloud to a canonical orientation, facilitating the learning of rotation-invariant features by all point cloud models. Furthermore, instead of optimizing fixed-size synthetic data directly, we devise a point-wise generator to produce point clouds at various resolutions based on the sampled noise amount. Compared to conventional DD methods, the proposed approach, termed DD3D, enables efficient training on low-resolution point clouds while generating high-resolution data for evaluation, thereby significantly reducing memory requirements and enhancing model scalability. Extensive experiments validate the effectiveness of DD3D in shape classification and part segmentation tasks across diverse scenarios, such as cross-architecture and cross-resolution settings.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-bo25a, title = {Point Cloud Dataset Distillation}, author = {Bo, Deyu and Wang, Xinchao}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {4588--4601}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/bo25a/bo25a.pdf}, url = {https://proceedings.mlr.press/v267/bo25a.html}, abstract = {This study introduces dataset distillation (DD) tailored for 3D data, particularly point clouds. DD aims to substitute large-scale real datasets with a small set of synthetic samples while preserving model performance. Existing methods mainly focus on structured data such as images. However, adapting DD for unstructured point clouds poses challenges due to their diverse orientations and resolutions in 3D space. To address these challenges, we theoretically demonstrate the importance of matching rotation-invariant features between real and synthetic data for 3D distillation. We further propose a plug-and-play point cloud rotator to align the point cloud to a canonical orientation, facilitating the learning of rotation-invariant features by all point cloud models. Furthermore, instead of optimizing fixed-size synthetic data directly, we devise a point-wise generator to produce point clouds at various resolutions based on the sampled noise amount. Compared to conventional DD methods, the proposed approach, termed DD3D, enables efficient training on low-resolution point clouds while generating high-resolution data for evaluation, thereby significantly reducing memory requirements and enhancing model scalability. Extensive experiments validate the effectiveness of DD3D in shape classification and part segmentation tasks across diverse scenarios, such as cross-architecture and cross-resolution settings.} }
Endnote
%0 Conference Paper %T Point Cloud Dataset Distillation %A Deyu Bo %A Xinchao Wang %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-bo25a %I PMLR %P 4588--4601 %U https://proceedings.mlr.press/v267/bo25a.html %V 267 %X This study introduces dataset distillation (DD) tailored for 3D data, particularly point clouds. DD aims to substitute large-scale real datasets with a small set of synthetic samples while preserving model performance. Existing methods mainly focus on structured data such as images. However, adapting DD for unstructured point clouds poses challenges due to their diverse orientations and resolutions in 3D space. To address these challenges, we theoretically demonstrate the importance of matching rotation-invariant features between real and synthetic data for 3D distillation. We further propose a plug-and-play point cloud rotator to align the point cloud to a canonical orientation, facilitating the learning of rotation-invariant features by all point cloud models. Furthermore, instead of optimizing fixed-size synthetic data directly, we devise a point-wise generator to produce point clouds at various resolutions based on the sampled noise amount. Compared to conventional DD methods, the proposed approach, termed DD3D, enables efficient training on low-resolution point clouds while generating high-resolution data for evaluation, thereby significantly reducing memory requirements and enhancing model scalability. Extensive experiments validate the effectiveness of DD3D in shape classification and part segmentation tasks across diverse scenarios, such as cross-architecture and cross-resolution settings.
APA
Bo, D. & Wang, X.. (2025). Point Cloud Dataset Distillation. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:4588-4601 Available from https://proceedings.mlr.press/v267/bo25a.html.

Related Material