Towards Learning to Complete Anything in Lidar

Ayça Takmaz, Cristiano Saltori, Neehar Peri, Tim Meinhardt, Riccardo De Lutio, Laura Leal-Taixé, Aljosa Osep
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:58406-58430, 2025.

Abstract

We propose CAL (Complete Anything in Lidar) for Lidar-based shape-completion in-the-wild. This is closely related to Lidar-based semantic/panoptic scene completion. However, contemporary methods can only complete and recognize objects from a closed vocabulary labeled in existing Lidar datasets. Different to that, our zero-shot approach leverages the temporal context from multi-modal sensor sequences to mine object shapes and semantic features of observed objects. These are then distilled into a Lidar-only instance-level completion and recognition model. Although we only mine partial shape completions, we find that our distilled model learns to infer full object shapes from multiple such partial observations across the dataset. We show that our model can be prompted on standard benchmarks for Semantic and Panoptic Scene Completion, localize objects as (amodal) 3D bounding boxes, and recognize objects beyond fixed class vocabularies.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-takmaz25a, title = {Towards Learning to Complete Anything in Lidar}, author = {Takmaz, Ay\c{c}a and Saltori, Cristiano and Peri, Neehar and Meinhardt, Tim and De Lutio, Riccardo and Leal-Taix\'{e}, Laura and Osep, Aljosa}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {58406--58430}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/takmaz25a/takmaz25a.pdf}, url = {https://proceedings.mlr.press/v267/takmaz25a.html}, abstract = {We propose CAL (Complete Anything in Lidar) for Lidar-based shape-completion in-the-wild. This is closely related to Lidar-based semantic/panoptic scene completion. However, contemporary methods can only complete and recognize objects from a closed vocabulary labeled in existing Lidar datasets. Different to that, our zero-shot approach leverages the temporal context from multi-modal sensor sequences to mine object shapes and semantic features of observed objects. These are then distilled into a Lidar-only instance-level completion and recognition model. Although we only mine partial shape completions, we find that our distilled model learns to infer full object shapes from multiple such partial observations across the dataset. We show that our model can be prompted on standard benchmarks for Semantic and Panoptic Scene Completion, localize objects as (amodal) 3D bounding boxes, and recognize objects beyond fixed class vocabularies.} }
Endnote
%0 Conference Paper %T Towards Learning to Complete Anything in Lidar %A Ayça Takmaz %A Cristiano Saltori %A Neehar Peri %A Tim Meinhardt %A Riccardo De Lutio %A Laura Leal-Taixé %A Aljosa Osep %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-takmaz25a %I PMLR %P 58406--58430 %U https://proceedings.mlr.press/v267/takmaz25a.html %V 267 %X We propose CAL (Complete Anything in Lidar) for Lidar-based shape-completion in-the-wild. This is closely related to Lidar-based semantic/panoptic scene completion. However, contemporary methods can only complete and recognize objects from a closed vocabulary labeled in existing Lidar datasets. Different to that, our zero-shot approach leverages the temporal context from multi-modal sensor sequences to mine object shapes and semantic features of observed objects. These are then distilled into a Lidar-only instance-level completion and recognition model. Although we only mine partial shape completions, we find that our distilled model learns to infer full object shapes from multiple such partial observations across the dataset. We show that our model can be prompted on standard benchmarks for Semantic and Panoptic Scene Completion, localize objects as (amodal) 3D bounding boxes, and recognize objects beyond fixed class vocabularies.
APA
Takmaz, A., Saltori, C., Peri, N., Meinhardt, T., De Lutio, R., Leal-Taixé, L. & Osep, A.. (2025). Towards Learning to Complete Anything in Lidar. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:58406-58430 Available from https://proceedings.mlr.press/v267/takmaz25a.html.

Related Material