Q-SLAM: Quadric Representations for Monocular SLAM

Chensheng Peng, Chenfeng Xu, Yue Wang, Mingyu Ding, Heng Yang, Masayoshi Tomizuka, Kurt Keutzer, Marco Pavone, Wei Zhan
Proceedings of The 8th Conference on Robot Learning, PMLR 270:1763-1781, 2025.

Abstract

In this paper, we reimagine volumetric representations through the lens of quadrics. We posit that rigid scene components can be effectively decomposed into quadric surfaces. Leveraging this assumption, we reshape the volumetric representations with million of cubes by several quadric planes, which results in more accurate and efficient modeling of 3D scenes in SLAM contexts. First, we use the quadric assumption to rectify noisy depth estimations from RGB inputs. This step significantly improves depth estimation accuracy, and allows us to efficiently sample ray points around quadric planes instead of the entire volume space in previous NeRF-SLAM systems. Second, we introduce a novel quadric-decomposed transformer to aggregate information across quadrics. The quadric semantics are not only explicitly used for depth correction and scene decomposition, but also serve as an implicit supervision signal for the mapping network. Through rigorous experimental evaluation, our method exhibits superior performance over other approaches relying on estimated depth, and achieves comparable accuracy to methods utilizing ground truth depth on both synthetic and real-world datasets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v270-peng25b, title = {Q-SLAM: Quadric Representations for Monocular SLAM}, author = {Peng, Chensheng and Xu, Chenfeng and Wang, Yue and Ding, Mingyu and Yang, Heng and Tomizuka, Masayoshi and Keutzer, Kurt and Pavone, Marco and Zhan, Wei}, booktitle = {Proceedings of The 8th Conference on Robot Learning}, pages = {1763--1781}, year = {2025}, editor = {Agrawal, Pulkit and Kroemer, Oliver and Burgard, Wolfram}, volume = {270}, series = {Proceedings of Machine Learning Research}, month = {06--09 Nov}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v270/main/assets/peng25b/peng25b.pdf}, url = {https://proceedings.mlr.press/v270/peng25b.html}, abstract = {In this paper, we reimagine volumetric representations through the lens of quadrics. We posit that rigid scene components can be effectively decomposed into quadric surfaces. Leveraging this assumption, we reshape the volumetric representations with million of cubes by several quadric planes, which results in more accurate and efficient modeling of 3D scenes in SLAM contexts. First, we use the quadric assumption to rectify noisy depth estimations from RGB inputs. This step significantly improves depth estimation accuracy, and allows us to efficiently sample ray points around quadric planes instead of the entire volume space in previous NeRF-SLAM systems. Second, we introduce a novel quadric-decomposed transformer to aggregate information across quadrics. The quadric semantics are not only explicitly used for depth correction and scene decomposition, but also serve as an implicit supervision signal for the mapping network. Through rigorous experimental evaluation, our method exhibits superior performance over other approaches relying on estimated depth, and achieves comparable accuracy to methods utilizing ground truth depth on both synthetic and real-world datasets.} }
Endnote
%0 Conference Paper %T Q-SLAM: Quadric Representations for Monocular SLAM %A Chensheng Peng %A Chenfeng Xu %A Yue Wang %A Mingyu Ding %A Heng Yang %A Masayoshi Tomizuka %A Kurt Keutzer %A Marco Pavone %A Wei Zhan %B Proceedings of The 8th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2025 %E Pulkit Agrawal %E Oliver Kroemer %E Wolfram Burgard %F pmlr-v270-peng25b %I PMLR %P 1763--1781 %U https://proceedings.mlr.press/v270/peng25b.html %V 270 %X In this paper, we reimagine volumetric representations through the lens of quadrics. We posit that rigid scene components can be effectively decomposed into quadric surfaces. Leveraging this assumption, we reshape the volumetric representations with million of cubes by several quadric planes, which results in more accurate and efficient modeling of 3D scenes in SLAM contexts. First, we use the quadric assumption to rectify noisy depth estimations from RGB inputs. This step significantly improves depth estimation accuracy, and allows us to efficiently sample ray points around quadric planes instead of the entire volume space in previous NeRF-SLAM systems. Second, we introduce a novel quadric-decomposed transformer to aggregate information across quadrics. The quadric semantics are not only explicitly used for depth correction and scene decomposition, but also serve as an implicit supervision signal for the mapping network. Through rigorous experimental evaluation, our method exhibits superior performance over other approaches relying on estimated depth, and achieves comparable accuracy to methods utilizing ground truth depth on both synthetic and real-world datasets.
APA
Peng, C., Xu, C., Wang, Y., Ding, M., Yang, H., Tomizuka, M., Keutzer, K., Pavone, M. & Zhan, W.. (2025). Q-SLAM: Quadric Representations for Monocular SLAM. Proceedings of The 8th Conference on Robot Learning, in Proceedings of Machine Learning Research 270:1763-1781 Available from https://proceedings.mlr.press/v270/peng25b.html.

Related Material