Advancing Self-supervised Monocular Depth Learning with Sparse LiDAR

Ziyue Feng, Longlong Jing, Peng Yin, Yingli Tian, Bing Li
Proceedings of the 5th Conference on Robot Learning, PMLR 164:685-694, 2022.

Abstract

Self-supervised monocular depth prediction provides a cost-effective solution to obtain the 3D location of each pixel. However, the existing approaches usually lead to unsatisfactory accuracy, which is critical for autonomous robots. In this paper, we propose FusionDepth, a novel two-stage network to advance the self-supervised monocular dense depth learning by leveraging low-cost sparse (e.g. 4-beam) LiDAR. Unlike the existing methods that use sparse LiDAR mainly in a manner of time-consuming iterative post-processing, our model fuses monocular image features and sparse LiDAR features to predict initial depth maps. Then, an efficient feed-forward refine network is further designed to correct the errors in these initial depth maps in pseudo-3D space with real-time performance. Extensive experiments show that our proposed model significantly outperforms all the state-of-the-art self-supervised methods, as well as the sparse-LiDAR-based methods on both self-supervised monocular depth prediction and completion tasks. With the accurate dense depth prediction, our model outperforms the state-of-the-art sparse-LiDAR-based method (Pseudo-LiDAR++) by more than 68% for the downstream task monocular 3D object detection on the KITTI Leaderboard. Code is available at https://github.com/AutoAILab/FusionDepth

Cite this Paper


BibTeX
@InProceedings{pmlr-v164-feng22a, title = {Advancing Self-supervised Monocular Depth Learning with Sparse LiDAR}, author = {Feng, Ziyue and Jing, Longlong and Yin, Peng and Tian, Yingli and Li, Bing}, booktitle = {Proceedings of the 5th Conference on Robot Learning}, pages = {685--694}, year = {2022}, editor = {Faust, Aleksandra and Hsu, David and Neumann, Gerhard}, volume = {164}, series = {Proceedings of Machine Learning Research}, month = {08--11 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v164/feng22a/feng22a.pdf}, url = {https://proceedings.mlr.press/v164/feng22a.html}, abstract = {Self-supervised monocular depth prediction provides a cost-effective solution to obtain the 3D location of each pixel. However, the existing approaches usually lead to unsatisfactory accuracy, which is critical for autonomous robots. In this paper, we propose FusionDepth, a novel two-stage network to advance the self-supervised monocular dense depth learning by leveraging low-cost sparse (e.g. 4-beam) LiDAR. Unlike the existing methods that use sparse LiDAR mainly in a manner of time-consuming iterative post-processing, our model fuses monocular image features and sparse LiDAR features to predict initial depth maps. Then, an efficient feed-forward refine network is further designed to correct the errors in these initial depth maps in pseudo-3D space with real-time performance. Extensive experiments show that our proposed model significantly outperforms all the state-of-the-art self-supervised methods, as well as the sparse-LiDAR-based methods on both self-supervised monocular depth prediction and completion tasks. With the accurate dense depth prediction, our model outperforms the state-of-the-art sparse-LiDAR-based method (Pseudo-LiDAR++) by more than 68% for the downstream task monocular 3D object detection on the KITTI Leaderboard. Code is available at https://github.com/AutoAILab/FusionDepth} }
Endnote
%0 Conference Paper %T Advancing Self-supervised Monocular Depth Learning with Sparse LiDAR %A Ziyue Feng %A Longlong Jing %A Peng Yin %A Yingli Tian %A Bing Li %B Proceedings of the 5th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2022 %E Aleksandra Faust %E David Hsu %E Gerhard Neumann %F pmlr-v164-feng22a %I PMLR %P 685--694 %U https://proceedings.mlr.press/v164/feng22a.html %V 164 %X Self-supervised monocular depth prediction provides a cost-effective solution to obtain the 3D location of each pixel. However, the existing approaches usually lead to unsatisfactory accuracy, which is critical for autonomous robots. In this paper, we propose FusionDepth, a novel two-stage network to advance the self-supervised monocular dense depth learning by leveraging low-cost sparse (e.g. 4-beam) LiDAR. Unlike the existing methods that use sparse LiDAR mainly in a manner of time-consuming iterative post-processing, our model fuses monocular image features and sparse LiDAR features to predict initial depth maps. Then, an efficient feed-forward refine network is further designed to correct the errors in these initial depth maps in pseudo-3D space with real-time performance. Extensive experiments show that our proposed model significantly outperforms all the state-of-the-art self-supervised methods, as well as the sparse-LiDAR-based methods on both self-supervised monocular depth prediction and completion tasks. With the accurate dense depth prediction, our model outperforms the state-of-the-art sparse-LiDAR-based method (Pseudo-LiDAR++) by more than 68% for the downstream task monocular 3D object detection on the KITTI Leaderboard. Code is available at https://github.com/AutoAILab/FusionDepth
APA
Feng, Z., Jing, L., Yin, P., Tian, Y. & Li, B.. (2022). Advancing Self-supervised Monocular Depth Learning with Sparse LiDAR. Proceedings of the 5th Conference on Robot Learning, in Proceedings of Machine Learning Research 164:685-694 Available from https://proceedings.mlr.press/v164/feng22a.html.

Related Material