Robust Semi-Supervised Monocular Depth Estimation with Reprojected Distances

Vitor Guizilini; Jie Li; Rares Ambrus; Sudeep Pillai; Adrien Gaidon

Robust Semi-Supervised Monocular Depth Estimation with Reprojected Distances

Vitor Guizilini, Jie Li, Rares Ambrus, Sudeep Pillai, Adrien Gaidon

Proceedings of the Conference on Robot Learning, PMLR 100:503-512, 2020.

Abstract

Dense depth estimation from a single image is a key problem in computer vision, with exciting applications in a multitude of robotic tasks. Initially viewed as a direct regression problem, requiring annotated labels as supervision at training time, in the past few years a substantial amount of work has been done in self-supervised depth training based on strong geometric cues, both from stereo cameras and more recently from monocular video sequences. In this paper we investigate how these two approaches (supervised & self-supervised) can be effectively combined, so that a depth model can learn to encode true scale from sparse supervision while achieving high fidelity local accuracy by leveraging geometric cues. To this end, we propose a novel supervised loss term that complements the widely used photometric loss, and show how it can be used to train robust semi-supervised monocular depth estimation models. Furthermore, we evaluate how much supervision is actually necessary to train accurate scale-aware monocular depth models, showing that with our proposed framework, very sparse LiDAR information, with as few as 4 beams (less than 100 valid depth values per image), is enough to achieve results competitive with the current state-of-the-art.

Cite this Paper

BibTeX


@InProceedings{pmlr-v100-guizilini20a,
  title = 	 {Robust Semi-Supervised Monocular Depth Estimation with Reprojected Distances},
  author =       {Guizilini, Vitor and Li, Jie and Ambrus, Rares and Pillai, Sudeep and Gaidon, Adrien},
  booktitle = 	 {Proceedings of the Conference on Robot Learning},
  pages = 	 {503--512},
  year = 	 {2020},
  editor = 	 {Kaelbling, Leslie Pack and Kragic, Danica and Sugiura, Komei},
  volume = 	 {100},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {30 Oct--01 Nov},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v100/guizilini20a/guizilini20a.pdf},
  url = 	 {https://proceedings.mlr.press/v100/guizilini20a.html},
  abstract = 	 {Dense depth estimation from a single image is a key problem in computer vision, with exciting applications in a multitude of robotic tasks. Initially viewed as a direct regression problem, requiring annotated labels as supervision at training time, in the past few years a substantial amount of work has been done in self-supervised depth training based on strong geometric cues, both from stereo cameras and more recently from monocular video sequences. In this paper we investigate how these two approaches (supervised & self-supervised) can be effectively combined, so that a depth model can learn to encode true scale from sparse supervision while achieving high fidelity local accuracy by leveraging geometric cues. To this end, we propose a novel supervised loss term that complements the widely used photometric loss, and show how it can be used to train robust semi-supervised monocular depth estimation models. Furthermore, we evaluate how much supervision is actually necessary to train accurate scale-aware monocular depth models, showing that with our proposed framework, very sparse LiDAR information, with as few as 4 beams (less than 100 valid depth values per image), is enough to achieve results competitive with the current state-of-the-art.}
}

Endnote

%0 Conference Paper
%T Robust Semi-Supervised Monocular Depth Estimation with Reprojected Distances
%A Vitor Guizilini
%A Jie Li
%A Rares Ambrus
%A Sudeep Pillai
%A Adrien Gaidon
%B Proceedings of the Conference on Robot Learning
%C Proceedings of Machine Learning Research
%D 2020
%E Leslie Pack Kaelbling
%E Danica Kragic
%E Komei Sugiura	
%F pmlr-v100-guizilini20a
%I PMLR
%P 503--512
%U https://proceedings.mlr.press/v100/guizilini20a.html
%V 100
%X Dense depth estimation from a single image is a key problem in computer vision, with exciting applications in a multitude of robotic tasks. Initially viewed as a direct regression problem, requiring annotated labels as supervision at training time, in the past few years a substantial amount of work has been done in self-supervised depth training based on strong geometric cues, both from stereo cameras and more recently from monocular video sequences. In this paper we investigate how these two approaches (supervised & self-supervised) can be effectively combined, so that a depth model can learn to encode true scale from sparse supervision while achieving high fidelity local accuracy by leveraging geometric cues. To this end, we propose a novel supervised loss term that complements the widely used photometric loss, and show how it can be used to train robust semi-supervised monocular depth estimation models. Furthermore, we evaluate how much supervision is actually necessary to train accurate scale-aware monocular depth models, showing that with our proposed framework, very sparse LiDAR information, with as few as 4 beams (less than 100 valid depth values per image), is enough to achieve results competitive with the current state-of-the-art.

APA


Guizilini, V., Li, J., Ambrus, R., Pillai, S. & Gaidon, A.. (2020). Robust Semi-Supervised Monocular Depth Estimation with Reprojected Distances. Proceedings of the Conference on Robot Learning, in Proceedings of Machine Learning Research 100:503-512 Available from https://proceedings.mlr.press/v100/guizilini20a.html.

Robust Semi-Supervised Monocular Depth Estimation with Reprojected Distances

Abstract

Cite this Paper

Related Material