ADU-Depth: Attention-based Distillation with Uncertainty Modeling for Depth Estimation

ZiZhang Wu, Zhuozheng Li, Zhi-Gang Fan, Yunzhe Wu, Xiaoquan Wang, Rui Tang, Jian Pu
Proceedings of The 7th Conference on Robot Learning, PMLR 229:3167-3179, 2023.

Abstract

Monocular depth estimation is challenging due to its inherent ambiguity and ill-posed nature, yet it is quite important to many applications. While recent works achieve limited accuracy by designing increasingly complicated networks to extract features with limited spatial geometric cues from a single RGB image, we intend to introduce spatial cues by training a teacher network that leverages left-right image pairs as inputs and transferring the learned 3D geometry-aware knowledge to the monocular student network. Specifically, we present a novel knowledge distillation framework, named ADU-Depth, with the goal of leveraging the well-trained teacher network to guide the learning of the student network, thus boosting the precise depth estimation with the help of extra spatial scene information. To enable domain adaptation and ensure effective and smooth knowledge transfer from teacher to student, we apply both attention-adapted feature distillation and focal-depth-adapted response distillation in the training stage. In addition, we explicitly model the uncertainty of depth estimation to guide distillation in both feature space and result space to better produce 3D-aware knowledge from monocular observations and thus enhance the learning for hard-to-predict image regions. Our extensive experiments on the real depth estimation datasets KITTI and DrivingStereo demonstrate the effectiveness of the proposed method, which ranked 1st on the challenging KITTI online benchmark.

Cite this Paper


BibTeX
@InProceedings{pmlr-v229-wu23c, title = {ADU-Depth: Attention-based Distillation with Uncertainty Modeling for Depth Estimation}, author = {Wu, ZiZhang and Li, Zhuozheng and Fan, Zhi-Gang and Wu, Yunzhe and Wang, Xiaoquan and Tang, Rui and Pu, Jian}, booktitle = {Proceedings of The 7th Conference on Robot Learning}, pages = {3167--3179}, year = {2023}, editor = {Tan, Jie and Toussaint, Marc and Darvish, Kourosh}, volume = {229}, series = {Proceedings of Machine Learning Research}, month = {06--09 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v229/wu23c/wu23c.pdf}, url = {https://proceedings.mlr.press/v229/wu23c.html}, abstract = {Monocular depth estimation is challenging due to its inherent ambiguity and ill-posed nature, yet it is quite important to many applications. While recent works achieve limited accuracy by designing increasingly complicated networks to extract features with limited spatial geometric cues from a single RGB image, we intend to introduce spatial cues by training a teacher network that leverages left-right image pairs as inputs and transferring the learned 3D geometry-aware knowledge to the monocular student network. Specifically, we present a novel knowledge distillation framework, named ADU-Depth, with the goal of leveraging the well-trained teacher network to guide the learning of the student network, thus boosting the precise depth estimation with the help of extra spatial scene information. To enable domain adaptation and ensure effective and smooth knowledge transfer from teacher to student, we apply both attention-adapted feature distillation and focal-depth-adapted response distillation in the training stage. In addition, we explicitly model the uncertainty of depth estimation to guide distillation in both feature space and result space to better produce 3D-aware knowledge from monocular observations and thus enhance the learning for hard-to-predict image regions. Our extensive experiments on the real depth estimation datasets KITTI and DrivingStereo demonstrate the effectiveness of the proposed method, which ranked 1st on the challenging KITTI online benchmark.} }
Endnote
%0 Conference Paper %T ADU-Depth: Attention-based Distillation with Uncertainty Modeling for Depth Estimation %A ZiZhang Wu %A Zhuozheng Li %A Zhi-Gang Fan %A Yunzhe Wu %A Xiaoquan Wang %A Rui Tang %A Jian Pu %B Proceedings of The 7th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2023 %E Jie Tan %E Marc Toussaint %E Kourosh Darvish %F pmlr-v229-wu23c %I PMLR %P 3167--3179 %U https://proceedings.mlr.press/v229/wu23c.html %V 229 %X Monocular depth estimation is challenging due to its inherent ambiguity and ill-posed nature, yet it is quite important to many applications. While recent works achieve limited accuracy by designing increasingly complicated networks to extract features with limited spatial geometric cues from a single RGB image, we intend to introduce spatial cues by training a teacher network that leverages left-right image pairs as inputs and transferring the learned 3D geometry-aware knowledge to the monocular student network. Specifically, we present a novel knowledge distillation framework, named ADU-Depth, with the goal of leveraging the well-trained teacher network to guide the learning of the student network, thus boosting the precise depth estimation with the help of extra spatial scene information. To enable domain adaptation and ensure effective and smooth knowledge transfer from teacher to student, we apply both attention-adapted feature distillation and focal-depth-adapted response distillation in the training stage. In addition, we explicitly model the uncertainty of depth estimation to guide distillation in both feature space and result space to better produce 3D-aware knowledge from monocular observations and thus enhance the learning for hard-to-predict image regions. Our extensive experiments on the real depth estimation datasets KITTI and DrivingStereo demonstrate the effectiveness of the proposed method, which ranked 1st on the challenging KITTI online benchmark.
APA
Wu, Z., Li, Z., Fan, Z., Wu, Y., Wang, X., Tang, R. & Pu, J.. (2023). ADU-Depth: Attention-based Distillation with Uncertainty Modeling for Depth Estimation. Proceedings of The 7th Conference on Robot Learning, in Proceedings of Machine Learning Research 229:3167-3179 Available from https://proceedings.mlr.press/v229/wu23c.html.

Related Material