Self-Supervised 3D Keypoint Learning for Ego-Motion Estimation

Jiexiong Tang; Rares Ambrus; Vitor Guizilini; Sudeep Pillai; Hanme Kim; Patric Jensfelt; Adrien Gaidon

Self-Supervised 3D Keypoint Learning for Ego-Motion Estimation

Jiexiong Tang, Rares Ambrus, Vitor Guizilini, Sudeep Pillai, Hanme Kim, Patric Jensfelt, Adrien Gaidon

Proceedings of the 2020 Conference on Robot Learning, PMLR 155:2085-2103, 2021.

Abstract

Detecting and matching robust viewpoint-invariant keypoints is critical for visual SLAM and Structure-from-Motion. State-of-the-art learning-based methods generate training samples via homography adaptation to create 2D synthetic views with known keypoint matches from a single image. This approach does not, however, generalize to non-planar 3D scenes with illumination variations commonly seen in real-world videos. In this work, we propose self-supervised learning depth-aware keypoints from unlabeled videos directly. We jointly learn keypoint and depth estimation networks by combining appearance and geometric matching via a differentiable structure-from-motion module based on Procrustean residual pose correction. We show how our self-supervised keypoints can be trivially incorporated into state-of-the-art visual odometry frameworks for robust and accurate ego-motion estimation of autonomous vehicles in real-world conditions.

Cite this Paper

BibTeX


@InProceedings{pmlr-v155-tang21b,
  title = 	 {Self-Supervised 3D Keypoint Learning for Ego-Motion Estimation},
  author =       {Tang, Jiexiong and Ambrus, Rares and Guizilini, Vitor and Pillai, Sudeep and Kim, Hanme and Jensfelt, Patric and Gaidon, Adrien},
  booktitle = 	 {Proceedings of the 2020 Conference on Robot Learning},
  pages = 	 {2085--2103},
  year = 	 {2021},
  editor = 	 {Kober, Jens and Ramos, Fabio and Tomlin, Claire},
  volume = 	 {155},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {16--18 Nov},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v155/tang21b/tang21b.pdf},
  url = 	 {https://proceedings.mlr.press/v155/tang21b.html},
  abstract = 	 {Detecting and matching robust viewpoint-invariant keypoints is critical for visual SLAM and Structure-from-Motion. State-of-the-art learning-based methods generate training samples via homography adaptation to create 2D synthetic views with known keypoint matches from a single image. This approach does not, however, generalize to non-planar 3D scenes with illumination variations commonly seen in real-world videos. In this work, we propose self-supervised learning depth-aware keypoints from unlabeled videos directly. We jointly learn keypoint and depth estimation networks by combining appearance and geometric matching via a differentiable structure-from-motion module based on Procrustean residual pose correction. We show how our self-supervised keypoints can be trivially incorporated into state-of-the-art visual odometry frameworks for robust and accurate ego-motion estimation of autonomous vehicles in real-world conditions.}
}

Endnote

%0 Conference Paper
%T Self-Supervised 3D Keypoint Learning for Ego-Motion Estimation
%A Jiexiong Tang
%A Rares Ambrus
%A Vitor Guizilini
%A Sudeep Pillai
%A Hanme Kim
%A Patric Jensfelt
%A Adrien Gaidon
%B Proceedings of the 2020 Conference on Robot Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Jens Kober
%E Fabio Ramos
%E Claire Tomlin	
%F pmlr-v155-tang21b
%I PMLR
%P 2085--2103
%U https://proceedings.mlr.press/v155/tang21b.html
%V 155
%X Detecting and matching robust viewpoint-invariant keypoints is critical for visual SLAM and Structure-from-Motion. State-of-the-art learning-based methods generate training samples via homography adaptation to create 2D synthetic views with known keypoint matches from a single image. This approach does not, however, generalize to non-planar 3D scenes with illumination variations commonly seen in real-world videos. In this work, we propose self-supervised learning depth-aware keypoints from unlabeled videos directly. We jointly learn keypoint and depth estimation networks by combining appearance and geometric matching via a differentiable structure-from-motion module based on Procrustean residual pose correction. We show how our self-supervised keypoints can be trivially incorporated into state-of-the-art visual odometry frameworks for robust and accurate ego-motion estimation of autonomous vehicles in real-world conditions.

APA


Tang, J., Ambrus, R., Guizilini, V., Pillai, S., Kim, H., Jensfelt, P. & Gaidon, A.. (2021). Self-Supervised 3D Keypoint Learning for Ego-Motion Estimation. Proceedings of the 2020 Conference on Robot Learning, in Proceedings of Machine Learning Research 155:2085-2103 Available from https://proceedings.mlr.press/v155/tang21b.html.

Related Material

Download PDF