TartanVO: A Generalizable Learning-based VO

Wenshan Wang; Yaoyu Hu; Sebastian Scherer

TartanVO: A Generalizable Learning-based VO

Wenshan Wang, Yaoyu Hu, Sebastian Scherer

Proceedings of the 2020 Conference on Robot Learning, PMLR 155:1761-1772, 2021.

Abstract

We present the first learning-based visual odometry (VO) model, which generalizes to multiple datasets and real-world scenarios and outperforms geometry-based methods in challenging scenes. We achieve this by leveraging the SLAM dataset TartanAir, which provides a large amount of diverse synthetic data in challenging environments. Furthermore, to make our VO model generalize across datasets, we propose an up-to-scale loss function and incorporate the camera intrinsic parameters into the model. Experiments show that a single model, TartanVO, trained only on synthetic data, without any finetuning, can be generalized to real-world datasets such as KITTI and EuRoC, demonstrating significant advantages over the geometry-based methods on challenging trajectories. Our code is available at https://github.com/castacks/tartanvo.

Cite this Paper

BibTeX


@InProceedings{pmlr-v155-wang21h,
  title = 	 {TartanVO: A Generalizable Learning-based VO},
  author =       {Wang, Wenshan and Hu, Yaoyu and Scherer, Sebastian},
  booktitle = 	 {Proceedings of the 2020 Conference on Robot Learning},
  pages = 	 {1761--1772},
  year = 	 {2021},
  editor = 	 {Kober, Jens and Ramos, Fabio and Tomlin, Claire},
  volume = 	 {155},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {16--18 Nov},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v155/wang21h/wang21h.pdf},
  url = 	 {https://proceedings.mlr.press/v155/wang21h.html},
  abstract = 	 {We present the first learning-based visual odometry (VO) model, which generalizes to multiple datasets and real-world scenarios and outperforms geometry-based methods in challenging scenes. We achieve this by leveraging the SLAM dataset TartanAir, which provides a large amount of diverse synthetic data in challenging environments. Furthermore, to make our VO model generalize across datasets, we propose an up-to-scale loss function and incorporate the camera intrinsic parameters into the model. Experiments show that a single model, TartanVO, trained only on synthetic data, without any finetuning, can be generalized to real-world datasets such as KITTI and EuRoC, demonstrating significant advantages over the geometry-based methods on challenging trajectories. Our code is available at https://github.com/castacks/tartanvo.}
}

Endnote

%0 Conference Paper
%T TartanVO: A Generalizable Learning-based VO
%A Wenshan Wang
%A Yaoyu Hu
%A Sebastian Scherer
%B Proceedings of the 2020 Conference on Robot Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Jens Kober
%E Fabio Ramos
%E Claire Tomlin	
%F pmlr-v155-wang21h
%I PMLR
%P 1761--1772
%U https://proceedings.mlr.press/v155/wang21h.html
%V 155
%X We present the first learning-based visual odometry (VO) model, which generalizes to multiple datasets and real-world scenarios and outperforms geometry-based methods in challenging scenes. We achieve this by leveraging the SLAM dataset TartanAir, which provides a large amount of diverse synthetic data in challenging environments. Furthermore, to make our VO model generalize across datasets, we propose an up-to-scale loss function and incorporate the camera intrinsic parameters into the model. Experiments show that a single model, TartanVO, trained only on synthetic data, without any finetuning, can be generalized to real-world datasets such as KITTI and EuRoC, demonstrating significant advantages over the geometry-based methods on challenging trajectories. Our code is available at https://github.com/castacks/tartanvo.

APA


Wang, W., Hu, Y. & Scherer, S.. (2021). TartanVO: A Generalizable Learning-based VO. Proceedings of the 2020 Conference on Robot Learning, in Proceedings of Machine Learning Research 155:1761-1772 Available from https://proceedings.mlr.press/v155/wang21h.html.

Related Material

Download PDF