Leveraging Deep Visual Descriptors for Hierarchical Efficient Localization

Paul-Edouard Sarlin, Frederic Debraine, Marcin Dymczyk, Roland Siegwart, Cesar Cadena
Proceedings of The 2nd Conference on Robot Learning, PMLR 87:456-465, 2018.

Abstract

Many robotics applications require precise pose estimates despite operating in large and changing environments. This can be addressed by visual localization, using a pre-computed 3D model of the surroundings. The pose estimation then amounts to finding correspondences between 2D keypoints in a query image and 3D points in the model using local descriptors. However, computational power is often limited on robotic platforms, making this task challenging in large-scale environments. Binary feature descriptors significantly speed up this 2D-3D matching, and have become popular in the robotics community, but also strongly impair the robustness to perceptual aliasing and changes in viewpoint, illumination and scene structure. In this work, we propose to leverage recent advances in deep learning to perform an efficient hierarchical localization. We first localize at the map level using learned image-wide global descriptors, and subsequently estimate a precise pose from 2D-3D matches computed in the candidate places only. This restricts the local search and thus allows to efficiently exploit powerful non-binary descriptors usually dismissed on resource-constrained devices. Our approach results in state-of-the-art localization performance while running in real-time on a popular mobile platform, enabling new prospects for robotics research.

Cite this Paper


BibTeX
@InProceedings{pmlr-v87-sarlin18a, title = {Leveraging Deep Visual Descriptors for Hierarchical Efficient Localization}, author = {Sarlin, Paul-Edouard and Debraine, Frederic and Dymczyk, Marcin and Siegwart, Roland and Cadena, Cesar}, booktitle = {Proceedings of The 2nd Conference on Robot Learning}, pages = {456--465}, year = {2018}, editor = {Billard, Aude and Dragan, Anca and Peters, Jan and Morimoto, Jun}, volume = {87}, series = {Proceedings of Machine Learning Research}, month = {29--31 Oct}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v87/sarlin18a/sarlin18a.pdf}, url = {https://proceedings.mlr.press/v87/sarlin18a.html}, abstract = {Many robotics applications require precise pose estimates despite operating in large and changing environments. This can be addressed by visual localization, using a pre-computed 3D model of the surroundings. The pose estimation then amounts to finding correspondences between 2D keypoints in a query image and 3D points in the model using local descriptors. However, computational power is often limited on robotic platforms, making this task challenging in large-scale environments. Binary feature descriptors significantly speed up this 2D-3D matching, and have become popular in the robotics community, but also strongly impair the robustness to perceptual aliasing and changes in viewpoint, illumination and scene structure. In this work, we propose to leverage recent advances in deep learning to perform an efficient hierarchical localization. We first localize at the map level using learned image-wide global descriptors, and subsequently estimate a precise pose from 2D-3D matches computed in the candidate places only. This restricts the local search and thus allows to efficiently exploit powerful non-binary descriptors usually dismissed on resource-constrained devices. Our approach results in state-of-the-art localization performance while running in real-time on a popular mobile platform, enabling new prospects for robotics research.} }
Endnote
%0 Conference Paper %T Leveraging Deep Visual Descriptors for Hierarchical Efficient Localization %A Paul-Edouard Sarlin %A Frederic Debraine %A Marcin Dymczyk %A Roland Siegwart %A Cesar Cadena %B Proceedings of The 2nd Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2018 %E Aude Billard %E Anca Dragan %E Jan Peters %E Jun Morimoto %F pmlr-v87-sarlin18a %I PMLR %P 456--465 %U https://proceedings.mlr.press/v87/sarlin18a.html %V 87 %X Many robotics applications require precise pose estimates despite operating in large and changing environments. This can be addressed by visual localization, using a pre-computed 3D model of the surroundings. The pose estimation then amounts to finding correspondences between 2D keypoints in a query image and 3D points in the model using local descriptors. However, computational power is often limited on robotic platforms, making this task challenging in large-scale environments. Binary feature descriptors significantly speed up this 2D-3D matching, and have become popular in the robotics community, but also strongly impair the robustness to perceptual aliasing and changes in viewpoint, illumination and scene structure. In this work, we propose to leverage recent advances in deep learning to perform an efficient hierarchical localization. We first localize at the map level using learned image-wide global descriptors, and subsequently estimate a precise pose from 2D-3D matches computed in the candidate places only. This restricts the local search and thus allows to efficiently exploit powerful non-binary descriptors usually dismissed on resource-constrained devices. Our approach results in state-of-the-art localization performance while running in real-time on a popular mobile platform, enabling new prospects for robotics research.
APA
Sarlin, P., Debraine, F., Dymczyk, M., Siegwart, R. & Cadena, C.. (2018). Leveraging Deep Visual Descriptors for Hierarchical Efficient Localization. Proceedings of The 2nd Conference on Robot Learning, in Proceedings of Machine Learning Research 87:456-465 Available from https://proceedings.mlr.press/v87/sarlin18a.html.

Related Material