SeqMatchNet: Contrastive Learning with Sequence Matching for Place Recognition & Relocalization

Sourav Garg; Madhu Vankadari; Michael Milford

SeqMatchNet: Contrastive Learning with Sequence Matching for Place Recognition & Relocalization

Sourav Garg, Madhu Vankadari, Michael Milford

Proceedings of the 5th Conference on Robot Learning, PMLR 164:429-443, 2022.

Abstract

Visual Place Recognition (VPR) for mobile robot global relocalization is a well-studied problem, where contrastive learning based representation training methods have led to state-of-the-art performance. However, these methods are mainly designed for single image based VPR, where sequential information, which is ubiquitous in robotics, is only used as a post-processing step for filtering single image match scores, but is never used to guide the representation learning process itself. In this work, for the first time, we bridge the gap between single image representation learning and sequence matching through "SeqMatchNet" which transforms the single image descriptors such that they become more responsive to the sequence matching metric. We propose a novel triplet loss formulation where the distance metric is based on "sequence matching", that is, the aggregation of temporal order-based Euclidean distances computed using single images. We use the same metric for mining negatives online during the training which helps the optimization process by selecting appropriate positives and harder negatives. To overcome the computational overhead of sequence matching for negative mining, we propose a 2D convolution based formulation of sequence matching for efficiently aggregating distances within a distance matrix computed using single images. We show that our proposed method achieves consistent gains in performance as demonstrated on four benchmark datasets. Source code available at https://github.com/oravus/SeqMatchNet.

Cite this Paper

BibTeX


@InProceedings{pmlr-v164-garg22a,
  title = 	 {SeqMatchNet: Contrastive Learning with Sequence Matching for Place Recognition & Relocalization},
  author =       {Garg, Sourav and Vankadari, Madhu and Milford, Michael},
  booktitle = 	 {Proceedings of the 5th Conference on Robot Learning},
  pages = 	 {429--443},
  year = 	 {2022},
  editor = 	 {Faust, Aleksandra and Hsu, David and Neumann, Gerhard},
  volume = 	 {164},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {08--11 Nov},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v164/garg22a/garg22a.pdf},
  url = 	 {https://proceedings.mlr.press/v164/garg22a.html},
  abstract = 	 {Visual Place Recognition (VPR) for mobile robot global relocalization is a well-studied problem, where contrastive learning based representation training methods have led to state-of-the-art performance. However, these methods are mainly designed for single image based VPR, where sequential information, which is ubiquitous in robotics, is only used as a post-processing step for filtering single image match scores, but is never used to guide the representation learning process itself. In this work, for the first time, we bridge the gap between single image representation learning and sequence matching through "SeqMatchNet" which transforms the single image descriptors such that they become more responsive to the sequence matching metric. We propose a novel triplet loss formulation where the distance metric is based on "sequence matching", that is, the aggregation of temporal order-based Euclidean distances computed using single images. We use the same metric for mining negatives online during the training which helps the optimization process by selecting appropriate positives and harder negatives. To overcome the computational overhead of sequence matching for negative mining, we propose a 2D convolution based formulation of sequence matching for efficiently aggregating distances within a distance matrix computed using single images. We show that our proposed method achieves consistent gains in performance as demonstrated on four benchmark datasets. Source code available at https://github.com/oravus/SeqMatchNet.}
}

Endnote

%0 Conference Paper
%T SeqMatchNet: Contrastive Learning with Sequence Matching for Place Recognition & Relocalization
%A Sourav Garg
%A Madhu Vankadari
%A Michael Milford
%B Proceedings of the 5th Conference on Robot Learning
%C Proceedings of Machine Learning Research
%D 2022
%E Aleksandra Faust
%E David Hsu
%E Gerhard Neumann	
%F pmlr-v164-garg22a
%I PMLR
%P 429--443
%U https://proceedings.mlr.press/v164/garg22a.html
%V 164
%X Visual Place Recognition (VPR) for mobile robot global relocalization is a well-studied problem, where contrastive learning based representation training methods have led to state-of-the-art performance. However, these methods are mainly designed for single image based VPR, where sequential information, which is ubiquitous in robotics, is only used as a post-processing step for filtering single image match scores, but is never used to guide the representation learning process itself. In this work, for the first time, we bridge the gap between single image representation learning and sequence matching through "SeqMatchNet" which transforms the single image descriptors such that they become more responsive to the sequence matching metric. We propose a novel triplet loss formulation where the distance metric is based on "sequence matching", that is, the aggregation of temporal order-based Euclidean distances computed using single images. We use the same metric for mining negatives online during the training which helps the optimization process by selecting appropriate positives and harder negatives. To overcome the computational overhead of sequence matching for negative mining, we propose a 2D convolution based formulation of sequence matching for efficiently aggregating distances within a distance matrix computed using single images. We show that our proposed method achieves consistent gains in performance as demonstrated on four benchmark datasets. Source code available at https://github.com/oravus/SeqMatchNet.

APA


Garg, S., Vankadari, M. & Milford, M.. (2022). SeqMatchNet: Contrastive Learning with Sequence Matching for Place Recognition & Relocalization. Proceedings of the 5th Conference on Robot Learning, in Proceedings of Machine Learning Research 164:429-443 Available from https://proceedings.mlr.press/v164/garg22a.html.

SeqMatchNet: Contrastive Learning with Sequence Matching for Place Recognition & Relocalization

Abstract

Cite this Paper

Related Material