Multi-Strategy Deployment-Time Learning and Adaptation for Navigation under Uncertainty

Abhishek Paudel; Xuesu Xiao; Gregory J. Stein

Multi-Strategy Deployment-Time Learning and Adaptation for Navigation under Uncertainty

Abhishek Paudel, Xuesu Xiao, Gregory J. Stein

Proceedings of The 8th Conference on Robot Learning, PMLR 270:3908-3923, 2025.

Abstract

We present an approach for performant point-goal navigation in unfamiliar partially-mapped environments. When deployed, our robot runs multiple strategies for deployment-time learning and visual domain adaptation in parallel and quickly selects the best-performing among them. Choosing between policies as they are learned or adapted between navigation trials requires continually updating estimates of their performance as they evolve. Leveraging recent work in model-based learning-informed planning under uncertainty, we determine lower bounds on the would-be performance of newly-updated policies on old trials without needing to re-deploy them. This information constrains and accelerates bandit-like policy selection, affording quick selection of the best-performing strategy shortly after it would start to yield good performance. We validate the effectiveness of our approach in simulated maze-like environments, showing improved navigation cost and cumulative regret versus existing baselines.

Cite this Paper

BibTeX

@InProceedings{pmlr-v270-paudel25a,
  title = 	 {Multi-Strategy Deployment-Time Learning and Adaptation for Navigation under Uncertainty},
  author =       {Paudel, Abhishek and Xiao, Xuesu and Stein, Gregory J.},
  booktitle = 	 {Proceedings of The 8th Conference on Robot Learning},
  pages = 	 {3908--3923},
  year = 	 {2025},
  editor = 	 {Agrawal, Pulkit and Kroemer, Oliver and Burgard, Wolfram},
  volume = 	 {270},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {06--09 Nov},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v270/main/assets/paudel25a/paudel25a.pdf},
  url = 	 {https://proceedings.mlr.press/v270/paudel25a.html},
  abstract = 	 {We present an approach for performant point-goal navigation in unfamiliar partially-mapped environments. When deployed, our robot runs multiple strategies for deployment-time learning and visual domain adaptation in parallel and quickly selects the best-performing among them. Choosing between policies as they are learned or adapted between navigation trials requires continually updating estimates of their performance as they evolve. Leveraging recent work in model-based learning-informed planning under uncertainty, we determine lower bounds on the would-be performance of newly-updated policies on old trials without needing to re-deploy them. This information constrains and accelerates bandit-like policy selection, affording quick selection of the best-performing strategy shortly after it would start to yield good performance. We validate the effectiveness of our approach in simulated maze-like environments, showing improved navigation cost and cumulative regret versus existing baselines.}
}

Endnote

%0 Conference Paper
%T Multi-Strategy Deployment-Time Learning and Adaptation for Navigation under Uncertainty
%A Abhishek Paudel
%A Xuesu Xiao
%A Gregory J. Stein
%B Proceedings of The 8th Conference on Robot Learning
%C Proceedings of Machine Learning Research
%D 2025
%E Pulkit Agrawal
%E Oliver Kroemer
%E Wolfram Burgard	
%F pmlr-v270-paudel25a
%I PMLR
%P 3908--3923
%U https://proceedings.mlr.press/v270/paudel25a.html
%V 270
%X We present an approach for performant point-goal navigation in unfamiliar partially-mapped environments. When deployed, our robot runs multiple strategies for deployment-time learning and visual domain adaptation in parallel and quickly selects the best-performing among them. Choosing between policies as they are learned or adapted between navigation trials requires continually updating estimates of their performance as they evolve. Leveraging recent work in model-based learning-informed planning under uncertainty, we determine lower bounds on the would-be performance of newly-updated policies on old trials without needing to re-deploy them. This information constrains and accelerates bandit-like policy selection, affording quick selection of the best-performing strategy shortly after it would start to yield good performance. We validate the effectiveness of our approach in simulated maze-like environments, showing improved navigation cost and cumulative regret versus existing baselines.

APA

Paudel, A., Xiao, X. & Stein, G.J.. (2025). Multi-Strategy Deployment-Time Learning and Adaptation for Navigation under Uncertainty. Proceedings of The 8th Conference on Robot Learning, in Proceedings of Machine Learning Research 270:3908-3923 Available from https://proceedings.mlr.press/v270/paudel25a.html.

Multi-Strategy Deployment-Time Learning and Adaptation for Navigation under Uncertainty

Abstract

Cite this Paper

Related Material