Sparse Gaussian Process Temporal Difference Learning for Marine Robot Navigation

John Martin; Jinkun Wang; Brendan Englot

Sparse Gaussian Process Temporal Difference Learning for Marine Robot Navigation

John Martin, Jinkun Wang, Brendan Englot

Proceedings of The 2nd Conference on Robot Learning, PMLR 87:179-189, 2018.

Abstract

We present a method for Temporal Difference (TD) learning that addresses several challenges faced by robots learning to navigate in a marine environment. For improved data efficiency, our method reduces TD updates to Gaussian Process regression. To make predictions amenable to online settings, we introduce a sparse approximation with improved quality over current rejection-based methods. We derive the predictive value function posterior and use the moments to obtain a new algorithm for model-free policy evaluation, SPGP-SARSA. With simple changes, we show SPGP-SARSA can be reduced to a model-based equivalent, SPGP-TD. We perform comprehensive simulation studies and also conduct physical learning trials with an underwater robot. Our results show SPGP-SARSA can outperform the state-of-the-art sparse method, replicate the prediction quality of its exact counterpart, and be applied to solve underwater navigation tasks.

Cite this Paper

BibTeX


@InProceedings{pmlr-v87-martin18a,
  title = 	 {Sparse Gaussian Process Temporal Difference Learning for Marine Robot Navigation},
  author =       {Martin, John and Wang, Jinkun and Englot, Brendan},
  booktitle = 	 {Proceedings of The 2nd Conference on Robot Learning},
  pages = 	 {179--189},
  year = 	 {2018},
  editor = 	 {Billard, Aude and Dragan, Anca and Peters, Jan and Morimoto, Jun},
  volume = 	 {87},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {29--31 Oct},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v87/martin18a/martin18a.pdf},
  url = 	 {https://proceedings.mlr.press/v87/martin18a.html},
  abstract = 	 {We present a method for Temporal Difference (TD) learning that addresses several challenges faced by robots learning to navigate in a marine environment. For improved data efficiency, our method reduces TD updates to Gaussian Process regression. To make predictions amenable to online settings, we introduce a sparse approximation with improved quality over current rejection-based methods. We derive the predictive value function posterior and use the moments to obtain a new algorithm for model-free policy evaluation, SPGP-SARSA. With simple changes, we show SPGP-SARSA can be reduced to a model-based equivalent, SPGP-TD. We perform comprehensive simulation studies and also conduct physical learning trials with an underwater robot. Our results show SPGP-SARSA can outperform the state-of-the-art sparse method, replicate the prediction quality of its exact counterpart, and be applied to solve underwater navigation tasks. }
}

Endnote

%0 Conference Paper
%T Sparse Gaussian Process Temporal Difference Learning for Marine Robot Navigation
%A John Martin
%A Jinkun Wang
%A Brendan Englot
%B Proceedings of The 2nd Conference on Robot Learning
%C Proceedings of Machine Learning Research
%D 2018
%E Aude Billard
%E Anca Dragan
%E Jan Peters
%E Jun Morimoto	
%F pmlr-v87-martin18a
%I PMLR
%P 179--189
%U https://proceedings.mlr.press/v87/martin18a.html
%V 87
%X We present a method for Temporal Difference (TD) learning that addresses several challenges faced by robots learning to navigate in a marine environment. For improved data efficiency, our method reduces TD updates to Gaussian Process regression. To make predictions amenable to online settings, we introduce a sparse approximation with improved quality over current rejection-based methods. We derive the predictive value function posterior and use the moments to obtain a new algorithm for model-free policy evaluation, SPGP-SARSA. With simple changes, we show SPGP-SARSA can be reduced to a model-based equivalent, SPGP-TD. We perform comprehensive simulation studies and also conduct physical learning trials with an underwater robot. Our results show SPGP-SARSA can outperform the state-of-the-art sparse method, replicate the prediction quality of its exact counterpart, and be applied to solve underwater navigation tasks.

APA


Martin, J., Wang, J. & Englot, B.. (2018). Sparse Gaussian Process Temporal Difference Learning for Marine Robot Navigation. Proceedings of The 2nd Conference on Robot Learning, in Proceedings of Machine Learning Research 87:179-189 Available from https://proceedings.mlr.press/v87/martin18a.html.

Related Material

Download PDF