Generalised Task Planning with First-Order Function Approximation

Jun Hao Alvin Ng, Ronald P.A. Petrick
Proceedings of the 5th Conference on Robot Learning, PMLR 164:1595-1610, 2022.

Abstract

Real world robotics often operates in uncertain and dynamic environments where generalisation over different scenarios is of practical interest. In the absence of a model, value-based reinforcement learning can be used to learn a goal-directed policy. Typically, the interaction between robots and the objects in the environment exhibit a first-order structure. We introduce first-order, or relational, features to represent an approximation of the Q-function so that it can induce a generalised policy. Empirical results for a service robot domain show that our online relational reinforcement learning method is scalable to large scale problems and enables transfer learning between different problems and simulation environments with dissimilar transition dynamics.

Cite this Paper


BibTeX
@InProceedings{pmlr-v164-ng22a, title = {Generalised Task Planning with First-Order Function Approximation}, author = {Ng, Jun Hao Alvin and Petrick, Ronald P.A.}, booktitle = {Proceedings of the 5th Conference on Robot Learning}, pages = {1595--1610}, year = {2022}, editor = {Faust, Aleksandra and Hsu, David and Neumann, Gerhard}, volume = {164}, series = {Proceedings of Machine Learning Research}, month = {08--11 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v164/ng22a/ng22a.pdf}, url = {https://proceedings.mlr.press/v164/ng22a.html}, abstract = {Real world robotics often operates in uncertain and dynamic environments where generalisation over different scenarios is of practical interest. In the absence of a model, value-based reinforcement learning can be used to learn a goal-directed policy. Typically, the interaction between robots and the objects in the environment exhibit a first-order structure. We introduce first-order, or relational, features to represent an approximation of the Q-function so that it can induce a generalised policy. Empirical results for a service robot domain show that our online relational reinforcement learning method is scalable to large scale problems and enables transfer learning between different problems and simulation environments with dissimilar transition dynamics.} }
Endnote
%0 Conference Paper %T Generalised Task Planning with First-Order Function Approximation %A Jun Hao Alvin Ng %A Ronald P.A. Petrick %B Proceedings of the 5th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2022 %E Aleksandra Faust %E David Hsu %E Gerhard Neumann %F pmlr-v164-ng22a %I PMLR %P 1595--1610 %U https://proceedings.mlr.press/v164/ng22a.html %V 164 %X Real world robotics often operates in uncertain and dynamic environments where generalisation over different scenarios is of practical interest. In the absence of a model, value-based reinforcement learning can be used to learn a goal-directed policy. Typically, the interaction between robots and the objects in the environment exhibit a first-order structure. We introduce first-order, or relational, features to represent an approximation of the Q-function so that it can induce a generalised policy. Empirical results for a service robot domain show that our online relational reinforcement learning method is scalable to large scale problems and enables transfer learning between different problems and simulation environments with dissimilar transition dynamics.
APA
Ng, J.H.A. & Petrick, R.P.. (2022). Generalised Task Planning with First-Order Function Approximation. Proceedings of the 5th Conference on Robot Learning, in Proceedings of Machine Learning Research 164:1595-1610 Available from https://proceedings.mlr.press/v164/ng22a.html.

Related Material