Inferring Task Goals and Constraints using Bayesian Nonparametric Inverse Reinforcement Learning

Daehyung Park, Michael Noseworthy, Rohan Paul, Subhro Roy, Nicholas Roy
; Proceedings of the Conference on Robot Learning, PMLR 100:1005-1014, 2020.

Abstract

Recovering an unknown reward function for complex manipulation tasks is the fundamental problem of Inverse Reinforcement Learning (IRL). Often, the recovered reward function fails to explicitly capture implicit constraints (e.g., axis alignment, force, or relative alignment) between the manipulator, the objects of interaction, and other entities in the workspace. The standard IRL approaches do not model the presence of locally-consistent constraints that may be active only in a section of a demonstration. This work introduces Constraint-based Bayesian Nonparametric Inverse Reinforcement Learning (CBN-IRL) that models the observed behaviour as a sequence of subtasks, each consisting of a goal and a set of locally-active constraints. CBN-IRL infers locally-active constraints given a single demonstration by identifying potential constraints and their activation space. Further, the nonparametric prior over subgoals constituting the task allows the model to adapt with the complexity of the demonstration. The inferred set of goals and constraints are then used to recover a control policy via constrained optimization. We evaluate the proposed model in simulated navigation and manipulation domains. CBN-IRL efficiently learns a compact representation for complex tasks that allows generalization in novel environments, outperforming state-of-the-art IRL methods. Finally, we demonstrate the model on two tool-manipulation tasks using a UR5 manipulator and show generalization to novel test scenarios.

Cite this Paper


BibTeX
@InProceedings{pmlr-v100-park20a, title = {Inferring Task Goals and Constraints using Bayesian Nonparametric Inverse Reinforcement Learning}, author = {Park, Daehyung and Noseworthy, Michael and Paul, Rohan and Roy, Subhro and Roy, Nicholas}, pages = {1005--1014}, year = {2020}, editor = {Leslie Pack Kaelbling and Danica Kragic and Komei Sugiura}, volume = {100}, series = {Proceedings of Machine Learning Research}, address = {}, month = {30 Oct--01 Nov}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v100/park20a/park20a.pdf}, url = {http://proceedings.mlr.press/v100/park20a.html}, abstract = {Recovering an unknown reward function for complex manipulation tasks is the fundamental problem of Inverse Reinforcement Learning (IRL). Often, the recovered reward function fails to explicitly capture implicit constraints (e.g., axis alignment, force, or relative alignment) between the manipulator, the objects of interaction, and other entities in the workspace. The standard IRL approaches do not model the presence of locally-consistent constraints that may be active only in a section of a demonstration. This work introduces Constraint-based Bayesian Nonparametric Inverse Reinforcement Learning (CBN-IRL) that models the observed behaviour as a sequence of subtasks, each consisting of a goal and a set of locally-active constraints. CBN-IRL infers locally-active constraints given a single demonstration by identifying potential constraints and their activation space. Further, the nonparametric prior over subgoals constituting the task allows the model to adapt with the complexity of the demonstration. The inferred set of goals and constraints are then used to recover a control policy via constrained optimization. We evaluate the proposed model in simulated navigation and manipulation domains. CBN-IRL efficiently learns a compact representation for complex tasks that allows generalization in novel environments, outperforming state-of-the-art IRL methods. Finally, we demonstrate the model on two tool-manipulation tasks using a UR5 manipulator and show generalization to novel test scenarios.} }
Endnote
%0 Conference Paper %T Inferring Task Goals and Constraints using Bayesian Nonparametric Inverse Reinforcement Learning %A Daehyung Park %A Michael Noseworthy %A Rohan Paul %A Subhro Roy %A Nicholas Roy %B Proceedings of the Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2020 %E Leslie Pack Kaelbling %E Danica Kragic %E Komei Sugiura %F pmlr-v100-park20a %I PMLR %J Proceedings of Machine Learning Research %P 1005--1014 %U http://proceedings.mlr.press %V 100 %W PMLR %X Recovering an unknown reward function for complex manipulation tasks is the fundamental problem of Inverse Reinforcement Learning (IRL). Often, the recovered reward function fails to explicitly capture implicit constraints (e.g., axis alignment, force, or relative alignment) between the manipulator, the objects of interaction, and other entities in the workspace. The standard IRL approaches do not model the presence of locally-consistent constraints that may be active only in a section of a demonstration. This work introduces Constraint-based Bayesian Nonparametric Inverse Reinforcement Learning (CBN-IRL) that models the observed behaviour as a sequence of subtasks, each consisting of a goal and a set of locally-active constraints. CBN-IRL infers locally-active constraints given a single demonstration by identifying potential constraints and their activation space. Further, the nonparametric prior over subgoals constituting the task allows the model to adapt with the complexity of the demonstration. The inferred set of goals and constraints are then used to recover a control policy via constrained optimization. We evaluate the proposed model in simulated navigation and manipulation domains. CBN-IRL efficiently learns a compact representation for complex tasks that allows generalization in novel environments, outperforming state-of-the-art IRL methods. Finally, we demonstrate the model on two tool-manipulation tasks using a UR5 manipulator and show generalization to novel test scenarios.
APA
Park, D., Noseworthy, M., Paul, R., Roy, S. & Roy, N.. (2020). Inferring Task Goals and Constraints using Bayesian Nonparametric Inverse Reinforcement Learning. Proceedings of the Conference on Robot Learning, in PMLR 100:1005-1014

Related Material