Modeling Long-horizon Tasks as Sequential Interaction Landscapes

Soeren Pirk, Karol Hausman, Alexander Toshev, Mohi Khansari
Proceedings of the 2020 Conference on Robot Learning, PMLR 155:471-484, 2021.

Abstract

Task planning over long-time horizons is a challenging and open problem in robotics and its complexity grows exponentially with an increasing number of subtasks. In this paper we present a deep neural network that learns dependencies and transitions across subtasks solely from a set of demonstration videos. We represent each subtasks as action symbols (e.g. move cup), and show that these symbols can be learned and predicted directly from image observations. Learning symbol sequences provides the network with additional information about the most frequent transitions and relevant dependencies between subtasks and thereby structures tasks over long-time horizons. Learning from images, on the other hand, allows the network to continuously monitor the task progress and thus to interactively adapt to changes in the environment. We evaluate our framework on two long horizon tasks: (1) block stacking of puzzle pieces being executed by humans, and (2) a robot manipulation task involving pick and place of objects and sliding a cabinet door with a 7-DoF robot arm. We show that complex plans can be carried out when executing the robotic task and the robot can interactively adapt to changes in the environment and recover from failure cases.

Cite this Paper


BibTeX
@InProceedings{pmlr-v155-pirk21a, title = {Modeling Long-horizon Tasks as Sequential Interaction Landscapes}, author = {Pirk, Soeren and Hausman, Karol and Toshev, Alexander and Khansari, Mohi}, booktitle = {Proceedings of the 2020 Conference on Robot Learning}, pages = {471--484}, year = {2021}, editor = {Kober, Jens and Ramos, Fabio and Tomlin, Claire}, volume = {155}, series = {Proceedings of Machine Learning Research}, month = {16--18 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v155/pirk21a/pirk21a.pdf}, url = {https://proceedings.mlr.press/v155/pirk21a.html}, abstract = {Task planning over long-time horizons is a challenging and open problem in robotics and its complexity grows exponentially with an increasing number of subtasks. In this paper we present a deep neural network that learns dependencies and transitions across subtasks solely from a set of demonstration videos. We represent each subtasks as action symbols (e.g. move cup), and show that these symbols can be learned and predicted directly from image observations. Learning symbol sequences provides the network with additional information about the most frequent transitions and relevant dependencies between subtasks and thereby structures tasks over long-time horizons. Learning from images, on the other hand, allows the network to continuously monitor the task progress and thus to interactively adapt to changes in the environment. We evaluate our framework on two long horizon tasks: (1) block stacking of puzzle pieces being executed by humans, and (2) a robot manipulation task involving pick and place of objects and sliding a cabinet door with a 7-DoF robot arm. We show that complex plans can be carried out when executing the robotic task and the robot can interactively adapt to changes in the environment and recover from failure cases.} }
Endnote
%0 Conference Paper %T Modeling Long-horizon Tasks as Sequential Interaction Landscapes %A Soeren Pirk %A Karol Hausman %A Alexander Toshev %A Mohi Khansari %B Proceedings of the 2020 Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2021 %E Jens Kober %E Fabio Ramos %E Claire Tomlin %F pmlr-v155-pirk21a %I PMLR %P 471--484 %U https://proceedings.mlr.press/v155/pirk21a.html %V 155 %X Task planning over long-time horizons is a challenging and open problem in robotics and its complexity grows exponentially with an increasing number of subtasks. In this paper we present a deep neural network that learns dependencies and transitions across subtasks solely from a set of demonstration videos. We represent each subtasks as action symbols (e.g. move cup), and show that these symbols can be learned and predicted directly from image observations. Learning symbol sequences provides the network with additional information about the most frequent transitions and relevant dependencies between subtasks and thereby structures tasks over long-time horizons. Learning from images, on the other hand, allows the network to continuously monitor the task progress and thus to interactively adapt to changes in the environment. We evaluate our framework on two long horizon tasks: (1) block stacking of puzzle pieces being executed by humans, and (2) a robot manipulation task involving pick and place of objects and sliding a cabinet door with a 7-DoF robot arm. We show that complex plans can be carried out when executing the robotic task and the robot can interactively adapt to changes in the environment and recover from failure cases.
APA
Pirk, S., Hausman, K., Toshev, A. & Khansari, M.. (2021). Modeling Long-horizon Tasks as Sequential Interaction Landscapes. Proceedings of the 2020 Conference on Robot Learning, in Proceedings of Machine Learning Research 155:471-484 Available from https://proceedings.mlr.press/v155/pirk21a.html.

Related Material