TACO: Learning Task Decomposition via Temporal Alignment for Control

Kyriacos Shiarlis; Markus Wulfmeier; Sasha Salter; Shimon Whiteson; Ingmar Posner

TACO: Learning Task Decomposition via Temporal Alignment for Control

Kyriacos Shiarlis, Markus Wulfmeier, Sasha Salter, Shimon Whiteson, Ingmar Posner

Proceedings of the 35th International Conference on Machine Learning, PMLR 80:4654-4663, 2018.

Abstract

Many advanced Learning from Demonstration (LfD) methods consider the decomposition of complex, real-world tasks into simpler sub-tasks. By reusing the corresponding sub-policies within and between tasks, we can provide training data for each policy from different high-level tasks and compose them to perform novel ones. Existing approaches to modular LfD focus either on learning a single high-level task or depend on domain knowledge and temporal segmentation. In contrast, we propose a weakly supervised, domain-agnostic approach based on task sketches, which include only the sequence of sub-tasks performed in each demonstration. Our approach simultaneously aligns the sketches with the observed demonstrations and learns the required sub-policies. This improves generalisation in comparison to separate optimisation procedures. We evaluate the approach on multiple domains, including a simulated 3D robot arm control task using purely image-based observations. The results show that our approach performs commensurately with fully supervised approaches, while requiring significantly less annotation effort.

Cite this Paper

BibTeX

@InProceedings{pmlr-v80-shiarlis18a,
  title = 	 {{TACO}: Learning Task Decomposition via Temporal Alignment for Control},
  author =       {Shiarlis, Kyriacos and Wulfmeier, Markus and Salter, Sasha and Whiteson, Shimon and Posner, Ingmar},
  booktitle = 	 {Proceedings of the 35th International Conference on Machine Learning},
  pages = 	 {4654--4663},
  year = 	 {2018},
  editor = 	 {Dy, Jennifer and Krause, Andreas},
  volume = 	 {80},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {10--15 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v80/shiarlis18a/shiarlis18a.pdf},
  url = 	 {https://proceedings.mlr.press/v80/shiarlis18a.html},
  abstract = 	 {Many advanced Learning from Demonstration (LfD) methods consider the decomposition of complex, real-world tasks into simpler sub-tasks. By reusing the corresponding sub-policies within and between tasks, we can provide training data for each policy from different high-level tasks and compose them to perform novel ones. Existing approaches to modular LfD focus either on learning a single high-level task or depend on domain knowledge and temporal segmentation. In contrast, we propose a weakly supervised, domain-agnostic approach based on task sketches, which include only the sequence of sub-tasks performed in each demonstration. Our approach simultaneously aligns the sketches with the observed demonstrations and learns the required sub-policies. This improves generalisation in comparison to separate optimisation procedures. We evaluate the approach on multiple domains, including a simulated 3D robot arm control task using purely image-based observations. The results show that our approach performs commensurately with fully supervised approaches, while requiring significantly less annotation effort.}
}

Endnote

%0 Conference Paper
%T TACO: Learning Task Decomposition via Temporal Alignment for Control
%A Kyriacos Shiarlis
%A Markus Wulfmeier
%A Sasha Salter
%A Shimon Whiteson
%A Ingmar Posner
%B Proceedings of the 35th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2018
%E Jennifer Dy
%E Andreas Krause	
%F pmlr-v80-shiarlis18a
%I PMLR
%P 4654--4663
%U https://proceedings.mlr.press/v80/shiarlis18a.html
%V 80
%X Many advanced Learning from Demonstration (LfD) methods consider the decomposition of complex, real-world tasks into simpler sub-tasks. By reusing the corresponding sub-policies within and between tasks, we can provide training data for each policy from different high-level tasks and compose them to perform novel ones. Existing approaches to modular LfD focus either on learning a single high-level task or depend on domain knowledge and temporal segmentation. In contrast, we propose a weakly supervised, domain-agnostic approach based on task sketches, which include only the sequence of sub-tasks performed in each demonstration. Our approach simultaneously aligns the sketches with the observed demonstrations and learns the required sub-policies. This improves generalisation in comparison to separate optimisation procedures. We evaluate the approach on multiple domains, including a simulated 3D robot arm control task using purely image-based observations. The results show that our approach performs commensurately with fully supervised approaches, while requiring significantly less annotation effort.

APA

Shiarlis, K., Wulfmeier, M., Salter, S., Whiteson, S. & Posner, I.. (2018). TACO: Learning Task Decomposition via Temporal Alignment for Control. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:4654-4663 Available from https://proceedings.mlr.press/v80/shiarlis18a.html.

TACO: Learning Task Decomposition via Temporal Alignment for Control

Abstract

Cite this Paper

Related Material