Online Multi-Task Learning for Policy Gradient Methods

Haitham Bou Ammar; Eric Eaton; Paul Ruvolo; Matthew Taylor

Online Multi-Task Learning for Policy Gradient Methods

Haitham Bou Ammar, Eric Eaton, Paul Ruvolo, Matthew Taylor

Proceedings of the 31st International Conference on Machine Learning, PMLR 32(2):1206-1214, 2014.

Abstract

Policy gradient algorithms have shown considerable recent success in solving high-dimensional sequential decision making tasks, particularly in robotics. However, these methods often require extensive experience in a domain to achieve high performance. To make agents more sample-efficient, we developed a multi-task policy gradient method to learn decision making tasks consecutively, transferring knowledge between tasks to accelerate learning. Our approach provides robust theoretical guarantees, and we show empirically that it dramatically accelerates learning on a variety of dynamical systems, including an application to quadrotor control.

Cite this Paper

BibTeX


@InProceedings{pmlr-v32-ammar14,
  title = 	 {Online Multi-Task Learning for Policy Gradient Methods},
  author = 	 {Ammar, Haitham Bou and Eaton, Eric and Ruvolo, Paul and Taylor, Matthew},
  booktitle = 	 {Proceedings of the 31st International Conference on Machine Learning},
  pages = 	 {1206--1214},
  year = 	 {2014},
  editor = 	 {Xing, Eric P. and Jebara, Tony},
  volume = 	 {32},
  number =       {2},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Bejing, China},
  month = 	 {22--24 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v32/ammar14.pdf},
  url = 	 {https://proceedings.mlr.press/v32/ammar14.html},
  abstract = 	 {Policy gradient algorithms have shown considerable recent success in solving high-dimensional sequential decision making tasks, particularly in robotics.  However, these methods often require extensive experience in a domain to achieve high performance.  To make agents more sample-efficient, we developed a multi-task policy gradient method to learn decision making tasks consecutively, transferring knowledge between tasks to accelerate learning.  Our approach provides robust theoretical guarantees, and we show empirically that it dramatically accelerates learning on a variety of dynamical systems, including an application to quadrotor control.}
}

Endnote

%0 Conference Paper
%T Online Multi-Task Learning for Policy Gradient Methods
%A Haitham Bou Ammar
%A Eric Eaton
%A Paul Ruvolo
%A Matthew Taylor
%B Proceedings of the 31st International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2014
%E Eric P. Xing
%E Tony Jebara	
%F pmlr-v32-ammar14
%I PMLR
%P 1206--1214
%U https://proceedings.mlr.press/v32/ammar14.html
%V 32
%N 2
%X Policy gradient algorithms have shown considerable recent success in solving high-dimensional sequential decision making tasks, particularly in robotics.  However, these methods often require extensive experience in a domain to achieve high performance.  To make agents more sample-efficient, we developed a multi-task policy gradient method to learn decision making tasks consecutively, transferring knowledge between tasks to accelerate learning.  Our approach provides robust theoretical guarantees, and we show empirically that it dramatically accelerates learning on a variety of dynamical systems, including an application to quadrotor control.

RIS


TY  - CPAPER
TI  - Online Multi-Task Learning for Policy Gradient Methods
AU  - Haitham Bou Ammar
AU  - Eric Eaton
AU  - Paul Ruvolo
AU  - Matthew Taylor
BT  - Proceedings of the 31st International Conference on Machine Learning
DA  - 2014/06/18
ED  - Eric P. Xing
ED  - Tony Jebara	
ID  - pmlr-v32-ammar14
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 32
IS  - 2
SP  - 1206
EP  - 1214
L1  - http://proceedings.mlr.press/v32/ammar14.pdf
UR  - https://proceedings.mlr.press/v32/ammar14.html
AB  - Policy gradient algorithms have shown considerable recent success in solving high-dimensional sequential decision making tasks, particularly in robotics.  However, these methods often require extensive experience in a domain to achieve high performance.  To make agents more sample-efficient, we developed a multi-task policy gradient method to learn decision making tasks consecutively, transferring knowledge between tasks to accelerate learning.  Our approach provides robust theoretical guarantees, and we show empirically that it dramatically accelerates learning on a variety of dynamical systems, including an application to quadrotor control.
ER  -

APA


Ammar, H.B., Eaton, E., Ruvolo, P. & Taylor, M.. (2014). Online Multi-Task Learning for Policy Gradient Methods. Proceedings of the 31st International Conference on Machine Learning, in Proceedings of Machine Learning Research 32(2):1206-1214 Available from https://proceedings.mlr.press/v32/ammar14.html.

Related Material

Download PDF