Online Multi-Task Learning for Policy Gradient Methods

Haitham Bou Ammar, Eric Eaton, Paul Ruvolo, Matthew Taylor
Proceedings of the 31st International Conference on Machine Learning, PMLR 32(2):1206-1214, 2014.

Abstract

Policy gradient algorithms have shown considerable recent success in solving high-dimensional sequential decision making tasks, particularly in robotics. However, these methods often require extensive experience in a domain to achieve high performance. To make agents more sample-efficient, we developed a multi-task policy gradient method to learn decision making tasks consecutively, transferring knowledge between tasks to accelerate learning. Our approach provides robust theoretical guarantees, and we show empirically that it dramatically accelerates learning on a variety of dynamical systems, including an application to quadrotor control.

Cite this Paper


BibTeX
@InProceedings{pmlr-v32-ammar14, title = {Online Multi-Task Learning for Policy Gradient Methods}, author = {Ammar, Haitham Bou and Eaton, Eric and Ruvolo, Paul and Taylor, Matthew}, booktitle = {Proceedings of the 31st International Conference on Machine Learning}, pages = {1206--1214}, year = {2014}, editor = {Xing, Eric P. and Jebara, Tony}, volume = {32}, number = {2}, series = {Proceedings of Machine Learning Research}, address = {Bejing, China}, month = {22--24 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v32/ammar14.pdf}, url = {https://proceedings.mlr.press/v32/ammar14.html}, abstract = {Policy gradient algorithms have shown considerable recent success in solving high-dimensional sequential decision making tasks, particularly in robotics. However, these methods often require extensive experience in a domain to achieve high performance. To make agents more sample-efficient, we developed a multi-task policy gradient method to learn decision making tasks consecutively, transferring knowledge between tasks to accelerate learning. Our approach provides robust theoretical guarantees, and we show empirically that it dramatically accelerates learning on a variety of dynamical systems, including an application to quadrotor control.} }
Endnote
%0 Conference Paper %T Online Multi-Task Learning for Policy Gradient Methods %A Haitham Bou Ammar %A Eric Eaton %A Paul Ruvolo %A Matthew Taylor %B Proceedings of the 31st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2014 %E Eric P. Xing %E Tony Jebara %F pmlr-v32-ammar14 %I PMLR %P 1206--1214 %U https://proceedings.mlr.press/v32/ammar14.html %V 32 %N 2 %X Policy gradient algorithms have shown considerable recent success in solving high-dimensional sequential decision making tasks, particularly in robotics. However, these methods often require extensive experience in a domain to achieve high performance. To make agents more sample-efficient, we developed a multi-task policy gradient method to learn decision making tasks consecutively, transferring knowledge between tasks to accelerate learning. Our approach provides robust theoretical guarantees, and we show empirically that it dramatically accelerates learning on a variety of dynamical systems, including an application to quadrotor control.
RIS
TY - CPAPER TI - Online Multi-Task Learning for Policy Gradient Methods AU - Haitham Bou Ammar AU - Eric Eaton AU - Paul Ruvolo AU - Matthew Taylor BT - Proceedings of the 31st International Conference on Machine Learning DA - 2014/06/18 ED - Eric P. Xing ED - Tony Jebara ID - pmlr-v32-ammar14 PB - PMLR DP - Proceedings of Machine Learning Research VL - 32 IS - 2 SP - 1206 EP - 1214 L1 - http://proceedings.mlr.press/v32/ammar14.pdf UR - https://proceedings.mlr.press/v32/ammar14.html AB - Policy gradient algorithms have shown considerable recent success in solving high-dimensional sequential decision making tasks, particularly in robotics. However, these methods often require extensive experience in a domain to achieve high performance. To make agents more sample-efficient, we developed a multi-task policy gradient method to learn decision making tasks consecutively, transferring knowledge between tasks to accelerate learning. Our approach provides robust theoretical guarantees, and we show empirically that it dramatically accelerates learning on a variety of dynamical systems, including an application to quadrotor control. ER -
APA
Ammar, H.B., Eaton, E., Ruvolo, P. & Taylor, M.. (2014). Online Multi-Task Learning for Policy Gradient Methods. Proceedings of the 31st International Conference on Machine Learning, in Proceedings of Machine Learning Research 32(2):1206-1214 Available from https://proceedings.mlr.press/v32/ammar14.html.

Related Material