Multi-task Actor-Critic with Knowledge Transfer via a Shared Critic

Gengzhi Zhang; Liang Feng; Yaqing Hou

Multi-task Actor-Critic with Knowledge Transfer via a Shared Critic

Gengzhi Zhang, Liang Feng, Yaqing Hou

Proceedings of The 13th Asian Conference on Machine Learning, PMLR 157:580-593, 2021.

Abstract

Multi-task actor-critic is a learning paradigm proposed in the literature to improve the learning efficiency of multiple actor-critics by sharing the learned policies across tasks while the reinforcement learning progresses online. However, existing multi-task actor-critic algorithms can only handle reinforcement learning tasks within the same problem domain, they may fail in cases where tasks possessing diverse state-action spaces. Taking this cue, in this paper, we embark a study on multi-task actor-critic with knowledge transfer via a share critic to enable the multi-task learning of actor-critic in heterogeneous state-action environments. Further, for efficient learning of the proposed multi-task actor-critic, a new formula for calculating the gradient of the actor network is also presented. To evaluate the performance of our approach, comprehensive empirical studies on continuous robotic tasks with different numbers of links. The experimental results confirmed the effectiveness of the proposed multi-task actor-critic algorithm.

Cite this Paper

BibTeX


@InProceedings{pmlr-v157-zhang21b,
  title = 	 {Multi-task Actor-Critic with Knowledge Transfer via a Shared Critic},
  author =       {Zhang, Gengzhi and Feng, Liang and Hou, Yaqing},
  booktitle = 	 {Proceedings of The 13th Asian Conference on Machine Learning},
  pages = 	 {580--593},
  year = 	 {2021},
  editor = 	 {Balasubramanian, Vineeth N. and Tsang, Ivor},
  volume = 	 {157},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {17--19 Nov},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v157/zhang21b/zhang21b.pdf},
  url = 	 {https://proceedings.mlr.press/v157/zhang21b.html},
  abstract = 	 {Multi-task actor-critic is a learning paradigm proposed in the literature to improve the learning efficiency of multiple actor-critics by sharing the learned policies across tasks while the reinforcement learning progresses online. However, existing multi-task actor-critic algorithms can only handle reinforcement learning tasks within the same problem domain, they may fail in cases where tasks possessing diverse state-action spaces. Taking this cue, in this paper, we embark a study on multi-task actor-critic with knowledge transfer via a share critic to enable the multi-task learning of actor-critic in heterogeneous state-action environments. Further, for efficient learning of the proposed multi-task actor-critic, a new formula for calculating the gradient of the actor network is also presented. To evaluate the performance of our approach, comprehensive empirical studies on continuous robotic tasks with different numbers of links. The experimental results confirmed the effectiveness of the proposed multi-task actor-critic algorithm.}
}

Endnote

%0 Conference Paper
%T Multi-task Actor-Critic with Knowledge Transfer via a Shared Critic
%A Gengzhi Zhang
%A Liang Feng
%A Yaqing Hou
%B Proceedings of The 13th Asian Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Vineeth N. Balasubramanian
%E Ivor Tsang	
%F pmlr-v157-zhang21b
%I PMLR
%P 580--593
%U https://proceedings.mlr.press/v157/zhang21b.html
%V 157
%X Multi-task actor-critic is a learning paradigm proposed in the literature to improve the learning efficiency of multiple actor-critics by sharing the learned policies across tasks while the reinforcement learning progresses online. However, existing multi-task actor-critic algorithms can only handle reinforcement learning tasks within the same problem domain, they may fail in cases where tasks possessing diverse state-action spaces. Taking this cue, in this paper, we embark a study on multi-task actor-critic with knowledge transfer via a share critic to enable the multi-task learning of actor-critic in heterogeneous state-action environments. Further, for efficient learning of the proposed multi-task actor-critic, a new formula for calculating the gradient of the actor network is also presented. To evaluate the performance of our approach, comprehensive empirical studies on continuous robotic tasks with different numbers of links. The experimental results confirmed the effectiveness of the proposed multi-task actor-critic algorithm.

APA


Zhang, G., Feng, L. & Hou, Y.. (2021). Multi-task Actor-Critic with Knowledge Transfer via a Shared Critic. Proceedings of The 13th Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 157:580-593 Available from https://proceedings.mlr.press/v157/zhang21b.html.

Related Material

Download PDF