Learning to Branch for Multi-Task Learning

Pengsheng Guo; Chen-Yu Lee; Daniel Ulbricht

Learning to Branch for Multi-Task Learning

Pengsheng Guo, Chen-Yu Lee, Daniel Ulbricht

Proceedings of the 37th International Conference on Machine Learning, PMLR 119:3854-3863, 2020.

Abstract

Training multiple tasks jointly in one deep network yields reduced latency during inference and better performance over the single-task counterpart by sharing certain layers of a network. However, over-sharing a network could erroneously enforce over-generalization, causing negative knowledge transfer across tasks. Prior works rely on human intuition or pre-computed task relatedness scores for ad hoc branching structures. They provide sub-optimal end results and often require huge efforts for the trial-and-error process. In this work, we present an automated multi-task learning algorithm that learns where to share or branch within a network, designing an effective network topology that is directly optimized for multiple objectives across tasks. Specifically, we propose a novel tree-structured design space that casts a tree branching operation as a gumbel-softmax sampling procedure. This enables differentiable network splitting that is end-to-end trainable. We validate the proposed method on controlled synthetic data, CelebA, and Taskonomy.

Cite this Paper

BibTeX

@InProceedings{pmlr-v119-guo20e,
  title = 	 {Learning to Branch for Multi-Task Learning},
  author =       {Guo, Pengsheng and Lee, Chen-Yu and Ulbricht, Daniel},
  booktitle = 	 {Proceedings of the 37th International Conference on Machine Learning},
  pages = 	 {3854--3863},
  year = 	 {2020},
  editor = 	 {III, Hal Daumé and Singh, Aarti},
  volume = 	 {119},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--18 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v119/guo20e/guo20e.pdf},
  url = 	 {https://proceedings.mlr.press/v119/guo20e.html},
  abstract = 	 {Training multiple tasks jointly in one deep network yields reduced latency during inference and better performance over the single-task counterpart by sharing certain layers of a network. However, over-sharing a network could erroneously enforce over-generalization, causing negative knowledge transfer across tasks. Prior works rely on human intuition or pre-computed task relatedness scores for ad hoc branching structures. They provide sub-optimal end results and often require huge efforts for the trial-and-error process. In this work, we present an automated multi-task learning algorithm that learns where to share or branch within a network, designing an effective network topology that is directly optimized for multiple objectives across tasks. Specifically, we propose a novel tree-structured design space that casts a tree branching operation as a gumbel-softmax sampling procedure. This enables differentiable network splitting that is end-to-end trainable. We validate the proposed method on controlled synthetic data, CelebA, and Taskonomy.}
}

Endnote

%0 Conference Paper
%T Learning to Branch for Multi-Task Learning
%A Pengsheng Guo
%A Chen-Yu Lee
%A Daniel Ulbricht
%B Proceedings of the 37th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2020
%E Hal Daumé III
%E Aarti Singh	
%F pmlr-v119-guo20e
%I PMLR
%P 3854--3863
%U https://proceedings.mlr.press/v119/guo20e.html
%V 119
%X Training multiple tasks jointly in one deep network yields reduced latency during inference and better performance over the single-task counterpart by sharing certain layers of a network. However, over-sharing a network could erroneously enforce over-generalization, causing negative knowledge transfer across tasks. Prior works rely on human intuition or pre-computed task relatedness scores for ad hoc branching structures. They provide sub-optimal end results and often require huge efforts for the trial-and-error process. In this work, we present an automated multi-task learning algorithm that learns where to share or branch within a network, designing an effective network topology that is directly optimized for multiple objectives across tasks. Specifically, we propose a novel tree-structured design space that casts a tree branching operation as a gumbel-softmax sampling procedure. This enables differentiable network splitting that is end-to-end trainable. We validate the proposed method on controlled synthetic data, CelebA, and Taskonomy.

APA

Guo, P., Lee, C. & Ulbricht, D.. (2020). Learning to Branch for Multi-Task Learning. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:3854-3863 Available from https://proceedings.mlr.press/v119/guo20e.html.

Learning to Branch for Multi-Task Learning

Abstract

Cite this Paper

Related Material