Learning to Branch for Multi-Task Learning

Pengsheng Guo, Chen-Yu Lee, Daniel Ulbricht
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:3854-3863, 2020.

Abstract

Training multiple tasks jointly in one deep network yields reduced latency during inference and better performance over the single-task counterpart by sharing certain layers of a network. However, over-sharing a network could erroneously enforce over-generalization, causing negative knowledge transfer across tasks. Prior works rely on human intuition or pre-computed task relatedness scores for ad hoc branching structures. They provide sub-optimal end results and often require huge efforts for the trial-and-error process. In this work, we present an automated multi-task learning algorithm that learns where to share or branch within a network, designing an effective network topology that is directly optimized for multiple objectives across tasks. Specifically, we propose a novel tree-structured design space that casts a tree branching operation as a gumbel-softmax sampling procedure. This enables differentiable network splitting that is end-to-end trainable. We validate the proposed method on controlled synthetic data, CelebA, and Taskonomy.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-guo20e, title = {Learning to Branch for Multi-Task Learning}, author = {Guo, Pengsheng and Lee, Chen-Yu and Ulbricht, Daniel}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {3854--3863}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/guo20e/guo20e.pdf}, url = {https://proceedings.mlr.press/v119/guo20e.html}, abstract = {Training multiple tasks jointly in one deep network yields reduced latency during inference and better performance over the single-task counterpart by sharing certain layers of a network. However, over-sharing a network could erroneously enforce over-generalization, causing negative knowledge transfer across tasks. Prior works rely on human intuition or pre-computed task relatedness scores for ad hoc branching structures. They provide sub-optimal end results and often require huge efforts for the trial-and-error process. In this work, we present an automated multi-task learning algorithm that learns where to share or branch within a network, designing an effective network topology that is directly optimized for multiple objectives across tasks. Specifically, we propose a novel tree-structured design space that casts a tree branching operation as a gumbel-softmax sampling procedure. This enables differentiable network splitting that is end-to-end trainable. We validate the proposed method on controlled synthetic data, CelebA, and Taskonomy.} }
Endnote
%0 Conference Paper %T Learning to Branch for Multi-Task Learning %A Pengsheng Guo %A Chen-Yu Lee %A Daniel Ulbricht %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-guo20e %I PMLR %P 3854--3863 %U https://proceedings.mlr.press/v119/guo20e.html %V 119 %X Training multiple tasks jointly in one deep network yields reduced latency during inference and better performance over the single-task counterpart by sharing certain layers of a network. However, over-sharing a network could erroneously enforce over-generalization, causing negative knowledge transfer across tasks. Prior works rely on human intuition or pre-computed task relatedness scores for ad hoc branching structures. They provide sub-optimal end results and often require huge efforts for the trial-and-error process. In this work, we present an automated multi-task learning algorithm that learns where to share or branch within a network, designing an effective network topology that is directly optimized for multiple objectives across tasks. Specifically, we propose a novel tree-structured design space that casts a tree branching operation as a gumbel-softmax sampling procedure. This enables differentiable network splitting that is end-to-end trainable. We validate the proposed method on controlled synthetic data, CelebA, and Taskonomy.
APA
Guo, P., Lee, C. & Ulbricht, D.. (2020). Learning to Branch for Multi-Task Learning. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:3854-3863 Available from https://proceedings.mlr.press/v119/guo20e.html.

Related Material