Generative Modeling for Multi-task Visual Learning

Zhipeng Bao; Martial Hebert; Yu-Xiong Wang

Generative Modeling for Multi-task Visual Learning

Zhipeng Bao, Martial Hebert, Yu-Xiong Wang

Proceedings of the 39th International Conference on Machine Learning, PMLR 162:1537-1554, 2022.

Abstract

Generative modeling has recently shown great promise in computer vision, but it has mostly focused on synthesizing visually realistic images. In this paper, motivated by multi-task learning of shareable feature representations, we consider a novel problem of learning a shared generative model that is useful across various visual perception tasks. Correspondingly, we propose a general multi-task oriented generative modeling (MGM) framework, by coupling a discriminative multi-task network with a generative network. While it is challenging to synthesize both RGB images and pixel-level annotations in multi-task scenarios, our framework enables us to use synthesized images paired with only weak annotations (i.e., image-level scene labels) to facilitate multiple visual tasks. Experimental evaluation on challenging multi-task benchmarks, including NYUv2 and Taskonomy, demonstrates that our MGM framework improves the performance of all the tasks by large margins, consistently outperforming state-of-the-art multi-task approaches in different sample-size regimes.

Cite this Paper

BibTeX


@InProceedings{pmlr-v162-bao22c,
  title = 	 {Generative Modeling for Multi-task Visual Learning},
  author =       {Bao, Zhipeng and Hebert, Martial and Wang, Yu-Xiong},
  booktitle = 	 {Proceedings of the 39th International Conference on Machine Learning},
  pages = 	 {1537--1554},
  year = 	 {2022},
  editor = 	 {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan},
  volume = 	 {162},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {17--23 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v162/bao22c/bao22c.pdf},
  url = 	 {https://proceedings.mlr.press/v162/bao22c.html},
  abstract = 	 {Generative modeling has recently shown great promise in computer vision, but it has mostly focused on synthesizing visually realistic images. In this paper, motivated by multi-task learning of shareable feature representations, we consider a novel problem of learning a shared generative model that is useful across various visual perception tasks. Correspondingly, we propose a general multi-task oriented generative modeling (MGM) framework, by coupling a discriminative multi-task network with a generative network. While it is challenging to synthesize both RGB images and pixel-level annotations in multi-task scenarios, our framework enables us to use synthesized images paired with only weak annotations (i.e., image-level scene labels) to facilitate multiple visual tasks. Experimental evaluation on challenging multi-task benchmarks, including NYUv2 and Taskonomy, demonstrates that our MGM framework improves the performance of all the tasks by large margins, consistently outperforming state-of-the-art multi-task approaches in different sample-size regimes.}
}

Endnote

%0 Conference Paper
%T Generative Modeling for Multi-task Visual Learning
%A Zhipeng Bao
%A Martial Hebert
%A Yu-Xiong Wang
%B Proceedings of the 39th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2022
%E Kamalika Chaudhuri
%E Stefanie Jegelka
%E Le Song
%E Csaba Szepesvari
%E Gang Niu
%E Sivan Sabato	
%F pmlr-v162-bao22c
%I PMLR
%P 1537--1554
%U https://proceedings.mlr.press/v162/bao22c.html
%V 162
%X Generative modeling has recently shown great promise in computer vision, but it has mostly focused on synthesizing visually realistic images. In this paper, motivated by multi-task learning of shareable feature representations, we consider a novel problem of learning a shared generative model that is useful across various visual perception tasks. Correspondingly, we propose a general multi-task oriented generative modeling (MGM) framework, by coupling a discriminative multi-task network with a generative network. While it is challenging to synthesize both RGB images and pixel-level annotations in multi-task scenarios, our framework enables us to use synthesized images paired with only weak annotations (i.e., image-level scene labels) to facilitate multiple visual tasks. Experimental evaluation on challenging multi-task benchmarks, including NYUv2 and Taskonomy, demonstrates that our MGM framework improves the performance of all the tasks by large margins, consistently outperforming state-of-the-art multi-task approaches in different sample-size regimes.

APA


Bao, Z., Hebert, M. & Wang, Y.. (2022). Generative Modeling for Multi-task Visual Learning. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:1537-1554 Available from https://proceedings.mlr.press/v162/bao22c.html.

Related Material

Download PDF