Simple Disentanglement of Style and Content in Visual Representations

Lilian Ngweta; Subha Maity; Alex Gittens; Yuekai Sun; Mikhail Yurochkin

Simple Disentanglement of Style and Content in Visual Representations

Lilian Ngweta, Subha Maity, Alex Gittens, Yuekai Sun, Mikhail Yurochkin

Proceedings of the 40th International Conference on Machine Learning, PMLR 202:26063-26086, 2023.

Abstract

Learning visual representations with interpretable features, i.e., disentangled representations, remains a challenging problem. Existing methods demonstrate some success but are hard to apply to large-scale vision datasets like ImageNet. In this work, we propose a simple post-processing framework to disentangle content and style in learned representations from pre-trained vision models. We model the pre-trained features probabilistically as linearly entangled combinations of the latent content and style factors and develop a simple disentanglement algorithm based on the probabilistic model. We show that the method provably disentangles content and style features and verify its efficacy empirically. Our post-processed features yield significant domain generalization performance improvements when the distribution shift occurs due to style changes or style-related spurious correlations.

Cite this Paper

BibTeX


@InProceedings{pmlr-v202-ngweta23a,
  title = 	 {Simple Disentanglement of Style and Content in Visual Representations},
  author =       {Ngweta, Lilian and Maity, Subha and Gittens, Alex and Sun, Yuekai and Yurochkin, Mikhail},
  booktitle = 	 {Proceedings of the 40th International Conference on Machine Learning},
  pages = 	 {26063--26086},
  year = 	 {2023},
  editor = 	 {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan},
  volume = 	 {202},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {23--29 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v202/ngweta23a/ngweta23a.pdf},
  url = 	 {https://proceedings.mlr.press/v202/ngweta23a.html},
  abstract = 	 {Learning visual representations with interpretable features, i.e., disentangled representations, remains a challenging problem. Existing methods demonstrate some success but are hard to apply to large-scale vision datasets like ImageNet. In this work, we propose a simple post-processing framework to disentangle content and style in learned representations from pre-trained vision models. We model the pre-trained features probabilistically as linearly entangled combinations of the latent content and style factors and develop a simple disentanglement algorithm based on the probabilistic model. We show that the method provably disentangles content and style features and verify its efficacy empirically. Our post-processed features yield significant domain generalization performance improvements when the distribution shift occurs due to style changes or style-related spurious correlations.}
}

Endnote

%0 Conference Paper
%T Simple Disentanglement of Style and Content in Visual Representations
%A Lilian Ngweta
%A Subha Maity
%A Alex Gittens
%A Yuekai Sun
%A Mikhail Yurochkin
%B Proceedings of the 40th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2023
%E Andreas Krause
%E Emma Brunskill
%E Kyunghyun Cho
%E Barbara Engelhardt
%E Sivan Sabato
%E Jonathan Scarlett	
%F pmlr-v202-ngweta23a
%I PMLR
%P 26063--26086
%U https://proceedings.mlr.press/v202/ngweta23a.html
%V 202
%X Learning visual representations with interpretable features, i.e., disentangled representations, remains a challenging problem. Existing methods demonstrate some success but are hard to apply to large-scale vision datasets like ImageNet. In this work, we propose a simple post-processing framework to disentangle content and style in learned representations from pre-trained vision models. We model the pre-trained features probabilistically as linearly entangled combinations of the latent content and style factors and develop a simple disentanglement algorithm based on the probabilistic model. We show that the method provably disentangles content and style features and verify its efficacy empirically. Our post-processed features yield significant domain generalization performance improvements when the distribution shift occurs due to style changes or style-related spurious correlations.

APA


Ngweta, L., Maity, S., Gittens, A., Sun, Y. & Yurochkin, M.. (2023). Simple Disentanglement of Style and Content in Visual Representations. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:26063-26086 Available from https://proceedings.mlr.press/v202/ngweta23a.html.

Simple Disentanglement of Style and Content in Visual Representations

Abstract

Cite this Paper

Related Material