InfoDiffusion: Representation Learning Using Information Maximizing Diffusion Models

Yingheng Wang, Yair Schiff, Aaron Gokaslan, Weishen Pan, Fei Wang, Christopher De Sa, Volodymyr Kuleshov
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:36336-36354, 2023.

Abstract

While diffusion models excel at generating high-quality samples, their latent variables typically lack semantic meaning and are not suitable for representation learning. Here, we propose InfoDiffusion, an algorithm that augments diffusion models with low-dimensional latent variables that capture high-level factors of variation in the data. InfoDiffusion relies on a learning objective regularized with the mutual information between observed and hidden variables, which improves latent space quality and prevents the latents from being ignored by expressive diffusion-based decoders. Empirically, we find that InfoDiffusion learns disentangled and human-interpretable latent representations that are competitive with state-of-the-art generative and contrastive methods, while retaining the high sample quality of diffusion models. Our method enables manipulating the attributes of generated images and has the potential to assist tasks that require exploring a learned latent space to generate quality samples, e.g., generative design.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-wang23ah, title = {{I}nfo{D}iffusion: Representation Learning Using Information Maximizing Diffusion Models}, author = {Wang, Yingheng and Schiff, Yair and Gokaslan, Aaron and Pan, Weishen and Wang, Fei and De Sa, Christopher and Kuleshov, Volodymyr}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {36336--36354}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/wang23ah/wang23ah.pdf}, url = {https://proceedings.mlr.press/v202/wang23ah.html}, abstract = {While diffusion models excel at generating high-quality samples, their latent variables typically lack semantic meaning and are not suitable for representation learning. Here, we propose InfoDiffusion, an algorithm that augments diffusion models with low-dimensional latent variables that capture high-level factors of variation in the data. InfoDiffusion relies on a learning objective regularized with the mutual information between observed and hidden variables, which improves latent space quality and prevents the latents from being ignored by expressive diffusion-based decoders. Empirically, we find that InfoDiffusion learns disentangled and human-interpretable latent representations that are competitive with state-of-the-art generative and contrastive methods, while retaining the high sample quality of diffusion models. Our method enables manipulating the attributes of generated images and has the potential to assist tasks that require exploring a learned latent space to generate quality samples, e.g., generative design.} }
Endnote
%0 Conference Paper %T InfoDiffusion: Representation Learning Using Information Maximizing Diffusion Models %A Yingheng Wang %A Yair Schiff %A Aaron Gokaslan %A Weishen Pan %A Fei Wang %A Christopher De Sa %A Volodymyr Kuleshov %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-wang23ah %I PMLR %P 36336--36354 %U https://proceedings.mlr.press/v202/wang23ah.html %V 202 %X While diffusion models excel at generating high-quality samples, their latent variables typically lack semantic meaning and are not suitable for representation learning. Here, we propose InfoDiffusion, an algorithm that augments diffusion models with low-dimensional latent variables that capture high-level factors of variation in the data. InfoDiffusion relies on a learning objective regularized with the mutual information between observed and hidden variables, which improves latent space quality and prevents the latents from being ignored by expressive diffusion-based decoders. Empirically, we find that InfoDiffusion learns disentangled and human-interpretable latent representations that are competitive with state-of-the-art generative and contrastive methods, while retaining the high sample quality of diffusion models. Our method enables manipulating the attributes of generated images and has the potential to assist tasks that require exploring a learned latent space to generate quality samples, e.g., generative design.
APA
Wang, Y., Schiff, Y., Gokaslan, A., Pan, W., Wang, F., De Sa, C. & Kuleshov, V.. (2023). InfoDiffusion: Representation Learning Using Information Maximizing Diffusion Models. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:36336-36354 Available from https://proceedings.mlr.press/v202/wang23ah.html.

Related Material