What happens to diffusion model likelihood when your model is conditional?

Mattias Cross, Anton Ragni
Proceedings of the 1st ECAI Workshop on "Machine Learning Meets Differential Equations: From Theory to Applications", PMLR 255:1-14, 2024.

Abstract

Diffusion Models (DMs) iteratively denoise random samples to produce high-quality data. The iterative sampling process is derived from Stochastic Differential Equations (SDEs), allowing a speed-quality trade-off chosen at inference. Another advantage of sampling with differential equations is exact likelihood computation. These likelihoods have been used to rank unconditional DMs and for out-of-domain classification. Despite the many existing and possible uses of DM likelihoods, the distinct properties captured are unknown, especially in conditional contexts such as Text-To-Image (TTI) or Text-To-Speech synthesis (TTS). Surprisingly, we find that TTS DM likelihoods are agnostic to the text input. TTI likelihood is more expressive but cannot discern confounding prompts. Our results show that applying DMs to conditional tasks reveals inconsistencies and strengthens claims that the properties of DM likelihood are unknown. This impact sheds light on the previously unknown nature of DM likelihoods. Although conditional DMs maximise likelihood, the likelihood in question is not as sensitive to the conditioning input as one expects. This investigation provides a new point-of-view on diffusion likelihoods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v255-cross24a, title = {What happens to diffusion model likelihood when your model is conditional?}, author = {Cross, Mattias and Ragni, Anton}, booktitle = {Proceedings of the 1st ECAI Workshop on "Machine Learning Meets Differential Equations: From Theory to Applications"}, pages = {1--14}, year = {2024}, editor = {Coelho, Cecı́lia and Zimmering, Bernd and Costa, M. Fernanda P. and Ferrás, Luı́s L. and Niggemann, Oliver}, volume = {255}, series = {Proceedings of Machine Learning Research}, month = {20 Oct}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v255/main/assets/cross24a/cross24a.pdf}, url = {https://proceedings.mlr.press/v255/cross24a.html}, abstract = {Diffusion Models (DMs) iteratively denoise random samples to produce high-quality data. The iterative sampling process is derived from Stochastic Differential Equations (SDEs), allowing a speed-quality trade-off chosen at inference. Another advantage of sampling with differential equations is exact likelihood computation. These likelihoods have been used to rank unconditional DMs and for out-of-domain classification. Despite the many existing and possible uses of DM likelihoods, the distinct properties captured are unknown, especially in conditional contexts such as Text-To-Image (TTI) or Text-To-Speech synthesis (TTS). Surprisingly, we find that TTS DM likelihoods are agnostic to the text input. TTI likelihood is more expressive but cannot discern confounding prompts. Our results show that applying DMs to conditional tasks reveals inconsistencies and strengthens claims that the properties of DM likelihood are unknown. This impact sheds light on the previously unknown nature of DM likelihoods. Although conditional DMs maximise likelihood, the likelihood in question is not as sensitive to the conditioning input as one expects. This investigation provides a new point-of-view on diffusion likelihoods.} }
Endnote
%0 Conference Paper %T What happens to diffusion model likelihood when your model is conditional? %A Mattias Cross %A Anton Ragni %B Proceedings of the 1st ECAI Workshop on "Machine Learning Meets Differential Equations: From Theory to Applications" %C Proceedings of Machine Learning Research %D 2024 %E Cecı́lia Coelho %E Bernd Zimmering %E M. Fernanda P. Costa %E Luı́s L. Ferrás %E Oliver Niggemann %F pmlr-v255-cross24a %I PMLR %P 1--14 %U https://proceedings.mlr.press/v255/cross24a.html %V 255 %X Diffusion Models (DMs) iteratively denoise random samples to produce high-quality data. The iterative sampling process is derived from Stochastic Differential Equations (SDEs), allowing a speed-quality trade-off chosen at inference. Another advantage of sampling with differential equations is exact likelihood computation. These likelihoods have been used to rank unconditional DMs and for out-of-domain classification. Despite the many existing and possible uses of DM likelihoods, the distinct properties captured are unknown, especially in conditional contexts such as Text-To-Image (TTI) or Text-To-Speech synthesis (TTS). Surprisingly, we find that TTS DM likelihoods are agnostic to the text input. TTI likelihood is more expressive but cannot discern confounding prompts. Our results show that applying DMs to conditional tasks reveals inconsistencies and strengthens claims that the properties of DM likelihood are unknown. This impact sheds light on the previously unknown nature of DM likelihoods. Although conditional DMs maximise likelihood, the likelihood in question is not as sensitive to the conditioning input as one expects. This investigation provides a new point-of-view on diffusion likelihoods.
APA
Cross, M. & Ragni, A.. (2024). What happens to diffusion model likelihood when your model is conditional?. Proceedings of the 1st ECAI Workshop on "Machine Learning Meets Differential Equations: From Theory to Applications", in Proceedings of Machine Learning Research 255:1-14 Available from https://proceedings.mlr.press/v255/cross24a.html.

Related Material