Interpreting and Improving Diffusion Models from an Optimization Perspective

Frank Permenter, Chenyang Yuan
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:40461-40483, 2024.

Abstract

Denoising is intuitively related to projection. Indeed, under the manifold hypothesis, adding random noise is approximately equivalent to orthogonal perturbation. Hence, learning to denoise is approximately learning to project. In this paper, we use this observation to interpret denoising diffusion models as approximate gradient descent applied to the Euclidean distance function. We then provide straight-forward convergence analysis of the DDIM sampler under simple assumptions on the projection error of the denoiser. Finally, we propose a new gradient-estimation sampler, generalizing DDIM using insights from our theoretical results. In as few as 5-10 function evaluations, our sampler achieves state-of-the-art FID scores on pretrained CIFAR-10 and CelebA models and can generate high quality samples on latent diffusion models.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-permenter24a, title = {Interpreting and Improving Diffusion Models from an Optimization Perspective}, author = {Permenter, Frank and Yuan, Chenyang}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {40461--40483}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/permenter24a/permenter24a.pdf}, url = {https://proceedings.mlr.press/v235/permenter24a.html}, abstract = {Denoising is intuitively related to projection. Indeed, under the manifold hypothesis, adding random noise is approximately equivalent to orthogonal perturbation. Hence, learning to denoise is approximately learning to project. In this paper, we use this observation to interpret denoising diffusion models as approximate gradient descent applied to the Euclidean distance function. We then provide straight-forward convergence analysis of the DDIM sampler under simple assumptions on the projection error of the denoiser. Finally, we propose a new gradient-estimation sampler, generalizing DDIM using insights from our theoretical results. In as few as 5-10 function evaluations, our sampler achieves state-of-the-art FID scores on pretrained CIFAR-10 and CelebA models and can generate high quality samples on latent diffusion models.} }
Endnote
%0 Conference Paper %T Interpreting and Improving Diffusion Models from an Optimization Perspective %A Frank Permenter %A Chenyang Yuan %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-permenter24a %I PMLR %P 40461--40483 %U https://proceedings.mlr.press/v235/permenter24a.html %V 235 %X Denoising is intuitively related to projection. Indeed, under the manifold hypothesis, adding random noise is approximately equivalent to orthogonal perturbation. Hence, learning to denoise is approximately learning to project. In this paper, we use this observation to interpret denoising diffusion models as approximate gradient descent applied to the Euclidean distance function. We then provide straight-forward convergence analysis of the DDIM sampler under simple assumptions on the projection error of the denoiser. Finally, we propose a new gradient-estimation sampler, generalizing DDIM using insights from our theoretical results. In as few as 5-10 function evaluations, our sampler achieves state-of-the-art FID scores on pretrained CIFAR-10 and CelebA models and can generate high quality samples on latent diffusion models.
APA
Permenter, F. & Yuan, C.. (2024). Interpreting and Improving Diffusion Models from an Optimization Perspective. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:40461-40483 Available from https://proceedings.mlr.press/v235/permenter24a.html.

Related Material