Directly Denoising Diffusion Models

Dan Zhang, Jingjing Wang, Feng Luo
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:59951-59974, 2024.

Abstract

In this paper, we present Directly Denoising Diffusion Models (DDDMs): a simple and generic approach for generating realistic images with few-step sampling, while multistep sampling is still preserved for better performance. DDDMs require no delicately designed samplers nor distillation on pre-trained distillation models. DDDMs train the diffusion model conditioned on an estimated target that was generated from previous training iterations of its own. To generate images, samples generated from previous timestep are also taken into consideration, guiding the generation process iteratively. We further propose Pseudo-LPIPS, a novel metric loss that is more robust to various values of hyperparameter. Despite its simplicity, the proposed approach can achieve strong performance in benchmark datasets. Our model achieves FID scores of 2.57 and 2.33 on CIFAR-10 in one-step and two-step sampling respectively, surpassing those obtained from GANs and distillation-based models. By extending the sampling to 1000 steps, we further reduce FID score to 1.79, aligning with state-of-the-art methods in the literature. For ImageNet 64x64, our approach stands as a competitive contender against leading models.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-zhang24bl, title = {Directly Denoising Diffusion Models}, author = {Zhang, Dan and Wang, Jingjing and Luo, Feng}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {59951--59974}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/zhang24bl/zhang24bl.pdf}, url = {https://proceedings.mlr.press/v235/zhang24bl.html}, abstract = {In this paper, we present Directly Denoising Diffusion Models (DDDMs): a simple and generic approach for generating realistic images with few-step sampling, while multistep sampling is still preserved for better performance. DDDMs require no delicately designed samplers nor distillation on pre-trained distillation models. DDDMs train the diffusion model conditioned on an estimated target that was generated from previous training iterations of its own. To generate images, samples generated from previous timestep are also taken into consideration, guiding the generation process iteratively. We further propose Pseudo-LPIPS, a novel metric loss that is more robust to various values of hyperparameter. Despite its simplicity, the proposed approach can achieve strong performance in benchmark datasets. Our model achieves FID scores of 2.57 and 2.33 on CIFAR-10 in one-step and two-step sampling respectively, surpassing those obtained from GANs and distillation-based models. By extending the sampling to 1000 steps, we further reduce FID score to 1.79, aligning with state-of-the-art methods in the literature. For ImageNet 64x64, our approach stands as a competitive contender against leading models.} }
Endnote
%0 Conference Paper %T Directly Denoising Diffusion Models %A Dan Zhang %A Jingjing Wang %A Feng Luo %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-zhang24bl %I PMLR %P 59951--59974 %U https://proceedings.mlr.press/v235/zhang24bl.html %V 235 %X In this paper, we present Directly Denoising Diffusion Models (DDDMs): a simple and generic approach for generating realistic images with few-step sampling, while multistep sampling is still preserved for better performance. DDDMs require no delicately designed samplers nor distillation on pre-trained distillation models. DDDMs train the diffusion model conditioned on an estimated target that was generated from previous training iterations of its own. To generate images, samples generated from previous timestep are also taken into consideration, guiding the generation process iteratively. We further propose Pseudo-LPIPS, a novel metric loss that is more robust to various values of hyperparameter. Despite its simplicity, the proposed approach can achieve strong performance in benchmark datasets. Our model achieves FID scores of 2.57 and 2.33 on CIFAR-10 in one-step and two-step sampling respectively, surpassing those obtained from GANs and distillation-based models. By extending the sampling to 1000 steps, we further reduce FID score to 1.79, aligning with state-of-the-art methods in the literature. For ImageNet 64x64, our approach stands as a competitive contender against leading models.
APA
Zhang, D., Wang, J. & Luo, F.. (2024). Directly Denoising Diffusion Models. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:59951-59974 Available from https://proceedings.mlr.press/v235/zhang24bl.html.

Related Material