SinDDM: A Single Image Denoising Diffusion Model

Vladimir Kulikov, Shahar Yadin, Matan Kleiner, Tomer Michaeli
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:17920-17930, 2023.

Abstract

Denoising diffusion models (DDMs) have led to staggering performance leaps in image generation, editing and restoration. However, existing DDMs use very large datasets for training. Here, we introduce a framework for training a DDM on a single image. Our method, which we coin SinDDM, learns the internal statistics of the training image by using a multi-scale diffusion process. To drive the reverse diffusion process, we use a fully-convolutional light-weight denoiser, which is conditioned on both the noise level and the scale. This architecture allows generating samples of arbitrary dimensions, in a coarse-to-fine manner. As we illustrate, SinDDM generates diverse high-quality samples, and is applicable in a wide array of tasks, including style transfer and harmonization. Furthermore, it can be easily guided by external supervision. Particularly, we demonstrate text-guided generation from a single image using a pre-trained CLIP model.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-kulikov23a, title = {{S}in{DDM}: A Single Image Denoising Diffusion Model}, author = {Kulikov, Vladimir and Yadin, Shahar and Kleiner, Matan and Michaeli, Tomer}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {17920--17930}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/kulikov23a/kulikov23a.pdf}, url = {https://proceedings.mlr.press/v202/kulikov23a.html}, abstract = {Denoising diffusion models (DDMs) have led to staggering performance leaps in image generation, editing and restoration. However, existing DDMs use very large datasets for training. Here, we introduce a framework for training a DDM on a single image. Our method, which we coin SinDDM, learns the internal statistics of the training image by using a multi-scale diffusion process. To drive the reverse diffusion process, we use a fully-convolutional light-weight denoiser, which is conditioned on both the noise level and the scale. This architecture allows generating samples of arbitrary dimensions, in a coarse-to-fine manner. As we illustrate, SinDDM generates diverse high-quality samples, and is applicable in a wide array of tasks, including style transfer and harmonization. Furthermore, it can be easily guided by external supervision. Particularly, we demonstrate text-guided generation from a single image using a pre-trained CLIP model.} }
Endnote
%0 Conference Paper %T SinDDM: A Single Image Denoising Diffusion Model %A Vladimir Kulikov %A Shahar Yadin %A Matan Kleiner %A Tomer Michaeli %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-kulikov23a %I PMLR %P 17920--17930 %U https://proceedings.mlr.press/v202/kulikov23a.html %V 202 %X Denoising diffusion models (DDMs) have led to staggering performance leaps in image generation, editing and restoration. However, existing DDMs use very large datasets for training. Here, we introduce a framework for training a DDM on a single image. Our method, which we coin SinDDM, learns the internal statistics of the training image by using a multi-scale diffusion process. To drive the reverse diffusion process, we use a fully-convolutional light-weight denoiser, which is conditioned on both the noise level and the scale. This architecture allows generating samples of arbitrary dimensions, in a coarse-to-fine manner. As we illustrate, SinDDM generates diverse high-quality samples, and is applicable in a wide array of tasks, including style transfer and harmonization. Furthermore, it can be easily guided by external supervision. Particularly, we demonstrate text-guided generation from a single image using a pre-trained CLIP model.
APA
Kulikov, V., Yadin, S., Kleiner, M. & Michaeli, T.. (2023). SinDDM: A Single Image Denoising Diffusion Model. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:17920-17930 Available from https://proceedings.mlr.press/v202/kulikov23a.html.

Related Material