The Emergence of Reproducibility and Consistency in Diffusion Models

Huijie Zhang, Jinfan Zhou, Yifu Lu, Minzhe Guo, Peng Wang, Liyue Shen, Qing Qu
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:60558-60590, 2024.

Abstract

In this work, we investigate an intriguing and prevalent phenomenon of diffusion models which we term as "consistent model reproducibility”: given the same starting noise input and a deterministic sampler, different diffusion models often yield remarkably similar outputs. We confirm this phenomenon through comprehensive experiments, implying that different diffusion models consistently reach the same data distribution and score function regardless of diffusion model frameworks, model architectures, or training procedures. More strikingly, our further investigation implies that diffusion models are learning distinct distributions influenced by the training data size. This is evident in two distinct training regimes: (I) "memorization regime,” where the diffusion model overfits to the training data distribution, and (ii) "generalization regime,” where the model learns the underlying data distribution. Our study also finds that this valuable property generalizes to many variants of diffusion models, including those for conditional generation and solving inverse problems. Lastly, we discuss how our findings connect to existing research and highlight the practical implications of our discoveries.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-zhang24cn, title = {The Emergence of Reproducibility and Consistency in Diffusion Models}, author = {Zhang, Huijie and Zhou, Jinfan and Lu, Yifu and Guo, Minzhe and Wang, Peng and Shen, Liyue and Qu, Qing}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {60558--60590}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/zhang24cn/zhang24cn.pdf}, url = {https://proceedings.mlr.press/v235/zhang24cn.html}, abstract = {In this work, we investigate an intriguing and prevalent phenomenon of diffusion models which we term as "consistent model reproducibility”: given the same starting noise input and a deterministic sampler, different diffusion models often yield remarkably similar outputs. We confirm this phenomenon through comprehensive experiments, implying that different diffusion models consistently reach the same data distribution and score function regardless of diffusion model frameworks, model architectures, or training procedures. More strikingly, our further investigation implies that diffusion models are learning distinct distributions influenced by the training data size. This is evident in two distinct training regimes: (I) "memorization regime,” where the diffusion model overfits to the training data distribution, and (ii) "generalization regime,” where the model learns the underlying data distribution. Our study also finds that this valuable property generalizes to many variants of diffusion models, including those for conditional generation and solving inverse problems. Lastly, we discuss how our findings connect to existing research and highlight the practical implications of our discoveries.} }
Endnote
%0 Conference Paper %T The Emergence of Reproducibility and Consistency in Diffusion Models %A Huijie Zhang %A Jinfan Zhou %A Yifu Lu %A Minzhe Guo %A Peng Wang %A Liyue Shen %A Qing Qu %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-zhang24cn %I PMLR %P 60558--60590 %U https://proceedings.mlr.press/v235/zhang24cn.html %V 235 %X In this work, we investigate an intriguing and prevalent phenomenon of diffusion models which we term as "consistent model reproducibility”: given the same starting noise input and a deterministic sampler, different diffusion models often yield remarkably similar outputs. We confirm this phenomenon through comprehensive experiments, implying that different diffusion models consistently reach the same data distribution and score function regardless of diffusion model frameworks, model architectures, or training procedures. More strikingly, our further investigation implies that diffusion models are learning distinct distributions influenced by the training data size. This is evident in two distinct training regimes: (I) "memorization regime,” where the diffusion model overfits to the training data distribution, and (ii) "generalization regime,” where the model learns the underlying data distribution. Our study also finds that this valuable property generalizes to many variants of diffusion models, including those for conditional generation and solving inverse problems. Lastly, we discuss how our findings connect to existing research and highlight the practical implications of our discoveries.
APA
Zhang, H., Zhou, J., Lu, Y., Guo, M., Wang, P., Shen, L. & Qu, Q.. (2024). The Emergence of Reproducibility and Consistency in Diffusion Models. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:60558-60590 Available from https://proceedings.mlr.press/v235/zhang24cn.html.

Related Material