Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data

Minshuo Chen; Kaixuan Huang; Tuo Zhao; Mengdi Wang

Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data

Minshuo Chen, Kaixuan Huang, Tuo Zhao, Mengdi Wang

Proceedings of the 40th International Conference on Machine Learning, PMLR 202:4672-4712, 2023.

Abstract

Diffusion models achieve state-of-the-art performance in various generation tasks. However, their theoretical foundations fall far behind. This paper studies score approximation, estimation, and distribution recovery of diffusion models, when data are supported on an unknown low-dimensional linear subspace. Our result provides sample complexity bounds for distribution estimation using diffusion models. We show that with a properly chosen neural network architecture, the score function can be both accurately approximated and efficiently estimated. Further, the generated distribution based on the estimated score function captures the data geometric structures and converges to a close vicinity of the data distribution. The convergence rate depends on subspace dimension, implying that diffusion models can circumvent the curse of data ambient dimensionality.

Cite this Paper

BibTeX


@InProceedings{pmlr-v202-chen23o,
  title = 	 {Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data},
  author =       {Chen, Minshuo and Huang, Kaixuan and Zhao, Tuo and Wang, Mengdi},
  booktitle = 	 {Proceedings of the 40th International Conference on Machine Learning},
  pages = 	 {4672--4712},
  year = 	 {2023},
  editor = 	 {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan},
  volume = 	 {202},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {23--29 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v202/chen23o/chen23o.pdf},
  url = 	 {https://proceedings.mlr.press/v202/chen23o.html},
  abstract = 	 {Diffusion models achieve state-of-the-art performance in various generation tasks. However, their theoretical foundations fall far behind. This paper studies score approximation, estimation, and distribution recovery of diffusion models, when data are supported on an unknown low-dimensional linear subspace. Our result provides sample complexity bounds for distribution estimation using diffusion models. We show that with a properly chosen neural network architecture, the score function can be both accurately approximated and efficiently estimated. Further, the generated distribution based on the estimated score function captures the data geometric structures and converges to a close vicinity of the data distribution. The convergence rate depends on subspace dimension, implying that diffusion models can circumvent the curse of data ambient dimensionality.}
}

Endnote

%0 Conference Paper
%T Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data
%A Minshuo Chen
%A Kaixuan Huang
%A Tuo Zhao
%A Mengdi Wang
%B Proceedings of the 40th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2023
%E Andreas Krause
%E Emma Brunskill
%E Kyunghyun Cho
%E Barbara Engelhardt
%E Sivan Sabato
%E Jonathan Scarlett	
%F pmlr-v202-chen23o
%I PMLR
%P 4672--4712
%U https://proceedings.mlr.press/v202/chen23o.html
%V 202
%X Diffusion models achieve state-of-the-art performance in various generation tasks. However, their theoretical foundations fall far behind. This paper studies score approximation, estimation, and distribution recovery of diffusion models, when data are supported on an unknown low-dimensional linear subspace. Our result provides sample complexity bounds for distribution estimation using diffusion models. We show that with a properly chosen neural network architecture, the score function can be both accurately approximated and efficiently estimated. Further, the generated distribution based on the estimated score function captures the data geometric structures and converges to a close vicinity of the data distribution. The convergence rate depends on subspace dimension, implying that diffusion models can circumvent the curse of data ambient dimensionality.

APA


Chen, M., Huang, K., Zhao, T. & Wang, M.. (2023). Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:4672-4712 Available from https://proceedings.mlr.press/v202/chen23o.html.

Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data

Abstract

Cite this Paper

Related Material