[edit]
To smooth a cloud or to pin it down: Expressiveness guarantees and insights on score matching in denoising diffusion models
Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence, PMLR 244:3094-3120, 2024.
Abstract
Denoising diffusion models are a class of generative models that have recently achieved state-of-the-art results across many domains. Gradual noise is added to the data using a diffusion process, which transforms the data distribution into a Gaussian. Samples from the generative model are then obtained by simulating an approximation of the time reversal of this diffusion initialized by Gaussian samples. Recent research has explored the sampling error achieved by diffusion models under the assumption of an absolute error $\epsilon$ achieved via a neural approximation of the score. To the best of our knowledge, no work formally quantifies the error of such neural approximation to the score. In this paper, we close the gap and present quantitative error bounds for approximating the score of denoising diffusion models using neural networks leveraging ideas from stochastic control. Finally, through simulation, we explore some of the insights that arise from our results confirming that diffusion models based on the Ornstein-Uhlenbeck (OU) process require fewer parameters to better approximate the score than those based on the Fölmer drift / Pinned Brownian Motion.