Diffusion Models are Secretly Exchangeable: Parallelizing DDPMs via Auto Speculation

Hengyuan Hu, Aniket Das, Dorsa Sadigh, Nima Anari
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:24270-24289, 2025.

Abstract

Denoising Diffusion Probabilistic Models (DDPMs) have emerged as powerful tools for generative modeling. However, their sequential computation requirements lead to significant inference-time bottlenecks. In this work, we utilize the connection between DDPMs and Stochastic Localization to prove that, under an appropriate reparametrization, the increments of DDPM satisfy an exchangeability property. This general insight enables near-black-box adaptation of various performance optimization techniques from autoregressive models to the diffusion setting. To demonstrate this, we introduce Autospeculative Decoding (ASD), an extension of the widely used speculative decoding algorithm to DDPMs that does not require any auxiliary draft models. Our theoretical analysis shows that ASD achieves a $\tilde{O}(K^{\frac{1}{3}})$ parallel runtime speedup over the $K$ step sequential DDPM. We also demonstrate that a practical implementation of autospeculative decoding accelerates DDPM inference significantly in various domains.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-hu25d, title = {Diffusion Models are Secretly Exchangeable: Parallelizing {DDPM}s via Auto Speculation}, author = {Hu, Hengyuan and Das, Aniket and Sadigh, Dorsa and Anari, Nima}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {24270--24289}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/hu25d/hu25d.pdf}, url = {https://proceedings.mlr.press/v267/hu25d.html}, abstract = {Denoising Diffusion Probabilistic Models (DDPMs) have emerged as powerful tools for generative modeling. However, their sequential computation requirements lead to significant inference-time bottlenecks. In this work, we utilize the connection between DDPMs and Stochastic Localization to prove that, under an appropriate reparametrization, the increments of DDPM satisfy an exchangeability property. This general insight enables near-black-box adaptation of various performance optimization techniques from autoregressive models to the diffusion setting. To demonstrate this, we introduce Autospeculative Decoding (ASD), an extension of the widely used speculative decoding algorithm to DDPMs that does not require any auxiliary draft models. Our theoretical analysis shows that ASD achieves a $\tilde{O}(K^{\frac{1}{3}})$ parallel runtime speedup over the $K$ step sequential DDPM. We also demonstrate that a practical implementation of autospeculative decoding accelerates DDPM inference significantly in various domains.} }
Endnote
%0 Conference Paper %T Diffusion Models are Secretly Exchangeable: Parallelizing DDPMs via Auto Speculation %A Hengyuan Hu %A Aniket Das %A Dorsa Sadigh %A Nima Anari %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-hu25d %I PMLR %P 24270--24289 %U https://proceedings.mlr.press/v267/hu25d.html %V 267 %X Denoising Diffusion Probabilistic Models (DDPMs) have emerged as powerful tools for generative modeling. However, their sequential computation requirements lead to significant inference-time bottlenecks. In this work, we utilize the connection between DDPMs and Stochastic Localization to prove that, under an appropriate reparametrization, the increments of DDPM satisfy an exchangeability property. This general insight enables near-black-box adaptation of various performance optimization techniques from autoregressive models to the diffusion setting. To demonstrate this, we introduce Autospeculative Decoding (ASD), an extension of the widely used speculative decoding algorithm to DDPMs that does not require any auxiliary draft models. Our theoretical analysis shows that ASD achieves a $\tilde{O}(K^{\frac{1}{3}})$ parallel runtime speedup over the $K$ step sequential DDPM. We also demonstrate that a practical implementation of autospeculative decoding accelerates DDPM inference significantly in various domains.
APA
Hu, H., Das, A., Sadigh, D. & Anari, N.. (2025). Diffusion Models are Secretly Exchangeable: Parallelizing DDPMs via Auto Speculation. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:24270-24289 Available from https://proceedings.mlr.press/v267/hu25d.html.

Related Material