Accelerated Diffusion Models via Speculative Sampling

Valentin De Bortoli, Alexandre Galashov, Arthur Gretton, Arnaud Doucet
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:12590-12631, 2025.

Abstract

Speculative sampling is a popular technique for accelerating inference in Large Language Models by generating candidate tokens using a fast draft model and then accepting or rejecting them based on the target model’s distribution. While speculative sampling was previously limited to discrete sequences, we extend it to diffusion models, which generate samples via continuous, vector-valued Markov chains. In this context, the target model is a high-quality but computationally expensive diffusion model. We propose various drafting strategies, including a simple and effective approach that does not require training a draft model and is applicable out-of-the-box to any diffusion model. We demonstrate significant generation speedup on various diffusion models, halving the number of function evaluations while generating exact samples from the target model. Finally, we also show how this procedure can be used to accelerate Langevin diffusions to sample unnormalized distributions.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-de-bortoli25a, title = {Accelerated Diffusion Models via Speculative Sampling}, author = {De Bortoli, Valentin and Galashov, Alexandre and Gretton, Arthur and Doucet, Arnaud}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {12590--12631}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/de-bortoli25a/de-bortoli25a.pdf}, url = {https://proceedings.mlr.press/v267/de-bortoli25a.html}, abstract = {Speculative sampling is a popular technique for accelerating inference in Large Language Models by generating candidate tokens using a fast draft model and then accepting or rejecting them based on the target model’s distribution. While speculative sampling was previously limited to discrete sequences, we extend it to diffusion models, which generate samples via continuous, vector-valued Markov chains. In this context, the target model is a high-quality but computationally expensive diffusion model. We propose various drafting strategies, including a simple and effective approach that does not require training a draft model and is applicable out-of-the-box to any diffusion model. We demonstrate significant generation speedup on various diffusion models, halving the number of function evaluations while generating exact samples from the target model. Finally, we also show how this procedure can be used to accelerate Langevin diffusions to sample unnormalized distributions.} }
Endnote
%0 Conference Paper %T Accelerated Diffusion Models via Speculative Sampling %A Valentin De Bortoli %A Alexandre Galashov %A Arthur Gretton %A Arnaud Doucet %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-de-bortoli25a %I PMLR %P 12590--12631 %U https://proceedings.mlr.press/v267/de-bortoli25a.html %V 267 %X Speculative sampling is a popular technique for accelerating inference in Large Language Models by generating candidate tokens using a fast draft model and then accepting or rejecting them based on the target model’s distribution. While speculative sampling was previously limited to discrete sequences, we extend it to diffusion models, which generate samples via continuous, vector-valued Markov chains. In this context, the target model is a high-quality but computationally expensive diffusion model. We propose various drafting strategies, including a simple and effective approach that does not require training a draft model and is applicable out-of-the-box to any diffusion model. We demonstrate significant generation speedup on various diffusion models, halving the number of function evaluations while generating exact samples from the target model. Finally, we also show how this procedure can be used to accelerate Langevin diffusions to sample unnormalized distributions.
APA
De Bortoli, V., Galashov, A., Gretton, A. & Doucet, A.. (2025). Accelerated Diffusion Models via Speculative Sampling. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:12590-12631 Available from https://proceedings.mlr.press/v267/de-bortoli25a.html.

Related Material