Modulated Diffusion: Accelerating Generative Modeling with Modulated Quantization

Weizhi Gao; Zhichao Hou; Junqi Yin; Feiyi Wang; Linyu Peng; Xiaorui Liu

Modulated Diffusion: Accelerating Generative Modeling with Modulated Quantization

Weizhi Gao, Zhichao Hou, Junqi Yin, Feiyi Wang, Linyu Peng, Xiaorui Liu

Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:18337-18362, 2025.

Abstract

Diffusion models have emerged as powerful generative models, but their high computational cost in iterative sampling remains a significant bottleneck. In this work, we present an in-depth and insightful study of state-of-the-art acceleration techniques for diffusion models, including caching and quantization, and reveal their limitations in computation error and generation quality. To break these limits, this work introduces Modulated Diffusion (MoDiff), an innovative, rigorous, and principled framework that accelerates generative modeling through modulated quantization and error compensation. MoDiff not only inherits the advantages of existing caching and quantization methods but also serves as a general framework to accelerate all diffusion models. The advantages of MoDiff are supported by solid theoretical insight and analysis. In addition, extensive experiments on CIFAR-10 and LSUN demonstrate that MoDiff significantly reduces activation quantization from 8 bits to 3 bits without performance degradation in post-training quantization (PTQ). Our code implementation is available at https://github.com/WeizhiGao/MoDiff.

Cite this Paper

BibTeX

@InProceedings{pmlr-v267-gao25d,
  title = 	 {Modulated Diffusion: Accelerating Generative Modeling with Modulated Quantization},
  author =       {Gao, Weizhi and Hou, Zhichao and Yin, Junqi and Wang, Feiyi and Peng, Linyu and Liu, Xiaorui},
  booktitle = 	 {Proceedings of the 42nd International Conference on Machine Learning},
  pages = 	 {18337--18362},
  year = 	 {2025},
  editor = 	 {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry},
  volume = 	 {267},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--19 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v267/main/assets/gao25d/gao25d.pdf},
  url = 	 {https://proceedings.mlr.press/v267/gao25d.html},
  abstract = 	 {Diffusion models have emerged as powerful generative models, but their high computational cost in iterative sampling remains a significant bottleneck. In this work, we present an in-depth and insightful study of state-of-the-art acceleration techniques for diffusion models, including caching and quantization, and reveal their limitations in computation error and generation quality. To break these limits, this work introduces Modulated Diffusion (MoDiff), an innovative, rigorous, and principled framework that accelerates generative modeling through modulated quantization and error compensation. MoDiff not only inherits the advantages of existing caching and quantization methods but also serves as a general framework to accelerate all diffusion models. The advantages of MoDiff are supported by solid theoretical insight and analysis. In addition, extensive experiments on CIFAR-10 and LSUN demonstrate that MoDiff significantly reduces activation quantization from 8 bits to 3 bits without performance degradation in post-training quantization (PTQ). Our code implementation is available at https://github.com/WeizhiGao/MoDiff.}
}

Endnote

%0 Conference Paper
%T Modulated Diffusion: Accelerating Generative Modeling with Modulated Quantization
%A Weizhi Gao
%A Zhichao Hou
%A Junqi Yin
%A Feiyi Wang
%A Linyu Peng
%A Xiaorui Liu
%B Proceedings of the 42nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2025
%E Aarti Singh
%E Maryam Fazel
%E Daniel Hsu
%E Simon Lacoste-Julien
%E Felix Berkenkamp
%E Tegan Maharaj
%E Kiri Wagstaff
%E Jerry Zhu	
%F pmlr-v267-gao25d
%I PMLR
%P 18337--18362
%U https://proceedings.mlr.press/v267/gao25d.html
%V 267
%X Diffusion models have emerged as powerful generative models, but their high computational cost in iterative sampling remains a significant bottleneck. In this work, we present an in-depth and insightful study of state-of-the-art acceleration techniques for diffusion models, including caching and quantization, and reveal their limitations in computation error and generation quality. To break these limits, this work introduces Modulated Diffusion (MoDiff), an innovative, rigorous, and principled framework that accelerates generative modeling through modulated quantization and error compensation. MoDiff not only inherits the advantages of existing caching and quantization methods but also serves as a general framework to accelerate all diffusion models. The advantages of MoDiff are supported by solid theoretical insight and analysis. In addition, extensive experiments on CIFAR-10 and LSUN demonstrate that MoDiff significantly reduces activation quantization from 8 bits to 3 bits without performance degradation in post-training quantization (PTQ). Our code implementation is available at https://github.com/WeizhiGao/MoDiff.

APA

Gao, W., Hou, Z., Yin, J., Wang, F., Peng, L. & Liu, X.. (2025). Modulated Diffusion: Accelerating Generative Modeling with Modulated Quantization. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:18337-18362 Available from https://proceedings.mlr.press/v267/gao25d.html.

Modulated Diffusion: Accelerating Generative Modeling with Modulated Quantization

Abstract

Cite this Paper

Related Material