Loss-Guided Diffusion Models for Plug-and-Play Controllable Generation

Jiaming Song, Qinsheng Zhang, Hongxu Yin, Morteza Mardani, Ming-Yu Liu, Jan Kautz, Yongxin Chen, Arash Vahdat
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:32483-32498, 2023.

Abstract

We consider guiding denoising diffusion models with general differentiable loss functions in a plug-and-play fashion, enabling controllable generation without additional training. This paradigm, termed Loss-Guided Diffusion (LGD), can easily be integrated into all diffusion models and leverage various efficient samplers. Despite the benefits, the resulting guidance term is, unfortunately, an intractable integral and needs to be approximated. Existing methods compute the guidance term based on a point estimate. However, we show that such approaches have significant errors over the scale of the approximations. To address this issue, we propose a Monte Carlo method that uses multiple samples from a suitable distribution to reduce bias. Our method is effective in various synthetic and real-world settings, including image super-resolution, text or label-conditional image generation, and controllable motion synthesis. Notably, we show how our method can be applied to control a pretrained motion diffusion model to follow certain paths and avoid obstacles that are proven challenging to prior methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-song23k, title = {Loss-Guided Diffusion Models for Plug-and-Play Controllable Generation}, author = {Song, Jiaming and Zhang, Qinsheng and Yin, Hongxu and Mardani, Morteza and Liu, Ming-Yu and Kautz, Jan and Chen, Yongxin and Vahdat, Arash}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {32483--32498}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/song23k/song23k.pdf}, url = {https://proceedings.mlr.press/v202/song23k.html}, abstract = {We consider guiding denoising diffusion models with general differentiable loss functions in a plug-and-play fashion, enabling controllable generation without additional training. This paradigm, termed Loss-Guided Diffusion (LGD), can easily be integrated into all diffusion models and leverage various efficient samplers. Despite the benefits, the resulting guidance term is, unfortunately, an intractable integral and needs to be approximated. Existing methods compute the guidance term based on a point estimate. However, we show that such approaches have significant errors over the scale of the approximations. To address this issue, we propose a Monte Carlo method that uses multiple samples from a suitable distribution to reduce bias. Our method is effective in various synthetic and real-world settings, including image super-resolution, text or label-conditional image generation, and controllable motion synthesis. Notably, we show how our method can be applied to control a pretrained motion diffusion model to follow certain paths and avoid obstacles that are proven challenging to prior methods.} }
Endnote
%0 Conference Paper %T Loss-Guided Diffusion Models for Plug-and-Play Controllable Generation %A Jiaming Song %A Qinsheng Zhang %A Hongxu Yin %A Morteza Mardani %A Ming-Yu Liu %A Jan Kautz %A Yongxin Chen %A Arash Vahdat %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-song23k %I PMLR %P 32483--32498 %U https://proceedings.mlr.press/v202/song23k.html %V 202 %X We consider guiding denoising diffusion models with general differentiable loss functions in a plug-and-play fashion, enabling controllable generation without additional training. This paradigm, termed Loss-Guided Diffusion (LGD), can easily be integrated into all diffusion models and leverage various efficient samplers. Despite the benefits, the resulting guidance term is, unfortunately, an intractable integral and needs to be approximated. Existing methods compute the guidance term based on a point estimate. However, we show that such approaches have significant errors over the scale of the approximations. To address this issue, we propose a Monte Carlo method that uses multiple samples from a suitable distribution to reduce bias. Our method is effective in various synthetic and real-world settings, including image super-resolution, text or label-conditional image generation, and controllable motion synthesis. Notably, we show how our method can be applied to control a pretrained motion diffusion model to follow certain paths and avoid obstacles that are proven challenging to prior methods.
APA
Song, J., Zhang, Q., Yin, H., Mardani, M., Liu, M., Kautz, J., Chen, Y. & Vahdat, A.. (2023). Loss-Guided Diffusion Models for Plug-and-Play Controllable Generation. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:32483-32498 Available from https://proceedings.mlr.press/v202/song23k.html.

Related Material