A First-order Generative Bilevel Optimization Framework for Diffusion Models

Quan Xiao; Hui Yuan; A F M Saif; Gaowen Liu; Ramana Rao Kompella; Mengdi Wang; Tianyi Chen

A First-order Generative Bilevel Optimization Framework for Diffusion Models

Quan Xiao, Hui Yuan, A F M Saif, Gaowen Liu, Ramana Rao Kompella, Mengdi Wang, Tianyi Chen

Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:68535-68558, 2025.

Abstract

Diffusion models, which iteratively denoise data samples to synthesize high-quality outputs, have achieved empirical success across domains. However, optimizing these models for downstream tasks often involves nested bilevel structures, such as tuning hyperparameters for fine-tuning tasks or noise schedules in training dynamics, where traditional bilevel methods fail due to the infinite-dimensional probability space and prohibitive sampling costs. We formalize this challenge as a generative bilevel optimization problem and address two key scenarios: (1) fine-tuning pre-trained models via an inference-only lower-level solver paired with a sample-efficient gradient estimator for the upper level, and (2) training diffusion model from scratch with noise schedule optimization by reparameterizing the lower-level problem and designing a computationally tractable gradient estimator. Our first-order bilevel framework overcomes the incompatibility of conventional bilevel methods with diffusion processes, offering theoretical grounding and computational practicality. Experiments demonstrate that our method outperforms existing fine-tuning and hyperparameter search baselines.

Cite this Paper

BibTeX

@InProceedings{pmlr-v267-xiao25i,
  title = 	 {A First-order Generative Bilevel Optimization Framework for Diffusion Models},
  author =       {Xiao, Quan and Yuan, Hui and Saif, A F M and Liu, Gaowen and Kompella, Ramana Rao and Wang, Mengdi and Chen, Tianyi},
  booktitle = 	 {Proceedings of the 42nd International Conference on Machine Learning},
  pages = 	 {68535--68558},
  year = 	 {2025},
  editor = 	 {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry},
  volume = 	 {267},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--19 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v267/main/assets/xiao25i/xiao25i.pdf},
  url = 	 {https://proceedings.mlr.press/v267/xiao25i.html},
  abstract = 	 {Diffusion models, which iteratively denoise data samples to synthesize high-quality outputs, have achieved empirical success across domains. However, optimizing these models for downstream tasks often involves nested bilevel structures, such as tuning hyperparameters for fine-tuning tasks or noise schedules in training dynamics, where traditional bilevel methods fail due to the infinite-dimensional probability space and prohibitive sampling costs. We formalize this challenge as a generative bilevel optimization problem and address two key scenarios: (1) fine-tuning pre-trained models via an inference-only lower-level solver paired with a sample-efficient gradient estimator for the upper level, and (2) training diffusion model from scratch with noise schedule optimization by reparameterizing the lower-level problem and designing a computationally tractable gradient estimator. Our first-order bilevel framework overcomes the incompatibility of conventional bilevel methods with diffusion processes, offering theoretical grounding and computational practicality. Experiments demonstrate that our method outperforms existing fine-tuning and hyperparameter search baselines.}
}

Endnote

%0 Conference Paper
%T A First-order Generative Bilevel Optimization Framework for Diffusion Models
%A Quan Xiao
%A Hui Yuan
%A A F M Saif
%A Gaowen Liu
%A Ramana Rao Kompella
%A Mengdi Wang
%A Tianyi Chen
%B Proceedings of the 42nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2025
%E Aarti Singh
%E Maryam Fazel
%E Daniel Hsu
%E Simon Lacoste-Julien
%E Felix Berkenkamp
%E Tegan Maharaj
%E Kiri Wagstaff
%E Jerry Zhu	
%F pmlr-v267-xiao25i
%I PMLR
%P 68535--68558
%U https://proceedings.mlr.press/v267/xiao25i.html
%V 267
%X Diffusion models, which iteratively denoise data samples to synthesize high-quality outputs, have achieved empirical success across domains. However, optimizing these models for downstream tasks often involves nested bilevel structures, such as tuning hyperparameters for fine-tuning tasks or noise schedules in training dynamics, where traditional bilevel methods fail due to the infinite-dimensional probability space and prohibitive sampling costs. We formalize this challenge as a generative bilevel optimization problem and address two key scenarios: (1) fine-tuning pre-trained models via an inference-only lower-level solver paired with a sample-efficient gradient estimator for the upper level, and (2) training diffusion model from scratch with noise schedule optimization by reparameterizing the lower-level problem and designing a computationally tractable gradient estimator. Our first-order bilevel framework overcomes the incompatibility of conventional bilevel methods with diffusion processes, offering theoretical grounding and computational practicality. Experiments demonstrate that our method outperforms existing fine-tuning and hyperparameter search baselines.

APA

Xiao, Q., Yuan, H., Saif, A.F.M., Liu, G., Kompella, R.R., Wang, M. & Chen, T.. (2025). A First-order Generative Bilevel Optimization Framework for Diffusion Models. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:68535-68558 Available from https://proceedings.mlr.press/v267/xiao25i.html.

A First-order Generative Bilevel Optimization Framework for Diffusion Models

Abstract

Cite this Paper

Related Material