FlexControl: Computation-Aware Conditional Control with Differentiable Router for Text-to-Image Generation

Zheng Fang; Lichuan Xiang; Xu Cai; Kaicheng Zhou; Hongkai Wen

FlexControl: Computation-Aware Conditional Control with Differentiable Router for Text-to-Image Generation

Zheng Fang, Lichuan Xiang, Xu Cai, Kaicheng Zhou, Hongkai Wen

Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:16058-16077, 2025.

Abstract

Spatial conditioning control offers a powerful way to guide diffusion-based generative models. Yet, most implementations (e.g., ControlNet) rely on ad-hoc heuristics to choose which network blocks to control — an approach that varies unpredictably with different tasks. To address this gap, we propose FlexControl, a novel framework that equips all diffusion blocks with control signals during training and employs a trainable gating mechanism to dynamically select which control signal to activate at each denoising step. By introducing a computation-aware loss, we can encourage the control signal to activate only when it benefits the generation quality. By eliminating manual control unit selection, FlexControl enhances adaptability across diverse tasks and streamlines the design pipeline with computation-aware training loss in an end-to-end training manner. Through comprehensive experiments on both UNet and DiT architectures on different control methods, we show that our method can upgrade existing controllable generative models in certain key aspects of interest. As evidenced by both quantitative and qualitative evaluations, FlexControl preserves or enhances image fidelity while also reducing computational overhead by selectively activating the most relevant blocks to control. These results underscore the potential of a flexible, data-driven approach for controlled diffusion and open new avenues for efficient generative model design. The code will soon be available at https://github.com/Daryu-Fan/FlexControl.

Cite this Paper

BibTeX

@InProceedings{pmlr-v267-fang25j,
  title = 	 {{F}lex{C}ontrol: Computation-Aware Conditional Control with Differentiable Router for Text-to-Image Generation},
  author =       {Fang, Zheng and Xiang, Lichuan and Cai, Xu and Zhou, Kaicheng and Wen, Hongkai},
  booktitle = 	 {Proceedings of the 42nd International Conference on Machine Learning},
  pages = 	 {16058--16077},
  year = 	 {2025},
  editor = 	 {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry},
  volume = 	 {267},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--19 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v267/main/assets/fang25j/fang25j.pdf},
  url = 	 {https://proceedings.mlr.press/v267/fang25j.html},
  abstract = 	 {Spatial conditioning control offers a powerful way to guide diffusion-based generative models. Yet, most implementations (e.g., ControlNet) rely on ad-hoc heuristics to choose which network blocks to control — an approach that varies unpredictably with different tasks. To address this gap, we propose FlexControl, a novel framework that equips all diffusion blocks with control signals during training and employs a trainable gating mechanism to dynamically select which control signal to activate at each denoising step. By introducing a computation-aware loss, we can encourage the control signal to activate only when it benefits the generation quality. By eliminating manual control unit selection, FlexControl enhances adaptability across diverse tasks and streamlines the design pipeline with computation-aware training loss in an end-to-end training manner. Through comprehensive experiments on both UNet and DiT architectures on different control methods, we show that our method can upgrade existing controllable generative models in certain key aspects of interest. As evidenced by both quantitative and qualitative evaluations, FlexControl preserves or enhances image fidelity while also reducing computational overhead by selectively activating the most relevant blocks to control. These results underscore the potential of a flexible, data-driven approach for controlled diffusion and open new avenues for efficient generative model design. The code will soon be available at https://github.com/Daryu-Fan/FlexControl.}
}

Endnote

%0 Conference Paper
%T FlexControl: Computation-Aware Conditional Control with Differentiable Router for Text-to-Image Generation
%A Zheng Fang
%A Lichuan Xiang
%A Xu Cai
%A Kaicheng Zhou
%A Hongkai Wen
%B Proceedings of the 42nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2025
%E Aarti Singh
%E Maryam Fazel
%E Daniel Hsu
%E Simon Lacoste-Julien
%E Felix Berkenkamp
%E Tegan Maharaj
%E Kiri Wagstaff
%E Jerry Zhu	
%F pmlr-v267-fang25j
%I PMLR
%P 16058--16077
%U https://proceedings.mlr.press/v267/fang25j.html
%V 267
%X Spatial conditioning control offers a powerful way to guide diffusion-based generative models. Yet, most implementations (e.g., ControlNet) rely on ad-hoc heuristics to choose which network blocks to control — an approach that varies unpredictably with different tasks. To address this gap, we propose FlexControl, a novel framework that equips all diffusion blocks with control signals during training and employs a trainable gating mechanism to dynamically select which control signal to activate at each denoising step. By introducing a computation-aware loss, we can encourage the control signal to activate only when it benefits the generation quality. By eliminating manual control unit selection, FlexControl enhances adaptability across diverse tasks and streamlines the design pipeline with computation-aware training loss in an end-to-end training manner. Through comprehensive experiments on both UNet and DiT architectures on different control methods, we show that our method can upgrade existing controllable generative models in certain key aspects of interest. As evidenced by both quantitative and qualitative evaluations, FlexControl preserves or enhances image fidelity while also reducing computational overhead by selectively activating the most relevant blocks to control. These results underscore the potential of a flexible, data-driven approach for controlled diffusion and open new avenues for efficient generative model design. The code will soon be available at https://github.com/Daryu-Fan/FlexControl.

APA

Fang, Z., Xiang, L., Cai, X., Zhou, K. & Wen, H.. (2025). FlexControl: Computation-Aware Conditional Control with Differentiable Router for Text-to-Image Generation. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:16058-16077 Available from https://proceedings.mlr.press/v267/fang25j.html.

FlexControl: Computation-Aware Conditional Control with Differentiable Router for Text-to-Image Generation

Abstract

Cite this Paper

Related Material