Federated Disentangled Tuning with Textual Prior Decoupling and Visual Dynamic Adaptation

Yihao Yang, Wenke Huang, Guancheng Wan, Bin Yang, Mang Ye
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:70745-70755, 2025.

Abstract

Federated Parameter-Efficient Fine-Tuning aims to adapt Vision-Language Models for downstream tasks in distributed environments. However, data heterogeneity across participants hinders collaborative effectiveness, necessitating personalized adaptation to cover distinct data distributions. Current personalized methods suffer from two limitations. 1) Textual Property Loss: Existing methods facilitate the collaboration between decoupled prompts at the feature level, which potentially undermines the textual properties of the prompts. 2) Visual Feature Diversity: The diversity of visual features makes it challenging to leverage naive image features directly for image-text alignment in downstream tasks. In this work, we propose Federated Disentangled Tuning with Textual Prior Decoupling and Visual Dynamic Adaptation (FedDDA) to overcome the above limitations. Specifically, we encourage decoupling prompts in a way that maximizes the efficacy of prior knowledge, which is essential for maintaining a coherent linguistic context. Furthermore, we design a visual adaption model to reshape visual space to optimally align with the textual space. Extensive experiments on various image classification tasks show the effectiveness of our work in addressing data heterogeneity. The codes are released at https://github.com/MoratalYang/FedDDA.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-yang25k, title = {Federated Disentangled Tuning with Textual Prior Decoupling and Visual Dynamic Adaptation}, author = {Yang, Yihao and Huang, Wenke and Wan, Guancheng and Yang, Bin and Ye, Mang}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {70745--70755}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/yang25k/yang25k.pdf}, url = {https://proceedings.mlr.press/v267/yang25k.html}, abstract = {Federated Parameter-Efficient Fine-Tuning aims to adapt Vision-Language Models for downstream tasks in distributed environments. However, data heterogeneity across participants hinders collaborative effectiveness, necessitating personalized adaptation to cover distinct data distributions. Current personalized methods suffer from two limitations. 1) Textual Property Loss: Existing methods facilitate the collaboration between decoupled prompts at the feature level, which potentially undermines the textual properties of the prompts. 2) Visual Feature Diversity: The diversity of visual features makes it challenging to leverage naive image features directly for image-text alignment in downstream tasks. In this work, we propose Federated Disentangled Tuning with Textual Prior Decoupling and Visual Dynamic Adaptation (FedDDA) to overcome the above limitations. Specifically, we encourage decoupling prompts in a way that maximizes the efficacy of prior knowledge, which is essential for maintaining a coherent linguistic context. Furthermore, we design a visual adaption model to reshape visual space to optimally align with the textual space. Extensive experiments on various image classification tasks show the effectiveness of our work in addressing data heterogeneity. The codes are released at https://github.com/MoratalYang/FedDDA.} }
Endnote
%0 Conference Paper %T Federated Disentangled Tuning with Textual Prior Decoupling and Visual Dynamic Adaptation %A Yihao Yang %A Wenke Huang %A Guancheng Wan %A Bin Yang %A Mang Ye %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-yang25k %I PMLR %P 70745--70755 %U https://proceedings.mlr.press/v267/yang25k.html %V 267 %X Federated Parameter-Efficient Fine-Tuning aims to adapt Vision-Language Models for downstream tasks in distributed environments. However, data heterogeneity across participants hinders collaborative effectiveness, necessitating personalized adaptation to cover distinct data distributions. Current personalized methods suffer from two limitations. 1) Textual Property Loss: Existing methods facilitate the collaboration between decoupled prompts at the feature level, which potentially undermines the textual properties of the prompts. 2) Visual Feature Diversity: The diversity of visual features makes it challenging to leverage naive image features directly for image-text alignment in downstream tasks. In this work, we propose Federated Disentangled Tuning with Textual Prior Decoupling and Visual Dynamic Adaptation (FedDDA) to overcome the above limitations. Specifically, we encourage decoupling prompts in a way that maximizes the efficacy of prior knowledge, which is essential for maintaining a coherent linguistic context. Furthermore, we design a visual adaption model to reshape visual space to optimally align with the textual space. Extensive experiments on various image classification tasks show the effectiveness of our work in addressing data heterogeneity. The codes are released at https://github.com/MoratalYang/FedDDA.
APA
Yang, Y., Huang, W., Wan, G., Yang, B. & Ye, M.. (2025). Federated Disentangled Tuning with Textual Prior Decoupling and Visual Dynamic Adaptation. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:70745-70755 Available from https://proceedings.mlr.press/v267/yang25k.html.

Related Material