Enhancing Foundation Models with Federated Domain Knowledge Infusion

Jiaqi Wang, Jingtao Li, Weiming Zhuang, Chen Chen, Lingjuan Lyu, Fenglong Ma
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:63621-63635, 2025.

Abstract

Vision foundation models (FMs) like CLIP have exhibited exceptional capabilities in visual and linguistic understanding, particularly in zero-shot inference tasks. However, these models struggle with data that significantly deviates from their training samples, necessitating fine-tuning, which is often infeasible in centralized settings due to data privacy concerns. Federated learning (FL) combined with parameter-efficient fine-tuning (PEFT) offers a potential solution, yet existing methods face issues with domain-specific characteristics and out-of-domain generalization. We propose a cross-silo Federated Adapter Generalization (FedAG), a novel federated fine-tuning approach that leverages multiple fine-grained adapters to capture domain-specific knowledge while enhancing out-of-domain generalization. Our method uses quality-aware in-domain mutual learning and attention-regularized cross-domain learning to integrate domain-specific insights effectively. Experiments of the CLIP model on three domain-shifting datasets, ImageCLEF-DA, Office-Home, and DomainNet, demonstrate the superior performance of FedAG in both in-domain and out-of-domain scenarios. We envision this work as a milestone for generalizing CLIP to handle the challenge of out-of-domain knowledge under federated learning setting.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-wang25bk, title = {Enhancing Foundation Models with Federated Domain Knowledge Infusion}, author = {Wang, Jiaqi and Li, Jingtao and Zhuang, Weiming and Chen, Chen and Lyu, Lingjuan and Ma, Fenglong}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {63621--63635}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/wang25bk/wang25bk.pdf}, url = {https://proceedings.mlr.press/v267/wang25bk.html}, abstract = {Vision foundation models (FMs) like CLIP have exhibited exceptional capabilities in visual and linguistic understanding, particularly in zero-shot inference tasks. However, these models struggle with data that significantly deviates from their training samples, necessitating fine-tuning, which is often infeasible in centralized settings due to data privacy concerns. Federated learning (FL) combined with parameter-efficient fine-tuning (PEFT) offers a potential solution, yet existing methods face issues with domain-specific characteristics and out-of-domain generalization. We propose a cross-silo Federated Adapter Generalization (FedAG), a novel federated fine-tuning approach that leverages multiple fine-grained adapters to capture domain-specific knowledge while enhancing out-of-domain generalization. Our method uses quality-aware in-domain mutual learning and attention-regularized cross-domain learning to integrate domain-specific insights effectively. Experiments of the CLIP model on three domain-shifting datasets, ImageCLEF-DA, Office-Home, and DomainNet, demonstrate the superior performance of FedAG in both in-domain and out-of-domain scenarios. We envision this work as a milestone for generalizing CLIP to handle the challenge of out-of-domain knowledge under federated learning setting.} }
Endnote
%0 Conference Paper %T Enhancing Foundation Models with Federated Domain Knowledge Infusion %A Jiaqi Wang %A Jingtao Li %A Weiming Zhuang %A Chen Chen %A Lingjuan Lyu %A Fenglong Ma %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-wang25bk %I PMLR %P 63621--63635 %U https://proceedings.mlr.press/v267/wang25bk.html %V 267 %X Vision foundation models (FMs) like CLIP have exhibited exceptional capabilities in visual and linguistic understanding, particularly in zero-shot inference tasks. However, these models struggle with data that significantly deviates from their training samples, necessitating fine-tuning, which is often infeasible in centralized settings due to data privacy concerns. Federated learning (FL) combined with parameter-efficient fine-tuning (PEFT) offers a potential solution, yet existing methods face issues with domain-specific characteristics and out-of-domain generalization. We propose a cross-silo Federated Adapter Generalization (FedAG), a novel federated fine-tuning approach that leverages multiple fine-grained adapters to capture domain-specific knowledge while enhancing out-of-domain generalization. Our method uses quality-aware in-domain mutual learning and attention-regularized cross-domain learning to integrate domain-specific insights effectively. Experiments of the CLIP model on three domain-shifting datasets, ImageCLEF-DA, Office-Home, and DomainNet, demonstrate the superior performance of FedAG in both in-domain and out-of-domain scenarios. We envision this work as a milestone for generalizing CLIP to handle the challenge of out-of-domain knowledge under federated learning setting.
APA
Wang, J., Li, J., Zhuang, W., Chen, C., Lyu, L. & Ma, F.. (2025). Enhancing Foundation Models with Federated Domain Knowledge Infusion. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:63621-63635 Available from https://proceedings.mlr.press/v267/wang25bk.html.

Related Material