Enhancing Storage and Computational Efficiency in Federated Multimodal Learning for Large-Scale Models

Zixin Zhang, Fan Qi, Changsheng Xu
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:59685-59699, 2024.

Abstract

The remarkable generalization of large-scale models has recently gained significant attention in multimodal research. However, deploying heterogeneous large-scale models with different modalities under Federated Learning (FL) to protect data privacy imposes tremendous challenges on clients’ limited computation and storage. In this work, we propose M$^2$FedSA to address the above issue. We realize modularized decomposition of large-scale models via Split Learning (SL) and only retain privacy-sensitive modules on clients, alleviating storage overhead. By freezing large-scale models and introducing two specialized lightweight adapters, the models can better focus on task-specific knowledge and enhance modality-specific knowledge, improving the model’s adaptability to different tasks while balancing efficiency. In addition, M$^2$FedSA further improves performance by transferring multimodal knowledge to unimodal clients at both the feature and decision levels, which leverages the complementarity of different modalities. Extensive experiments on various multimodal classification tasks validate the effectiveness of our proposed M$^2$FedSA. The code is made available publicly at https://github.com/M2FedSA/M-2FedSA.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-zhang24az, title = {Enhancing Storage and Computational Efficiency in Federated Multimodal Learning for Large-Scale Models}, author = {Zhang, Zixin and Qi, Fan and Xu, Changsheng}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {59685--59699}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/zhang24az/zhang24az.pdf}, url = {https://proceedings.mlr.press/v235/zhang24az.html}, abstract = {The remarkable generalization of large-scale models has recently gained significant attention in multimodal research. However, deploying heterogeneous large-scale models with different modalities under Federated Learning (FL) to protect data privacy imposes tremendous challenges on clients’ limited computation and storage. In this work, we propose M$^2$FedSA to address the above issue. We realize modularized decomposition of large-scale models via Split Learning (SL) and only retain privacy-sensitive modules on clients, alleviating storage overhead. By freezing large-scale models and introducing two specialized lightweight adapters, the models can better focus on task-specific knowledge and enhance modality-specific knowledge, improving the model’s adaptability to different tasks while balancing efficiency. In addition, M$^2$FedSA further improves performance by transferring multimodal knowledge to unimodal clients at both the feature and decision levels, which leverages the complementarity of different modalities. Extensive experiments on various multimodal classification tasks validate the effectiveness of our proposed M$^2$FedSA. The code is made available publicly at https://github.com/M2FedSA/M-2FedSA.} }
Endnote
%0 Conference Paper %T Enhancing Storage and Computational Efficiency in Federated Multimodal Learning for Large-Scale Models %A Zixin Zhang %A Fan Qi %A Changsheng Xu %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-zhang24az %I PMLR %P 59685--59699 %U https://proceedings.mlr.press/v235/zhang24az.html %V 235 %X The remarkable generalization of large-scale models has recently gained significant attention in multimodal research. However, deploying heterogeneous large-scale models with different modalities under Federated Learning (FL) to protect data privacy imposes tremendous challenges on clients’ limited computation and storage. In this work, we propose M$^2$FedSA to address the above issue. We realize modularized decomposition of large-scale models via Split Learning (SL) and only retain privacy-sensitive modules on clients, alleviating storage overhead. By freezing large-scale models and introducing two specialized lightweight adapters, the models can better focus on task-specific knowledge and enhance modality-specific knowledge, improving the model’s adaptability to different tasks while balancing efficiency. In addition, M$^2$FedSA further improves performance by transferring multimodal knowledge to unimodal clients at both the feature and decision levels, which leverages the complementarity of different modalities. Extensive experiments on various multimodal classification tasks validate the effectiveness of our proposed M$^2$FedSA. The code is made available publicly at https://github.com/M2FedSA/M-2FedSA.
APA
Zhang, Z., Qi, F. & Xu, C.. (2024). Enhancing Storage and Computational Efficiency in Federated Multimodal Learning for Large-Scale Models. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:59685-59699 Available from https://proceedings.mlr.press/v235/zhang24az.html.

Related Material