Dual-Module Collaborative LoRA for Effective Large Language Model Fine-Tuning

Lecheng Cao, Junhao Zhang, Xiaohan Zhang, Huaxiong Li
Proceedings of the 17th Asian Conference on Machine Learning, PMLR 304:782-797, 2025.

Abstract

To enable parameter-efficient fine-tuning of large language models (LLMs), Low-Rank Adaptation (LoRA) reduces parameters by freezing pretrained weights $W_\{0\}$ and approximating updates via low-rank matrices $\\Delta W = BA$. However, standard LoRA neglects the differential impact of low-rank matrix components on model performance and suffers from slow convergence due to random initialization. To address this, we propose a dual-module architecture: The shared module inherits pretrained weights’s core semantic representations through principal component initialization, retaining residuals in the original model.The expert module incorporates a selection mechanism guided by importance screening, with orthogonality constraints imposed through loss regularization to ensure independence in parameter update directions. The shared module accelerates convergence by updating world knowledge, while the expert module dynamically screens domain knowledge to achieve efficient allocation of updated budgets. Extensive experiments under identical configurations show our method achieves 76.8% average accuracy on Commonsense 170k (Llama 2-7B), surpassing LoRA by 2.1%. On GSM8K and HumanEval, it outperforms LoRA by 2.3% and 9.7% respectively.

Cite this Paper


BibTeX
@InProceedings{pmlr-v304-cao25b, title = {Dual-Module Collaborative LoRA for Effective Large Language Model Fine-Tuning}, author = {Cao, Lecheng and Zhang, Junhao and Zhang, Xiaohan and Li, Huaxiong}, booktitle = {Proceedings of the 17th Asian Conference on Machine Learning}, pages = {782--797}, year = {2025}, editor = {Lee, Hung-yi and Liu, Tongliang}, volume = {304}, series = {Proceedings of Machine Learning Research}, month = {09--12 Dec}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v304/main/assets/cao25b/cao25b.pdf}, url = {https://proceedings.mlr.press/v304/cao25b.html}, abstract = {To enable parameter-efficient fine-tuning of large language models (LLMs), Low-Rank Adaptation (LoRA) reduces parameters by freezing pretrained weights $W_\{0\}$ and approximating updates via low-rank matrices $\\Delta W = BA$. However, standard LoRA neglects the differential impact of low-rank matrix components on model performance and suffers from slow convergence due to random initialization. To address this, we propose a dual-module architecture: The shared module inherits pretrained weights’s core semantic representations through principal component initialization, retaining residuals in the original model.The expert module incorporates a selection mechanism guided by importance screening, with orthogonality constraints imposed through loss regularization to ensure independence in parameter update directions. The shared module accelerates convergence by updating world knowledge, while the expert module dynamically screens domain knowledge to achieve efficient allocation of updated budgets. Extensive experiments under identical configurations show our method achieves 76.8% average accuracy on Commonsense 170k (Llama 2-7B), surpassing LoRA by 2.1%. On GSM8K and HumanEval, it outperforms LoRA by 2.3% and 9.7% respectively.} }
Endnote
%0 Conference Paper %T Dual-Module Collaborative LoRA for Effective Large Language Model Fine-Tuning %A Lecheng Cao %A Junhao Zhang %A Xiaohan Zhang %A Huaxiong Li %B Proceedings of the 17th Asian Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Hung-yi Lee %E Tongliang Liu %F pmlr-v304-cao25b %I PMLR %P 782--797 %U https://proceedings.mlr.press/v304/cao25b.html %V 304 %X To enable parameter-efficient fine-tuning of large language models (LLMs), Low-Rank Adaptation (LoRA) reduces parameters by freezing pretrained weights $W_\{0\}$ and approximating updates via low-rank matrices $\\Delta W = BA$. However, standard LoRA neglects the differential impact of low-rank matrix components on model performance and suffers from slow convergence due to random initialization. To address this, we propose a dual-module architecture: The shared module inherits pretrained weights’s core semantic representations through principal component initialization, retaining residuals in the original model.The expert module incorporates a selection mechanism guided by importance screening, with orthogonality constraints imposed through loss regularization to ensure independence in parameter update directions. The shared module accelerates convergence by updating world knowledge, while the expert module dynamically screens domain knowledge to achieve efficient allocation of updated budgets. Extensive experiments under identical configurations show our method achieves 76.8% average accuracy on Commonsense 170k (Llama 2-7B), surpassing LoRA by 2.1%. On GSM8K and HumanEval, it outperforms LoRA by 2.3% and 9.7% respectively.
APA
Cao, L., Zhang, J., Zhang, X. & Li, H.. (2025). Dual-Module Collaborative LoRA for Effective Large Language Model Fine-Tuning. Proceedings of the 17th Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 304:782-797 Available from https://proceedings.mlr.press/v304/cao25b.html.

Related Material