WeGeFT: Weight-Generative Fine-Tuning for Multi-Faceted Efficient Adaptation of Large Models

Chinmay Savadikar; Xi Song; Tianfu Wu

WeGeFT: Weight-Generative Fine-Tuning for Multi-Faceted Efficient Adaptation of Large Models

Chinmay Savadikar, Xi Song, Tianfu Wu

Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:53019-53041, 2025.

Abstract

Fine-tuning large pretrained Transformer models can focus on either introducing a small number of new learnable parameters (parameter efficiency) or editing representations of a small number of tokens using lightweight modules (representation efficiency). While the pioneering method LoRA (Low-Rank Adaptation) inherently balances parameter, compute, and memory efficiency, many subsequent variants trade off compute and memory efficiency and/or performance to further reduce fine-tuning parameters. To address this limitation and unify parameter-efficient and representation-efficient fine-tuning, we propose Weight-Generative Fine-Tuning (WeGeFT, pronounced wee-gift), a novel approach that learns to generate fine-tuning weights directly from the pretrained weights. WeGeFT employs a simple low-rank formulation consisting of two linear layers, either shared across multiple layers of the pretrained model or individually learned for different layers. This design achieves multi-faceted efficiency in parameters, representations, compute, and memory, while maintaining or exceeding the performance of LoRA and its variants. Extensive experiments on commonsense reasoning, arithmetic reasoning, instruction following, code generation, and visual recognition verify the effectiveness of our proposed WeGeFT.

Cite this Paper

BibTeX

@InProceedings{pmlr-v267-savadikar25a,
  title = 	 {{W}e{G}e{FT}: {W}eight-{G}enerative {F}ine-{T}uning for {M}ulti-{F}aceted Efficient Adaptation of Large Models},
  author =       {Savadikar, Chinmay and Song, Xi and Wu, Tianfu},
  booktitle = 	 {Proceedings of the 42nd International Conference on Machine Learning},
  pages = 	 {53019--53041},
  year = 	 {2025},
  editor = 	 {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry},
  volume = 	 {267},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--19 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v267/main/assets/savadikar25a/savadikar25a.pdf},
  url = 	 {https://proceedings.mlr.press/v267/savadikar25a.html},
  abstract = 	 {Fine-tuning large pretrained Transformer models can focus on either introducing a small number of new learnable parameters (parameter efficiency) or editing representations of a small number of tokens using lightweight modules (representation efficiency). While the pioneering method LoRA (Low-Rank Adaptation) inherently balances parameter, compute, and memory efficiency, many subsequent variants trade off compute and memory efficiency and/or performance to further reduce fine-tuning parameters. To address this limitation and unify parameter-efficient and representation-efficient fine-tuning, we propose Weight-Generative Fine-Tuning (WeGeFT, pronounced wee-gift), a novel approach that learns to generate fine-tuning weights directly from the pretrained weights. WeGeFT employs a simple low-rank formulation consisting of two linear layers, either shared across multiple layers of the pretrained model or individually learned for different layers. This design achieves multi-faceted efficiency in parameters, representations, compute, and memory, while maintaining or exceeding the performance of LoRA and its variants. Extensive experiments on commonsense reasoning, arithmetic reasoning, instruction following, code generation, and visual recognition verify the effectiveness of our proposed WeGeFT.}
}

Endnote

%0 Conference Paper
%T WeGeFT: Weight-Generative Fine-Tuning for Multi-Faceted Efficient Adaptation of Large Models
%A Chinmay Savadikar
%A Xi Song
%A Tianfu Wu
%B Proceedings of the 42nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2025
%E Aarti Singh
%E Maryam Fazel
%E Daniel Hsu
%E Simon Lacoste-Julien
%E Felix Berkenkamp
%E Tegan Maharaj
%E Kiri Wagstaff
%E Jerry Zhu	
%F pmlr-v267-savadikar25a
%I PMLR
%P 53019--53041
%U https://proceedings.mlr.press/v267/savadikar25a.html
%V 267
%X Fine-tuning large pretrained Transformer models can focus on either introducing a small number of new learnable parameters (parameter efficiency) or editing representations of a small number of tokens using lightweight modules (representation efficiency). While the pioneering method LoRA (Low-Rank Adaptation) inherently balances parameter, compute, and memory efficiency, many subsequent variants trade off compute and memory efficiency and/or performance to further reduce fine-tuning parameters. To address this limitation and unify parameter-efficient and representation-efficient fine-tuning, we propose Weight-Generative Fine-Tuning (WeGeFT, pronounced wee-gift), a novel approach that learns to generate fine-tuning weights directly from the pretrained weights. WeGeFT employs a simple low-rank formulation consisting of two linear layers, either shared across multiple layers of the pretrained model or individually learned for different layers. This design achieves multi-faceted efficiency in parameters, representations, compute, and memory, while maintaining or exceeding the performance of LoRA and its variants. Extensive experiments on commonsense reasoning, arithmetic reasoning, instruction following, code generation, and visual recognition verify the effectiveness of our proposed WeGeFT.

APA

Savadikar, C., Song, X. & Wu, T.. (2025). WeGeFT: Weight-Generative Fine-Tuning for Multi-Faceted Efficient Adaptation of Large Models. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:53019-53041 Available from https://proceedings.mlr.press/v267/savadikar25a.html.

WeGeFT: Weight-Generative Fine-Tuning for Multi-Faceted Efficient Adaptation of Large Models

Abstract

Cite this Paper

Related Material