Neutral residues: revisiting adapters for model extension

Franck Signe Talla, Edouard Grave, Herve Jegou
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:58431-58447, 2025.

Abstract

We address the problem of extending a pre-trained large language model to a new domain that was not seen during training. Standard techniques, such as fine-tuning or low-rank adaptation (LoRA) are successful at domain adaptation, but do not formally add capacity to the model. This often leads to a trade-off, between performing well on the new domain vs. degrading performance on the original domain. Here, we propose to revisit and improve adapters to extend LLMs. Our paper analyzes this extension problem from three angles: data, architecture and training procedure, which are advantageously considered jointly. The resulting method, called neutral residues, modifies adapters in a way that leads to each new residual block to output near-zeros on the original domain. This solution leads to strong results when adapting a state-of-the-art model originally trained on English to a new language. Neutral residues significantly outperforms competing approaches such as fine-tuning, LoRA or vanilla adapters in terms of the trade-off between learning the new language and not forgetting English.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-talla25a, title = {Neutral residues: revisiting adapters for model extension}, author = {Talla, Franck Signe and Grave, Edouard and Jegou, Herve}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {58431--58447}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/talla25a/talla25a.pdf}, url = {https://proceedings.mlr.press/v267/talla25a.html}, abstract = {We address the problem of extending a pre-trained large language model to a new domain that was not seen during training. Standard techniques, such as fine-tuning or low-rank adaptation (LoRA) are successful at domain adaptation, but do not formally add capacity to the model. This often leads to a trade-off, between performing well on the new domain vs. degrading performance on the original domain. Here, we propose to revisit and improve adapters to extend LLMs. Our paper analyzes this extension problem from three angles: data, architecture and training procedure, which are advantageously considered jointly. The resulting method, called neutral residues, modifies adapters in a way that leads to each new residual block to output near-zeros on the original domain. This solution leads to strong results when adapting a state-of-the-art model originally trained on English to a new language. Neutral residues significantly outperforms competing approaches such as fine-tuning, LoRA or vanilla adapters in terms of the trade-off between learning the new language and not forgetting English.} }
Endnote
%0 Conference Paper %T Neutral residues: revisiting adapters for model extension %A Franck Signe Talla %A Edouard Grave %A Herve Jegou %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-talla25a %I PMLR %P 58431--58447 %U https://proceedings.mlr.press/v267/talla25a.html %V 267 %X We address the problem of extending a pre-trained large language model to a new domain that was not seen during training. Standard techniques, such as fine-tuning or low-rank adaptation (LoRA) are successful at domain adaptation, but do not formally add capacity to the model. This often leads to a trade-off, between performing well on the new domain vs. degrading performance on the original domain. Here, we propose to revisit and improve adapters to extend LLMs. Our paper analyzes this extension problem from three angles: data, architecture and training procedure, which are advantageously considered jointly. The resulting method, called neutral residues, modifies adapters in a way that leads to each new residual block to output near-zeros on the original domain. This solution leads to strong results when adapting a state-of-the-art model originally trained on English to a new language. Neutral residues significantly outperforms competing approaches such as fine-tuning, LoRA or vanilla adapters in terms of the trade-off between learning the new language and not forgetting English.
APA
Talla, F.S., Grave, E. & Jegou, H.. (2025). Neutral residues: revisiting adapters for model extension. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:58431-58447 Available from https://proceedings.mlr.press/v267/talla25a.html.

Related Material