MERGE$^3$: Efficient Evolutionary Merging on Consumer-grade GPUs

Tommaso Mencattini, Robert Adrian Minut, Donato Crisostomi, Andrea Santilli, Emanuele Rodolà
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:43694-43715, 2025.

Abstract

Evolutionary model merging enables the creation of high-performing multi-task models but remains computationally prohibitive for consumer hardware. We introduce MERGE$^3$, an efficient framework that makes evolutionary merging of Large Language Models (LLMs) feasible on a single GPU by reducing fitness computation costs 50$\times$ while retaining a large fraction of the original performance. MERGE$^3$ achieves this by Extracting a reduced dataset for evaluation, Estimating model abilities using Item Response Theory (IRT), and Evolving optimal merges via IRT-based performance estimators. Our method enables state-of-the-art multilingual and cross-lingual merging, transferring knowledge across languages with significantly lower computational overhead. We provide theoretical guarantees and an open-source library, democratizing high-quality model merging.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-mencattini25a, title = {{MERGE}$^3$: Efficient Evolutionary Merging on Consumer-grade {GPU}s}, author = {Mencattini, Tommaso and Minut, Robert Adrian and Crisostomi, Donato and Santilli, Andrea and Rodol\`{a}, Emanuele}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {43694--43715}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/mencattini25a/mencattini25a.pdf}, url = {https://proceedings.mlr.press/v267/mencattini25a.html}, abstract = {Evolutionary model merging enables the creation of high-performing multi-task models but remains computationally prohibitive for consumer hardware. We introduce MERGE$^3$, an efficient framework that makes evolutionary merging of Large Language Models (LLMs) feasible on a single GPU by reducing fitness computation costs 50$\times$ while retaining a large fraction of the original performance. MERGE$^3$ achieves this by Extracting a reduced dataset for evaluation, Estimating model abilities using Item Response Theory (IRT), and Evolving optimal merges via IRT-based performance estimators. Our method enables state-of-the-art multilingual and cross-lingual merging, transferring knowledge across languages with significantly lower computational overhead. We provide theoretical guarantees and an open-source library, democratizing high-quality model merging.} }
Endnote
%0 Conference Paper %T MERGE$^3$: Efficient Evolutionary Merging on Consumer-grade GPUs %A Tommaso Mencattini %A Robert Adrian Minut %A Donato Crisostomi %A Andrea Santilli %A Emanuele Rodolà %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-mencattini25a %I PMLR %P 43694--43715 %U https://proceedings.mlr.press/v267/mencattini25a.html %V 267 %X Evolutionary model merging enables the creation of high-performing multi-task models but remains computationally prohibitive for consumer hardware. We introduce MERGE$^3$, an efficient framework that makes evolutionary merging of Large Language Models (LLMs) feasible on a single GPU by reducing fitness computation costs 50$\times$ while retaining a large fraction of the original performance. MERGE$^3$ achieves this by Extracting a reduced dataset for evaluation, Estimating model abilities using Item Response Theory (IRT), and Evolving optimal merges via IRT-based performance estimators. Our method enables state-of-the-art multilingual and cross-lingual merging, transferring knowledge across languages with significantly lower computational overhead. We provide theoretical guarantees and an open-source library, democratizing high-quality model merging.
APA
Mencattini, T., Minut, R.A., Crisostomi, D., Santilli, A. & Rodolà, E.. (2025). MERGE$^3$: Efficient Evolutionary Merging on Consumer-grade GPUs. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:43694-43715 Available from https://proceedings.mlr.press/v267/mencattini25a.html.

Related Material