MERGE$^3$: Efficient Evolutionary Merging on Consumer-grade GPUs

Tommaso Mencattini; Robert Adrian Minut; Donato Crisostomi; Andrea Santilli; Emanuele Rodolà

MERGE$^3$: Efficient Evolutionary Merging on Consumer-grade GPUs

Tommaso Mencattini, Robert Adrian Minut, Donato Crisostomi, Andrea Santilli, Emanuele Rodolà

Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:43694-43715, 2025.

Abstract

Evolutionary model merging enables the creation of high-performing multi-task models but remains computationally prohibitive for consumer hardware. We introduce MERGE$^3$, an efficient framework that makes evolutionary merging of Large Language Models (LLMs) feasible on a single GPU by reducing fitness computation costs 50$\times$ while retaining a large fraction of the original performance. MERGE$^3$ achieves this by Extracting a reduced dataset for evaluation, Estimating model abilities using Item Response Theory (IRT), and Evolving optimal merges via IRT-based performance estimators. Our method enables state-of-the-art multilingual and cross-lingual merging, transferring knowledge across languages with significantly lower computational overhead. We provide theoretical guarantees and an open-source library, democratizing high-quality model merging.

Cite this Paper

BibTeX

@InProceedings{pmlr-v267-mencattini25a,
  title = 	 {{MERGE}$^3$: Efficient Evolutionary Merging on Consumer-grade {GPU}s},
  author =       {Mencattini, Tommaso and Minut, Robert Adrian and Crisostomi, Donato and Santilli, Andrea and Rodol\`{a}, Emanuele},
  booktitle = 	 {Proceedings of the 42nd International Conference on Machine Learning},
  pages = 	 {43694--43715},
  year = 	 {2025},
  editor = 	 {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry},
  volume = 	 {267},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--19 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v267/main/assets/mencattini25a/mencattini25a.pdf},
  url = 	 {https://proceedings.mlr.press/v267/mencattini25a.html},
  abstract = 	 {Evolutionary model merging enables the creation of high-performing multi-task models but remains computationally prohibitive for consumer hardware. We introduce MERGE$^3$, an efficient framework that makes evolutionary merging of Large Language Models (LLMs) feasible on a single GPU by reducing fitness computation costs 50$\times$ while retaining a large fraction of the original performance. MERGE$^3$ achieves this by Extracting a reduced dataset for evaluation, Estimating model abilities using Item Response Theory (IRT), and Evolving optimal merges via IRT-based performance estimators. Our method enables state-of-the-art multilingual and cross-lingual merging, transferring knowledge across languages with significantly lower computational overhead. We provide theoretical guarantees and an open-source library, democratizing high-quality model merging.}
}

Endnote

%0 Conference Paper
%T MERGE$^3$: Efficient Evolutionary Merging on Consumer-grade GPUs
%A Tommaso Mencattini
%A Robert Adrian Minut
%A Donato Crisostomi
%A Andrea Santilli
%A Emanuele Rodolà
%B Proceedings of the 42nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2025
%E Aarti Singh
%E Maryam Fazel
%E Daniel Hsu
%E Simon Lacoste-Julien
%E Felix Berkenkamp
%E Tegan Maharaj
%E Kiri Wagstaff
%E Jerry Zhu	
%F pmlr-v267-mencattini25a
%I PMLR
%P 43694--43715
%U https://proceedings.mlr.press/v267/mencattini25a.html
%V 267
%X Evolutionary model merging enables the creation of high-performing multi-task models but remains computationally prohibitive for consumer hardware. We introduce MERGE$^3$, an efficient framework that makes evolutionary merging of Large Language Models (LLMs) feasible on a single GPU by reducing fitness computation costs 50$\times$ while retaining a large fraction of the original performance. MERGE$^3$ achieves this by Extracting a reduced dataset for evaluation, Estimating model abilities using Item Response Theory (IRT), and Evolving optimal merges via IRT-based performance estimators. Our method enables state-of-the-art multilingual and cross-lingual merging, transferring knowledge across languages with significantly lower computational overhead. We provide theoretical guarantees and an open-source library, democratizing high-quality model merging.

APA

Mencattini, T., Minut, R.A., Crisostomi, D., Santilli, A. & Rodolà, E.. (2025). MERGE$^3$: Efficient Evolutionary Merging on Consumer-grade GPUs. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:43694-43715 Available from https://proceedings.mlr.press/v267/mencattini25a.html.

MERGE$^3$: Efficient Evolutionary Merging on Consumer-grade GPUs

Abstract

Cite this Paper

Related Material