Representation Surgery in Model Merging with Probabilistic Modeling

Qi Wei, Shuo He, Enneng Yang, Tingcong Liu, Haobo Wang, Lei Feng, Bo An
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:66015-66032, 2025.

Abstract

Model merging aims to achieve multitask performance by merging multiple expert models without the need to access the raw training data. Recent research identified the representation bias of model merging, characterized by a discrepancy in the representation distribution between the merged and individual models, hindering the performance of model merging methods. To mitigate the representation bias, a task-specific MLP, Surgery, was built to model the bias that is subsequently decreased on the merged representation. However, this strategy is still suboptimal due to the limited modeling capability within the deterministic manner. To address this issue, we present ProbSurgery, a probabilistic module specifically designed to accurately model the representation bias. This module generates an embedding distribution for each sample and outputs the representation bias through a sampling process. ProbSurgery offers superior representational capacity by naturally handling the uncertainty resulting from parameter interference of merging multiple models. Besides, we provide a theoretical analysis to reveal the advance of the probabilistic manner and propose an extension of ProSurgery for adapting to the task-sharing setting. Extensive experiments verify the effectiveness of ProbSurgery for representation surgery while maintaining generalization capabilities in real-world scenarios, including out-of-distribution and domain shift challenges.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-wei25c, title = {Representation Surgery in Model Merging with Probabilistic Modeling}, author = {Wei, Qi and He, Shuo and Yang, Enneng and Liu, Tingcong and Wang, Haobo and Feng, Lei and An, Bo}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {66015--66032}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/wei25c/wei25c.pdf}, url = {https://proceedings.mlr.press/v267/wei25c.html}, abstract = {Model merging aims to achieve multitask performance by merging multiple expert models without the need to access the raw training data. Recent research identified the representation bias of model merging, characterized by a discrepancy in the representation distribution between the merged and individual models, hindering the performance of model merging methods. To mitigate the representation bias, a task-specific MLP, Surgery, was built to model the bias that is subsequently decreased on the merged representation. However, this strategy is still suboptimal due to the limited modeling capability within the deterministic manner. To address this issue, we present ProbSurgery, a probabilistic module specifically designed to accurately model the representation bias. This module generates an embedding distribution for each sample and outputs the representation bias through a sampling process. ProbSurgery offers superior representational capacity by naturally handling the uncertainty resulting from parameter interference of merging multiple models. Besides, we provide a theoretical analysis to reveal the advance of the probabilistic manner and propose an extension of ProSurgery for adapting to the task-sharing setting. Extensive experiments verify the effectiveness of ProbSurgery for representation surgery while maintaining generalization capabilities in real-world scenarios, including out-of-distribution and domain shift challenges.} }
Endnote
%0 Conference Paper %T Representation Surgery in Model Merging with Probabilistic Modeling %A Qi Wei %A Shuo He %A Enneng Yang %A Tingcong Liu %A Haobo Wang %A Lei Feng %A Bo An %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-wei25c %I PMLR %P 66015--66032 %U https://proceedings.mlr.press/v267/wei25c.html %V 267 %X Model merging aims to achieve multitask performance by merging multiple expert models without the need to access the raw training data. Recent research identified the representation bias of model merging, characterized by a discrepancy in the representation distribution between the merged and individual models, hindering the performance of model merging methods. To mitigate the representation bias, a task-specific MLP, Surgery, was built to model the bias that is subsequently decreased on the merged representation. However, this strategy is still suboptimal due to the limited modeling capability within the deterministic manner. To address this issue, we present ProbSurgery, a probabilistic module specifically designed to accurately model the representation bias. This module generates an embedding distribution for each sample and outputs the representation bias through a sampling process. ProbSurgery offers superior representational capacity by naturally handling the uncertainty resulting from parameter interference of merging multiple models. Besides, we provide a theoretical analysis to reveal the advance of the probabilistic manner and propose an extension of ProSurgery for adapting to the task-sharing setting. Extensive experiments verify the effectiveness of ProbSurgery for representation surgery while maintaining generalization capabilities in real-world scenarios, including out-of-distribution and domain shift challenges.
APA
Wei, Q., He, S., Yang, E., Liu, T., Wang, H., Feng, L. & An, B.. (2025). Representation Surgery in Model Merging with Probabilistic Modeling. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:66015-66032 Available from https://proceedings.mlr.press/v267/wei25c.html.

Related Material