NegMerge: Sign-Consensual Weight Merging for Machine Unlearning

Hyo Seo Kim, Dongyoon Han, Junsuk Choe
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:30067-30085, 2025.

Abstract

Machine unlearning aims to selectively remove specific knowledge from a trained model. Existing approaches, such as Task Arithmetic, fine-tune the model on the forget set to create a task vector (i.e., a direction in weight space) for subtraction from the original model’s weight. However, their effectiveness is highly sensitive to hyperparameter selection, requiring extensive validation to identify the optimal vector from many fine-tuned candidates. In this paper, we propose a novel method that utilizes all fine-tuned models trained with varying hyperparameters instead of a single selection. Specifically, we aggregate the computed task vectors by retaining only the elements with consistent shared signs. The merged task vector is then negated to induce unlearning on the original model. Evaluations on zero-shot and standard image recognition tasks across twelve datasets and four backbone architectures show that our approach outperforms state-of-the-art methods while requiring similar or fewer computational resources. Code is available at https://github.com/naver-ai/negmerge.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-kim25f, title = {{N}eg{M}erge: Sign-Consensual Weight Merging for Machine Unlearning}, author = {Kim, Hyo Seo and Han, Dongyoon and Choe, Junsuk}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {30067--30085}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/kim25f/kim25f.pdf}, url = {https://proceedings.mlr.press/v267/kim25f.html}, abstract = {Machine unlearning aims to selectively remove specific knowledge from a trained model. Existing approaches, such as Task Arithmetic, fine-tune the model on the forget set to create a task vector (i.e., a direction in weight space) for subtraction from the original model’s weight. However, their effectiveness is highly sensitive to hyperparameter selection, requiring extensive validation to identify the optimal vector from many fine-tuned candidates. In this paper, we propose a novel method that utilizes all fine-tuned models trained with varying hyperparameters instead of a single selection. Specifically, we aggregate the computed task vectors by retaining only the elements with consistent shared signs. The merged task vector is then negated to induce unlearning on the original model. Evaluations on zero-shot and standard image recognition tasks across twelve datasets and four backbone architectures show that our approach outperforms state-of-the-art methods while requiring similar or fewer computational resources. Code is available at https://github.com/naver-ai/negmerge.} }
Endnote
%0 Conference Paper %T NegMerge: Sign-Consensual Weight Merging for Machine Unlearning %A Hyo Seo Kim %A Dongyoon Han %A Junsuk Choe %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-kim25f %I PMLR %P 30067--30085 %U https://proceedings.mlr.press/v267/kim25f.html %V 267 %X Machine unlearning aims to selectively remove specific knowledge from a trained model. Existing approaches, such as Task Arithmetic, fine-tune the model on the forget set to create a task vector (i.e., a direction in weight space) for subtraction from the original model’s weight. However, their effectiveness is highly sensitive to hyperparameter selection, requiring extensive validation to identify the optimal vector from many fine-tuned candidates. In this paper, we propose a novel method that utilizes all fine-tuned models trained with varying hyperparameters instead of a single selection. Specifically, we aggregate the computed task vectors by retaining only the elements with consistent shared signs. The merged task vector is then negated to induce unlearning on the original model. Evaluations on zero-shot and standard image recognition tasks across twelve datasets and four backbone architectures show that our approach outperforms state-of-the-art methods while requiring similar or fewer computational resources. Code is available at https://github.com/naver-ai/negmerge.
APA
Kim, H.S., Han, D. & Choe, J.. (2025). NegMerge: Sign-Consensual Weight Merging for Machine Unlearning. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:30067-30085 Available from https://proceedings.mlr.press/v267/kim25f.html.

Related Material