BECAME: Bayesian Continual Learning with Adaptive Model Merging

Mei Li, Yuxiang Lu, Qinyan Dai, Suizhi Huang, Yue Ding, Hongtao Lu
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:35481-35501, 2025.

Abstract

Continual Learning (CL) strives to learn incrementally across tasks while mitigating catastrophic forgetting. A key challenge in CL is balancing stability (retaining prior knowledge) and plasticity (learning new tasks). While representative gradient projection methods ensure stability, they often limit plasticity. Model merging techniques offer promising solutions, but prior methods typically rely on empirical assumptions and carefully selected hyperparameters. In this paper, we explore the potential of model merging to enhance the stability-plasticity trade-off, providing theoretical insights that underscore its benefits. Specifically, we reformulate the merging mechanism using Bayesian continual learning principles and derive a closed-form solution for the optimal merging coefficient that adapts to the diverse characteristics of tasks. To validate our approach, we introduce a two-stage framework named BECAME, which synergizes the expertise of gradient projection and adaptive merging. Extensive experiments show that our approach outperforms state-of-the-art CL methods and existing merging strategies https://github.com/limei0818/BECAME.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-li25bk, title = {{BECAME}: {B}ayesian Continual Learning with Adaptive Model Merging}, author = {Li, Mei and Lu, Yuxiang and Dai, Qinyan and Huang, Suizhi and Ding, Yue and Lu, Hongtao}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {35481--35501}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/li25bk/li25bk.pdf}, url = {https://proceedings.mlr.press/v267/li25bk.html}, abstract = {Continual Learning (CL) strives to learn incrementally across tasks while mitigating catastrophic forgetting. A key challenge in CL is balancing stability (retaining prior knowledge) and plasticity (learning new tasks). While representative gradient projection methods ensure stability, they often limit plasticity. Model merging techniques offer promising solutions, but prior methods typically rely on empirical assumptions and carefully selected hyperparameters. In this paper, we explore the potential of model merging to enhance the stability-plasticity trade-off, providing theoretical insights that underscore its benefits. Specifically, we reformulate the merging mechanism using Bayesian continual learning principles and derive a closed-form solution for the optimal merging coefficient that adapts to the diverse characteristics of tasks. To validate our approach, we introduce a two-stage framework named BECAME, which synergizes the expertise of gradient projection and adaptive merging. Extensive experiments show that our approach outperforms state-of-the-art CL methods and existing merging strategies https://github.com/limei0818/BECAME.} }
Endnote
%0 Conference Paper %T BECAME: Bayesian Continual Learning with Adaptive Model Merging %A Mei Li %A Yuxiang Lu %A Qinyan Dai %A Suizhi Huang %A Yue Ding %A Hongtao Lu %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-li25bk %I PMLR %P 35481--35501 %U https://proceedings.mlr.press/v267/li25bk.html %V 267 %X Continual Learning (CL) strives to learn incrementally across tasks while mitigating catastrophic forgetting. A key challenge in CL is balancing stability (retaining prior knowledge) and plasticity (learning new tasks). While representative gradient projection methods ensure stability, they often limit plasticity. Model merging techniques offer promising solutions, but prior methods typically rely on empirical assumptions and carefully selected hyperparameters. In this paper, we explore the potential of model merging to enhance the stability-plasticity trade-off, providing theoretical insights that underscore its benefits. Specifically, we reformulate the merging mechanism using Bayesian continual learning principles and derive a closed-form solution for the optimal merging coefficient that adapts to the diverse characteristics of tasks. To validate our approach, we introduce a two-stage framework named BECAME, which synergizes the expertise of gradient projection and adaptive merging. Extensive experiments show that our approach outperforms state-of-the-art CL methods and existing merging strategies https://github.com/limei0818/BECAME.
APA
Li, M., Lu, Y., Dai, Q., Huang, S., Ding, Y. & Lu, H.. (2025). BECAME: Bayesian Continual Learning with Adaptive Model Merging. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:35481-35501 Available from https://proceedings.mlr.press/v267/li25bk.html.

Related Material