An Effective Dynamic Gradient Calibration Method for Continual Learning

Weichen Lin; Jiaxiang Chen; Ruomin Huang; Hu Ding

An Effective Dynamic Gradient Calibration Method for Continual Learning

Weichen Lin, Jiaxiang Chen, Ruomin Huang, Hu Ding

Proceedings of the 41st International Conference on Machine Learning, PMLR 235:29872-29889, 2024.

Abstract

Continual learning (CL) is a fundamental topic in machine learning, where the goal is to train a model with continuously incoming data and tasks. Due to the memory limit, we cannot store all the historical data, and therefore confront the “catastrophic forgetting” problem, i.e., the performance on the previous tasks can substantially decrease because of the missing information in the latter period. Though a number of elegant methods have been proposed, the catastrophic forgetting phenomenon still cannot be well avoided in practice. In this paper, we study the problem from the gradient perspective, where our aim is to develop an effective algorithm to calibrate the gradient in each updating step of the model; namely, our goal is to guide the model to be updated in the right direction under the situation that a large amount of historical data are unavailable. Our idea is partly inspired by the seminal stochastic variance reduction methods (e.g., SVRG and SAGA) for reducing the variance of gradient estimation in stochastic gradient descent algorithms. Another benefit is that our approach can be used as a general tool, which is able to be incorporated with several existing popular CL methods to achieve better performance. We also conduct a set of experiments on several benchmark datasets to evaluate the performance in practice.

Cite this Paper

BibTeX


@InProceedings{pmlr-v235-lin24a,
  title = 	 {An Effective Dynamic Gradient Calibration Method for Continual Learning},
  author =       {Lin, Weichen and Chen, Jiaxiang and Huang, Ruomin and Ding, Hu},
  booktitle = 	 {Proceedings of the 41st International Conference on Machine Learning},
  pages = 	 {29872--29889},
  year = 	 {2024},
  editor = 	 {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix},
  volume = 	 {235},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {21--27 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v235/main/assets/lin24a/lin24a.pdf},
  url = 	 {https://proceedings.mlr.press/v235/lin24a.html},
  abstract = 	 {Continual learning (CL) is a fundamental topic in machine learning, where the goal is to train a model with continuously incoming data and tasks. Due to the memory limit, we cannot store all the historical data, and therefore confront the “catastrophic forgetting” problem, i.e., the performance on the previous tasks can substantially decrease because of the missing information in the latter period. Though a number of elegant methods have been proposed, the catastrophic forgetting phenomenon still cannot be well avoided in practice. In this paper, we study the problem from the gradient perspective, where our aim is to develop an effective algorithm to calibrate the gradient in each updating step of the model; namely, our goal is to guide the model to be updated in the right direction under the situation that a large amount of historical data are unavailable. Our idea is partly inspired by the seminal stochastic variance reduction methods (e.g., SVRG and SAGA) for reducing the variance of gradient estimation in stochastic gradient descent algorithms. Another benefit is that our approach can be used as a general tool, which is able to be incorporated with several existing popular CL methods to achieve better performance. We also conduct a set of experiments on several benchmark datasets to evaluate the performance in practice.}
}

Endnote

%0 Conference Paper
%T An Effective Dynamic Gradient Calibration Method for Continual Learning
%A Weichen Lin
%A Jiaxiang Chen
%A Ruomin Huang
%A Hu Ding
%B Proceedings of the 41st International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2024
%E Ruslan Salakhutdinov
%E Zico Kolter
%E Katherine Heller
%E Adrian Weller
%E Nuria Oliver
%E Jonathan Scarlett
%E Felix Berkenkamp	
%F pmlr-v235-lin24a
%I PMLR
%P 29872--29889
%U https://proceedings.mlr.press/v235/lin24a.html
%V 235
%X Continual learning (CL) is a fundamental topic in machine learning, where the goal is to train a model with continuously incoming data and tasks. Due to the memory limit, we cannot store all the historical data, and therefore confront the “catastrophic forgetting” problem, i.e., the performance on the previous tasks can substantially decrease because of the missing information in the latter period. Though a number of elegant methods have been proposed, the catastrophic forgetting phenomenon still cannot be well avoided in practice. In this paper, we study the problem from the gradient perspective, where our aim is to develop an effective algorithm to calibrate the gradient in each updating step of the model; namely, our goal is to guide the model to be updated in the right direction under the situation that a large amount of historical data are unavailable. Our idea is partly inspired by the seminal stochastic variance reduction methods (e.g., SVRG and SAGA) for reducing the variance of gradient estimation in stochastic gradient descent algorithms. Another benefit is that our approach can be used as a general tool, which is able to be incorporated with several existing popular CL methods to achieve better performance. We also conduct a set of experiments on several benchmark datasets to evaluate the performance in practice.

APA


Lin, W., Chen, J., Huang, R. & Ding, H.. (2024). An Effective Dynamic Gradient Calibration Method for Continual Learning. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:29872-29889 Available from https://proceedings.mlr.press/v235/lin24a.html.

An Effective Dynamic Gradient Calibration Method for Continual Learning

Abstract

Cite this Paper

Related Material