ZeroFlow: Overcoming Catastrophic Forgetting is Easier than You Think

Tao Feng; Wei Li; Didi Zhu; Hangjie Yuan; Wendi Zheng; Dan Zhang; Jie Tang

ZeroFlow: Overcoming Catastrophic Forgetting is Easier than You Think

Tao Feng, Wei Li, Didi Zhu, Hangjie Yuan, Wendi Zheng, Dan Zhang, Jie Tang

Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:16808-16824, 2025.

Abstract

Backpropagation provides a generalized configuration for overcoming catastrophic forgetting. Optimizers such as SGD and Adam are commonly used for weight updates in continual learning and continual pre-training. However, access to gradient information is not always feasible in practice due to black-box APIs, hardware constraints, or non-differentiable systems, a challenge we refer to as the gradient bans. To bridge this gap, we introduce ZeroFlow, the first benchmark designed to evaluate gradient-free optimization algorithms for overcoming forgetting. ZeroFlow examines a suite of forward pass-based methods across various algorithms, forgetting scenarios, and datasets. Our results show that forward passes alone can be sufficient to mitigate forgetting. We uncover novel optimization principles that highlight the potential of forward pass-based methods in mitigating forgetting, managing task conflicts, and reducing memory demands. Additionally, we propose new enhancements that further improve forgetting resistance using only forward passes. This work provides essential tools and insights to advance the development of forward-pass-based methods for continual learning.

Cite this Paper

BibTeX

@InProceedings{pmlr-v267-feng25j,
  title = 	 {{Z}ero{F}low: Overcoming Catastrophic Forgetting is Easier than You Think},
  author =       {Feng, Tao and Li, Wei and Zhu, Didi and Yuan, Hangjie and Zheng, Wendi and Zhang, Dan and Tang, Jie},
  booktitle = 	 {Proceedings of the 42nd International Conference on Machine Learning},
  pages = 	 {16808--16824},
  year = 	 {2025},
  editor = 	 {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry},
  volume = 	 {267},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--19 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v267/main/assets/feng25j/feng25j.pdf},
  url = 	 {https://proceedings.mlr.press/v267/feng25j.html},
  abstract = 	 {Backpropagation provides a generalized configuration for overcoming catastrophic forgetting. Optimizers such as SGD and Adam are commonly used for weight updates in continual learning and continual pre-training. However, access to gradient information is not always feasible in practice due to black-box APIs, hardware constraints, or non-differentiable systems, a challenge we refer to as the gradient bans. To bridge this gap, we introduce ZeroFlow, the first benchmark designed to evaluate gradient-free optimization algorithms for overcoming forgetting. ZeroFlow examines a suite of forward pass-based methods across various algorithms, forgetting scenarios, and datasets. Our results show that forward passes alone can be sufficient to mitigate forgetting. We uncover novel optimization principles that highlight the potential of forward pass-based methods in mitigating forgetting, managing task conflicts, and reducing memory demands. Additionally, we propose new enhancements that further improve forgetting resistance using only forward passes. This work provides essential tools and insights to advance the development of forward-pass-based methods for continual learning.}
}

Endnote

%0 Conference Paper
%T ZeroFlow: Overcoming Catastrophic Forgetting is Easier than You Think
%A Tao Feng
%A Wei Li
%A Didi Zhu
%A Hangjie Yuan
%A Wendi Zheng
%A Dan Zhang
%A Jie Tang
%B Proceedings of the 42nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2025
%E Aarti Singh
%E Maryam Fazel
%E Daniel Hsu
%E Simon Lacoste-Julien
%E Felix Berkenkamp
%E Tegan Maharaj
%E Kiri Wagstaff
%E Jerry Zhu	
%F pmlr-v267-feng25j
%I PMLR
%P 16808--16824
%U https://proceedings.mlr.press/v267/feng25j.html
%V 267
%X Backpropagation provides a generalized configuration for overcoming catastrophic forgetting. Optimizers such as SGD and Adam are commonly used for weight updates in continual learning and continual pre-training. However, access to gradient information is not always feasible in practice due to black-box APIs, hardware constraints, or non-differentiable systems, a challenge we refer to as the gradient bans. To bridge this gap, we introduce ZeroFlow, the first benchmark designed to evaluate gradient-free optimization algorithms for overcoming forgetting. ZeroFlow examines a suite of forward pass-based methods across various algorithms, forgetting scenarios, and datasets. Our results show that forward passes alone can be sufficient to mitigate forgetting. We uncover novel optimization principles that highlight the potential of forward pass-based methods in mitigating forgetting, managing task conflicts, and reducing memory demands. Additionally, we propose new enhancements that further improve forgetting resistance using only forward passes. This work provides essential tools and insights to advance the development of forward-pass-based methods for continual learning.

APA

Feng, T., Li, W., Zhu, D., Yuan, H., Zheng, W., Zhang, D. & Tang, J.. (2025). ZeroFlow: Overcoming Catastrophic Forgetting is Easier than You Think. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:16808-16824 Available from https://proceedings.mlr.press/v267/feng25j.html.

ZeroFlow: Overcoming Catastrophic Forgetting is Easier than You Think

Abstract

Cite this Paper

Related Material