Memory Head for Pre-Trained Backbones in Continual Learning

Matteo Tiezzi; Federico Becattini; Simone Marullo; Stefano Melacci

Memory Head for Pre-Trained Backbones in Continual Learning

Matteo Tiezzi, Federico Becattini, Simone Marullo, Stefano Melacci

Proceedings of The 3rd Conference on Lifelong Learning Agents, PMLR 274:179-197, 2025.

Abstract

This paper is focused on the role of classification heads for pre-trained backbones in the context of continual learning. A novel neuron model is proposed as basic constituent of what we refer to as Memory Head, which naturally includes self-organized memorization capabilities, going beyond the ones of classic neurons and specifically designed for continual learning purposes. Memory Heads are based on memory units which are indexed depending on the input, with a mechanism which resembles attention models. Such a computational structure allows the head to adapt to different regions of the input space without altering its behavior in other regions, that might be possibly associated to previously acquired knowledge. The neuron model is generic, as it does not exploit any supervisory information and it does not require any experience replay strategies. When stacked on top of pre-trained backbones, the proposed head allows the network to adapt to new knowledge and to memorize the properties of the temporally streamed data. The experimental activity of the paper covers both the case of frozen and fine-tuned backbones, showing that Memory Heads overcome recent state-of-the art competitors which work in the same setting. Moreover, continual online learning is explored in the class-domain incremental setting, being it a more challenging scenario which is less frequently analyzed in the literature. We demonstrate that Memory Heads are more flexible with respect to “vanilla” heads, and more effective than several experience-replay-based approaches.

Cite this Paper

BibTeX

@InProceedings{pmlr-v274-tiezzi25a,
  title = 	 {Memory Head for Pre-Trained Backbones in Continual Learning},
  author =       {Tiezzi, Matteo and Becattini, Federico and Marullo, Simone and Melacci, Stefano},
  booktitle = 	 {Proceedings of The 3rd Conference on Lifelong Learning Agents},
  pages = 	 {179--197},
  year = 	 {2025},
  editor = 	 {Lomonaco, Vincenzo and Melacci, Stefano and Tuytelaars, Tinne and Chandar, Sarath and Pascanu, Razvan},
  volume = 	 {274},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {29 Jul--01 Aug},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v274/main/assets/tiezzi25a/tiezzi25a.pdf},
  url = 	 {https://proceedings.mlr.press/v274/tiezzi25a.html},
  abstract = 	 {This paper is focused on the role of classification heads for pre-trained backbones in the context of continual learning. A novel neuron model is proposed as basic constituent of what we refer to as Memory Head, which naturally includes self-organized memorization capabilities, going beyond the ones of classic neurons and specifically designed for continual learning purposes. Memory Heads are based on memory units which are indexed depending on the input, with a mechanism which resembles attention models. Such a computational structure allows the head to adapt to different regions of the input space without altering its behavior in other regions, that might be possibly associated to previously acquired knowledge. The neuron model is generic, as it does not exploit any supervisory information and it does not require any experience replay strategies. When stacked on top of pre-trained backbones, the proposed head allows the network to adapt to new knowledge and to memorize the properties of the temporally streamed data. The experimental activity of the paper covers both the case of frozen and fine-tuned backbones, showing that Memory Heads overcome recent state-of-the art competitors which work in the same setting. Moreover, continual online learning is explored in the class-domain incremental setting, being it a more challenging scenario which is less frequently analyzed in the literature. We demonstrate that Memory Heads are more flexible with respect to “vanilla” heads, and more effective than several experience-replay-based approaches.}
}

Endnote

%0 Conference Paper
%T Memory Head for Pre-Trained Backbones in Continual Learning
%A Matteo Tiezzi
%A Federico Becattini
%A Simone Marullo
%A Stefano Melacci
%B Proceedings of The 3rd Conference on Lifelong Learning Agents
%C Proceedings of Machine Learning Research
%D 2025
%E Vincenzo Lomonaco
%E Stefano Melacci
%E Tinne Tuytelaars
%E Sarath Chandar
%E Razvan Pascanu	
%F pmlr-v274-tiezzi25a
%I PMLR
%P 179--197
%U https://proceedings.mlr.press/v274/tiezzi25a.html
%V 274
%X This paper is focused on the role of classification heads for pre-trained backbones in the context of continual learning. A novel neuron model is proposed as basic constituent of what we refer to as Memory Head, which naturally includes self-organized memorization capabilities, going beyond the ones of classic neurons and specifically designed for continual learning purposes. Memory Heads are based on memory units which are indexed depending on the input, with a mechanism which resembles attention models. Such a computational structure allows the head to adapt to different regions of the input space without altering its behavior in other regions, that might be possibly associated to previously acquired knowledge. The neuron model is generic, as it does not exploit any supervisory information and it does not require any experience replay strategies. When stacked on top of pre-trained backbones, the proposed head allows the network to adapt to new knowledge and to memorize the properties of the temporally streamed data. The experimental activity of the paper covers both the case of frozen and fine-tuned backbones, showing that Memory Heads overcome recent state-of-the art competitors which work in the same setting. Moreover, continual online learning is explored in the class-domain incremental setting, being it a more challenging scenario which is less frequently analyzed in the literature. We demonstrate that Memory Heads are more flexible with respect to “vanilla” heads, and more effective than several experience-replay-based approaches.

APA

Tiezzi, M., Becattini, F., Marullo, S. & Melacci, S.. (2025). Memory Head for Pre-Trained Backbones in Continual Learning. Proceedings of The 3rd Conference on Lifelong Learning Agents, in Proceedings of Machine Learning Research 274:179-197 Available from https://proceedings.mlr.press/v274/tiezzi25a.html.

Related Material

Download PDF