CATFormer: When Continual Learning Meets Spiking Transformers With Dynamic Thresholds

Vaishnavi Nagabhushana; Kartikay Agrawal; Ayon Borthakur

CATFormer: When Continual Learning Meets Spiking Transformers With Dynamic Thresholds

Vaishnavi Nagabhushana, Kartikay Agrawal, Ayon Borthakur

Proceedings of the First Workshop on NeuroAI Multimodal Intelligence @ AAAI 2026, PMLR 308:84-92, 2026.

Abstract

Although deep neural networks perform extremely well in controlled environments, they fail in real-world scenarios where the data isn’t available all at once, and the model requires an update to adapt itself to the new data distribution, which might or might not follow the initial distribution. Previously acquired knowledge is lost during such subsequent updates from new data. a phenomenon commonly known as catastrophic forgetting. In contrast, the brain can learn without such catastrophic forgetting, irrespective of the number of tasks it encounters. Existing spiking neural networks (SNNs) for class-incremental learning (CIL) suffer a sharp performance drop as tasks accumulate. We here introduce CATFormer (Context Adaptive Threshold Transformer), a scalable framework that overcomes this limitation. We observe that the key to preventing forgetting in SNNs lies not only in synaptic plasticity, but in modulating neuronal excitability too. At the core of CATFormer is the Dynamic Threshold Leaky Integrate-and-Fire (DTLIF) neuron model, which leverages context-adaptive thresholds as the primary mechanism for knowledge retention. This is paired with a Gated Dynamic Head Selection (G-DHS) mechanism for task-agnostic inference. Extensive evaluation on both static (CIFAR-10/100/Tiny-ImageNet) and neuromorphic (CIFAR10-DVS/SHD) datasets reveals that CATFormer outperforms existing rehearsal-free CIL algorithms across various task splits, establishing it as an ideal architecture for energy-efficient and true class incremental learning.

Cite this Paper

BibTeX

@InProceedings{pmlr-v308-nagabhushana26a,
  title = 	 {CATFormer: When Continual Learning Meets Spiking Transformers With Dynamic Thresholds},
  author =       {Nagabhushana, Vaishnavi and Agrawal, Kartikay and Borthakur, Ayon},
  booktitle = 	 {Proceedings of the First Workshop on NeuroAI Multimodal Intelligence @ AAAI 2026},
  pages = 	 {84--92},
  year = 	 {2026},
  editor = 	 {Abbasi-Asl, Reza and Iqbal, Asim and Ito, Shinya and Arkhipov, Anton and Sanborn, Sophia},
  volume = 	 {308},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {27 Jan},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v308/main/assets/nagabhushana26a/nagabhushana26a.pdf},
  url = 	 {https://proceedings.mlr.press/v308/nagabhushana26a.html},
  abstract = 	 {Although deep neural networks perform extremely well in controlled environments, they fail in real-world scenarios where the data isn’t available all at once, and the model requires an update to adapt itself to the new data distribution, which might or might not follow the initial distribution. Previously acquired knowledge is lost during such subsequent updates from new data. a phenomenon commonly known as catastrophic forgetting. In contrast, the brain can learn without such catastrophic forgetting, irrespective of the number of tasks it encounters. Existing spiking neural networks (SNNs) for class-incremental learning (CIL) suffer a sharp performance drop as tasks accumulate. We here introduce CATFormer (Context Adaptive Threshold Transformer), a scalable framework that overcomes this limitation. We observe that the key to preventing forgetting in SNNs lies not only in synaptic plasticity, but in modulating neuronal excitability too. At the core of CATFormer is the Dynamic Threshold Leaky Integrate-and-Fire (DTLIF) neuron model, which leverages context-adaptive thresholds as the primary mechanism for knowledge retention. This is paired with a Gated Dynamic Head Selection (G-DHS) mechanism for task-agnostic inference. Extensive evaluation on both static (CIFAR-10/100/Tiny-ImageNet) and neuromorphic (CIFAR10-DVS/SHD) datasets reveals that CATFormer outperforms existing rehearsal-free CIL algorithms across various task splits, establishing it as an ideal architecture for energy-efficient and true class incremental learning.}
}

Endnote

%0 Conference Paper
%T CATFormer: When Continual Learning Meets Spiking Transformers With Dynamic Thresholds
%A Vaishnavi Nagabhushana
%A Kartikay Agrawal
%A Ayon Borthakur
%B Proceedings of the First Workshop on NeuroAI Multimodal Intelligence @ AAAI 2026
%C Proceedings of Machine Learning Research
%D 2026
%E Reza Abbasi-Asl
%E Asim Iqbal
%E Shinya Ito
%E Anton Arkhipov
%E Sophia Sanborn	
%F pmlr-v308-nagabhushana26a
%I PMLR
%P 84--92
%U https://proceedings.mlr.press/v308/nagabhushana26a.html
%V 308
%X Although deep neural networks perform extremely well in controlled environments, they fail in real-world scenarios where the data isn’t available all at once, and the model requires an update to adapt itself to the new data distribution, which might or might not follow the initial distribution. Previously acquired knowledge is lost during such subsequent updates from new data. a phenomenon commonly known as catastrophic forgetting. In contrast, the brain can learn without such catastrophic forgetting, irrespective of the number of tasks it encounters. Existing spiking neural networks (SNNs) for class-incremental learning (CIL) suffer a sharp performance drop as tasks accumulate. We here introduce CATFormer (Context Adaptive Threshold Transformer), a scalable framework that overcomes this limitation. We observe that the key to preventing forgetting in SNNs lies not only in synaptic plasticity, but in modulating neuronal excitability too. At the core of CATFormer is the Dynamic Threshold Leaky Integrate-and-Fire (DTLIF) neuron model, which leverages context-adaptive thresholds as the primary mechanism for knowledge retention. This is paired with a Gated Dynamic Head Selection (G-DHS) mechanism for task-agnostic inference. Extensive evaluation on both static (CIFAR-10/100/Tiny-ImageNet) and neuromorphic (CIFAR10-DVS/SHD) datasets reveals that CATFormer outperforms existing rehearsal-free CIL algorithms across various task splits, establishing it as an ideal architecture for energy-efficient and true class incremental learning.

APA

Nagabhushana, V., Agrawal, K. & Borthakur, A.. (2026). CATFormer: When Continual Learning Meets Spiking Transformers With Dynamic Thresholds. Proceedings of the First Workshop on NeuroAI Multimodal Intelligence @ AAAI 2026, in Proceedings of Machine Learning Research 308:84-92 Available from https://proceedings.mlr.press/v308/nagabhushana26a.html.

Related Material

Download PDF