Sparsity and Heterogeneous Dropout for Continual Learning in the Null Space of Neural Activations

Ali Abbasi, Parsa Nooralinejad, Vladimir Braverman, Hamed Pirsiavash, Soheil Kolouri
Proceedings of The 1st Conference on Lifelong Learning Agents, PMLR 199:617-628, 2022.

Abstract

Continual/lifelong learning from a non-stationary input data stream is a cornerstone of intelligence. Despite their phenomenal performance in a wide variety of applications, deep neural networks are prone to forgetting their previously learned information upon learning new ones. This phenomenon is called “catastrophic forgetting” and is deeply rooted in the stability-plasticity dilemma. Overcoming catastrophic forgetting in deep neural networks has become an active field of research in recent years. In particular, gradient projection-based methods have recently shown exceptional performance at overcoming catastrophic forgetting. This paper proposes two biologically-inspired mechanisms based on sparsity and heterogeneous dropout that significantly increase a continual learner’s performance over a long sequence of tasks. Our proposed approach builds on the Gradient Projection Memory (GPM) framework. We leverage k-winner activations in each layer of a neural network to enforce layer-wise sparse activations for each task, together with a between-task heterogeneous dropout that encourages the network to use non-overlapping activation patterns between different tasks. In addition, we introduce two new benchmarks for continual learning under distributional shift, namely Continual Swiss Roll and ImageNet SuperDog-40. Lastly, we provide an in-depth analysis of our proposed method and demonstrate a significant performance boost on various benchmark continual learning problems.

Cite this Paper


BibTeX
@InProceedings{pmlr-v199-abbasi22a, title = {Sparsity and Heterogeneous Dropout for Continual Learning in the Null Space of Neural Activations}, author = {Abbasi, Ali and Nooralinejad, Parsa and Braverman, Vladimir and Pirsiavash, Hamed and Kolouri, Soheil}, booktitle = {Proceedings of The 1st Conference on Lifelong Learning Agents}, pages = {617--628}, year = {2022}, editor = {Chandar, Sarath and Pascanu, Razvan and Precup, Doina}, volume = {199}, series = {Proceedings of Machine Learning Research}, month = {22--24 Aug}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v199/abbasi22a/abbasi22a.pdf}, url = {https://proceedings.mlr.press/v199/abbasi22a.html}, abstract = {Continual/lifelong learning from a non-stationary input data stream is a cornerstone of intelligence. Despite their phenomenal performance in a wide variety of applications, deep neural networks are prone to forgetting their previously learned information upon learning new ones. This phenomenon is called “catastrophic forgetting” and is deeply rooted in the stability-plasticity dilemma. Overcoming catastrophic forgetting in deep neural networks has become an active field of research in recent years. In particular, gradient projection-based methods have recently shown exceptional performance at overcoming catastrophic forgetting. This paper proposes two biologically-inspired mechanisms based on sparsity and heterogeneous dropout that significantly increase a continual learner’s performance over a long sequence of tasks. Our proposed approach builds on the Gradient Projection Memory (GPM) framework. We leverage k-winner activations in each layer of a neural network to enforce layer-wise sparse activations for each task, together with a between-task heterogeneous dropout that encourages the network to use non-overlapping activation patterns between different tasks. In addition, we introduce two new benchmarks for continual learning under distributional shift, namely Continual Swiss Roll and ImageNet SuperDog-40. Lastly, we provide an in-depth analysis of our proposed method and demonstrate a significant performance boost on various benchmark continual learning problems.} }
Endnote
%0 Conference Paper %T Sparsity and Heterogeneous Dropout for Continual Learning in the Null Space of Neural Activations %A Ali Abbasi %A Parsa Nooralinejad %A Vladimir Braverman %A Hamed Pirsiavash %A Soheil Kolouri %B Proceedings of The 1st Conference on Lifelong Learning Agents %C Proceedings of Machine Learning Research %D 2022 %E Sarath Chandar %E Razvan Pascanu %E Doina Precup %F pmlr-v199-abbasi22a %I PMLR %P 617--628 %U https://proceedings.mlr.press/v199/abbasi22a.html %V 199 %X Continual/lifelong learning from a non-stationary input data stream is a cornerstone of intelligence. Despite their phenomenal performance in a wide variety of applications, deep neural networks are prone to forgetting their previously learned information upon learning new ones. This phenomenon is called “catastrophic forgetting” and is deeply rooted in the stability-plasticity dilemma. Overcoming catastrophic forgetting in deep neural networks has become an active field of research in recent years. In particular, gradient projection-based methods have recently shown exceptional performance at overcoming catastrophic forgetting. This paper proposes two biologically-inspired mechanisms based on sparsity and heterogeneous dropout that significantly increase a continual learner’s performance over a long sequence of tasks. Our proposed approach builds on the Gradient Projection Memory (GPM) framework. We leverage k-winner activations in each layer of a neural network to enforce layer-wise sparse activations for each task, together with a between-task heterogeneous dropout that encourages the network to use non-overlapping activation patterns between different tasks. In addition, we introduce two new benchmarks for continual learning under distributional shift, namely Continual Swiss Roll and ImageNet SuperDog-40. Lastly, we provide an in-depth analysis of our proposed method and demonstrate a significant performance boost on various benchmark continual learning problems.
APA
Abbasi, A., Nooralinejad, P., Braverman, V., Pirsiavash, H. & Kolouri, S.. (2022). Sparsity and Heterogeneous Dropout for Continual Learning in the Null Space of Neural Activations. Proceedings of The 1st Conference on Lifelong Learning Agents, in Proceedings of Machine Learning Research 199:617-628 Available from https://proceedings.mlr.press/v199/abbasi22a.html.

Related Material