Task-aware Orthogonal Sparse Network for Exploring Shared Knowledge in Continual Learning

Yusong Hu, De Cheng, Dingwen Zhang, Nannan Wang, Tongliang Liu, Xinbo Gao
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:19153-19164, 2024.

Abstract

Continual learning (CL) aims to learn from sequentially arriving tasks without catastrophic forgetting (CF). By partitioning the network into two parts based on the Lottery Ticket Hypothesis—one for holding the knowledge of the old tasks while the other for learning the knowledge of the new task—the recent progress has achieved forget-free CL. Although addressing the CF issue well, such methods would encounter serious under-fitting in long-term CL, in which the learning process will continue for a long time and the number of new tasks involved will be much higher. To solve this problem, this paper partitions the network into three parts—with a new part for exploring the knowledge sharing between the old and new tasks. With the shared knowledge, this part of network can be learnt to simultaneously consolidate the old tasks and fit to the new task. To achieve this goal, we propose a task-aware Orthogonal Sparse Network (OSN), which contains shared knowledge induced network partition and sharpness-aware orthogonal sparse network learning. The former partitions the network to select shared parameters, while the latter guides the exploration of shared knowledge through shared parameters. Qualitative and quantitative analyses, show that the proposed OSN induces minimum to no interference with past tasks, i.e., approximately no forgetting, while greatly improves the model plasticity and capacity, and finally achieves the state-of-the-art performances.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-hu24b, title = {Task-aware Orthogonal Sparse Network for Exploring Shared Knowledge in Continual Learning}, author = {Hu, Yusong and Cheng, De and Zhang, Dingwen and Wang, Nannan and Liu, Tongliang and Gao, Xinbo}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {19153--19164}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/hu24b/hu24b.pdf}, url = {https://proceedings.mlr.press/v235/hu24b.html}, abstract = {Continual learning (CL) aims to learn from sequentially arriving tasks without catastrophic forgetting (CF). By partitioning the network into two parts based on the Lottery Ticket Hypothesis—one for holding the knowledge of the old tasks while the other for learning the knowledge of the new task—the recent progress has achieved forget-free CL. Although addressing the CF issue well, such methods would encounter serious under-fitting in long-term CL, in which the learning process will continue for a long time and the number of new tasks involved will be much higher. To solve this problem, this paper partitions the network into three parts—with a new part for exploring the knowledge sharing between the old and new tasks. With the shared knowledge, this part of network can be learnt to simultaneously consolidate the old tasks and fit to the new task. To achieve this goal, we propose a task-aware Orthogonal Sparse Network (OSN), which contains shared knowledge induced network partition and sharpness-aware orthogonal sparse network learning. The former partitions the network to select shared parameters, while the latter guides the exploration of shared knowledge through shared parameters. Qualitative and quantitative analyses, show that the proposed OSN induces minimum to no interference with past tasks, i.e., approximately no forgetting, while greatly improves the model plasticity and capacity, and finally achieves the state-of-the-art performances.} }
Endnote
%0 Conference Paper %T Task-aware Orthogonal Sparse Network for Exploring Shared Knowledge in Continual Learning %A Yusong Hu %A De Cheng %A Dingwen Zhang %A Nannan Wang %A Tongliang Liu %A Xinbo Gao %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-hu24b %I PMLR %P 19153--19164 %U https://proceedings.mlr.press/v235/hu24b.html %V 235 %X Continual learning (CL) aims to learn from sequentially arriving tasks without catastrophic forgetting (CF). By partitioning the network into two parts based on the Lottery Ticket Hypothesis—one for holding the knowledge of the old tasks while the other for learning the knowledge of the new task—the recent progress has achieved forget-free CL. Although addressing the CF issue well, such methods would encounter serious under-fitting in long-term CL, in which the learning process will continue for a long time and the number of new tasks involved will be much higher. To solve this problem, this paper partitions the network into three parts—with a new part for exploring the knowledge sharing between the old and new tasks. With the shared knowledge, this part of network can be learnt to simultaneously consolidate the old tasks and fit to the new task. To achieve this goal, we propose a task-aware Orthogonal Sparse Network (OSN), which contains shared knowledge induced network partition and sharpness-aware orthogonal sparse network learning. The former partitions the network to select shared parameters, while the latter guides the exploration of shared knowledge through shared parameters. Qualitative and quantitative analyses, show that the proposed OSN induces minimum to no interference with past tasks, i.e., approximately no forgetting, while greatly improves the model plasticity and capacity, and finally achieves the state-of-the-art performances.
APA
Hu, Y., Cheng, D., Zhang, D., Wang, N., Liu, T. & Gao, X.. (2024). Task-aware Orthogonal Sparse Network for Exploring Shared Knowledge in Continual Learning. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:19153-19164 Available from https://proceedings.mlr.press/v235/hu24b.html.

Related Material