Kernel Continual Learning

Mohammad Mahdi Derakhshani, Xiantong Zhen, Ling Shao, Cees Snoek
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:2621-2631, 2021.

Abstract

This paper introduces kernel continual learning, a simple but effective variant of continual learning that leverages the non-parametric nature of kernel methods to tackle catastrophic forgetting. We deploy an episodic memory unit that stores a subset of samples for each task to learn task-specific classifiers based on kernel ridge regression. This does not require memory replay and systematically avoids task interference in the classifiers. We further introduce variational random features to learn a data-driven kernel for each task. To do so, we formulate kernel continual learning as a variational inference problem, where a random Fourier basis is incorporated as the latent variable. The variational posterior distribution over the random Fourier basis is inferred from the coreset of each task. In this way, we are able to generate more informative kernels specific to each task, and, more importantly, the coreset size can be reduced to achieve more compact memory, resulting in more efficient continual learning based on episodic memory. Extensive evaluation on four benchmarks demonstrates the effectiveness and promise of kernels for continual learning.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-derakhshani21a, title = {Kernel Continual Learning}, author = {Derakhshani, Mohammad Mahdi and Zhen, Xiantong and Shao, Ling and Snoek, Cees}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {2621--2631}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/derakhshani21a/derakhshani21a.pdf}, url = {https://proceedings.mlr.press/v139/derakhshani21a.html}, abstract = {This paper introduces kernel continual learning, a simple but effective variant of continual learning that leverages the non-parametric nature of kernel methods to tackle catastrophic forgetting. We deploy an episodic memory unit that stores a subset of samples for each task to learn task-specific classifiers based on kernel ridge regression. This does not require memory replay and systematically avoids task interference in the classifiers. We further introduce variational random features to learn a data-driven kernel for each task. To do so, we formulate kernel continual learning as a variational inference problem, where a random Fourier basis is incorporated as the latent variable. The variational posterior distribution over the random Fourier basis is inferred from the coreset of each task. In this way, we are able to generate more informative kernels specific to each task, and, more importantly, the coreset size can be reduced to achieve more compact memory, resulting in more efficient continual learning based on episodic memory. Extensive evaluation on four benchmarks demonstrates the effectiveness and promise of kernels for continual learning.} }
Endnote
%0 Conference Paper %T Kernel Continual Learning %A Mohammad Mahdi Derakhshani %A Xiantong Zhen %A Ling Shao %A Cees Snoek %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-derakhshani21a %I PMLR %P 2621--2631 %U https://proceedings.mlr.press/v139/derakhshani21a.html %V 139 %X This paper introduces kernel continual learning, a simple but effective variant of continual learning that leverages the non-parametric nature of kernel methods to tackle catastrophic forgetting. We deploy an episodic memory unit that stores a subset of samples for each task to learn task-specific classifiers based on kernel ridge regression. This does not require memory replay and systematically avoids task interference in the classifiers. We further introduce variational random features to learn a data-driven kernel for each task. To do so, we formulate kernel continual learning as a variational inference problem, where a random Fourier basis is incorporated as the latent variable. The variational posterior distribution over the random Fourier basis is inferred from the coreset of each task. In this way, we are able to generate more informative kernels specific to each task, and, more importantly, the coreset size can be reduced to achieve more compact memory, resulting in more efficient continual learning based on episodic memory. Extensive evaluation on four benchmarks demonstrates the effectiveness and promise of kernels for continual learning.
APA
Derakhshani, M.M., Zhen, X., Shao, L. & Snoek, C.. (2021). Kernel Continual Learning. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:2621-2631 Available from https://proceedings.mlr.press/v139/derakhshani21a.html.

Related Material