Online Continual Learning from Imbalanced Data

Aristotelis Chrysakis, Marie-Francine Moens
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:1952-1961, 2020.

Abstract

A well-documented weakness of neural networks is the fact that they suffer from catastrophic forgetting when trained on data provided by a non-stationary distribution. Recent work in the field of continual learning attempts to understand and overcome this issue. Unfortunately, the majority of relevant work embraces the implicit assumption that the distribution of observed data is perfectly balanced, despite the fact that, in the real world, humans and animals learn from observations that are temporally correlated and severely imbalanced. Motivated by this remark, we aim to evaluate memory population methods that are used in online continual learning, when dealing with highly imbalanced and temporally correlated streams of data. More importantly, we introduce a new memory population approach, which we call class-balancing reservoir sampling (CBRS). We demonstrate that CBRS outperforms the state-of-the-art memory population algorithms in a considerably challenging learning setting, over a range of different datasets, and for multiple architectures.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-chrysakis20a, title = {Online Continual Learning from Imbalanced Data}, author = {Chrysakis, Aristotelis and Moens, Marie-Francine}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {1952--1961}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/chrysakis20a/chrysakis20a.pdf}, url = {https://proceedings.mlr.press/v119/chrysakis20a.html}, abstract = {A well-documented weakness of neural networks is the fact that they suffer from catastrophic forgetting when trained on data provided by a non-stationary distribution. Recent work in the field of continual learning attempts to understand and overcome this issue. Unfortunately, the majority of relevant work embraces the implicit assumption that the distribution of observed data is perfectly balanced, despite the fact that, in the real world, humans and animals learn from observations that are temporally correlated and severely imbalanced. Motivated by this remark, we aim to evaluate memory population methods that are used in online continual learning, when dealing with highly imbalanced and temporally correlated streams of data. More importantly, we introduce a new memory population approach, which we call class-balancing reservoir sampling (CBRS). We demonstrate that CBRS outperforms the state-of-the-art memory population algorithms in a considerably challenging learning setting, over a range of different datasets, and for multiple architectures.} }
Endnote
%0 Conference Paper %T Online Continual Learning from Imbalanced Data %A Aristotelis Chrysakis %A Marie-Francine Moens %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-chrysakis20a %I PMLR %P 1952--1961 %U https://proceedings.mlr.press/v119/chrysakis20a.html %V 119 %X A well-documented weakness of neural networks is the fact that they suffer from catastrophic forgetting when trained on data provided by a non-stationary distribution. Recent work in the field of continual learning attempts to understand and overcome this issue. Unfortunately, the majority of relevant work embraces the implicit assumption that the distribution of observed data is perfectly balanced, despite the fact that, in the real world, humans and animals learn from observations that are temporally correlated and severely imbalanced. Motivated by this remark, we aim to evaluate memory population methods that are used in online continual learning, when dealing with highly imbalanced and temporally correlated streams of data. More importantly, we introduce a new memory population approach, which we call class-balancing reservoir sampling (CBRS). We demonstrate that CBRS outperforms the state-of-the-art memory population algorithms in a considerably challenging learning setting, over a range of different datasets, and for multiple architectures.
APA
Chrysakis, A. & Moens, M.. (2020). Online Continual Learning from Imbalanced Data. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:1952-1961 Available from https://proceedings.mlr.press/v119/chrysakis20a.html.

Related Material