Online-MC-Queue: Learning from Imbalanced Multi-Class Streams

Farnaz Sadeghi, Herna L. Viktor
Proceedings of the Third International Workshop on Learning with Imbalanced Domains: Theory and Applications, PMLR 154:21-34, 2021.

Abstract

Online supervised learning from fast-evolving data streams has application in many areas. The development of techniques with highly skewed class distributions (or ’class imbalance’) is an important area of research in domains such as manufacturing, the environment, and health. Solutions should not only be able to analyse large repositories in near real-time but also be capable of providing accurate models to describe rare classes that may appear infrequently or in bursts, while continuously accommodating new instances. Although online learning methods have been proposed to handle binary class imbalance, solutions suitable for multi-class streams with varying degrees of imbalance in evolving streams have received limited attention. In order to address this knowledge gap, this paper introduces the Online-MC-Queue (OMCQ) algorithm for online learning in multi-class imbalanced settings. Our approach utilises a queue-based resampling method that dynamically creates an instance queue for each class. The number of instances is balanced by maintaining a queue threshold and removing older samples during training. In addition, new and rare classes are dynamically added to the training process as they appear. Our experimental results confirm a noticeable improvement in minority-class detection and in classification performance. A comparative evaluation shows that the OMCQ algorithm outperforms the state-of-the-art.

Cite this Paper


BibTeX
@InProceedings{pmlr-v154-sadeghi21a, title = {Online-MC-Queue: Learning from Imbalanced Multi-Class Streams}, author = {Sadeghi, Farnaz and Viktor, Herna L.}, booktitle = {Proceedings of the Third International Workshop on Learning with Imbalanced Domains: Theory and Applications}, pages = {21--34}, year = {2021}, editor = {Moniz, Nuno and Branco, Paula and Torgo, Luis and Japkowicz, Nathalie and Woźniak, Michał and Wang, Shuo}, volume = {154}, series = {Proceedings of Machine Learning Research}, month = {17 Sep}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v154/sadeghi21a/sadeghi21a.pdf}, url = {https://proceedings.mlr.press/v154/sadeghi21a.html}, abstract = {Online supervised learning from fast-evolving data streams has application in many areas. The development of techniques with highly skewed class distributions (or ’class imbalance’) is an important area of research in domains such as manufacturing, the environment, and health. Solutions should not only be able to analyse large repositories in near real-time but also be capable of providing accurate models to describe rare classes that may appear infrequently or in bursts, while continuously accommodating new instances. Although online learning methods have been proposed to handle binary class imbalance, solutions suitable for multi-class streams with varying degrees of imbalance in evolving streams have received limited attention. In order to address this knowledge gap, this paper introduces the Online-MC-Queue (OMCQ) algorithm for online learning in multi-class imbalanced settings. Our approach utilises a queue-based resampling method that dynamically creates an instance queue for each class. The number of instances is balanced by maintaining a queue threshold and removing older samples during training. In addition, new and rare classes are dynamically added to the training process as they appear. Our experimental results confirm a noticeable improvement in minority-class detection and in classification performance. A comparative evaluation shows that the OMCQ algorithm outperforms the state-of-the-art.} }
Endnote
%0 Conference Paper %T Online-MC-Queue: Learning from Imbalanced Multi-Class Streams %A Farnaz Sadeghi %A Herna L. Viktor %B Proceedings of the Third International Workshop on Learning with Imbalanced Domains: Theory and Applications %C Proceedings of Machine Learning Research %D 2021 %E Nuno Moniz %E Paula Branco %E Luis Torgo %E Nathalie Japkowicz %E Michał Woźniak %E Shuo Wang %F pmlr-v154-sadeghi21a %I PMLR %P 21--34 %U https://proceedings.mlr.press/v154/sadeghi21a.html %V 154 %X Online supervised learning from fast-evolving data streams has application in many areas. The development of techniques with highly skewed class distributions (or ’class imbalance’) is an important area of research in domains such as manufacturing, the environment, and health. Solutions should not only be able to analyse large repositories in near real-time but also be capable of providing accurate models to describe rare classes that may appear infrequently or in bursts, while continuously accommodating new instances. Although online learning methods have been proposed to handle binary class imbalance, solutions suitable for multi-class streams with varying degrees of imbalance in evolving streams have received limited attention. In order to address this knowledge gap, this paper introduces the Online-MC-Queue (OMCQ) algorithm for online learning in multi-class imbalanced settings. Our approach utilises a queue-based resampling method that dynamically creates an instance queue for each class. The number of instances is balanced by maintaining a queue threshold and removing older samples during training. In addition, new and rare classes are dynamically added to the training process as they appear. Our experimental results confirm a noticeable improvement in minority-class detection and in classification performance. A comparative evaluation shows that the OMCQ algorithm outperforms the state-of-the-art.
APA
Sadeghi, F. & Viktor, H.L.. (2021). Online-MC-Queue: Learning from Imbalanced Multi-Class Streams. Proceedings of the Third International Workshop on Learning with Imbalanced Domains: Theory and Applications, in Proceedings of Machine Learning Research 154:21-34 Available from https://proceedings.mlr.press/v154/sadeghi21a.html.

Related Material