[edit]
Online-MC-Queue: Learning from Imbalanced Multi-Class Streams
Proceedings of the Third International Workshop on Learning with Imbalanced Domains: Theory and Applications, PMLR 154:21-34, 2021.
Abstract
Online supervised learning from fast-evolving data streams has application in many areas. The development of techniques with highly skewed class distributions (or ’class imbalance’) is an important area of research in domains such as manufacturing, the environment, and health. Solutions should not only be able to analyse large repositories in near real-time but also be capable of providing accurate models to describe rare classes that may appear infrequently or in bursts, while continuously accommodating new instances. Although online learning methods have been proposed to handle binary class imbalance, solutions suitable for multi-class streams with varying degrees of imbalance in evolving streams have received limited attention. In order to address this knowledge gap, this paper introduces the Online-MC-Queue (OMCQ) algorithm for online learning in multi-class imbalanced settings. Our approach utilises a queue-based resampling method that dynamically creates an instance queue for each class. The number of instances is balanced by maintaining a queue threshold and removing older samples during training. In addition, new and rare classes are dynamically added to the training process as they appear. Our experimental results confirm a noticeable improvement in minority-class detection and in classification performance. A comparative evaluation shows that the OMCQ algorithm outperforms the state-of-the-art.