[edit]
The Influence of Multiple Classes on Learning from Imbalanced Data Streams
Proceedings of the Fourth International Workshop on Learning with Imbalanced Domains: Theory and Applications, PMLR 183:187-198, 2022.
Abstract
This work is aimed at examining the influence of local data characteristics and drifts on the difficulties of learning online classifiers from multi-class imbalanced data streams. The results of many experiments with synthetically generated data streams have shown a much greater role of the overlapping between many minority classes (the type of borderline examples) than for streams with one minority class. The presence of rare examples in the stream is the most difficult single factor. Unlike binary streams, the specialized UOB and OOB classifiers perform well enough for even high imbalance ratios. The most challenging for all classifiers are complex scenarios integrating many drifts and factors simultaneously, which worsen the evaluation measures stronger than for binary ones.