[edit]
Tunable Plug-In Rules with Reduced Posterior Certainty Loss in Imbalanced Datasets
Proceedings of the First International Workshop on Learning with Imbalanced Domains: Theory and Applications, PMLR 74:116-128, 2017.
Abstract
Classifiers have difficulty recognizing under-represented minorities in imbalanced datasets, due to their focus on minimizing the overall misclassification error. This introduces predictive biases against minority classes. Post-processing plug-in rules are popular for tackling class imbalance, but they often affect the certainty of base classifier posteriors, when the latter already perform correct classification. This shortcoming makes them ill-suited to scoring tasks, where informative posterior scores are required for human interpretation. To this end, we propose the $ILoss$ metric to measure the impact of imbalance-aware classifiers on the certainty of posterior distributions. We then generalize post-processing plug-in rules in an easily tunable framework and theoretically show that this framework tends to improve performance balance. Finally, we experimentally assert that appropriate usage of our framework can reduce $ILoss$ while yielding similar performance, with respect to common imbalance-aware measures, to existing plug-in rules for binary problems.