[edit]
Tunable Plug-In Rules with Reduced Posterior Certainty Loss in Imbalanced Datasets
Proceedings of the First International Workshop on Learning with Imbalanced Domains: Theory and Applications, PMLR 74:116-128, 2017.
Abstract
Classifiers have difficulty recognizing under-represented minorities in imbalanced datasets, due to their focus on minimizing the overall misclassification error. This introduces predictive biases against minority classes. Post-processing plug-in rules are popular for tackling class imbalance, but they often affect the certainty of base classifier posteriors, when the latter already perform correct classification. This shortcoming makes them ill-suited to scoring tasks, where informative posterior scores are required for human interpretation. To this end, we propose the ILoss metric to measure the impact of imbalance-aware classifiers on the certainty of posterior distributions. We then generalize post-processing plug-in rules in an easily tunable framework and theoretically show that this framework tends to improve performance balance. Finally, we experimentally assert that appropriate usage of our framework can reduce ILoss while yielding similar performance, with respect to common imbalance-aware measures, to existing plug-in rules for binary problems.