From Cost-Sensitive to Tight F-measure Bounds

[edit]

Kevin Bascol, Rémi Emonet, Elisa Fromont, Amaury Habrard, Guillaume Metzler, Marc Sebban ;
Proceedings of Machine Learning Research, PMLR 89:1245-1253, 2019.

Abstract

The F-measure is a classification performance measure, especially suited when dealing with imbalanced datasets, which provides a compromise between the precision and the recall of a classifier. As this measure is non convex and non linear, it is often indirectly optimized using cost-sensitive learning (that affects different costs to false positives and false negatives). In this article, we derive theoretical guarantees that give tight bounds on the best F-measure that can be obtained from cost-sensitive learning. We also give an original geometric interpretation of the bounds that serves as an inspiration for CONE, a new algorithm to optimize for the F-measure. Using 10 datasets exhibiting varied class imbalance, we illustrate that our bounds are much tighter than previous work and show that CONE learns models with either superior F-measures than existing methods or comparable but in fewer iterations.

Related Material