Tailoring Rulesets to Misclassification Costs
Pre-proceedings of the Fifth International Workshop on Artificial Intelligence and Statistics, PMLR R0:88-94, 1995.
This paper studies the capabilities obtained by modifying Quinlan’s  C4.5 programs for inducing decision trees and rules to permit the specification of unequal misclassification costs for binary classification tasks. Setting this cost value allows important parameters such as the percentage classified as a given class to be moved over their complete range. In some applications such parameters require precise control, but a considerable degree of variation appears difficult to suppress, particularly with rules: it is present even in the unmodified versions that treat all errors as equal. Crossvalidation over a range of cost values seems the appropriate way to tune such parameters. Independent of misclassifications costs, the ability to explore a spectrum of classifiers can considerably assist exploratory data analysis, delivering clearer rules than the standard version may provide. These conclusions are illustrated on a simulated version of the game Blackjack.