Tailoring Rulesets to Misclassification Costs

Jason Catlett
Pre-proceedings of the Fifth International Workshop on Artificial Intelligence and Statistics, PMLR R0:88-94, 1995.

Abstract

This paper studies the capabilities obtained by modifying Quinlan’s [9] C4.5 programs for inducing decision trees and rules to permit the specification of unequal misclassification costs for binary classification tasks. Setting this cost value allows important parameters such as the percentage classified as a given class to be moved over their complete range. In some applications such parameters require precise control, but a considerable degree of variation appears difficult to suppress, particularly with rules: it is present even in the unmodified versions that treat all errors as equal. Crossvalidation over a range of cost values seems the appropriate way to tune such parameters. Independent of misclassifications costs, the ability to explore a spectrum of classifiers can considerably assist exploratory data analysis, delivering clearer rules than the standard version may provide. These conclusions are illustrated on a simulated version of the game Blackjack.

Cite this Paper


BibTeX
@InProceedings{pmlr-vR0-catlett95a, title = {Tailoring Rulesets to Misclassification Costs}, author = {Catlett, Jason}, booktitle = {Pre-proceedings of the Fifth International Workshop on Artificial Intelligence and Statistics}, pages = {88--94}, year = {1995}, editor = {Fisher, Doug and Lenz, Hans-Joachim}, volume = {R0}, series = {Proceedings of Machine Learning Research}, month = {04--07 Jan}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/r0/catlett95a/catlett95a.pdf}, url = {https://proceedings.mlr.press/r0/catlett95a.html}, abstract = {This paper studies the capabilities obtained by modifying Quinlan’s [9] C4.5 programs for inducing decision trees and rules to permit the specification of unequal misclassification costs for binary classification tasks. Setting this cost value allows important parameters such as the percentage classified as a given class to be moved over their complete range. In some applications such parameters require precise control, but a considerable degree of variation appears difficult to suppress, particularly with rules: it is present even in the unmodified versions that treat all errors as equal. Crossvalidation over a range of cost values seems the appropriate way to tune such parameters. Independent of misclassifications costs, the ability to explore a spectrum of classifiers can considerably assist exploratory data analysis, delivering clearer rules than the standard version may provide. These conclusions are illustrated on a simulated version of the game Blackjack.}, note = {Reissued by PMLR on 01 May 2022.} }
Endnote
%0 Conference Paper %T Tailoring Rulesets to Misclassification Costs %A Jason Catlett %B Pre-proceedings of the Fifth International Workshop on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 1995 %E Doug Fisher %E Hans-Joachim Lenz %F pmlr-vR0-catlett95a %I PMLR %P 88--94 %U https://proceedings.mlr.press/r0/catlett95a.html %V R0 %X This paper studies the capabilities obtained by modifying Quinlan’s [9] C4.5 programs for inducing decision trees and rules to permit the specification of unequal misclassification costs for binary classification tasks. Setting this cost value allows important parameters such as the percentage classified as a given class to be moved over their complete range. In some applications such parameters require precise control, but a considerable degree of variation appears difficult to suppress, particularly with rules: it is present even in the unmodified versions that treat all errors as equal. Crossvalidation over a range of cost values seems the appropriate way to tune such parameters. Independent of misclassifications costs, the ability to explore a spectrum of classifiers can considerably assist exploratory data analysis, delivering clearer rules than the standard version may provide. These conclusions are illustrated on a simulated version of the game Blackjack. %Z Reissued by PMLR on 01 May 2022.
APA
Catlett, J.. (1995). Tailoring Rulesets to Misclassification Costs. Pre-proceedings of the Fifth International Workshop on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research R0:88-94 Available from https://proceedings.mlr.press/r0/catlett95a.html. Reissued by PMLR on 01 May 2022.

Related Material