Accuracy-Rejection Curves (ARCs) for Comparing Classification Methods with a Reject Option

Malik Sajjad Ahmed Nadeem, Jean-Daniel Zucker, Blaise Hanczar
Proceedings of the third International Workshop on Machine Learning in Systems Biology, PMLR 8:65-81, 2009.

Abstract

Data extracted from microarrays are now considered an important source of knowledge about various diseases. Several studies based on microarray data and the use of receiver operating characteristics (ROC) graphs have compared supervised machine learning approaches. These comparisons are based on classification schemes in which all samples are classified, regardless of the degree of confidence associated with the classification of a particular sample on the basis of a given classifier. In the domain of healthcare, it is safer to refrain from classifying a sample if the confidence assigned to the classification is not high enough, rather than classifying all samples even if confidence is low. We describe an approach in which the performance of different classifiers is compared, with the possibility of rejection, based on several reject areas. Using a tradeoff between accuracy and rejection, we propose the use of accuracy-rejection curves (ARCs) and three types of relationship between ARCs for comparisons of the ARCs of two classifiers. Empirical results based on purely synthetic data, semi-synthetic data (generated from real data obtained from patients) and public microarray data for binary classification problems demonstrate the efficacy of this method.

Cite this Paper


BibTeX
@InProceedings{pmlr-v8-nadeem10a, title = {Accuracy-Rejection Curves (ARCs) for Comparing Classification Methods with a Reject Option}, author = {Nadeem, Malik Sajjad Ahmed and Zucker, Jean-Daniel and Hanczar, Blaise}, booktitle = {Proceedings of the third International Workshop on Machine Learning in Systems Biology}, pages = {65--81}, year = {2009}, editor = {Džeroski, Sašo and Guerts, Pierre and Rousu, Juho}, volume = {8}, series = {Proceedings of Machine Learning Research}, address = {Ljubljana, Slovenia}, month = {05--06 Sep}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v8/nadeem10a/nadeem10a.pdf}, url = {https://proceedings.mlr.press/v8/nadeem10a.html}, abstract = {Data extracted from microarrays are now considered an important source of knowledge about various diseases. Several studies based on microarray data and the use of receiver operating characteristics (ROC) graphs have compared supervised machine learning approaches. These comparisons are based on classification schemes in which all samples are classified, regardless of the degree of confidence associated with the classification of a particular sample on the basis of a given classifier. In the domain of healthcare, it is safer to refrain from classifying a sample if the confidence assigned to the classification is not high enough, rather than classifying all samples even if confidence is low. We describe an approach in which the performance of different classifiers is compared, with the possibility of rejection, based on several reject areas. Using a tradeoff between accuracy and rejection, we propose the use of accuracy-rejection curves (ARCs) and three types of relationship between ARCs for comparisons of the ARCs of two classifiers. Empirical results based on purely synthetic data, semi-synthetic data (generated from real data obtained from patients) and public microarray data for binary classification problems demonstrate the efficacy of this method.} }
Endnote
%0 Conference Paper %T Accuracy-Rejection Curves (ARCs) for Comparing Classification Methods with a Reject Option %A Malik Sajjad Ahmed Nadeem %A Jean-Daniel Zucker %A Blaise Hanczar %B Proceedings of the third International Workshop on Machine Learning in Systems Biology %C Proceedings of Machine Learning Research %D 2009 %E Sašo Džeroski %E Pierre Guerts %E Juho Rousu %F pmlr-v8-nadeem10a %I PMLR %P 65--81 %U https://proceedings.mlr.press/v8/nadeem10a.html %V 8 %X Data extracted from microarrays are now considered an important source of knowledge about various diseases. Several studies based on microarray data and the use of receiver operating characteristics (ROC) graphs have compared supervised machine learning approaches. These comparisons are based on classification schemes in which all samples are classified, regardless of the degree of confidence associated with the classification of a particular sample on the basis of a given classifier. In the domain of healthcare, it is safer to refrain from classifying a sample if the confidence assigned to the classification is not high enough, rather than classifying all samples even if confidence is low. We describe an approach in which the performance of different classifiers is compared, with the possibility of rejection, based on several reject areas. Using a tradeoff between accuracy and rejection, we propose the use of accuracy-rejection curves (ARCs) and three types of relationship between ARCs for comparisons of the ARCs of two classifiers. Empirical results based on purely synthetic data, semi-synthetic data (generated from real data obtained from patients) and public microarray data for binary classification problems demonstrate the efficacy of this method.
RIS
TY - CPAPER TI - Accuracy-Rejection Curves (ARCs) for Comparing Classification Methods with a Reject Option AU - Malik Sajjad Ahmed Nadeem AU - Jean-Daniel Zucker AU - Blaise Hanczar BT - Proceedings of the third International Workshop on Machine Learning in Systems Biology DA - 2009/03/02 ED - Sašo Džeroski ED - Pierre Guerts ED - Juho Rousu ID - pmlr-v8-nadeem10a PB - PMLR DP - Proceedings of Machine Learning Research VL - 8 SP - 65 EP - 81 L1 - http://proceedings.mlr.press/v8/nadeem10a/nadeem10a.pdf UR - https://proceedings.mlr.press/v8/nadeem10a.html AB - Data extracted from microarrays are now considered an important source of knowledge about various diseases. Several studies based on microarray data and the use of receiver operating characteristics (ROC) graphs have compared supervised machine learning approaches. These comparisons are based on classification schemes in which all samples are classified, regardless of the degree of confidence associated with the classification of a particular sample on the basis of a given classifier. In the domain of healthcare, it is safer to refrain from classifying a sample if the confidence assigned to the classification is not high enough, rather than classifying all samples even if confidence is low. We describe an approach in which the performance of different classifiers is compared, with the possibility of rejection, based on several reject areas. Using a tradeoff between accuracy and rejection, we propose the use of accuracy-rejection curves (ARCs) and three types of relationship between ARCs for comparisons of the ARCs of two classifiers. Empirical results based on purely synthetic data, semi-synthetic data (generated from real data obtained from patients) and public microarray data for binary classification problems demonstrate the efficacy of this method. ER -
APA
Nadeem, M.S.A., Zucker, J. & Hanczar, B.. (2009). Accuracy-Rejection Curves (ARCs) for Comparing Classification Methods with a Reject Option. Proceedings of the third International Workshop on Machine Learning in Systems Biology, in Proceedings of Machine Learning Research 8:65-81 Available from https://proceedings.mlr.press/v8/nadeem10a.html.

Related Material