Efficient Bayes Risk Estimation for Cost-Sensitive Classification

Daniel Andrade, Yuzuru Okajima
Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, PMLR 89:3372-3381, 2019.

Abstract

In some real world applications, acquiring covariates for classification can be cost-intensive and should be limited as much as possible. For example, in the medical setting, a doctor cannot just perform all possible types of tests to classify whether the patient has diabetes or not. The decision of classifying or acquiring more covariates before classifying is dependent on the costs of new covariates and the expected optimal cost of misclassification (Bayes risk). However, estimating the latter is a formidable task due to the estimation of a high dimensional probability density and intractable integrals. In this work, we show that for linear classifiers this task can be considerably simplified, leading to a one dimensional integral for which we propose an efficient approximation. Experimental results on three datasets show consistent improvements over previously proposed methods for cost-sensitive classification. We also demonstrate that our proposed Bayes risk estimation procedure can benefit from additional unlabeled data which can be helpful when only small amount of labeled data is available.

Cite this Paper


BibTeX
@InProceedings{pmlr-v89-andrade19a, title = {Efficient Bayes Risk Estimation for Cost-Sensitive Classification}, author = {Andrade, Daniel and Okajima, Yuzuru}, booktitle = {Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics}, pages = {3372--3381}, year = {2019}, editor = {Chaudhuri, Kamalika and Sugiyama, Masashi}, volume = {89}, series = {Proceedings of Machine Learning Research}, month = {16--18 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v89/andrade19a/andrade19a.pdf}, url = {https://proceedings.mlr.press/v89/andrade19a.html}, abstract = {In some real world applications, acquiring covariates for classification can be cost-intensive and should be limited as much as possible. For example, in the medical setting, a doctor cannot just perform all possible types of tests to classify whether the patient has diabetes or not. The decision of classifying or acquiring more covariates before classifying is dependent on the costs of new covariates and the expected optimal cost of misclassification (Bayes risk). However, estimating the latter is a formidable task due to the estimation of a high dimensional probability density and intractable integrals. In this work, we show that for linear classifiers this task can be considerably simplified, leading to a one dimensional integral for which we propose an efficient approximation. Experimental results on three datasets show consistent improvements over previously proposed methods for cost-sensitive classification. We also demonstrate that our proposed Bayes risk estimation procedure can benefit from additional unlabeled data which can be helpful when only small amount of labeled data is available.} }
Endnote
%0 Conference Paper %T Efficient Bayes Risk Estimation for Cost-Sensitive Classification %A Daniel Andrade %A Yuzuru Okajima %B Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Masashi Sugiyama %F pmlr-v89-andrade19a %I PMLR %P 3372--3381 %U https://proceedings.mlr.press/v89/andrade19a.html %V 89 %X In some real world applications, acquiring covariates for classification can be cost-intensive and should be limited as much as possible. For example, in the medical setting, a doctor cannot just perform all possible types of tests to classify whether the patient has diabetes or not. The decision of classifying or acquiring more covariates before classifying is dependent on the costs of new covariates and the expected optimal cost of misclassification (Bayes risk). However, estimating the latter is a formidable task due to the estimation of a high dimensional probability density and intractable integrals. In this work, we show that for linear classifiers this task can be considerably simplified, leading to a one dimensional integral for which we propose an efficient approximation. Experimental results on three datasets show consistent improvements over previously proposed methods for cost-sensitive classification. We also demonstrate that our proposed Bayes risk estimation procedure can benefit from additional unlabeled data which can be helpful when only small amount of labeled data is available.
APA
Andrade, D. & Okajima, Y.. (2019). Efficient Bayes Risk Estimation for Cost-Sensitive Classification. Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 89:3372-3381 Available from https://proceedings.mlr.press/v89/andrade19a.html.

Related Material