Using Credal C4.5 for Calibrated Label Ranking in Multi-Label Classification

Serafı́n Moral Garcı́a, Javier Garcı́a Castellano, Carlos J. Mantas Ruiz, Joaquı́n Abellán
Proceedings of the Twelveth International Symposium on Imprecise Probability: Theories and Applications, PMLR 147:220-228, 2021.

Abstract

The Multi-Label Classification (MLC) task aims to predict the set of labels that correspond to an instance. It differs from traditional classification, which assumes that each instance has associated a single value of a class variable. Within MLC, the Calibrated Label Ranking algorithm (CLR) considers a binary classification problem for each pair of labels to determine a label ranking for a given instance, exploiting in this way correlations between pairs of labels. Moreover, CLR mitigates the class imbalance problem that frequently appears in MLC motivated by the fact that, in MLC, there are usually very few instances that have associated a certain label. For solving the binary classification problems, a traditional classification algorithm is needed. The C4.5 algorithm, based on Decision Trees, has been widely employed in this domain. In this work, we show that the Credal C4.5 method, a version of C4.5 recently proposed that uses imprecise probabilities, is more suitable than C4.5 for solving the binary classification problems in CLR. An exhaustive experimental analysis carried out in this research shows that Credal C4.5 performs better than C4.5 when both algorithms are employed in CLR, being the improvement more notable as there is more noise in the labels.

Cite this Paper


BibTeX
@InProceedings{pmlr-v147-moral-garci-a21a, title = {Using Credal C4.5 for Calibrated Label Ranking in Multi-Label Classification }, author = {Moral Garc\'{\i}a, Seraf\'{\i}n and Castellano, Javier Garc\'{\i}a and Ruiz, Carlos J. Mantas and Abell\'an, Joaqu\'{\i}n}, booktitle = {Proceedings of the Twelveth International Symposium on Imprecise Probability: Theories and Applications}, pages = {220--228}, year = {2021}, editor = {Cano, Andrés and De Bock, Jasper and Miranda, Enrique and Moral, Serafı́n}, volume = {147}, series = {Proceedings of Machine Learning Research}, month = {06--09 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v147/moral-garci-a21a/moral-garci-a21a.pdf}, url = {https://proceedings.mlr.press/v147/moral-garci-a21a.html}, abstract = {The Multi-Label Classification (MLC) task aims to predict the set of labels that correspond to an instance. It differs from traditional classification, which assumes that each instance has associated a single value of a class variable. Within MLC, the Calibrated Label Ranking algorithm (CLR) considers a binary classification problem for each pair of labels to determine a label ranking for a given instance, exploiting in this way correlations between pairs of labels. Moreover, CLR mitigates the class imbalance problem that frequently appears in MLC motivated by the fact that, in MLC, there are usually very few instances that have associated a certain label. For solving the binary classification problems, a traditional classification algorithm is needed. The C4.5 algorithm, based on Decision Trees, has been widely employed in this domain. In this work, we show that the Credal C4.5 method, a version of C4.5 recently proposed that uses imprecise probabilities, is more suitable than C4.5 for solving the binary classification problems in CLR. An exhaustive experimental analysis carried out in this research shows that Credal C4.5 performs better than C4.5 when both algorithms are employed in CLR, being the improvement more notable as there is more noise in the labels.} }
Endnote
%0 Conference Paper %T Using Credal C4.5 for Calibrated Label Ranking in Multi-Label Classification %A Serafı́n Moral Garcı́a %A Javier Garcı́a Castellano %A Carlos J. Mantas Ruiz %A Joaquı́n Abellán %B Proceedings of the Twelveth International Symposium on Imprecise Probability: Theories and Applications %C Proceedings of Machine Learning Research %D 2021 %E Andrés Cano %E Jasper De Bock %E Enrique Miranda %E Serafı́n Moral %F pmlr-v147-moral-garci-a21a %I PMLR %P 220--228 %U https://proceedings.mlr.press/v147/moral-garci-a21a.html %V 147 %X The Multi-Label Classification (MLC) task aims to predict the set of labels that correspond to an instance. It differs from traditional classification, which assumes that each instance has associated a single value of a class variable. Within MLC, the Calibrated Label Ranking algorithm (CLR) considers a binary classification problem for each pair of labels to determine a label ranking for a given instance, exploiting in this way correlations between pairs of labels. Moreover, CLR mitigates the class imbalance problem that frequently appears in MLC motivated by the fact that, in MLC, there are usually very few instances that have associated a certain label. For solving the binary classification problems, a traditional classification algorithm is needed. The C4.5 algorithm, based on Decision Trees, has been widely employed in this domain. In this work, we show that the Credal C4.5 method, a version of C4.5 recently proposed that uses imprecise probabilities, is more suitable than C4.5 for solving the binary classification problems in CLR. An exhaustive experimental analysis carried out in this research shows that Credal C4.5 performs better than C4.5 when both algorithms are employed in CLR, being the improvement more notable as there is more noise in the labels.
APA
Moral Garcı́a, S., Castellano, J.G., Ruiz, C.J.M. & Abellán, J.. (2021). Using Credal C4.5 for Calibrated Label Ranking in Multi-Label Classification . Proceedings of the Twelveth International Symposium on Imprecise Probability: Theories and Applications, in Proceedings of Machine Learning Research 147:220-228 Available from https://proceedings.mlr.press/v147/moral-garci-a21a.html.

Related Material