Consistent Hierarchical Classification with A Generalized Metric

Yuzhou Cao; Lei Feng; Bo An

Consistent Hierarchical Classification with A Generalized Metric

Yuzhou Cao, Lei Feng, Bo An

Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:4825-4833, 2024.

Abstract

In multi-class hierarchical classification, a natural evaluation metric is the tree distance loss that takes the value of two labels’ distance on the pre-defined tree hierarchy. This metric is motivated by that its Bayes optimal solution is the deepest label on the tree whose induced superclass (subtree rooted at it) includes the true label with probability at least

$\frac{1}{2}$ . However, it can hardly handle the risk sensitivity of different tasks since its accuracy requirement for induced superclasses is fixed at

$\frac{1}{2}$ . In this paper, we first introduce a new evaluation metric that generalizes the tree distance loss, whose solution’s accuracy constraint

$\frac{1+c}{2}$ can be controlled by a penalty value

$c$ tailored for different tasks: a higher c indicates the emphasis on prediction’s accuracy and a lower one indicates that on specificity. Then, we propose a novel class of consistent surrogate losses based on an intuitive presentation of our generalized metric and its regret, which can be compatible with various binary losses. Finally, we theoretically derive the regret transfer bounds for our proposed surrogates and empirically validate their usefulness on benchmark datasets.

Cite this Paper

BibTeX

@InProceedings{pmlr-v238-cao24a,
  title = 	 {Consistent Hierarchical Classification with A Generalized Metric},
  author =       {Cao, Yuzhou and Feng, Lei and An, Bo},
  booktitle = 	 {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {4825--4833},
  year = 	 {2024},
  editor = 	 {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen},
  volume = 	 {238},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {02--04 May},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v238/cao24a/cao24a.pdf},
  url = 	 {https://proceedings.mlr.press/v238/cao24a.html},
  abstract = 	 {In multi-class hierarchical classification, a natural evaluation metric is the tree distance loss that takes the value of two labels’ distance on the pre-defined tree hierarchy. This metric is motivated by that its Bayes optimal solution is the deepest label on the tree whose induced superclass (subtree rooted at it) includes the true label with probability at least $\frac{1}{2}$. However, it can hardly handle the risk sensitivity of different tasks since its accuracy requirement for induced superclasses is fixed at $\frac{1}{2}$. In this paper, we first introduce a new evaluation metric that generalizes the tree distance loss, whose solution’s accuracy constraint $\frac{1+c}{2}$ can be controlled by a penalty value $c$ tailored for different tasks: a higher c indicates the emphasis on prediction’s accuracy and a lower one indicates that on specificity. Then, we propose a novel class of consistent surrogate losses based on an intuitive presentation of our generalized metric and its regret, which can be compatible with various binary losses. Finally, we theoretically derive the regret transfer bounds for our proposed surrogates and empirically validate their usefulness on benchmark datasets.}
}

Endnote

%0 Conference Paper
%T Consistent Hierarchical Classification with A Generalized Metric
%A Yuzhou Cao
%A Lei Feng
%A Bo An
%B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2024
%E Sanjoy Dasgupta
%E Stephan Mandt
%E Yingzhen Li	
%F pmlr-v238-cao24a
%I PMLR
%P 4825--4833
%U https://proceedings.mlr.press/v238/cao24a.html
%V 238
%X In multi-class hierarchical classification, a natural evaluation metric is the tree distance loss that takes the value of two labels’ distance on the pre-defined tree hierarchy. This metric is motivated by that its Bayes optimal solution is the deepest label on the tree whose induced superclass (subtree rooted at it) includes the true label with probability at least $\frac{1}{2}$. However, it can hardly handle the risk sensitivity of different tasks since its accuracy requirement for induced superclasses is fixed at $\frac{1}{2}$. In this paper, we first introduce a new evaluation metric that generalizes the tree distance loss, whose solution’s accuracy constraint $\frac{1+c}{2}$ can be controlled by a penalty value $c$ tailored for different tasks: a higher c indicates the emphasis on prediction’s accuracy and a lower one indicates that on specificity. Then, we propose a novel class of consistent surrogate losses based on an intuitive presentation of our generalized metric and its regret, which can be compatible with various binary losses. Finally, we theoretically derive the regret transfer bounds for our proposed surrogates and empirically validate their usefulness on benchmark datasets.

APA

Cao, Y., Feng, L. & An, B.. (2024). Consistent Hierarchical Classification with A Generalized Metric. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:4825-4833 Available from https://proceedings.mlr.press/v238/cao24a.html.

Related Material

Download PDF