[edit]
A Multiclass Classification Approach to Label Ranking
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:1421-1430, 2020.
Abstract
In multiclass classification, the goal is to learn how to predict a random label Y, valued in Y={1,;…,;K} with K≥3, based upon observing a r.v. X, taking its values in Rq with q≥1 say, by means of a classification rule g:Rq→Y with minimum probability of error P{Yeqg(X)}. However, in a wide variety of situations, the task targeted may be more ambitious, consisting in sorting all the possible label values y that may be assigned to X by decreasing order of the posterior probability ηy(X)=P{Y=y∣X}. This article is devoted to the analysis of this statistical learning problem, halfway between multiclass classification and posterior probability estimation (regression) and referred to as \textit{label ranking} here. We highlight the fact that it can be viewed as a specific variant of \textit{ranking median regression} (RMR), where, rather than observing a random permutation Σ assigned to the input vector X and drawn from a Bradley-Terry-Luce-Plackett model with conditional preference vector (η1(X),;…,;ηK(X)), the sole information available for training a label ranking rule is the label Y ranked on top, namely Σ−1(1). Inspired by recent results in RMR, we prove that under appropriate noise conditions, the One-Versus-One (OVO) approach to multiclassification yields, as a by-product, an optimal ranking of the labels with overwhelming probability. Beyond theoretical guarantees, the relevance of the approach to label ranking promoted in this article is supported by experimental results.