A Multiclass Classification Approach to Label Ranking

Robin Vogel; Stéphan Clémen\con

A Multiclass Classification Approach to Label Ranking

Robin Vogel, Stéphan Clémen\con

Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:1421-1430, 2020.

Abstract

In multiclass classification, the goal is to learn how to predict a random label $Y$, valued in $\mathcal{Y}=\{1,; \ldots,;{K} \}$ with $K\geq 3$, based upon observing a r.v. $X$, taking its values in $\mathbb{R}^q$ with $q\geq 1$ say, by means of a classification rule $g:\mathbb{R}^q\to \mathcal{Y}$ with minimum probability of error $\mathbb{P}\{Yeq g(X) \}$. However, in a wide variety of situations, the task targeted may be more ambitious, consisting in sorting all the possible label values $y$ that may be assigned to $X$ by decreasing order of the posterior probability $\eta_y(X)=\mathbb{P}\{Y=y \mid X \}$. This article is devoted to the analysis of this statistical learning problem, halfway between multiclass classification and posterior probability estimation (regression) and referred to as \textit{label ranking} here. We highlight the fact that it can be viewed as a specific variant of \textit{ranking median regression} (RMR), where, rather than observing a random permutation $\Sigma$ assigned to the input vector $X$ and drawn from a Bradley-Terry-Luce-Plackett model with conditional preference vector $(\eta_1(X),; \ldots,; \eta_K(X))$, the sole information available for training a label ranking rule is the label $Y$ ranked on top, namely $\Sigma^{-1}(1)$. Inspired by recent results in RMR, we prove that under appropriate noise conditions, the One-Versus-One (OVO) approach to multiclassification yields, as a by-product, an optimal ranking of the labels with overwhelming probability. Beyond theoretical guarantees, the relevance of the approach to label ranking promoted in this article is supported by experimental results.

Cite this Paper

BibTeX

@InProceedings{pmlr-v108-vogel20a,
  title = 	 {A Multiclass Classification Approach to Label Ranking},
  author =       {Vogel, Robin and Cl\'emen{\c}on, St\'ephan},
  booktitle = 	 {Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics},
  pages = 	 {1421--1430},
  year = 	 {2020},
  editor = 	 {Chiappa, Silvia and Calandra, Roberto},
  volume = 	 {108},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {26--28 Aug},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v108/vogel20a/vogel20a.pdf},
  url = 	 {https://proceedings.mlr.press/v108/vogel20a.html},
  abstract = 	 {In multiclass classification, the goal is to learn how to predict a random label $Y$, valued in $\mathcal{Y}=\{1,; \ldots,;{K} \}$ with $K\geq 3$, based upon observing a r.v. $X$, taking its values in $\mathbb{R}^q$ with $q\geq 1$ say, by means of a classification rule $g:\mathbb{R}^q\to \mathcal{Y}$ with minimum probability of error $\mathbb{P}\{Yeq g(X) \}$. However, in a wide variety of situations, the task targeted may be more ambitious, consisting in sorting all the possible label values $y$ that may be assigned to $X$ by decreasing order of the posterior probability $\eta_y(X)=\mathbb{P}\{Y=y \mid X \}$. This article is devoted to the analysis of this statistical learning problem, halfway between multiclass classification and posterior probability estimation (regression) and referred to as \textit{label ranking} here. We highlight the fact that it can be viewed as a specific variant of \textit{ranking median regression} (RMR), where, rather than observing a random permutation $\Sigma$ assigned to the input vector $X$ and drawn from a Bradley-Terry-Luce-Plackett model with conditional preference vector $(\eta_1(X),; \ldots,; \eta_K(X))$, the sole information available for training a label ranking rule is the label $Y$ ranked on top, namely $\Sigma^{-1}(1)$. Inspired by recent results in RMR, we prove that under appropriate noise conditions, the One-Versus-One (OVO) approach to multiclassification yields, as a by-product, an optimal ranking of the labels with overwhelming probability. Beyond theoretical guarantees, the relevance of the approach to label ranking promoted in this article is supported by experimental results.}
}

Endnote

%0 Conference Paper
%T A Multiclass Classification Approach to Label Ranking
%A Robin Vogel
%A Stéphan Clémen\con
%B Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2020
%E Silvia Chiappa
%E Roberto Calandra	
%F pmlr-v108-vogel20a
%I PMLR
%P 1421--1430
%U https://proceedings.mlr.press/v108/vogel20a.html
%V 108
%X In multiclass classification, the goal is to learn how to predict a random label $Y$, valued in $\mathcal{Y}=\{1,; \ldots,;{K} \}$ with $K\geq 3$, based upon observing a r.v. $X$, taking its values in $\mathbb{R}^q$ with $q\geq 1$ say, by means of a classification rule $g:\mathbb{R}^q\to \mathcal{Y}$ with minimum probability of error $\mathbb{P}\{Yeq g(X) \}$. However, in a wide variety of situations, the task targeted may be more ambitious, consisting in sorting all the possible label values $y$ that may be assigned to $X$ by decreasing order of the posterior probability $\eta_y(X)=\mathbb{P}\{Y=y \mid X \}$. This article is devoted to the analysis of this statistical learning problem, halfway between multiclass classification and posterior probability estimation (regression) and referred to as \textit{label ranking} here. We highlight the fact that it can be viewed as a specific variant of \textit{ranking median regression} (RMR), where, rather than observing a random permutation $\Sigma$ assigned to the input vector $X$ and drawn from a Bradley-Terry-Luce-Plackett model with conditional preference vector $(\eta_1(X),; \ldots,; \eta_K(X))$, the sole information available for training a label ranking rule is the label $Y$ ranked on top, namely $\Sigma^{-1}(1)$. Inspired by recent results in RMR, we prove that under appropriate noise conditions, the One-Versus-One (OVO) approach to multiclassification yields, as a by-product, an optimal ranking of the labels with overwhelming probability. Beyond theoretical guarantees, the relevance of the approach to label ranking promoted in this article is supported by experimental results.

APA

Vogel, R. & Clémen\con, S.. (2020). A Multiclass Classification Approach to Label Ranking. Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 108:1421-1430 Available from https://proceedings.mlr.press/v108/vogel20a.html.

A Multiclass Classification Approach to Label Ranking

Abstract

Cite this Paper

Related Material