Robust Consensus in Ranking Data Analysis: Definitions, Properties and Computational Issues

Morgane Goibert, Clément Calauzènes, Ekhine Irurozki, Stephan Clémençon
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:11584-11597, 2023.

Abstract

As the issue of robustness in AI systems becomes vital, statistical learning techniques that are reliable even in presence of partly contaminated data have to be developed. Preference data, in the form of (complete) rankings in the simplest situations, are no exception and the demand for appropriate concepts and tools is all the more pressing given that technologies fed by or producing this type of data ($\textit{e.g.}$ search engines, recommending systems) are now massively deployed. However, the lack of vector space structure for the set of rankings ($\textit{i.e.}$ the symmetric group $\mathfrak{S}_n$) and the complex nature of statistics considered in ranking data analysis make the formulation of robustness objectives in this domain challenging. In this paper, we introduce notions of robustness, together with dedicated statistical methods, for $\textit{Consensus Ranking}$ the flagship problem in ranking data analysis, aiming at summarizing a probability distribution on $\mathfrak{S}_n$ by a $\textit{median}$ ranking. Precisely, we propose specific extensions of the popular concept of breakdown point, tailored to consensus ranking, and address the related computational issues. Beyond the theoretical contributions, the relevance of the approach proposed is supported by an experimental study.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-goibert23a, title = {Robust Consensus in Ranking Data Analysis: Definitions, Properties and Computational Issues}, author = {Goibert, Morgane and Calauz\`{e}nes, Cl\'{e}ment and Irurozki, Ekhine and Cl\'{e}men\c{c}on, Stephan}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {11584--11597}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/goibert23a/goibert23a.pdf}, url = {https://proceedings.mlr.press/v202/goibert23a.html}, abstract = {As the issue of robustness in AI systems becomes vital, statistical learning techniques that are reliable even in presence of partly contaminated data have to be developed. Preference data, in the form of (complete) rankings in the simplest situations, are no exception and the demand for appropriate concepts and tools is all the more pressing given that technologies fed by or producing this type of data ($\textit{e.g.}$ search engines, recommending systems) are now massively deployed. However, the lack of vector space structure for the set of rankings ($\textit{i.e.}$ the symmetric group $\mathfrak{S}_n$) and the complex nature of statistics considered in ranking data analysis make the formulation of robustness objectives in this domain challenging. In this paper, we introduce notions of robustness, together with dedicated statistical methods, for $\textit{Consensus Ranking}$ the flagship problem in ranking data analysis, aiming at summarizing a probability distribution on $\mathfrak{S}_n$ by a $\textit{median}$ ranking. Precisely, we propose specific extensions of the popular concept of breakdown point, tailored to consensus ranking, and address the related computational issues. Beyond the theoretical contributions, the relevance of the approach proposed is supported by an experimental study.} }
Endnote
%0 Conference Paper %T Robust Consensus in Ranking Data Analysis: Definitions, Properties and Computational Issues %A Morgane Goibert %A Clément Calauzènes %A Ekhine Irurozki %A Stephan Clémençon %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-goibert23a %I PMLR %P 11584--11597 %U https://proceedings.mlr.press/v202/goibert23a.html %V 202 %X As the issue of robustness in AI systems becomes vital, statistical learning techniques that are reliable even in presence of partly contaminated data have to be developed. Preference data, in the form of (complete) rankings in the simplest situations, are no exception and the demand for appropriate concepts and tools is all the more pressing given that technologies fed by or producing this type of data ($\textit{e.g.}$ search engines, recommending systems) are now massively deployed. However, the lack of vector space structure for the set of rankings ($\textit{i.e.}$ the symmetric group $\mathfrak{S}_n$) and the complex nature of statistics considered in ranking data analysis make the formulation of robustness objectives in this domain challenging. In this paper, we introduce notions of robustness, together with dedicated statistical methods, for $\textit{Consensus Ranking}$ the flagship problem in ranking data analysis, aiming at summarizing a probability distribution on $\mathfrak{S}_n$ by a $\textit{median}$ ranking. Precisely, we propose specific extensions of the popular concept of breakdown point, tailored to consensus ranking, and address the related computational issues. Beyond the theoretical contributions, the relevance of the approach proposed is supported by an experimental study.
APA
Goibert, M., Calauzènes, C., Irurozki, E. & Clémençon, S.. (2023). Robust Consensus in Ranking Data Analysis: Definitions, Properties and Computational Issues. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:11584-11597 Available from https://proceedings.mlr.press/v202/goibert23a.html.

Related Material