Statistical Inference for Incomplete Ranking Data: The Case of Rank-Dependent Coarsening
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:1078-1087, 2017.
We consider the problem of statistical inference for ranking data, specifically rank aggregation, under the assumption that samples are incomplete in the sense of not comprising all choice alternatives. In contrast to most existing methods, we explicitly model the process of turning a full ranking into an incomplete one, which we call the coarsening process. To this end, we propose the concept of rank-dependent coarsening, which assumes that incomplete rankings are produced by projecting a full ranking to a random subset of ranks. For a concrete instantiation of our model, in which full rankings are drawn from a Plackett-Luce distribution and observations take the form of pairwise preferences, we study the performance of various rank aggregation methods. In addition to predictive accuracy in the finite sample setting, we address the theoretical question of consistency, by which we mean the ability to recover a target ranking when the sample size goes to infinity, despite a potential bias in the observations caused by the (unknown) coarsening.