Ranking Items from Discrete Ratings: The Cost of Unknown User Thresholds

Oscar Villemaud, Suryanarayana Sankagiri, Matthias Grossglauser
Proceedings of The 37th International Conference on Algorithmic Learning Theory, PMLR 313:1-51, 2026.

Abstract

Ranking items is a central task in many information retrieval and recommender systems. User input for the ranking task often comes in the form of ratings on a coarse discrete scale. We ask whether it is possible to recover a fine-grained item ranking from such coarse-grained ratings. We model items as having scores and users as having thresholds; a user likes an item if the score exceeds the threshold, and dislikes it otherwise. Although all users implicitly agree on the total item order, estimating that order is challenging when both the scores and the thresholds are latent. Under our model, any ranking method naturally partitions the $n$ items into bins; the bins are ordered, but the items inside each bin are still unordered. Users arrive sequentially, and every new user can be queried to refine the current ranking. We prove that achieving a near-perfect ranking, measured by Spearman distance, requires $\Theta(n^2)$ users (and therefore $\Omega(n^2)$ queries). This is significantly worse than the $O(n\log n)$ queries needed to rank either from comparisons or from ratings with known user thresholds; the gap reflects the additional queries needed to estimate each user’s latent threshold. Our bound also quantifies the impact of a mismatch between the score and threshold distributions via a quadratic divergence factor. To show the tightness of our results, we provide a ranking algorithm whose query complexity matches our bound up to a logarithmic factor. Our work reveals a tension in online ranking: diversity in thresholds is necessary to merge coarse ratings from many users into a fine-grained ranking, but this diversity has a cost if the thresholds are a priori unknown.

Cite this Paper


BibTeX
@InProceedings{pmlr-v313-villemaud26a, title = {Ranking Items from Discrete Ratings: The Cost of Unknown User Thresholds}, author = {Villemaud, Oscar and Sankagiri, Suryanarayana and Grossglauser, Matthias}, booktitle = {Proceedings of The 37th International Conference on Algorithmic Learning Theory}, pages = {1--51}, year = {2026}, editor = {Telgarsky, Matus and Ullman, Jonathan}, volume = {313}, series = {Proceedings of Machine Learning Research}, month = {23--26 Feb}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v313/main/assets/villemaud26a/villemaud26a.pdf}, url = {https://proceedings.mlr.press/v313/villemaud26a.html}, abstract = {Ranking items is a central task in many information retrieval and recommender systems. User input for the ranking task often comes in the form of ratings on a coarse discrete scale. We ask whether it is possible to recover a fine-grained item ranking from such coarse-grained ratings. We model items as having scores and users as having thresholds; a user likes an item if the score exceeds the threshold, and dislikes it otherwise. Although all users implicitly agree on the total item order, estimating that order is challenging when both the scores and the thresholds are latent. Under our model, any ranking method naturally partitions the $n$ items into bins; the bins are ordered, but the items inside each bin are still unordered. Users arrive sequentially, and every new user can be queried to refine the current ranking. We prove that achieving a near-perfect ranking, measured by Spearman distance, requires $\Theta(n^2)$ users (and therefore $\Omega(n^2)$ queries). This is significantly worse than the $O(n\log n)$ queries needed to rank either from comparisons or from ratings with known user thresholds; the gap reflects the additional queries needed to estimate each user’s latent threshold. Our bound also quantifies the impact of a mismatch between the score and threshold distributions via a quadratic divergence factor. To show the tightness of our results, we provide a ranking algorithm whose query complexity matches our bound up to a logarithmic factor. Our work reveals a tension in online ranking: diversity in thresholds is necessary to merge coarse ratings from many users into a fine-grained ranking, but this diversity has a cost if the thresholds are a priori unknown.} }
Endnote
%0 Conference Paper %T Ranking Items from Discrete Ratings: The Cost of Unknown User Thresholds %A Oscar Villemaud %A Suryanarayana Sankagiri %A Matthias Grossglauser %B Proceedings of The 37th International Conference on Algorithmic Learning Theory %C Proceedings of Machine Learning Research %D 2026 %E Matus Telgarsky %E Jonathan Ullman %F pmlr-v313-villemaud26a %I PMLR %P 1--51 %U https://proceedings.mlr.press/v313/villemaud26a.html %V 313 %X Ranking items is a central task in many information retrieval and recommender systems. User input for the ranking task often comes in the form of ratings on a coarse discrete scale. We ask whether it is possible to recover a fine-grained item ranking from such coarse-grained ratings. We model items as having scores and users as having thresholds; a user likes an item if the score exceeds the threshold, and dislikes it otherwise. Although all users implicitly agree on the total item order, estimating that order is challenging when both the scores and the thresholds are latent. Under our model, any ranking method naturally partitions the $n$ items into bins; the bins are ordered, but the items inside each bin are still unordered. Users arrive sequentially, and every new user can be queried to refine the current ranking. We prove that achieving a near-perfect ranking, measured by Spearman distance, requires $\Theta(n^2)$ users (and therefore $\Omega(n^2)$ queries). This is significantly worse than the $O(n\log n)$ queries needed to rank either from comparisons or from ratings with known user thresholds; the gap reflects the additional queries needed to estimate each user’s latent threshold. Our bound also quantifies the impact of a mismatch between the score and threshold distributions via a quadratic divergence factor. To show the tightness of our results, we provide a ranking algorithm whose query complexity matches our bound up to a logarithmic factor. Our work reveals a tension in online ranking: diversity in thresholds is necessary to merge coarse ratings from many users into a fine-grained ranking, but this diversity has a cost if the thresholds are a priori unknown.
APA
Villemaud, O., Sankagiri, S. & Grossglauser, M.. (2026). Ranking Items from Discrete Ratings: The Cost of Unknown User Thresholds. Proceedings of The 37th International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 313:1-51 Available from https://proceedings.mlr.press/v313/villemaud26a.html.

Related Material