Fast Differentiable Sorting and Ranking

Mathieu Blondel, Olivier Teboul, Quentin Berthet, Josip Djolonga
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:950-959, 2020.

Abstract

The sorting operation is one of the most commonly used building blocks in computer programming. In machine learning, it is often used for robust statistics. However, seen as a function, it is piecewise linear and as a result includes many kinks where it is non-differentiable. More problematic is the related ranking operator, often used for order statistics and ranking metrics. It is a piecewise constant function, meaning that its derivatives are null or undefined. While numerous works have proposed differentiable proxies to sorting and ranking, they do not achieve the $O(n \log n)$ time complexity one would expect from sorting and ranking operations. In this paper, we propose the first differentiable sorting and ranking operators with $O(n \log n)$ time and $O(n)$ space complexity. Our proposal in addition enjoys exact computation and differentiation. We achieve this feat by constructing differentiable operators as projections onto the permutahedron, the convex hull of permutations, and using a reduction to isotonic optimization. Empirically, we confirm that our approach is an order of magnitude faster than existing approaches and showcase two novel applications: differentiable Spearman’s rank correlation coefficient and least trimmed squares.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-blondel20a, title = {Fast Differentiable Sorting and Ranking}, author = {Blondel, Mathieu and Teboul, Olivier and Berthet, Quentin and Djolonga, Josip}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {950--959}, year = {2020}, editor = {Hal Daumé III and Aarti Singh}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/blondel20a/blondel20a.pdf}, url = { http://proceedings.mlr.press/v119/blondel20a.html }, abstract = {The sorting operation is one of the most commonly used building blocks in computer programming. In machine learning, it is often used for robust statistics. However, seen as a function, it is piecewise linear and as a result includes many kinks where it is non-differentiable. More problematic is the related ranking operator, often used for order statistics and ranking metrics. It is a piecewise constant function, meaning that its derivatives are null or undefined. While numerous works have proposed differentiable proxies to sorting and ranking, they do not achieve the $O(n \log n)$ time complexity one would expect from sorting and ranking operations. In this paper, we propose the first differentiable sorting and ranking operators with $O(n \log n)$ time and $O(n)$ space complexity. Our proposal in addition enjoys exact computation and differentiation. We achieve this feat by constructing differentiable operators as projections onto the permutahedron, the convex hull of permutations, and using a reduction to isotonic optimization. Empirically, we confirm that our approach is an order of magnitude faster than existing approaches and showcase two novel applications: differentiable Spearman’s rank correlation coefficient and least trimmed squares.} }
Endnote
%0 Conference Paper %T Fast Differentiable Sorting and Ranking %A Mathieu Blondel %A Olivier Teboul %A Quentin Berthet %A Josip Djolonga %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-blondel20a %I PMLR %P 950--959 %U http://proceedings.mlr.press/v119/blondel20a.html %V 119 %X The sorting operation is one of the most commonly used building blocks in computer programming. In machine learning, it is often used for robust statistics. However, seen as a function, it is piecewise linear and as a result includes many kinks where it is non-differentiable. More problematic is the related ranking operator, often used for order statistics and ranking metrics. It is a piecewise constant function, meaning that its derivatives are null or undefined. While numerous works have proposed differentiable proxies to sorting and ranking, they do not achieve the $O(n \log n)$ time complexity one would expect from sorting and ranking operations. In this paper, we propose the first differentiable sorting and ranking operators with $O(n \log n)$ time and $O(n)$ space complexity. Our proposal in addition enjoys exact computation and differentiation. We achieve this feat by constructing differentiable operators as projections onto the permutahedron, the convex hull of permutations, and using a reduction to isotonic optimization. Empirically, we confirm that our approach is an order of magnitude faster than existing approaches and showcase two novel applications: differentiable Spearman’s rank correlation coefficient and least trimmed squares.
APA
Blondel, M., Teboul, O., Berthet, Q. & Djolonga, J.. (2020). Fast Differentiable Sorting and Ranking. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:950-959 Available from http://proceedings.mlr.press/v119/blondel20a.html .

Related Material