LegendreTron: Uprising Proper Multiclass Loss Learning

Kevin H Lam, Christian Walder, Spiridon Penev, Richard Nock
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:18454-18470, 2023.

Abstract

Loss functions serve as the foundation of supervised learning and are often chosen prior to model development. To avoid potentially ad hoc choices of losses, statistical decision theory describes a desirable property for losses known as properness, which asserts that Bayes’ rule is optimal. Recent works have sought to learn losses and models jointly. Existing methods do this by fitting an inverse canonical link function which monotonically maps $\mathbb{R}$ to $[0,1]$ to estimate probabilities for binary problems. In this paper, we extend monotonicity to maps between $\mathbb{R}^{C-1}$ and the projected probability simplex $\tilde{\Delta}^{C-1}$ by using monotonicity of gradients of convex functions. We present LegendreTron as a novel and practical method that jointly learns proper canonical losses and probabilities for multiclass problems. Tested on a benchmark of domains with up to 1,000 classes, our experimental results show that our method consistently outperforms the natural multiclass baseline under a $t$-test at 99% significance on all datasets with greater than $10$ classes.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-lam23b, title = {{L}egendre{T}ron: Uprising Proper Multiclass Loss Learning}, author = {Lam, Kevin H and Walder, Christian and Penev, Spiridon and Nock, Richard}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {18454--18470}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/lam23b/lam23b.pdf}, url = {https://proceedings.mlr.press/v202/lam23b.html}, abstract = {Loss functions serve as the foundation of supervised learning and are often chosen prior to model development. To avoid potentially ad hoc choices of losses, statistical decision theory describes a desirable property for losses known as properness, which asserts that Bayes’ rule is optimal. Recent works have sought to learn losses and models jointly. Existing methods do this by fitting an inverse canonical link function which monotonically maps $\mathbb{R}$ to $[0,1]$ to estimate probabilities for binary problems. In this paper, we extend monotonicity to maps between $\mathbb{R}^{C-1}$ and the projected probability simplex $\tilde{\Delta}^{C-1}$ by using monotonicity of gradients of convex functions. We present LegendreTron as a novel and practical method that jointly learns proper canonical losses and probabilities for multiclass problems. Tested on a benchmark of domains with up to 1,000 classes, our experimental results show that our method consistently outperforms the natural multiclass baseline under a $t$-test at 99% significance on all datasets with greater than $10$ classes.} }
Endnote
%0 Conference Paper %T LegendreTron: Uprising Proper Multiclass Loss Learning %A Kevin H Lam %A Christian Walder %A Spiridon Penev %A Richard Nock %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-lam23b %I PMLR %P 18454--18470 %U https://proceedings.mlr.press/v202/lam23b.html %V 202 %X Loss functions serve as the foundation of supervised learning and are often chosen prior to model development. To avoid potentially ad hoc choices of losses, statistical decision theory describes a desirable property for losses known as properness, which asserts that Bayes’ rule is optimal. Recent works have sought to learn losses and models jointly. Existing methods do this by fitting an inverse canonical link function which monotonically maps $\mathbb{R}$ to $[0,1]$ to estimate probabilities for binary problems. In this paper, we extend monotonicity to maps between $\mathbb{R}^{C-1}$ and the projected probability simplex $\tilde{\Delta}^{C-1}$ by using monotonicity of gradients of convex functions. We present LegendreTron as a novel and practical method that jointly learns proper canonical losses and probabilities for multiclass problems. Tested on a benchmark of domains with up to 1,000 classes, our experimental results show that our method consistently outperforms the natural multiclass baseline under a $t$-test at 99% significance on all datasets with greater than $10$ classes.
APA
Lam, K.H., Walder, C., Penev, S. & Nock, R.. (2023). LegendreTron: Uprising Proper Multiclass Loss Learning. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:18454-18470 Available from https://proceedings.mlr.press/v202/lam23b.html.

Related Material