Differentiable Top-k Classification Learning

Felix Petersen; Hilde Kuehne; Christian Borgelt; Oliver Deussen

Differentiable Top-k Classification Learning

Felix Petersen, Hilde Kuehne, Christian Borgelt, Oliver Deussen

Proceedings of the 39th International Conference on Machine Learning, PMLR 162:17656-17668, 2022.

Abstract

The top-k classification accuracy is one of the core metrics in machine learning. Here, k is conventionally a positive integer, such as 1 or 5, leading to top-1 or top-5 training objectives. In this work, we relax this assumption and optimize the model for multiple k simultaneously instead of using a single k. Leveraging recent advances in differentiable sorting and ranking, we propose a family of differentiable top-k cross-entropy classification losses. This allows training while not only considering the top-1 prediction, but also, e.g., the top-2 and top-5 predictions. We evaluate the proposed losses for fine-tuning on state-of-the-art architectures, as well as for training from scratch. We find that relaxing k not only produces better top-5 accuracies, but also leads to top-1 accuracy improvements. When fine-tuning publicly available ImageNet models, we achieve a new state-of-the-art for these models.

Cite this Paper

BibTeX


@InProceedings{pmlr-v162-petersen22a,
  title = 	 {Differentiable Top-k Classification Learning},
  author =       {Petersen, Felix and Kuehne, Hilde and Borgelt, Christian and Deussen, Oliver},
  booktitle = 	 {Proceedings of the 39th International Conference on Machine Learning},
  pages = 	 {17656--17668},
  year = 	 {2022},
  editor = 	 {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan},
  volume = 	 {162},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {17--23 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v162/petersen22a/petersen22a.pdf},
  url = 	 {https://proceedings.mlr.press/v162/petersen22a.html},
  abstract = 	 {The top-k classification accuracy is one of the core metrics in machine learning. Here, k is conventionally a positive integer, such as 1 or 5, leading to top-1 or top-5 training objectives. In this work, we relax this assumption and optimize the model for multiple k simultaneously instead of using a single k. Leveraging recent advances in differentiable sorting and ranking, we propose a family of differentiable top-k cross-entropy classification losses. This allows training while not only considering the top-1 prediction, but also, e.g., the top-2 and top-5 predictions. We evaluate the proposed losses for fine-tuning on state-of-the-art architectures, as well as for training from scratch. We find that relaxing k not only produces better top-5 accuracies, but also leads to top-1 accuracy improvements. When fine-tuning publicly available ImageNet models, we achieve a new state-of-the-art for these models.}
}

Endnote

%0 Conference Paper
%T Differentiable Top-k Classification Learning
%A Felix Petersen
%A Hilde Kuehne
%A Christian Borgelt
%A Oliver Deussen
%B Proceedings of the 39th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2022
%E Kamalika Chaudhuri
%E Stefanie Jegelka
%E Le Song
%E Csaba Szepesvari
%E Gang Niu
%E Sivan Sabato	
%F pmlr-v162-petersen22a
%I PMLR
%P 17656--17668
%U https://proceedings.mlr.press/v162/petersen22a.html
%V 162
%X The top-k classification accuracy is one of the core metrics in machine learning. Here, k is conventionally a positive integer, such as 1 or 5, leading to top-1 or top-5 training objectives. In this work, we relax this assumption and optimize the model for multiple k simultaneously instead of using a single k. Leveraging recent advances in differentiable sorting and ranking, we propose a family of differentiable top-k cross-entropy classification losses. This allows training while not only considering the top-1 prediction, but also, e.g., the top-2 and top-5 predictions. We evaluate the proposed losses for fine-tuning on state-of-the-art architectures, as well as for training from scratch. We find that relaxing k not only produces better top-5 accuracies, but also leads to top-1 accuracy improvements. When fine-tuning publicly available ImageNet models, we achieve a new state-of-the-art for these models.

APA


Petersen, F., Kuehne, H., Borgelt, C. & Deussen, O.. (2022). Differentiable Top-k Classification Learning. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:17656-17668 Available from https://proceedings.mlr.press/v162/petersen22a.html.

Related Material

Download PDF