Weighted Sampling without Replacement for Deep Top-$k$ Classification

Dieqiao Feng; Yuanqi Du; Carla P Gomes; Bart Selman

Weighted Sampling without Replacement for Deep Top- $k$ Classification

Dieqiao Feng, Yuanqi Du, Carla P Gomes, Bart Selman

Proceedings of the 40th International Conference on Machine Learning, PMLR 202:9910-9920, 2023.

Abstract

The top-

$k$ classification accuracy is a crucial metric in machine learning and is often used to evaluate the performance of deep neural networks. These networks are typically trained using the cross-entropy loss, which optimizes for top-

$1$ classification and is considered optimal in the case of infinite data. However, in real-world scenarios, data is often noisy and limited, leading to the need for more robust losses. In this paper, we propose using the Weighted Sampling Without Replacement (WSWR) method as a learning objective for top-

$k$ loss. While traditional methods for evaluating WSWR-based top- $k$ loss are computationally impractical, we show a novel connection between WSWR and Reinforcement Learning (RL) and apply well-established RL algorithms to estimate gradients. We compared our method with recently proposed top-

$k$ losses in various regimes of noise and data size for the prevalent use case of

$k = 5$ . Our experimental results reveal that our method consistently outperforms all other methods on the top-

$k$ metric for noisy datasets, has more robustness on extreme testing scenarios, and achieves competitive results on training with limited data.

Cite this Paper

BibTeX


@InProceedings{pmlr-v202-feng23a,
  title = 	 {Weighted Sampling without Replacement for Deep Top-$k$ Classification},
  author =       {Feng, Dieqiao and Du, Yuanqi and Gomes, Carla P and Selman, Bart},
  booktitle = 	 {Proceedings of the 40th International Conference on Machine Learning},
  pages = 	 {9910--9920},
  year = 	 {2023},
  editor = 	 {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan},
  volume = 	 {202},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {23--29 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v202/feng23a/feng23a.pdf},
  url = 	 {https://proceedings.mlr.press/v202/feng23a.html},
  abstract = 	 {The top-$k$ classification accuracy is a crucial metric in machine learning and is often used to evaluate the performance of deep neural networks. These networks are typically trained using the cross-entropy loss, which optimizes for top-$1$ classification and is considered optimal in the case of infinite data. However, in real-world scenarios, data is often noisy and limited, leading to the need for more robust losses. In this paper, we propose using the Weighted Sampling Without Replacement (WSWR) method as a learning objective for top-$k$ loss. While traditional methods for evaluating WSWR-based top-$k$ loss are computationally impractical, we show a novel connection between WSWR and Reinforcement Learning (RL) and apply well-established RL algorithms to estimate gradients. We compared our method with recently proposed top-$k$ losses in various regimes of noise and data size for the prevalent use case of $k = 5$. Our experimental results reveal that our method consistently outperforms all other methods on the top-$k$ metric for noisy datasets, has more robustness on extreme testing scenarios, and achieves competitive results on training with limited data.}
}

Endnote

%0 Conference Paper
%T Weighted Sampling without Replacement for Deep Top-$k$ Classification
%A Dieqiao Feng
%A Yuanqi Du
%A Carla P Gomes
%A Bart Selman
%B Proceedings of the 40th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2023
%E Andreas Krause
%E Emma Brunskill
%E Kyunghyun Cho
%E Barbara Engelhardt
%E Sivan Sabato
%E Jonathan Scarlett	
%F pmlr-v202-feng23a
%I PMLR
%P 9910--9920
%U https://proceedings.mlr.press/v202/feng23a.html
%V 202
%X The top-$k$ classification accuracy is a crucial metric in machine learning and is often used to evaluate the performance of deep neural networks. These networks are typically trained using the cross-entropy loss, which optimizes for top-$1$ classification and is considered optimal in the case of infinite data. However, in real-world scenarios, data is often noisy and limited, leading to the need for more robust losses. In this paper, we propose using the Weighted Sampling Without Replacement (WSWR) method as a learning objective for top-$k$ loss. While traditional methods for evaluating WSWR-based top-$k$ loss are computationally impractical, we show a novel connection between WSWR and Reinforcement Learning (RL) and apply well-established RL algorithms to estimate gradients. We compared our method with recently proposed top-$k$ losses in various regimes of noise and data size for the prevalent use case of $k = 5$. Our experimental results reveal that our method consistently outperforms all other methods on the top-$k$ metric for noisy datasets, has more robustness on extreme testing scenarios, and achieves competitive results on training with limited data.

APA


Feng, D., Du, Y., Gomes, C.P. & Selman, B.. (2023). Weighted Sampling without Replacement for Deep Top-$k$ Classification. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:9910-9920 Available from https://proceedings.mlr.press/v202/feng23a.html.

Weighted Sampling without Replacement for Deep Top-kk Classification

Abstract

Cite this Paper

Related Material

Weighted Sampling without Replacement for Deep Top- $k$ Classification