Weighted Sampling without Replacement for Deep Top-$k$ Classification

Dieqiao Feng, Yuanqi Du, Carla P Gomes, Bart Selman
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:9910-9920, 2023.

Abstract

The top-$k$ classification accuracy is a crucial metric in machine learning and is often used to evaluate the performance of deep neural networks. These networks are typically trained using the cross-entropy loss, which optimizes for top-$1$ classification and is considered optimal in the case of infinite data. However, in real-world scenarios, data is often noisy and limited, leading to the need for more robust losses. In this paper, we propose using the Weighted Sampling Without Replacement (WSWR) method as a learning objective for top-$k$ loss. While traditional methods for evaluating WSWR-based top-$k$ loss are computationally impractical, we show a novel connection between WSWR and Reinforcement Learning (RL) and apply well-established RL algorithms to estimate gradients. We compared our method with recently proposed top-$k$ losses in various regimes of noise and data size for the prevalent use case of $k = 5$. Our experimental results reveal that our method consistently outperforms all other methods on the top-$k$ metric for noisy datasets, has more robustness on extreme testing scenarios, and achieves competitive results on training with limited data.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-feng23a, title = {Weighted Sampling without Replacement for Deep Top-$k$ Classification}, author = {Feng, Dieqiao and Du, Yuanqi and Gomes, Carla P and Selman, Bart}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {9910--9920}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/feng23a/feng23a.pdf}, url = {https://proceedings.mlr.press/v202/feng23a.html}, abstract = {The top-$k$ classification accuracy is a crucial metric in machine learning and is often used to evaluate the performance of deep neural networks. These networks are typically trained using the cross-entropy loss, which optimizes for top-$1$ classification and is considered optimal in the case of infinite data. However, in real-world scenarios, data is often noisy and limited, leading to the need for more robust losses. In this paper, we propose using the Weighted Sampling Without Replacement (WSWR) method as a learning objective for top-$k$ loss. While traditional methods for evaluating WSWR-based top-$k$ loss are computationally impractical, we show a novel connection between WSWR and Reinforcement Learning (RL) and apply well-established RL algorithms to estimate gradients. We compared our method with recently proposed top-$k$ losses in various regimes of noise and data size for the prevalent use case of $k = 5$. Our experimental results reveal that our method consistently outperforms all other methods on the top-$k$ metric for noisy datasets, has more robustness on extreme testing scenarios, and achieves competitive results on training with limited data.} }
Endnote
%0 Conference Paper %T Weighted Sampling without Replacement for Deep Top-$k$ Classification %A Dieqiao Feng %A Yuanqi Du %A Carla P Gomes %A Bart Selman %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-feng23a %I PMLR %P 9910--9920 %U https://proceedings.mlr.press/v202/feng23a.html %V 202 %X The top-$k$ classification accuracy is a crucial metric in machine learning and is often used to evaluate the performance of deep neural networks. These networks are typically trained using the cross-entropy loss, which optimizes for top-$1$ classification and is considered optimal in the case of infinite data. However, in real-world scenarios, data is often noisy and limited, leading to the need for more robust losses. In this paper, we propose using the Weighted Sampling Without Replacement (WSWR) method as a learning objective for top-$k$ loss. While traditional methods for evaluating WSWR-based top-$k$ loss are computationally impractical, we show a novel connection between WSWR and Reinforcement Learning (RL) and apply well-established RL algorithms to estimate gradients. We compared our method with recently proposed top-$k$ losses in various regimes of noise and data size for the prevalent use case of $k = 5$. Our experimental results reveal that our method consistently outperforms all other methods on the top-$k$ metric for noisy datasets, has more robustness on extreme testing scenarios, and achieves competitive results on training with limited data.
APA
Feng, D., Du, Y., Gomes, C.P. & Selman, B.. (2023). Weighted Sampling without Replacement for Deep Top-$k$ Classification. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:9910-9920 Available from https://proceedings.mlr.press/v202/feng23a.html.

Related Material