Multiclass Online Learning and Uniform Convergence

Steve Hanneke, Shay Moran, Vinod Raman, Unique Subedi, Ambuj Tewari
Proceedings of Thirty Sixth Conference on Learning Theory, PMLR 195:5682-5696, 2023.

Abstract

We study multiclass classification in the agnostic adversarial online learning setting. As our main result, we prove that any multiclass concept class is agnostically learnable if and only if its Littlestone dimension is finite. This solves an open problem studied by Daniely, Sabato, Ben-David, and Shalev-Shwartz (2011,2015) who handled the case when the number of classes (or labels) is bounded. We also prove a separation between online learnability and online uniform convergence by exhibiting an easy-to-learn class whose sequential Rademacher complexity is unbounded.Our learning algorithm uses the multiplicative weights algorithm, with a set of experts defined by executions of the Standard Optimal Algorithm on subsequences of size Littlestone dimension. We argue that the best expert has regret at most Littlestone dimension relative to the best concept in the class. This differs from the well-known covering technique of Ben-David, Pal, and Shalev-Shwartz (2009) for binary classification, where the best expert has regret zero.

Cite this Paper


BibTeX
@InProceedings{pmlr-v195-hanneke23b, title = {Multiclass Online Learning and Uniform Convergence}, author = {Hanneke, Steve and Moran, Shay and Raman, Vinod and Subedi, Unique and Tewari, Ambuj}, booktitle = {Proceedings of Thirty Sixth Conference on Learning Theory}, pages = {5682--5696}, year = {2023}, editor = {Neu, Gergely and Rosasco, Lorenzo}, volume = {195}, series = {Proceedings of Machine Learning Research}, month = {12--15 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v195/hanneke23b/hanneke23b.pdf}, url = {https://proceedings.mlr.press/v195/hanneke23b.html}, abstract = {We study multiclass classification in the agnostic adversarial online learning setting. As our main result, we prove that any multiclass concept class is agnostically learnable if and only if its Littlestone dimension is finite. This solves an open problem studied by Daniely, Sabato, Ben-David, and Shalev-Shwartz (2011,2015) who handled the case when the number of classes (or labels) is bounded. We also prove a separation between online learnability and online uniform convergence by exhibiting an easy-to-learn class whose sequential Rademacher complexity is unbounded.Our learning algorithm uses the multiplicative weights algorithm, with a set of experts defined by executions of the Standard Optimal Algorithm on subsequences of size Littlestone dimension. We argue that the best expert has regret at most Littlestone dimension relative to the best concept in the class. This differs from the well-known covering technique of Ben-David, Pal, and Shalev-Shwartz (2009) for binary classification, where the best expert has regret zero.} }
Endnote
%0 Conference Paper %T Multiclass Online Learning and Uniform Convergence %A Steve Hanneke %A Shay Moran %A Vinod Raman %A Unique Subedi %A Ambuj Tewari %B Proceedings of Thirty Sixth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2023 %E Gergely Neu %E Lorenzo Rosasco %F pmlr-v195-hanneke23b %I PMLR %P 5682--5696 %U https://proceedings.mlr.press/v195/hanneke23b.html %V 195 %X We study multiclass classification in the agnostic adversarial online learning setting. As our main result, we prove that any multiclass concept class is agnostically learnable if and only if its Littlestone dimension is finite. This solves an open problem studied by Daniely, Sabato, Ben-David, and Shalev-Shwartz (2011,2015) who handled the case when the number of classes (or labels) is bounded. We also prove a separation between online learnability and online uniform convergence by exhibiting an easy-to-learn class whose sequential Rademacher complexity is unbounded.Our learning algorithm uses the multiplicative weights algorithm, with a set of experts defined by executions of the Standard Optimal Algorithm on subsequences of size Littlestone dimension. We argue that the best expert has regret at most Littlestone dimension relative to the best concept in the class. This differs from the well-known covering technique of Ben-David, Pal, and Shalev-Shwartz (2009) for binary classification, where the best expert has regret zero.
APA
Hanneke, S., Moran, S., Raman, V., Subedi, U. & Tewari, A.. (2023). Multiclass Online Learning and Uniform Convergence. Proceedings of Thirty Sixth Conference on Learning Theory, in Proceedings of Machine Learning Research 195:5682-5696 Available from https://proceedings.mlr.press/v195/hanneke23b.html.

Related Material