Multiclass Learnability and the ERM principle

[edit]

Amit Daniely, Sivan Sabato, Shai Ben-David, Shai Shalev-Shwartz ;
Proceedings of the 24th Annual Conference on Learning Theory, PMLR 19:207-232, 2011.

Abstract

Multiclass learning is an area of growing practical relevance, for which thecurrently available theory is still far from providing satisfactoryunderstanding. We study the learnability of multiclass prediction, and deriveupper and lower bounds on the sample complexity of multiclass hypothesisclasses in different learning models: batch/online, realizable/unrealizable,full information/bandit feedback. Our analysis reveals a surprisingphenomenon: In the multiclass setting, in sharp contrast to binaryclassification, not all Empirical Risk Minimization (ERM) algorithms areequally successful. We show that there exist hypotheses classes for which someERM learners have lower sample complexity than others. Furthermore, there areclasses that are learnable by some ERM learners, while other ERM learner willfail to learn them. We propose a principle for designing good ERM learners, anduse this principle to prove tight bounds on the sample complexity of learning\em symmetric multiclass hypothesis classes (that is, classes that areinvariant under any permutation of label names). We demonstrate the relevanceof the theory by analyzing the sample complexity of two widely used hypothesisclasses: generalized linear multiclass models and reduction trees. We also obtainsome practically relevant conclusions.

Related Material