[edit]
Refining the Sample Complexity of Comparative Learning
Proceedings of The 36th International Conference on Algorithmic Learning Theory, PMLR 272:67-88, 2025.
Abstract
Comparative learning, a recently introduced variation of the PAC (Probably Approximately Correct) framework, interpolates between the two standard extreme settings of realizable and agnostic PAC learning. In comparative learning the labeling is assumed to be from one hypothesis class (the source class) while the learner’s performance is to be measured against another hypothesis class (the benchmark class). This setup allows for incorporating more specific prior knowledge into PAC type learning bounds, which are known to be otherwise overly pessimistic. It also naturally represents model distillation tasks, where a predictor with specific requirements (such as being bounded in size or being interpretable) is trained on the labels from another model. A first sample complexity analysis of comparative learning established upper and lower bounds for general comparative learning. In this work, we propose a more fine grained view on this setting, distinguishing between proper learning and general (improper) learning. We derive novel upper and lower sample complexity bounds for both settings. In particular, we identify conditions for each of the regimes, and thereby exhibit how the rate depends on the relatedness of the two classes in perhaps unexpected ways.