Generalization Guarantees via Algorithm-dependent Rademacher Complexity

Sarah Sachs, Tim van Erven, Liam Hodgkinson, Rajiv Khanna, Umut Şimşekli
Proceedings of Thirty Sixth Conference on Learning Theory, PMLR 195:4863-4880, 2023.

Abstract

Algorithm- and data-dependent generalization bounds are required to explain the generalization behavior of modern machine learning algorithms. In this context, there exists information theoretic generalization bounds that involve (various forms of) mutual information, as well as bounds based on hypothesis set stability. We propose a conceptually related, but technically distinct complexity measure to control generalization error, which is the empirical Rademacher complexity of an algorithm- and data-dependent hypothesis class. Combining standard properties of Rademacher complexity with the convenient structure of this class, we are able to (i) obtain novel bounds based on the finite fractal dimension, which (a) extend previous fractal dimension-type bounds from continuous to finite hypothesis classes, and (b) avoid a mutual information term that was required in prior work; (ii) we greatly simplify the proof of a recent dimension-independent generalization bound for stochastic gradient descent; and (iii) we easily recover results for VC classes and compression schemes, similar to approaches based on conditional mutual information.

Cite this Paper


BibTeX
@InProceedings{pmlr-v195-sachs23a, title = {Generalization Guarantees via Algorithm-dependent Rademacher Complexity}, author = {Sachs, Sarah and van Erven, Tim and Hodgkinson, Liam and Khanna, Rajiv and {\c{S}}im{\c{s}}ekli, Umut}, booktitle = {Proceedings of Thirty Sixth Conference on Learning Theory}, pages = {4863--4880}, year = {2023}, editor = {Neu, Gergely and Rosasco, Lorenzo}, volume = {195}, series = {Proceedings of Machine Learning Research}, month = {12--15 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v195/sachs23a/sachs23a.pdf}, url = {https://proceedings.mlr.press/v195/sachs23a.html}, abstract = {Algorithm- and data-dependent generalization bounds are required to explain the generalization behavior of modern machine learning algorithms. In this context, there exists information theoretic generalization bounds that involve (various forms of) mutual information, as well as bounds based on hypothesis set stability. We propose a conceptually related, but technically distinct complexity measure to control generalization error, which is the empirical Rademacher complexity of an algorithm- and data-dependent hypothesis class. Combining standard properties of Rademacher complexity with the convenient structure of this class, we are able to (i) obtain novel bounds based on the finite fractal dimension, which (a) extend previous fractal dimension-type bounds from continuous to finite hypothesis classes, and (b) avoid a mutual information term that was required in prior work; (ii) we greatly simplify the proof of a recent dimension-independent generalization bound for stochastic gradient descent; and (iii) we easily recover results for VC classes and compression schemes, similar to approaches based on conditional mutual information.} }
Endnote
%0 Conference Paper %T Generalization Guarantees via Algorithm-dependent Rademacher Complexity %A Sarah Sachs %A Tim van Erven %A Liam Hodgkinson %A Rajiv Khanna %A Umut Şimşekli %B Proceedings of Thirty Sixth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2023 %E Gergely Neu %E Lorenzo Rosasco %F pmlr-v195-sachs23a %I PMLR %P 4863--4880 %U https://proceedings.mlr.press/v195/sachs23a.html %V 195 %X Algorithm- and data-dependent generalization bounds are required to explain the generalization behavior of modern machine learning algorithms. In this context, there exists information theoretic generalization bounds that involve (various forms of) mutual information, as well as bounds based on hypothesis set stability. We propose a conceptually related, but technically distinct complexity measure to control generalization error, which is the empirical Rademacher complexity of an algorithm- and data-dependent hypothesis class. Combining standard properties of Rademacher complexity with the convenient structure of this class, we are able to (i) obtain novel bounds based on the finite fractal dimension, which (a) extend previous fractal dimension-type bounds from continuous to finite hypothesis classes, and (b) avoid a mutual information term that was required in prior work; (ii) we greatly simplify the proof of a recent dimension-independent generalization bound for stochastic gradient descent; and (iii) we easily recover results for VC classes and compression schemes, similar to approaches based on conditional mutual information.
APA
Sachs, S., van Erven, T., Hodgkinson, L., Khanna, R. & Şimşekli, U.. (2023). Generalization Guarantees via Algorithm-dependent Rademacher Complexity. Proceedings of Thirty Sixth Conference on Learning Theory, in Proceedings of Machine Learning Research 195:4863-4880 Available from https://proceedings.mlr.press/v195/sachs23a.html.

Related Material