PAC-Bayes, MAC-Bayes and Conditional Mutual Information: Fast rate bounds that handle general VC classes

Peter Grunwald, Thomas Steinke, Lydia Zakynthinou
Proceedings of Thirty Fourth Conference on Learning Theory, PMLR 134:2217-2247, 2021.

Abstract

We give a novel, unified derivation of conditional PAC-Bayesian and mutual information (MI) generalization bounds. We derive conditional MI bounds as an instance, with special choice of prior, of conditional MAC-Bayesian (Mean Approximately Correct) bounds, itself derived from conditional PAC-Bayesian bounds, where ‘conditional’ means that one can use priors conditioned on a joint training and ghost sample. This allows us to get nontrivial PAC-Bayes and MI-style bounds for general VC classes, something recently shown to be impossible with standard PAC-Bayesian/MI bounds. Second, it allows us to get fast rates of order $O((\text{KL}/n)^{\gamma}$ for $\gamma > 1/2$ if a Bernstein condition holds and for exp-concave losses (with $\gamma=1$), which is impossible with both standard PAC-Bayes generalization and MI bounds. Our work extends the recent work by Steinke and Zakynthinou (2020) who handle MI with VC but neither PAC-Bayes nor fast rates and Mhammedi et al. (2019) who initiated fast rate PAC-Bayes generalization error bounds but handle neither MI nor general VC classes.

Cite this Paper


BibTeX
@InProceedings{pmlr-v134-grunwald21a, title = {PAC-Bayes, MAC-Bayes and Conditional Mutual Information: Fast rate bounds that handle general VC classes}, author = {Grunwald, Peter and Steinke, Thomas and Zakynthinou, Lydia}, booktitle = {Proceedings of Thirty Fourth Conference on Learning Theory}, pages = {2217--2247}, year = {2021}, editor = {Belkin, Mikhail and Kpotufe, Samory}, volume = {134}, series = {Proceedings of Machine Learning Research}, month = {15--19 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v134/grunwald21a/grunwald21a.pdf}, url = {https://proceedings.mlr.press/v134/grunwald21a.html}, abstract = {We give a novel, unified derivation of conditional PAC-Bayesian and mutual information (MI) generalization bounds. We derive conditional MI bounds as an instance, with special choice of prior, of conditional MAC-Bayesian (Mean Approximately Correct) bounds, itself derived from conditional PAC-Bayesian bounds, where ‘conditional’ means that one can use priors conditioned on a joint training and ghost sample. This allows us to get nontrivial PAC-Bayes and MI-style bounds for general VC classes, something recently shown to be impossible with standard PAC-Bayesian/MI bounds. Second, it allows us to get fast rates of order $O((\text{KL}/n)^{\gamma}$ for $\gamma > 1/2$ if a Bernstein condition holds and for exp-concave losses (with $\gamma=1$), which is impossible with both standard PAC-Bayes generalization and MI bounds. Our work extends the recent work by Steinke and Zakynthinou (2020) who handle MI with VC but neither PAC-Bayes nor fast rates and Mhammedi et al. (2019) who initiated fast rate PAC-Bayes generalization error bounds but handle neither MI nor general VC classes.} }
Endnote
%0 Conference Paper %T PAC-Bayes, MAC-Bayes and Conditional Mutual Information: Fast rate bounds that handle general VC classes %A Peter Grunwald %A Thomas Steinke %A Lydia Zakynthinou %B Proceedings of Thirty Fourth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2021 %E Mikhail Belkin %E Samory Kpotufe %F pmlr-v134-grunwald21a %I PMLR %P 2217--2247 %U https://proceedings.mlr.press/v134/grunwald21a.html %V 134 %X We give a novel, unified derivation of conditional PAC-Bayesian and mutual information (MI) generalization bounds. We derive conditional MI bounds as an instance, with special choice of prior, of conditional MAC-Bayesian (Mean Approximately Correct) bounds, itself derived from conditional PAC-Bayesian bounds, where ‘conditional’ means that one can use priors conditioned on a joint training and ghost sample. This allows us to get nontrivial PAC-Bayes and MI-style bounds for general VC classes, something recently shown to be impossible with standard PAC-Bayesian/MI bounds. Second, it allows us to get fast rates of order $O((\text{KL}/n)^{\gamma}$ for $\gamma > 1/2$ if a Bernstein condition holds and for exp-concave losses (with $\gamma=1$), which is impossible with both standard PAC-Bayes generalization and MI bounds. Our work extends the recent work by Steinke and Zakynthinou (2020) who handle MI with VC but neither PAC-Bayes nor fast rates and Mhammedi et al. (2019) who initiated fast rate PAC-Bayes generalization error bounds but handle neither MI nor general VC classes.
APA
Grunwald, P., Steinke, T. & Zakynthinou, L.. (2021). PAC-Bayes, MAC-Bayes and Conditional Mutual Information: Fast rate bounds that handle general VC classes. Proceedings of Thirty Fourth Conference on Learning Theory, in Proceedings of Machine Learning Research 134:2217-2247 Available from https://proceedings.mlr.press/v134/grunwald21a.html.

Related Material