Stronger Generalization Bounds for Deep Nets via a Compression Approach

Sanjeev Arora, Rong Ge, Behnam Neyshabur, Yi Zhang
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:254-263, 2018.

Abstract

Deep nets generalize well despite having more parameters than the number of training samples. Recent works try to give an explanation using PAC-Bayes and Margin-based analyses, but do not as yet result in sample complexity bounds better than naive parameter counting. The current paper shows generalization bounds that are orders of magnitude better in practice. These rely upon new succinct reparametrizations of the trained net — a compression that is explicit and efficient. These yield generalization bounds via a simple compression-based framework introduced here. Our results also provide some theoretical justification for widespread empirical success in compressing deep nets. Analysis of correctness of our compression relies upon some newly identified noise stability properties of trained deep nets, which are also experimentally verified. The study of these properties and resulting generalization bounds are also extended to convolutional nets, which had eluded earlier attempts on proving generalization.

Cite this Paper


BibTeX
@InProceedings{pmlr-v80-arora18b, title = {Stronger Generalization Bounds for Deep Nets via a Compression Approach}, author = {Arora, Sanjeev and Ge, Rong and Neyshabur, Behnam and Zhang, Yi}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, pages = {254--263}, year = {2018}, editor = {Dy, Jennifer and Krause, Andreas}, volume = {80}, series = {Proceedings of Machine Learning Research}, month = {10--15 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v80/arora18b/arora18b.pdf}, url = {https://proceedings.mlr.press/v80/arora18b.html}, abstract = {Deep nets generalize well despite having more parameters than the number of training samples. Recent works try to give an explanation using PAC-Bayes and Margin-based analyses, but do not as yet result in sample complexity bounds better than naive parameter counting. The current paper shows generalization bounds that are orders of magnitude better in practice. These rely upon new succinct reparametrizations of the trained net — a compression that is explicit and efficient. These yield generalization bounds via a simple compression-based framework introduced here. Our results also provide some theoretical justification for widespread empirical success in compressing deep nets. Analysis of correctness of our compression relies upon some newly identified noise stability properties of trained deep nets, which are also experimentally verified. The study of these properties and resulting generalization bounds are also extended to convolutional nets, which had eluded earlier attempts on proving generalization.} }
Endnote
%0 Conference Paper %T Stronger Generalization Bounds for Deep Nets via a Compression Approach %A Sanjeev Arora %A Rong Ge %A Behnam Neyshabur %A Yi Zhang %B Proceedings of the 35th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2018 %E Jennifer Dy %E Andreas Krause %F pmlr-v80-arora18b %I PMLR %P 254--263 %U https://proceedings.mlr.press/v80/arora18b.html %V 80 %X Deep nets generalize well despite having more parameters than the number of training samples. Recent works try to give an explanation using PAC-Bayes and Margin-based analyses, but do not as yet result in sample complexity bounds better than naive parameter counting. The current paper shows generalization bounds that are orders of magnitude better in practice. These rely upon new succinct reparametrizations of the trained net — a compression that is explicit and efficient. These yield generalization bounds via a simple compression-based framework introduced here. Our results also provide some theoretical justification for widespread empirical success in compressing deep nets. Analysis of correctness of our compression relies upon some newly identified noise stability properties of trained deep nets, which are also experimentally verified. The study of these properties and resulting generalization bounds are also extended to convolutional nets, which had eluded earlier attempts on proving generalization.
APA
Arora, S., Ge, R., Neyshabur, B. & Zhang, Y.. (2018). Stronger Generalization Bounds for Deep Nets via a Compression Approach. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:254-263 Available from https://proceedings.mlr.press/v80/arora18b.html.

Related Material