Uniform Convergence, Adversarial Spheres and a Simple Remedy

Gregor Bachmann, Seyed-Mohsen Moosavi-Dezfooli, Thomas Hofmann
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:490-499, 2021.

Abstract

Previous work has cast doubt on the general framework of uniform convergence and its ability to explain generalization in neural networks. By considering a specific dataset, it was observed that a neural network completely misclassifies a projection of the training data (adversarial set), rendering any existing generalization bound based on uniform convergence vacuous. We provide an extensive theoretical investigation of the previously studied data setting through the lens of infinitely-wide models. We prove that the Neural Tangent Kernel (NTK) also suffers from the same phenomenon and we uncover its origin. We highlight the important role of the output bias and show theoretically as well as empirically how a sensible choice completely mitigates the problem. We identify sharp phase transitions in the accuracy on the adversarial set and study its dependency on the training sample size. As a result, we are able to characterize critical sample sizes beyond which the effect disappears. Moreover, we study decompositions of a neural network into a clean and noisy part by considering its canonical decomposition into its different eigenfunctions and show empirically that for too small bias the adversarial phenomenon still persists.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-bachmann21a, title = {Uniform Convergence, Adversarial Spheres and a Simple Remedy}, author = {Bachmann, Gregor and Moosavi-Dezfooli, Seyed-Mohsen and Hofmann, Thomas}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {490--499}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/bachmann21a/bachmann21a.pdf}, url = {https://proceedings.mlr.press/v139/bachmann21a.html}, abstract = {Previous work has cast doubt on the general framework of uniform convergence and its ability to explain generalization in neural networks. By considering a specific dataset, it was observed that a neural network completely misclassifies a projection of the training data (adversarial set), rendering any existing generalization bound based on uniform convergence vacuous. We provide an extensive theoretical investigation of the previously studied data setting through the lens of infinitely-wide models. We prove that the Neural Tangent Kernel (NTK) also suffers from the same phenomenon and we uncover its origin. We highlight the important role of the output bias and show theoretically as well as empirically how a sensible choice completely mitigates the problem. We identify sharp phase transitions in the accuracy on the adversarial set and study its dependency on the training sample size. As a result, we are able to characterize critical sample sizes beyond which the effect disappears. Moreover, we study decompositions of a neural network into a clean and noisy part by considering its canonical decomposition into its different eigenfunctions and show empirically that for too small bias the adversarial phenomenon still persists.} }
Endnote
%0 Conference Paper %T Uniform Convergence, Adversarial Spheres and a Simple Remedy %A Gregor Bachmann %A Seyed-Mohsen Moosavi-Dezfooli %A Thomas Hofmann %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-bachmann21a %I PMLR %P 490--499 %U https://proceedings.mlr.press/v139/bachmann21a.html %V 139 %X Previous work has cast doubt on the general framework of uniform convergence and its ability to explain generalization in neural networks. By considering a specific dataset, it was observed that a neural network completely misclassifies a projection of the training data (adversarial set), rendering any existing generalization bound based on uniform convergence vacuous. We provide an extensive theoretical investigation of the previously studied data setting through the lens of infinitely-wide models. We prove that the Neural Tangent Kernel (NTK) also suffers from the same phenomenon and we uncover its origin. We highlight the important role of the output bias and show theoretically as well as empirically how a sensible choice completely mitigates the problem. We identify sharp phase transitions in the accuracy on the adversarial set and study its dependency on the training sample size. As a result, we are able to characterize critical sample sizes beyond which the effect disappears. Moreover, we study decompositions of a neural network into a clean and noisy part by considering its canonical decomposition into its different eigenfunctions and show empirically that for too small bias the adversarial phenomenon still persists.
APA
Bachmann, G., Moosavi-Dezfooli, S. & Hofmann, T.. (2021). Uniform Convergence, Adversarial Spheres and a Simple Remedy. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:490-499 Available from https://proceedings.mlr.press/v139/bachmann21a.html.

Related Material