Adversarial Robustness Guarantees for Random Deep Neural Networks

Giacomo De Palma, Bobak Kiani, Seth Lloyd
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:2522-2534, 2021.

Abstract

The reliability of deep learning algorithms is fundamentally challenged by the existence of adversarial examples, which are incorrectly classified inputs that are extremely close to a correctly classified input. We explore the properties of adversarial examples for deep neural networks with random weights and biases, and prove that for any p$\geq$1, the \ell^p distance of any given input from the classification boundary scales as one over the square root of the dimension of the input times the \ell^p norm of the input. The results are based on the recently proved equivalence between Gaussian processes and deep neural networks in the limit of infinite width of the hidden layers, and are validated with experiments on both random deep neural networks and deep neural networks trained on the MNIST and CIFAR10 datasets. The results constitute a fundamental advance in the theoretical understanding of adversarial examples, and open the way to a thorough theoretical characterization of the relation between network architecture and robustness to adversarial perturbations.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-de-palma21a, title = {Adversarial Robustness Guarantees for Random Deep Neural Networks}, author = {De Palma, Giacomo and Kiani, Bobak and Lloyd, Seth}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {2522--2534}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/de-palma21a/de-palma21a.pdf}, url = {https://proceedings.mlr.press/v139/de-palma21a.html}, abstract = {The reliability of deep learning algorithms is fundamentally challenged by the existence of adversarial examples, which are incorrectly classified inputs that are extremely close to a correctly classified input. We explore the properties of adversarial examples for deep neural networks with random weights and biases, and prove that for any p$\geq$1, the \ell^p distance of any given input from the classification boundary scales as one over the square root of the dimension of the input times the \ell^p norm of the input. The results are based on the recently proved equivalence between Gaussian processes and deep neural networks in the limit of infinite width of the hidden layers, and are validated with experiments on both random deep neural networks and deep neural networks trained on the MNIST and CIFAR10 datasets. The results constitute a fundamental advance in the theoretical understanding of adversarial examples, and open the way to a thorough theoretical characterization of the relation between network architecture and robustness to adversarial perturbations.} }
Endnote
%0 Conference Paper %T Adversarial Robustness Guarantees for Random Deep Neural Networks %A Giacomo De Palma %A Bobak Kiani %A Seth Lloyd %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-de-palma21a %I PMLR %P 2522--2534 %U https://proceedings.mlr.press/v139/de-palma21a.html %V 139 %X The reliability of deep learning algorithms is fundamentally challenged by the existence of adversarial examples, which are incorrectly classified inputs that are extremely close to a correctly classified input. We explore the properties of adversarial examples for deep neural networks with random weights and biases, and prove that for any p$\geq$1, the \ell^p distance of any given input from the classification boundary scales as one over the square root of the dimension of the input times the \ell^p norm of the input. The results are based on the recently proved equivalence between Gaussian processes and deep neural networks in the limit of infinite width of the hidden layers, and are validated with experiments on both random deep neural networks and deep neural networks trained on the MNIST and CIFAR10 datasets. The results constitute a fundamental advance in the theoretical understanding of adversarial examples, and open the way to a thorough theoretical characterization of the relation between network architecture and robustness to adversarial perturbations.
APA
De Palma, G., Kiani, B. & Lloyd, S.. (2021). Adversarial Robustness Guarantees for Random Deep Neural Networks. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:2522-2534 Available from https://proceedings.mlr.press/v139/de-palma21a.html.

Related Material