Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack

Francesco Croce; Matthias Hein

Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack

Francesco Croce, Matthias Hein

Proceedings of the 37th International Conference on Machine Learning, PMLR 119:2196-2205, 2020.

Abstract

The evaluation of robustness against adversarial manipulation of neural networks-based classifiers is mainly tested with empirical attacks as methods for the exact computation, even when available, do not scale to large networks. We propose in this paper a new white-box adversarial attack wrt the

$l_p$ -norms for

$p \in \{1,2,\infty\}$ aiming at finding the minimal perturbation necessary to change the class of a given input. It has an intuitive geometric meaning, yields quickly high quality results, minimizes the size of the perturbation (so that it returns the robust accuracy at every threshold with a single run). It performs better or similar to state-of-the-art attacks which are partially specialized to one

$l_p$ -norm, and is robust to the phenomenon of gradient obfuscation.

Cite this Paper

BibTeX


@InProceedings{pmlr-v119-croce20a,
  title = 	 {Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack},
  author =       {Croce, Francesco and Hein, Matthias},
  booktitle = 	 {Proceedings of the 37th International Conference on Machine Learning},
  pages = 	 {2196--2205},
  year = 	 {2020},
  editor = 	 {III, Hal Daumé and Singh, Aarti},
  volume = 	 {119},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--18 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v119/croce20a/croce20a.pdf},
  url = 	 {https://proceedings.mlr.press/v119/croce20a.html},
  abstract = 	 {The evaluation of robustness against adversarial manipulation of neural networks-based classifiers is mainly tested with empirical attacks as methods for the exact computation, even when available, do not scale to large networks. We propose in this paper a new white-box adversarial attack wrt the $l_p$-norms for $p \in \{1,2,\infty\}$ aiming at finding the minimal perturbation necessary to change the class of a given input. It has an intuitive geometric meaning, yields quickly high quality results, minimizes the size of the perturbation (so that it returns the robust accuracy at every threshold with a single run). It performs better or similar to state-of-the-art attacks which are partially specialized to one $l_p$-norm, and is robust to the phenomenon of gradient obfuscation.}
}

Endnote

%0 Conference Paper
%T Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack
%A Francesco Croce
%A Matthias Hein
%B Proceedings of the 37th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2020
%E Hal Daumé III
%E Aarti Singh	
%F pmlr-v119-croce20a
%I PMLR
%P 2196--2205
%U https://proceedings.mlr.press/v119/croce20a.html
%V 119
%X The evaluation of robustness against adversarial manipulation of neural networks-based classifiers is mainly tested with empirical attacks as methods for the exact computation, even when available, do not scale to large networks. We propose in this paper a new white-box adversarial attack wrt the $l_p$-norms for $p \in \{1,2,\infty\}$ aiming at finding the minimal perturbation necessary to change the class of a given input. It has an intuitive geometric meaning, yields quickly high quality results, minimizes the size of the perturbation (so that it returns the robust accuracy at every threshold with a single run). It performs better or similar to state-of-the-art attacks which are partially specialized to one $l_p$-norm, and is robust to the phenomenon of gradient obfuscation.

APA


Croce, F. & Hein, M.. (2020). Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:2196-2205 Available from https://proceedings.mlr.press/v119/croce20a.html.

Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack

Abstract

Cite this Paper

Related Material