How Many Perturbations Break This Model? Evaluating Robustness Beyond Adversarial Accuracy

Raphael Olivier, Bhiksha Raj
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:26583-26598, 2023.

Abstract

Robustness to adversarial attacks is typically evaluated with adversarial accuracy. While essential, this metric does not capture all aspects of robustness and in particular leaves out the question of how many perturbations can be found for each point. In this work, we introduce an alternative approach, adversarial sparsity, which quantifies how difficult it is to find a successful perturbation given both an input point and a constraint on the direction of the perturbation. We show that sparsity provides valuable insight into neural networks in multiple ways: for instance, it illustrates important differences between current state-of-the-art robust models them that accuracy analysis does not, and suggests approaches for improving their robustness. When applying broken defenses effective against weak attacks but not strong ones, sparsity can discriminate between the totally ineffective and the partially effective defenses. Finally, with sparsity we can measure increases in robustness that do not affect accuracy: we show for example that data augmentation can by itself increase adversarial robustness, without using adversarial training.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-olivier23a, title = {How Many Perturbations Break This Model? {E}valuating Robustness Beyond Adversarial Accuracy}, author = {Olivier, Raphael and Raj, Bhiksha}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {26583--26598}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/olivier23a/olivier23a.pdf}, url = {https://proceedings.mlr.press/v202/olivier23a.html}, abstract = {Robustness to adversarial attacks is typically evaluated with adversarial accuracy. While essential, this metric does not capture all aspects of robustness and in particular leaves out the question of how many perturbations can be found for each point. In this work, we introduce an alternative approach, adversarial sparsity, which quantifies how difficult it is to find a successful perturbation given both an input point and a constraint on the direction of the perturbation. We show that sparsity provides valuable insight into neural networks in multiple ways: for instance, it illustrates important differences between current state-of-the-art robust models them that accuracy analysis does not, and suggests approaches for improving their robustness. When applying broken defenses effective against weak attacks but not strong ones, sparsity can discriminate between the totally ineffective and the partially effective defenses. Finally, with sparsity we can measure increases in robustness that do not affect accuracy: we show for example that data augmentation can by itself increase adversarial robustness, without using adversarial training.} }
Endnote
%0 Conference Paper %T How Many Perturbations Break This Model? Evaluating Robustness Beyond Adversarial Accuracy %A Raphael Olivier %A Bhiksha Raj %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-olivier23a %I PMLR %P 26583--26598 %U https://proceedings.mlr.press/v202/olivier23a.html %V 202 %X Robustness to adversarial attacks is typically evaluated with adversarial accuracy. While essential, this metric does not capture all aspects of robustness and in particular leaves out the question of how many perturbations can be found for each point. In this work, we introduce an alternative approach, adversarial sparsity, which quantifies how difficult it is to find a successful perturbation given both an input point and a constraint on the direction of the perturbation. We show that sparsity provides valuable insight into neural networks in multiple ways: for instance, it illustrates important differences between current state-of-the-art robust models them that accuracy analysis does not, and suggests approaches for improving their robustness. When applying broken defenses effective against weak attacks but not strong ones, sparsity can discriminate between the totally ineffective and the partially effective defenses. Finally, with sparsity we can measure increases in robustness that do not affect accuracy: we show for example that data augmentation can by itself increase adversarial robustness, without using adversarial training.
APA
Olivier, R. & Raj, B.. (2023). How Many Perturbations Break This Model? Evaluating Robustness Beyond Adversarial Accuracy. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:26583-26598 Available from https://proceedings.mlr.press/v202/olivier23a.html.

Related Material