Understanding and Defending Patched-based Adversarial Attacks for Vision Transformer

Liang Liu, Yanan Guo, Youtao Zhang, Jun Yang
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:21631-21657, 2023.

Abstract

Vision Transformer (ViT) is an attention-based model architecture that has demonstrated superior performance on many computer vision tasks. However, its security properties, in particular, the robustness against adversarial attacks, are yet to be thoroughly studied. Recent works have shown that ViT is vulnerable to attention-based adversarial patch attacks, which cover 1-3% area of the input image using adversarial patches and degrades the model accuracy to 0%. This work provides a generic study targeting the attention-based patch attack. First, we experimentally observe that adversarial patches only activate in a few layers and become lazy during attention updating. According to experiments, we study the theory of how a small adversarial patch perturbates the whole model. Based on understanding adversarial patch attacks, we propose a simple but efficient defense that correctly detects more than 95% of adversarial patches.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-liu23n, title = {Understanding and Defending Patched-based Adversarial Attacks for Vision Transformer}, author = {Liu, Liang and Guo, Yanan and Zhang, Youtao and Yang, Jun}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {21631--21657}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/liu23n/liu23n.pdf}, url = {https://proceedings.mlr.press/v202/liu23n.html}, abstract = {Vision Transformer (ViT) is an attention-based model architecture that has demonstrated superior performance on many computer vision tasks. However, its security properties, in particular, the robustness against adversarial attacks, are yet to be thoroughly studied. Recent works have shown that ViT is vulnerable to attention-based adversarial patch attacks, which cover 1-3% area of the input image using adversarial patches and degrades the model accuracy to 0%. This work provides a generic study targeting the attention-based patch attack. First, we experimentally observe that adversarial patches only activate in a few layers and become lazy during attention updating. According to experiments, we study the theory of how a small adversarial patch perturbates the whole model. Based on understanding adversarial patch attacks, we propose a simple but efficient defense that correctly detects more than 95% of adversarial patches.} }
Endnote
%0 Conference Paper %T Understanding and Defending Patched-based Adversarial Attacks for Vision Transformer %A Liang Liu %A Yanan Guo %A Youtao Zhang %A Jun Yang %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-liu23n %I PMLR %P 21631--21657 %U https://proceedings.mlr.press/v202/liu23n.html %V 202 %X Vision Transformer (ViT) is an attention-based model architecture that has demonstrated superior performance on many computer vision tasks. However, its security properties, in particular, the robustness against adversarial attacks, are yet to be thoroughly studied. Recent works have shown that ViT is vulnerable to attention-based adversarial patch attacks, which cover 1-3% area of the input image using adversarial patches and degrades the model accuracy to 0%. This work provides a generic study targeting the attention-based patch attack. First, we experimentally observe that adversarial patches only activate in a few layers and become lazy during attention updating. According to experiments, we study the theory of how a small adversarial patch perturbates the whole model. Based on understanding adversarial patch attacks, we propose a simple but efficient defense that correctly detects more than 95% of adversarial patches.
APA
Liu, L., Guo, Y., Zhang, Y. & Yang, J.. (2023). Understanding and Defending Patched-based Adversarial Attacks for Vision Transformer. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:21631-21657 Available from https://proceedings.mlr.press/v202/liu23n.html.

Related Material