Layer-Aware Analysis of Catastrophic Overfitting: Revealing the Pseudo-Robust Shortcut Dependency

Runqi Lin; Chaojian Yu; Bo Han; Hang Su; Tongliang Liu

Layer-Aware Analysis of Catastrophic Overfitting: Revealing the Pseudo-Robust Shortcut Dependency

Runqi Lin, Chaojian Yu, Bo Han, Hang Su, Tongliang Liu

Proceedings of the 41st International Conference on Machine Learning, PMLR 235:30427-30439, 2024.

Abstract

Catastrophic overfitting (CO) presents a significant challenge in single-step adversarial training (AT), manifesting as highly distorted deep neural networks (DNNs) that are vulnerable to multi-step adversarial attacks. However, the underlying factors that lead to the distortion of decision boundaries remain unclear. In this work, we delve into the specific changes within different DNN layers and discover that during CO, the former layers are more susceptible, experiencing earlier and greater distortion, while the latter layers show relative insensitivity. Our analysis further reveals that this increased sensitivity in former layers stems from the formation of $\textit{pseudo-robust shortcuts}$, which alone can impeccably defend against single-step adversarial attacks but bypass genuine-robust learning, resulting in distorted decision boundaries. Eliminating these shortcuts can partially restore robustness in DNNs from the CO state, thereby verifying that dependence on them triggers the occurrence of CO. This understanding motivates us to implement adaptive weight perturbations across different layers to hinder the generation of $\textit{pseudo-robust shortcuts}$, consequently mitigating CO. Extensive experiments demonstrate that our proposed method, $\textbf{L}$ayer-$\textbf{A}$ware Adversarial Weight $\textbf{P}$erturbation (LAP), can effectively prevent CO and further enhance robustness.

Cite this Paper

BibTeX


@InProceedings{pmlr-v235-lin24v,
  title = 	 {Layer-Aware Analysis of Catastrophic Overfitting: Revealing the Pseudo-Robust Shortcut Dependency},
  author =       {Lin, Runqi and Yu, Chaojian and Han, Bo and Su, Hang and Liu, Tongliang},
  booktitle = 	 {Proceedings of the 41st International Conference on Machine Learning},
  pages = 	 {30427--30439},
  year = 	 {2024},
  editor = 	 {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix},
  volume = 	 {235},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {21--27 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v235/main/assets/lin24v/lin24v.pdf},
  url = 	 {https://proceedings.mlr.press/v235/lin24v.html},
  abstract = 	 {Catastrophic overfitting (CO) presents a significant challenge in single-step adversarial training (AT), manifesting as highly distorted deep neural networks (DNNs) that are vulnerable to multi-step adversarial attacks. However, the underlying factors that lead to the distortion of decision boundaries remain unclear. In this work, we delve into the specific changes within different DNN layers and discover that during CO, the former layers are more susceptible, experiencing earlier and greater distortion, while the latter layers show relative insensitivity. Our analysis further reveals that this increased sensitivity in former layers stems from the formation of $\textit{pseudo-robust shortcuts}$, which alone can impeccably defend against single-step adversarial attacks but bypass genuine-robust learning, resulting in distorted decision boundaries. Eliminating these shortcuts can partially restore robustness in DNNs from the CO state, thereby verifying that dependence on them triggers the occurrence of CO. This understanding motivates us to implement adaptive weight perturbations across different layers to hinder the generation of $\textit{pseudo-robust shortcuts}$, consequently mitigating CO. Extensive experiments demonstrate that our proposed method, $\textbf{L}$ayer-$\textbf{A}$ware Adversarial Weight $\textbf{P}$erturbation (LAP), can effectively prevent CO and further enhance robustness.}
}

Endnote

%0 Conference Paper
%T Layer-Aware Analysis of Catastrophic Overfitting: Revealing the Pseudo-Robust Shortcut Dependency
%A Runqi Lin
%A Chaojian Yu
%A Bo Han
%A Hang Su
%A Tongliang Liu
%B Proceedings of the 41st International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2024
%E Ruslan Salakhutdinov
%E Zico Kolter
%E Katherine Heller
%E Adrian Weller
%E Nuria Oliver
%E Jonathan Scarlett
%E Felix Berkenkamp	
%F pmlr-v235-lin24v
%I PMLR
%P 30427--30439
%U https://proceedings.mlr.press/v235/lin24v.html
%V 235
%X Catastrophic overfitting (CO) presents a significant challenge in single-step adversarial training (AT), manifesting as highly distorted deep neural networks (DNNs) that are vulnerable to multi-step adversarial attacks. However, the underlying factors that lead to the distortion of decision boundaries remain unclear. In this work, we delve into the specific changes within different DNN layers and discover that during CO, the former layers are more susceptible, experiencing earlier and greater distortion, while the latter layers show relative insensitivity. Our analysis further reveals that this increased sensitivity in former layers stems from the formation of $\textit{pseudo-robust shortcuts}$, which alone can impeccably defend against single-step adversarial attacks but bypass genuine-robust learning, resulting in distorted decision boundaries. Eliminating these shortcuts can partially restore robustness in DNNs from the CO state, thereby verifying that dependence on them triggers the occurrence of CO. This understanding motivates us to implement adaptive weight perturbations across different layers to hinder the generation of $\textit{pseudo-robust shortcuts}$, consequently mitigating CO. Extensive experiments demonstrate that our proposed method, $\textbf{L}$ayer-$\textbf{A}$ware Adversarial Weight $\textbf{P}$erturbation (LAP), can effectively prevent CO and further enhance robustness.

APA


Lin, R., Yu, C., Han, B., Su, H. & Liu, T.. (2024). Layer-Aware Analysis of Catastrophic Overfitting: Revealing the Pseudo-Robust Shortcut Dependency. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:30427-30439 Available from https://proceedings.mlr.press/v235/lin24v.html.

Layer-Aware Analysis of Catastrophic Overfitting: Revealing the Pseudo-Robust Shortcut Dependency

Abstract

Cite this Paper

Related Material