Phase-shifted adversarial training

Yeachan Kim, Seongyeon Kim, Ihyeok Seo, Bonggun Shin
Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, PMLR 216:1068-1077, 2023.

Abstract

Adversarial training (AT) has been considered an imperative component for safely deploying neural network-based applications. However, it typically comes with slow convergence and worse performance on clean samples (i.e., non-adversarial samples). In this work, we analyze the behavior of neural networks during learning with adversarial samples through the lens of response frequency. Interestingly, we observe that AT causes neural networks to converge slowly to high-frequency information, resulting in highly oscillatory predictions near each data point. To learn high-frequency content efficiently, we first prove that a universal phenomenon, the frequency principle (i.e., lower frequencies are learned first), still holds in AT. Building upon this theoretical foundation, we present a novel approach to AT, which we call phase-shifted adversarial training (PhaseAT). In PhaseAT, the high-frequency components, which are a contributing factor to slow convergence, are adaptively shifted into the low-frequency range where faster convergence occurs. For evaluation, we conduct extensive experiments on CIFAR-10 and ImageNet, using an adaptive attack that is carefully designed for reliable evaluation. Comprehensive results show that PhaseAT substantially improves convergence for high-frequency information, thereby leading to improved adversarial robustness.

Cite this Paper


BibTeX
@InProceedings{pmlr-v216-kim23b, title = {Phase-shifted adversarial training}, author = {Kim, Yeachan and Kim, Seongyeon and Seo, Ihyeok and Shin, Bonggun}, booktitle = {Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence}, pages = {1068--1077}, year = {2023}, editor = {Evans, Robin J. and Shpitser, Ilya}, volume = {216}, series = {Proceedings of Machine Learning Research}, month = {31 Jul--04 Aug}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v216/kim23b/kim23b.pdf}, url = {https://proceedings.mlr.press/v216/kim23b.html}, abstract = {Adversarial training (AT) has been considered an imperative component for safely deploying neural network-based applications. However, it typically comes with slow convergence and worse performance on clean samples (i.e., non-adversarial samples). In this work, we analyze the behavior of neural networks during learning with adversarial samples through the lens of response frequency. Interestingly, we observe that AT causes neural networks to converge slowly to high-frequency information, resulting in highly oscillatory predictions near each data point. To learn high-frequency content efficiently, we first prove that a universal phenomenon, the frequency principle (i.e., lower frequencies are learned first), still holds in AT. Building upon this theoretical foundation, we present a novel approach to AT, which we call phase-shifted adversarial training (PhaseAT). In PhaseAT, the high-frequency components, which are a contributing factor to slow convergence, are adaptively shifted into the low-frequency range where faster convergence occurs. For evaluation, we conduct extensive experiments on CIFAR-10 and ImageNet, using an adaptive attack that is carefully designed for reliable evaluation. Comprehensive results show that PhaseAT substantially improves convergence for high-frequency information, thereby leading to improved adversarial robustness.} }
Endnote
%0 Conference Paper %T Phase-shifted adversarial training %A Yeachan Kim %A Seongyeon Kim %A Ihyeok Seo %A Bonggun Shin %B Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2023 %E Robin J. Evans %E Ilya Shpitser %F pmlr-v216-kim23b %I PMLR %P 1068--1077 %U https://proceedings.mlr.press/v216/kim23b.html %V 216 %X Adversarial training (AT) has been considered an imperative component for safely deploying neural network-based applications. However, it typically comes with slow convergence and worse performance on clean samples (i.e., non-adversarial samples). In this work, we analyze the behavior of neural networks during learning with adversarial samples through the lens of response frequency. Interestingly, we observe that AT causes neural networks to converge slowly to high-frequency information, resulting in highly oscillatory predictions near each data point. To learn high-frequency content efficiently, we first prove that a universal phenomenon, the frequency principle (i.e., lower frequencies are learned first), still holds in AT. Building upon this theoretical foundation, we present a novel approach to AT, which we call phase-shifted adversarial training (PhaseAT). In PhaseAT, the high-frequency components, which are a contributing factor to slow convergence, are adaptively shifted into the low-frequency range where faster convergence occurs. For evaluation, we conduct extensive experiments on CIFAR-10 and ImageNet, using an adaptive attack that is carefully designed for reliable evaluation. Comprehensive results show that PhaseAT substantially improves convergence for high-frequency information, thereby leading to improved adversarial robustness.
APA
Kim, Y., Kim, S., Seo, I. & Shin, B.. (2023). Phase-shifted adversarial training. Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 216:1068-1077 Available from https://proceedings.mlr.press/v216/kim23b.html.

Related Material