Adversarial Purification with Score-based Generative Models

Jongmin Yoon, Sung Ju Hwang, Juho Lee
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:12062-12072, 2021.

Abstract

While adversarial training is considered as a standard defense method against adversarial attacks for image classifiers, adversarial purification, which purifies attacked images into clean images with a standalone purification, model has shown promises as an alternative defense method. Recently, an EBM trained with MCMC has been highlighted as a purification model, where an attacked image is purified by running a long Markov-chain using the gradients of the EBM. Yet, the practicality of the adversarial purification using an EBM remains questionable because the number of MCMC steps required for such purification is too large. In this paper, we propose a novel adversarial purification method based on an EBM trained with DSM. We show that an EBM trained with DSM can quickly purify attacked images within a few steps. We further introduce a simple yet effective randomized purification scheme that injects random noises into images before purification. This process screens the adversarial perturbations imposed on images by the random noises and brings the images to the regime where the EBM can denoise well. We show that our purification method is robust against various attacks and demonstrate its state-of-the-art performances.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-yoon21a, title = {Adversarial Purification with Score-based Generative Models}, author = {Yoon, Jongmin and Hwang, Sung Ju and Lee, Juho}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {12062--12072}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/yoon21a/yoon21a.pdf}, url = {https://proceedings.mlr.press/v139/yoon21a.html}, abstract = {While adversarial training is considered as a standard defense method against adversarial attacks for image classifiers, adversarial purification, which purifies attacked images into clean images with a standalone purification, model has shown promises as an alternative defense method. Recently, an EBM trained with MCMC has been highlighted as a purification model, where an attacked image is purified by running a long Markov-chain using the gradients of the EBM. Yet, the practicality of the adversarial purification using an EBM remains questionable because the number of MCMC steps required for such purification is too large. In this paper, we propose a novel adversarial purification method based on an EBM trained with DSM. We show that an EBM trained with DSM can quickly purify attacked images within a few steps. We further introduce a simple yet effective randomized purification scheme that injects random noises into images before purification. This process screens the adversarial perturbations imposed on images by the random noises and brings the images to the regime where the EBM can denoise well. We show that our purification method is robust against various attacks and demonstrate its state-of-the-art performances.} }
Endnote
%0 Conference Paper %T Adversarial Purification with Score-based Generative Models %A Jongmin Yoon %A Sung Ju Hwang %A Juho Lee %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-yoon21a %I PMLR %P 12062--12072 %U https://proceedings.mlr.press/v139/yoon21a.html %V 139 %X While adversarial training is considered as a standard defense method against adversarial attacks for image classifiers, adversarial purification, which purifies attacked images into clean images with a standalone purification, model has shown promises as an alternative defense method. Recently, an EBM trained with MCMC has been highlighted as a purification model, where an attacked image is purified by running a long Markov-chain using the gradients of the EBM. Yet, the practicality of the adversarial purification using an EBM remains questionable because the number of MCMC steps required for such purification is too large. In this paper, we propose a novel adversarial purification method based on an EBM trained with DSM. We show that an EBM trained with DSM can quickly purify attacked images within a few steps. We further introduce a simple yet effective randomized purification scheme that injects random noises into images before purification. This process screens the adversarial perturbations imposed on images by the random noises and brings the images to the regime where the EBM can denoise well. We show that our purification method is robust against various attacks and demonstrate its state-of-the-art performances.
APA
Yoon, J., Hwang, S.J. & Lee, J.. (2021). Adversarial Purification with Score-based Generative Models. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:12062-12072 Available from https://proceedings.mlr.press/v139/yoon21a.html.

Related Material