On Breaking Deep Generative Model-based Defenses and Beyond

Yanzhi Chen, Renjie Xie, Zhanxing Zhu
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:1736-1745, 2020.

Abstract

Deep neural networks have been proven to be vulnerable to the so-called adversarial attacks. Recently there have been efforts to defend such attacks with deep generative models. These defenses often predict by inverting the deep generative models rather than simple feedforward propagation. Such defenses are difficult to attack due to the obfuscated gradients caused by inversion. In this work, we propose a new white-box attack to break these defenses. The idea is to view the inversion phase as a dynamical system, through which we extract the gradient w.r.t the image by backtracking its trajectory. An amortized strategy is also developed to accelerate the attack. Experiments show that our attack better breaks state-of-the-art defenses (e.g DefenseGAN, ABS) than other attacks (e.g BPDA). Additionally, our empirical results provide insights for understanding the weaknesses of deep generative model defenses.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-chen20w, title = {On Breaking Deep Generative Model-based Defenses and Beyond}, author = {Chen, Yanzhi and Xie, Renjie and Zhu, Zhanxing}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {1736--1745}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/chen20w/chen20w.pdf}, url = {https://proceedings.mlr.press/v119/chen20w.html}, abstract = {Deep neural networks have been proven to be vulnerable to the so-called adversarial attacks. Recently there have been efforts to defend such attacks with deep generative models. These defenses often predict by inverting the deep generative models rather than simple feedforward propagation. Such defenses are difficult to attack due to the obfuscated gradients caused by inversion. In this work, we propose a new white-box attack to break these defenses. The idea is to view the inversion phase as a dynamical system, through which we extract the gradient w.r.t the image by backtracking its trajectory. An amortized strategy is also developed to accelerate the attack. Experiments show that our attack better breaks state-of-the-art defenses (e.g DefenseGAN, ABS) than other attacks (e.g BPDA). Additionally, our empirical results provide insights for understanding the weaknesses of deep generative model defenses.} }
Endnote
%0 Conference Paper %T On Breaking Deep Generative Model-based Defenses and Beyond %A Yanzhi Chen %A Renjie Xie %A Zhanxing Zhu %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-chen20w %I PMLR %P 1736--1745 %U https://proceedings.mlr.press/v119/chen20w.html %V 119 %X Deep neural networks have been proven to be vulnerable to the so-called adversarial attacks. Recently there have been efforts to defend such attacks with deep generative models. These defenses often predict by inverting the deep generative models rather than simple feedforward propagation. Such defenses are difficult to attack due to the obfuscated gradients caused by inversion. In this work, we propose a new white-box attack to break these defenses. The idea is to view the inversion phase as a dynamical system, through which we extract the gradient w.r.t the image by backtracking its trajectory. An amortized strategy is also developed to accelerate the attack. Experiments show that our attack better breaks state-of-the-art defenses (e.g DefenseGAN, ABS) than other attacks (e.g BPDA). Additionally, our empirical results provide insights for understanding the weaknesses of deep generative model defenses.
APA
Chen, Y., Xie, R. & Zhu, Z.. (2020). On Breaking Deep Generative Model-based Defenses and Beyond. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:1736-1745 Available from https://proceedings.mlr.press/v119/chen20w.html.

Related Material