Adversarial Robustness of Stabilized Neural ODE Might be from Obfuscated Gradients

Yifei Huang, Yaodong Yu, Hongyang Zhang, Yi Ma, Yuan Yao
Proceedings of the 2nd Mathematical and Scientific Machine Learning Conference, PMLR 145:497-515, 2022.

Abstract

In this paper we introduce a provably stable architecture for Neural Ordinary Differential Equations (ODEs) which achieves non-trivial adversarial robustness under white-box adversarial attacks even when the network is trained naturally. For most existing defense methods withstanding strong white-box attacks, to improve robustness of neural networks, they need to be trained adversarially, hence have to strike a trade-off between natural accuracy and adversarial robustness. Inspired by dynamical system theory, we design a stabilized neural ODE network named SONet whose ODE blocks are skew-symmetric and proved to be input-output stable. With natural training, SONet can achieve comparable robustness with the state-of-the-art adversarial defense methods, without sacrificing natural accuracy. Even replacing only the first layer of a ResNet by such a ODE block can exhibit further improvement in robustness, e.g., under PGD-20 ($\ell_\infty=0.031$) attack on CIFAR-10 dataset, it achieves 91.57% and natural accuracy and 62.35% robust accuracy, while a counterpart architecture of ResNet trained with TRADES achieves natural and robust accuracy 76.29% and 45.24%, respectively. To understand possible reasons behind this surprisingly good result, we further explore the possible mechanism underlying such . We show that the adaptive stepsize numerical ODE solvers, such as adaptive HEUN2, BOSH3, and DOPRI5, have a gradient masking effect that fails the PGD attacks which are sensitive to gradient information of training loss; on the other hand, they cannot fool the CW attack of robust gradients and the SPSA attack that is gradient-free. This provides a new explanation that the adversarial robustness of ODE-based networks mainly comes from the obfuscated gradients in numerical ODE solvers with adaptive step sizes. (Source codes: \url{https://github.com/silkylove/SONet}; \url{https://github.com/yao-lab/SONet})

Cite this Paper


BibTeX
@InProceedings{pmlr-v145-huang22a, title = {Adversarial Robustness of Stabilized Neural ODE Might be from Obfuscated Gradients}, author = {Huang, Yifei and Yu, Yaodong and Zhang, Hongyang and Ma, Yi and Yao, Yuan}, booktitle = {Proceedings of the 2nd Mathematical and Scientific Machine Learning Conference}, pages = {497--515}, year = {2022}, editor = {Bruna, Joan and Hesthaven, Jan and Zdeborova, Lenka}, volume = {145}, series = {Proceedings of Machine Learning Research}, month = {16--19 Aug}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v145/huang22a/huang22a.pdf}, url = {https://proceedings.mlr.press/v145/huang22a.html}, abstract = {In this paper we introduce a provably stable architecture for Neural Ordinary Differential Equations (ODEs) which achieves non-trivial adversarial robustness under white-box adversarial attacks even when the network is trained naturally. For most existing defense methods withstanding strong white-box attacks, to improve robustness of neural networks, they need to be trained adversarially, hence have to strike a trade-off between natural accuracy and adversarial robustness. Inspired by dynamical system theory, we design a stabilized neural ODE network named SONet whose ODE blocks are skew-symmetric and proved to be input-output stable. With natural training, SONet can achieve comparable robustness with the state-of-the-art adversarial defense methods, without sacrificing natural accuracy. Even replacing only the first layer of a ResNet by such a ODE block can exhibit further improvement in robustness, e.g., under PGD-20 ($\ell_\infty=0.031$) attack on CIFAR-10 dataset, it achieves 91.57% and natural accuracy and 62.35% robust accuracy, while a counterpart architecture of ResNet trained with TRADES achieves natural and robust accuracy 76.29% and 45.24%, respectively. To understand possible reasons behind this surprisingly good result, we further explore the possible mechanism underlying such . We show that the adaptive stepsize numerical ODE solvers, such as adaptive HEUN2, BOSH3, and DOPRI5, have a gradient masking effect that fails the PGD attacks which are sensitive to gradient information of training loss; on the other hand, they cannot fool the CW attack of robust gradients and the SPSA attack that is gradient-free. This provides a new explanation that the adversarial robustness of ODE-based networks mainly comes from the obfuscated gradients in numerical ODE solvers with adaptive step sizes. (Source codes: \url{https://github.com/silkylove/SONet}; \url{https://github.com/yao-lab/SONet}) } }
Endnote
%0 Conference Paper %T Adversarial Robustness of Stabilized Neural ODE Might be from Obfuscated Gradients %A Yifei Huang %A Yaodong Yu %A Hongyang Zhang %A Yi Ma %A Yuan Yao %B Proceedings of the 2nd Mathematical and Scientific Machine Learning Conference %C Proceedings of Machine Learning Research %D 2022 %E Joan Bruna %E Jan Hesthaven %E Lenka Zdeborova %F pmlr-v145-huang22a %I PMLR %P 497--515 %U https://proceedings.mlr.press/v145/huang22a.html %V 145 %X In this paper we introduce a provably stable architecture for Neural Ordinary Differential Equations (ODEs) which achieves non-trivial adversarial robustness under white-box adversarial attacks even when the network is trained naturally. For most existing defense methods withstanding strong white-box attacks, to improve robustness of neural networks, they need to be trained adversarially, hence have to strike a trade-off between natural accuracy and adversarial robustness. Inspired by dynamical system theory, we design a stabilized neural ODE network named SONet whose ODE blocks are skew-symmetric and proved to be input-output stable. With natural training, SONet can achieve comparable robustness with the state-of-the-art adversarial defense methods, without sacrificing natural accuracy. Even replacing only the first layer of a ResNet by such a ODE block can exhibit further improvement in robustness, e.g., under PGD-20 ($\ell_\infty=0.031$) attack on CIFAR-10 dataset, it achieves 91.57% and natural accuracy and 62.35% robust accuracy, while a counterpart architecture of ResNet trained with TRADES achieves natural and robust accuracy 76.29% and 45.24%, respectively. To understand possible reasons behind this surprisingly good result, we further explore the possible mechanism underlying such . We show that the adaptive stepsize numerical ODE solvers, such as adaptive HEUN2, BOSH3, and DOPRI5, have a gradient masking effect that fails the PGD attacks which are sensitive to gradient information of training loss; on the other hand, they cannot fool the CW attack of robust gradients and the SPSA attack that is gradient-free. This provides a new explanation that the adversarial robustness of ODE-based networks mainly comes from the obfuscated gradients in numerical ODE solvers with adaptive step sizes. (Source codes: \url{https://github.com/silkylove/SONet}; \url{https://github.com/yao-lab/SONet})
APA
Huang, Y., Yu, Y., Zhang, H., Ma, Y. & Yao, Y.. (2022). Adversarial Robustness of Stabilized Neural ODE Might be from Obfuscated Gradients. Proceedings of the 2nd Mathematical and Scientific Machine Learning Conference, in Proceedings of Machine Learning Research 145:497-515 Available from https://proceedings.mlr.press/v145/huang22a.html.

Related Material