On the Convergence and Robustness of Adversarial Training

Yisen Wang; Xingjun Ma; James Bailey; Jinfeng Yi; Bowen Zhou; Quanquan Gu

On the Convergence and Robustness of Adversarial Training

Yisen Wang, Xingjun Ma, James Bailey, Jinfeng Yi, Bowen Zhou, Quanquan Gu

Proceedings of the 36th International Conference on Machine Learning, PMLR 97:6586-6595, 2019.

Abstract

Improving the robustness of deep neural networks (DNNs) to adversarial examples is an important yet challenging problem for secure deep learning. Across existing defense techniques, adversarial training with Projected Gradient Decent (PGD) is amongst the most effective. Adversarial training solves a min-max optimization problem, with the inner maximization generating adversarial examples by maximizing the classification loss, and the outer minimization finding model parameters by minimizing the loss on adversarial examples generated from the inner maximization. A criterion that measures how well the inner maximization is solved is therefore crucial for adversarial training. In this paper, we propose such a criterion, namely First-Order Stationary Condition for constrained optimization (FOSC), to quantitatively evaluate the convergence quality of adversarial examples found in the inner maximization. With FOSC, we find that to ensure better robustness, it is essential to use adversarial examples with better convergence quality at the later stages of training. Yet at the early stages, high convergence quality adversarial examples are not necessary and may even lead to poor robustness. Based on these observations, we propose a dynamic training strategy to gradually increase the convergence quality of the generated adversarial examples, which significantly improves the robustness of adversarial training. Our theoretical and empirical results show the effectiveness of the proposed method.

Cite this Paper

BibTeX

@InProceedings{pmlr-v97-wang19i,
  title = 	 {On the Convergence and Robustness of Adversarial Training},
  author =       {Wang, Yisen and Ma, Xingjun and Bailey, James and Yi, Jinfeng and Zhou, Bowen and Gu, Quanquan},
  booktitle = 	 {Proceedings of the 36th International Conference on Machine Learning},
  pages = 	 {6586--6595},
  year = 	 {2019},
  editor = 	 {Chaudhuri, Kamalika and Salakhutdinov, Ruslan},
  volume = 	 {97},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {09--15 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v97/wang19i/wang19i.pdf},
  url = 	 {https://proceedings.mlr.press/v97/wang19i.html},
  abstract = 	 {Improving the robustness of deep neural networks (DNNs) to adversarial examples is an important yet challenging problem for secure deep learning. Across existing defense techniques, adversarial training with Projected Gradient Decent (PGD) is amongst the most effective. Adversarial training solves a min-max optimization problem, with the inner maximization generating adversarial examples by maximizing the classification loss, and the outer minimization finding model parameters by minimizing the loss on adversarial examples generated from the inner maximization. A criterion that measures how well the inner maximization is solved is therefore crucial for adversarial training. In this paper, we propose such a criterion, namely First-Order Stationary Condition for constrained optimization (FOSC), to quantitatively evaluate the convergence quality of adversarial examples found in the inner maximization. With FOSC, we find that to ensure better robustness, it is essential to use adversarial examples with better convergence quality at the later stages of training. Yet at the early stages, high convergence quality adversarial examples are not necessary and may even lead to poor robustness. Based on these observations, we propose a dynamic training strategy to gradually increase the convergence quality of the generated adversarial examples, which significantly improves the robustness of adversarial training. Our theoretical and empirical results show the effectiveness of the proposed method.}
}

Endnote

%0 Conference Paper
%T On the Convergence and Robustness of Adversarial Training
%A Yisen Wang
%A Xingjun Ma
%A James Bailey
%A Jinfeng Yi
%A Bowen Zhou
%A Quanquan Gu
%B Proceedings of the 36th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2019
%E Kamalika Chaudhuri
%E Ruslan Salakhutdinov	
%F pmlr-v97-wang19i
%I PMLR
%P 6586--6595
%U https://proceedings.mlr.press/v97/wang19i.html
%V 97
%X Improving the robustness of deep neural networks (DNNs) to adversarial examples is an important yet challenging problem for secure deep learning. Across existing defense techniques, adversarial training with Projected Gradient Decent (PGD) is amongst the most effective. Adversarial training solves a min-max optimization problem, with the inner maximization generating adversarial examples by maximizing the classification loss, and the outer minimization finding model parameters by minimizing the loss on adversarial examples generated from the inner maximization. A criterion that measures how well the inner maximization is solved is therefore crucial for adversarial training. In this paper, we propose such a criterion, namely First-Order Stationary Condition for constrained optimization (FOSC), to quantitatively evaluate the convergence quality of adversarial examples found in the inner maximization. With FOSC, we find that to ensure better robustness, it is essential to use adversarial examples with better convergence quality at the later stages of training. Yet at the early stages, high convergence quality adversarial examples are not necessary and may even lead to poor robustness. Based on these observations, we propose a dynamic training strategy to gradually increase the convergence quality of the generated adversarial examples, which significantly improves the robustness of adversarial training. Our theoretical and empirical results show the effectiveness of the proposed method.

APA

Wang, Y., Ma, X., Bailey, J., Yi, J., Zhou, B. & Gu, Q.. (2019). On the Convergence and Robustness of Adversarial Training. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:6586-6595 Available from https://proceedings.mlr.press/v97/wang19i.html.

On the Convergence and Robustness of Adversarial Training

Abstract

Cite this Paper

Related Material