Learning to Defend by Learning to Attack

Haoming Jiang, Zhehui Chen, Yuyang Shi, Bo Dai, Tuo Zhao
Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:577-585, 2021.

Abstract

Adversarial training provides a principled approach for training robust neural networks. From an optimization perspective, adversarial training is essentially solving a bilevel optimization problem. The leader problem is trying to learn a robust classifier, while the follower maximization is trying to generate adversarial samples. Unfortunately, such a bilevel problem is difficult to solve due to its highly complicated structure. This work proposes a new adversarial training method based on a generic learning-to-learn (L2L) framework. Specifically, instead of applying existing hand-designed algorithms for the inner problem, we learn an optimizer, which is parametrized as a convolutional neural network. At the same time, a robust classifier is learned to defense the adversarial attack generated by the learned optimizer. Experiments over CIFAR-10 and CIFAR-100 datasets demonstrate that L2L outperforms existing adversarial training methods in both classification accuracy and computational efficiency. Moreover, our L2L framework can be extended to generative adversarial imitation learning and stabilize the training.

Cite this Paper


BibTeX
@InProceedings{pmlr-v130-jiang21a, title = { Learning to Defend by Learning to Attack }, author = {Jiang, Haoming and Chen, Zhehui and Shi, Yuyang and Dai, Bo and Zhao, Tuo}, booktitle = {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics}, pages = {577--585}, year = {2021}, editor = {Banerjee, Arindam and Fukumizu, Kenji}, volume = {130}, series = {Proceedings of Machine Learning Research}, month = {13--15 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v130/jiang21a/jiang21a.pdf}, url = {https://proceedings.mlr.press/v130/jiang21a.html}, abstract = { Adversarial training provides a principled approach for training robust neural networks. From an optimization perspective, adversarial training is essentially solving a bilevel optimization problem. The leader problem is trying to learn a robust classifier, while the follower maximization is trying to generate adversarial samples. Unfortunately, such a bilevel problem is difficult to solve due to its highly complicated structure. This work proposes a new adversarial training method based on a generic learning-to-learn (L2L) framework. Specifically, instead of applying existing hand-designed algorithms for the inner problem, we learn an optimizer, which is parametrized as a convolutional neural network. At the same time, a robust classifier is learned to defense the adversarial attack generated by the learned optimizer. Experiments over CIFAR-10 and CIFAR-100 datasets demonstrate that L2L outperforms existing adversarial training methods in both classification accuracy and computational efficiency. Moreover, our L2L framework can be extended to generative adversarial imitation learning and stabilize the training. } }
Endnote
%0 Conference Paper %T Learning to Defend by Learning to Attack %A Haoming Jiang %A Zhehui Chen %A Yuyang Shi %A Bo Dai %A Tuo Zhao %B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2021 %E Arindam Banerjee %E Kenji Fukumizu %F pmlr-v130-jiang21a %I PMLR %P 577--585 %U https://proceedings.mlr.press/v130/jiang21a.html %V 130 %X Adversarial training provides a principled approach for training robust neural networks. From an optimization perspective, adversarial training is essentially solving a bilevel optimization problem. The leader problem is trying to learn a robust classifier, while the follower maximization is trying to generate adversarial samples. Unfortunately, such a bilevel problem is difficult to solve due to its highly complicated structure. This work proposes a new adversarial training method based on a generic learning-to-learn (L2L) framework. Specifically, instead of applying existing hand-designed algorithms for the inner problem, we learn an optimizer, which is parametrized as a convolutional neural network. At the same time, a robust classifier is learned to defense the adversarial attack generated by the learned optimizer. Experiments over CIFAR-10 and CIFAR-100 datasets demonstrate that L2L outperforms existing adversarial training methods in both classification accuracy and computational efficiency. Moreover, our L2L framework can be extended to generative adversarial imitation learning and stabilize the training.
APA
Jiang, H., Chen, Z., Shi, Y., Dai, B. & Zhao, T.. (2021). Learning to Defend by Learning to Attack . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:577-585 Available from https://proceedings.mlr.press/v130/jiang21a.html.

Related Material