Improving Adversarial Robustness via Unlabeled Out-of-Domain Data

Zhun Deng, Linjun Zhang, Amirata Ghorbani, James Zou
Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:2845-2853, 2021.

Abstract

Data augmentation by incorporating cheap unlabeled data from multiple domains is a powerful way to improve prediction especially when there is limited labeled data. In this work, we investigate how adversarial robustness can be enhanced by leveraging out-of-domain unlabeled data. We demonstrate that for broad classes of distributions and classifiers, there exists a sample complexity gap between standard and robust classification. We quantify the extent to which this gap can be bridged by leveraging unlabeled samples from a shifted domain by providing both upper and lower bounds. Moreover, we show settings where we achieve better adversarial robustness when the unlabeled data come from a shifted domain rather than the same domain as the labeled data. We also investigate how to leverage out-of-domain data when some structural information, such as sparsity, is shared between labeled and unlabeled domains. Experimentally, we augment object recognition datasets (CIFAR-10, CINIC-10, and SVHN) with easy-to-obtain and unlabeled out-of-domain data and demonstrate substantial improvement in the model’s robustness against $\ell_\infty$ adversarial attacks on the original domain.

Cite this Paper


BibTeX
@InProceedings{pmlr-v130-deng21b, title = { Improving Adversarial Robustness via Unlabeled Out-of-Domain Data }, author = {Deng, Zhun and Zhang, Linjun and Ghorbani, Amirata and Zou, James}, booktitle = {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics}, pages = {2845--2853}, year = {2021}, editor = {Banerjee, Arindam and Fukumizu, Kenji}, volume = {130}, series = {Proceedings of Machine Learning Research}, month = {13--15 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v130/deng21b/deng21b.pdf}, url = {https://proceedings.mlr.press/v130/deng21b.html}, abstract = { Data augmentation by incorporating cheap unlabeled data from multiple domains is a powerful way to improve prediction especially when there is limited labeled data. In this work, we investigate how adversarial robustness can be enhanced by leveraging out-of-domain unlabeled data. We demonstrate that for broad classes of distributions and classifiers, there exists a sample complexity gap between standard and robust classification. We quantify the extent to which this gap can be bridged by leveraging unlabeled samples from a shifted domain by providing both upper and lower bounds. Moreover, we show settings where we achieve better adversarial robustness when the unlabeled data come from a shifted domain rather than the same domain as the labeled data. We also investigate how to leverage out-of-domain data when some structural information, such as sparsity, is shared between labeled and unlabeled domains. Experimentally, we augment object recognition datasets (CIFAR-10, CINIC-10, and SVHN) with easy-to-obtain and unlabeled out-of-domain data and demonstrate substantial improvement in the model’s robustness against $\ell_\infty$ adversarial attacks on the original domain. } }
Endnote
%0 Conference Paper %T Improving Adversarial Robustness via Unlabeled Out-of-Domain Data %A Zhun Deng %A Linjun Zhang %A Amirata Ghorbani %A James Zou %B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2021 %E Arindam Banerjee %E Kenji Fukumizu %F pmlr-v130-deng21b %I PMLR %P 2845--2853 %U https://proceedings.mlr.press/v130/deng21b.html %V 130 %X Data augmentation by incorporating cheap unlabeled data from multiple domains is a powerful way to improve prediction especially when there is limited labeled data. In this work, we investigate how adversarial robustness can be enhanced by leveraging out-of-domain unlabeled data. We demonstrate that for broad classes of distributions and classifiers, there exists a sample complexity gap between standard and robust classification. We quantify the extent to which this gap can be bridged by leveraging unlabeled samples from a shifted domain by providing both upper and lower bounds. Moreover, we show settings where we achieve better adversarial robustness when the unlabeled data come from a shifted domain rather than the same domain as the labeled data. We also investigate how to leverage out-of-domain data when some structural information, such as sparsity, is shared between labeled and unlabeled domains. Experimentally, we augment object recognition datasets (CIFAR-10, CINIC-10, and SVHN) with easy-to-obtain and unlabeled out-of-domain data and demonstrate substantial improvement in the model’s robustness against $\ell_\infty$ adversarial attacks on the original domain.
APA
Deng, Z., Zhang, L., Ghorbani, A. & Zou, J.. (2021). Improving Adversarial Robustness via Unlabeled Out-of-Domain Data . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:2845-2853 Available from https://proceedings.mlr.press/v130/deng21b.html.

Related Material