More Data Can Expand The Generalization Gap Between Adversarially Robust and Standard Models

Lin Chen, Yifei Min, Mingrui Zhang, Amin Karbasi
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:1670-1680, 2020.

Abstract

Despite remarkable success in practice, modern machine learning models have been found to be susceptible to adversarial attacks that make human-imperceptible perturbations to the data, but result in serious and potentially dangerous prediction errors. To address this issue, practitioners often use adversarial training to learn models that are robust against such attacks at the cost of higher generalization error on unperturbed test sets. The conventional wisdom is that more training data should shrink the gap between the generalization error of adversarially-trained models and standard models. However, we study the training of robust classifiers for both Gaussian and Bernoulli models under $\ell_\infty$ attacks, and we prove that more data may actually increase this gap. Furthermore, our theoretical results identify if and when additional data will finally begin to shrink the gap. Lastly, we experimentally demonstrate that our results also hold for linear regression models, which may indicate that this phenomenon occurs more broadly.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-chen20q, title = {More Data Can Expand The Generalization Gap Between Adversarially Robust and Standard Models}, author = {Chen, Lin and Min, Yifei and Zhang, Mingrui and Karbasi, Amin}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {1670--1680}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/chen20q/chen20q.pdf}, url = {https://proceedings.mlr.press/v119/chen20q.html}, abstract = {Despite remarkable success in practice, modern machine learning models have been found to be susceptible to adversarial attacks that make human-imperceptible perturbations to the data, but result in serious and potentially dangerous prediction errors. To address this issue, practitioners often use adversarial training to learn models that are robust against such attacks at the cost of higher generalization error on unperturbed test sets. The conventional wisdom is that more training data should shrink the gap between the generalization error of adversarially-trained models and standard models. However, we study the training of robust classifiers for both Gaussian and Bernoulli models under $\ell_\infty$ attacks, and we prove that more data may actually increase this gap. Furthermore, our theoretical results identify if and when additional data will finally begin to shrink the gap. Lastly, we experimentally demonstrate that our results also hold for linear regression models, which may indicate that this phenomenon occurs more broadly.} }
Endnote
%0 Conference Paper %T More Data Can Expand The Generalization Gap Between Adversarially Robust and Standard Models %A Lin Chen %A Yifei Min %A Mingrui Zhang %A Amin Karbasi %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-chen20q %I PMLR %P 1670--1680 %U https://proceedings.mlr.press/v119/chen20q.html %V 119 %X Despite remarkable success in practice, modern machine learning models have been found to be susceptible to adversarial attacks that make human-imperceptible perturbations to the data, but result in serious and potentially dangerous prediction errors. To address this issue, practitioners often use adversarial training to learn models that are robust against such attacks at the cost of higher generalization error on unperturbed test sets. The conventional wisdom is that more training data should shrink the gap between the generalization error of adversarially-trained models and standard models. However, we study the training of robust classifiers for both Gaussian and Bernoulli models under $\ell_\infty$ attacks, and we prove that more data may actually increase this gap. Furthermore, our theoretical results identify if and when additional data will finally begin to shrink the gap. Lastly, we experimentally demonstrate that our results also hold for linear regression models, which may indicate that this phenomenon occurs more broadly.
APA
Chen, L., Min, Y., Zhang, M. & Karbasi, A.. (2020). More Data Can Expand The Generalization Gap Between Adversarially Robust and Standard Models. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:1670-1680 Available from https://proceedings.mlr.press/v119/chen20q.html.

Related Material