Adversarial Training May Induce Deteriorating Distributions

Runzhi Tian, Yongyi Mao
Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence, PMLR 286:4185-4203, 2025.

Abstract

The interactions between the update of model parameters and the update of perturbation operators complicate the dynamics of adversarial training (AT). This paper reveals a surprising behavior in AT, namely that the distribution induced by adversarial perturbations during AT becomes progressively more difficult to learn. We derived a generalization bound to theoretically attribute this behavior to the increasing of a quantity associated with the perturbation operator, namely, its local dispersion. We corroborate this explanation with concrete experimental validations and show that this deteriorating behavior of the induced distributions is correlated with robust overfitting of AT.

Cite this Paper


BibTeX
@InProceedings{pmlr-v286-tian25b, title = {Adversarial Training May Induce Deteriorating Distributions}, author = {Tian, Runzhi and Mao, Yongyi}, booktitle = {Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence}, pages = {4185--4203}, year = {2025}, editor = {Chiappa, Silvia and Magliacane, Sara}, volume = {286}, series = {Proceedings of Machine Learning Research}, month = {21--25 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v286/main/assets/tian25b/tian25b.pdf}, url = {https://proceedings.mlr.press/v286/tian25b.html}, abstract = {The interactions between the update of model parameters and the update of perturbation operators complicate the dynamics of adversarial training (AT). This paper reveals a surprising behavior in AT, namely that the distribution induced by adversarial perturbations during AT becomes progressively more difficult to learn. We derived a generalization bound to theoretically attribute this behavior to the increasing of a quantity associated with the perturbation operator, namely, its local dispersion. We corroborate this explanation with concrete experimental validations and show that this deteriorating behavior of the induced distributions is correlated with robust overfitting of AT.} }
Endnote
%0 Conference Paper %T Adversarial Training May Induce Deteriorating Distributions %A Runzhi Tian %A Yongyi Mao %B Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2025 %E Silvia Chiappa %E Sara Magliacane %F pmlr-v286-tian25b %I PMLR %P 4185--4203 %U https://proceedings.mlr.press/v286/tian25b.html %V 286 %X The interactions between the update of model parameters and the update of perturbation operators complicate the dynamics of adversarial training (AT). This paper reveals a surprising behavior in AT, namely that the distribution induced by adversarial perturbations during AT becomes progressively more difficult to learn. We derived a generalization bound to theoretically attribute this behavior to the increasing of a quantity associated with the perturbation operator, namely, its local dispersion. We corroborate this explanation with concrete experimental validations and show that this deteriorating behavior of the induced distributions is correlated with robust overfitting of AT.
APA
Tian, R. & Mao, Y.. (2025). Adversarial Training May Induce Deteriorating Distributions. Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 286:4185-4203 Available from https://proceedings.mlr.press/v286/tian25b.html.

Related Material