Effect of Ambient-Intrinsic Dimension Gap on Adversarial Vulnerability

Rajdeep Haldar, Yue Xing, Qifan Song
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:1090-1098, 2024.

Abstract

The existence of adversarial attacks on machine learning models imperceptible to a human is still quite a mystery from a theoretical perspective. In this work, we introduce two notions of adversarial attacks: natural or on-manifold attacks, which are perceptible by a human/oracle, and unnatural or off-manifold attacks, which are not. We argue that the existence of the off-manifold attacks is a natural consequence of the dimension gap between the intrinsic and ambient dimensions of the data. For 2-layer ReLU networks, we prove that even though the dimension gap does not affect generalization performance on samples drawn from the observed data space, it makes the clean-trained model more vulnerable to adversarial perturbations in the off-manifold direction of the data space. Our main results provide an explicit relationship between the $\ell_2,\ell_{\infty}$ attack strength of the on/off-manifold attack and the dimension gap.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-haldar24a, title = {Effect of Ambient-Intrinsic Dimension Gap on Adversarial Vulnerability}, author = {Haldar, Rajdeep and Xing, Yue and Song, Qifan}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {1090--1098}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/haldar24a/haldar24a.pdf}, url = {https://proceedings.mlr.press/v238/haldar24a.html}, abstract = {The existence of adversarial attacks on machine learning models imperceptible to a human is still quite a mystery from a theoretical perspective. In this work, we introduce two notions of adversarial attacks: natural or on-manifold attacks, which are perceptible by a human/oracle, and unnatural or off-manifold attacks, which are not. We argue that the existence of the off-manifold attacks is a natural consequence of the dimension gap between the intrinsic and ambient dimensions of the data. For 2-layer ReLU networks, we prove that even though the dimension gap does not affect generalization performance on samples drawn from the observed data space, it makes the clean-trained model more vulnerable to adversarial perturbations in the off-manifold direction of the data space. Our main results provide an explicit relationship between the $\ell_2,\ell_{\infty}$ attack strength of the on/off-manifold attack and the dimension gap.} }
Endnote
%0 Conference Paper %T Effect of Ambient-Intrinsic Dimension Gap on Adversarial Vulnerability %A Rajdeep Haldar %A Yue Xing %A Qifan Song %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-haldar24a %I PMLR %P 1090--1098 %U https://proceedings.mlr.press/v238/haldar24a.html %V 238 %X The existence of adversarial attacks on machine learning models imperceptible to a human is still quite a mystery from a theoretical perspective. In this work, we introduce two notions of adversarial attacks: natural or on-manifold attacks, which are perceptible by a human/oracle, and unnatural or off-manifold attacks, which are not. We argue that the existence of the off-manifold attacks is a natural consequence of the dimension gap between the intrinsic and ambient dimensions of the data. For 2-layer ReLU networks, we prove that even though the dimension gap does not affect generalization performance on samples drawn from the observed data space, it makes the clean-trained model more vulnerable to adversarial perturbations in the off-manifold direction of the data space. Our main results provide an explicit relationship between the $\ell_2,\ell_{\infty}$ attack strength of the on/off-manifold attack and the dimension gap.
APA
Haldar, R., Xing, Y. & Song, Q.. (2024). Effect of Ambient-Intrinsic Dimension Gap on Adversarial Vulnerability. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:1090-1098 Available from https://proceedings.mlr.press/v238/haldar24a.html.

Related Material