Effect of Ambient-Intrinsic Dimension Gap on Adversarial Vulnerability

Rajdeep Haldar; Yue Xing; Qifan Song

Effect of Ambient-Intrinsic Dimension Gap on Adversarial Vulnerability

Rajdeep Haldar, Yue Xing, Qifan Song

Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:1090-1098, 2024.

Abstract

The existence of adversarial attacks on machine learning models imperceptible to a human is still quite a mystery from a theoretical perspective. In this work, we introduce two notions of adversarial attacks: natural or on-manifold attacks, which are perceptible by a human/oracle, and unnatural or off-manifold attacks, which are not. We argue that the existence of the off-manifold attacks is a natural consequence of the dimension gap between the intrinsic and ambient dimensions of the data. For 2-layer ReLU networks, we prove that even though the dimension gap does not affect generalization performance on samples drawn from the observed data space, it makes the clean-trained model more vulnerable to adversarial perturbations in the off-manifold direction of the data space. Our main results provide an explicit relationship between the

$\ell_2,\ell_{\infty}$ attack strength of the on/off-manifold attack and the dimension gap.

Cite this Paper

BibTeX

@InProceedings{pmlr-v238-haldar24a,
  title = 	 {Effect of Ambient-Intrinsic Dimension Gap on Adversarial Vulnerability},
  author =       {Haldar, Rajdeep and Xing, Yue and Song, Qifan},
  booktitle = 	 {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {1090--1098},
  year = 	 {2024},
  editor = 	 {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen},
  volume = 	 {238},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {02--04 May},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v238/haldar24a/haldar24a.pdf},
  url = 	 {https://proceedings.mlr.press/v238/haldar24a.html},
  abstract = 	 {The existence of adversarial attacks on machine learning models imperceptible to a human is still quite a mystery from a theoretical perspective. In this work, we introduce two notions of adversarial attacks: natural or on-manifold attacks, which are perceptible by a human/oracle, and unnatural or off-manifold attacks, which are not. We argue that the existence of the off-manifold attacks is a natural consequence of the dimension gap between the intrinsic and ambient dimensions of the data. For 2-layer ReLU networks, we prove that even though the dimension gap does not affect generalization performance on samples drawn from the observed data space, it makes the clean-trained model more vulnerable to adversarial perturbations in the off-manifold direction of the data space. Our main results provide an explicit relationship between the $\ell_2,\ell_{\infty}$ attack strength of the on/off-manifold attack and the dimension gap.}
}

Endnote

%0 Conference Paper
%T Effect of Ambient-Intrinsic Dimension Gap on Adversarial Vulnerability
%A Rajdeep Haldar
%A Yue Xing
%A Qifan Song
%B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2024
%E Sanjoy Dasgupta
%E Stephan Mandt
%E Yingzhen Li	
%F pmlr-v238-haldar24a
%I PMLR
%P 1090--1098
%U https://proceedings.mlr.press/v238/haldar24a.html
%V 238
%X The existence of adversarial attacks on machine learning models imperceptible to a human is still quite a mystery from a theoretical perspective. In this work, we introduce two notions of adversarial attacks: natural or on-manifold attacks, which are perceptible by a human/oracle, and unnatural or off-manifold attacks, which are not. We argue that the existence of the off-manifold attacks is a natural consequence of the dimension gap between the intrinsic and ambient dimensions of the data. For 2-layer ReLU networks, we prove that even though the dimension gap does not affect generalization performance on samples drawn from the observed data space, it makes the clean-trained model more vulnerable to adversarial perturbations in the off-manifold direction of the data space. Our main results provide an explicit relationship between the $\ell_2,\ell_{\infty}$ attack strength of the on/off-manifold attack and the dimension gap.

APA

Haldar, R., Xing, Y. & Song, Q.. (2024). Effect of Ambient-Intrinsic Dimension Gap on Adversarial Vulnerability. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:1090-1098 Available from https://proceedings.mlr.press/v238/haldar24a.html.

Related Material

Download PDF