Can Adversarially Robust Learning LeverageComputational Hardness?

Saeed Mahloujifar, Mohammad Mahmoody
Proceedings of the 30th International Conference on Algorithmic Learning Theory, PMLR 98:581-609, 2019.

Abstract

Making learners robust to adversarial perturbation at test time (i.e., evasion attacks finding adversarial examples) or training time (i.e., data poisoning attacks) has emerged as a challenging task. It is known that in some cases \emph{sublinear} perturbations in the training phase or the testing phase can drastically decrease the quality of the predictions. These negative results, however, only prove the \emph{existence} of such successful adversarial perturbations. A natural question for these settings is whether or not we can make classifiers \emph{computationally} robust to \emph{polynomial-time} attacks. In this work, we prove some barriers against achieving such envisioned computational robustness for evasion attacks (for specific metric probability spaces) as well as poisoning attacks. In particular, we show that if the test instances come from a product distribution (e.g., uniform over $\{0,1\}^n$ or $[0,1]^n$, or isotropic $n$-variate Gaussian) and that there is an initial constant error, then there exists a \emph{polynomial-time} attack that finds adversarial examples of Hamming distance $O(\sqrt n)$. For poisoning attacks, we prove that for any deterministic learning algorithm with sample complexity $m$ and any efficiently computable “predicate” defining some “bad” property $B$ for the produced hypothesis (e.g., failing on a particular test) that happens with an initial constant probability, there exist a \emph{polynomial-time} online poisoning attack that replaces $O (\sqrt m)$ of the training examples with other correctly labeled examples and increases the probability of the bad event $B$ to $\approx 1$. Both of our poisoning and evasion attacks are \emph{black-box} in how they access their corresponding components of the system (i.e., the hypothesis, the concept, and the learning algorithm).

Cite this Paper


BibTeX
@InProceedings{pmlr-v98-mahloujifar19a, title = {Can Adversarially Robust Learning LeverageComputational Hardness?}, author = {Mahloujifar, Saeed and Mahmoody, Mohammad}, booktitle = {Proceedings of the 30th International Conference on Algorithmic Learning Theory}, pages = {581--609}, year = {2019}, editor = {Garivier, Aurélien and Kale, Satyen}, volume = {98}, series = {Proceedings of Machine Learning Research}, month = {22--24 Mar}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v98/mahloujifar19a/mahloujifar19a.pdf}, url = {https://proceedings.mlr.press/v98/mahloujifar19a.html}, abstract = {Making learners robust to adversarial perturbation at test time (i.e., evasion attacks finding adversarial examples) or training time (i.e., data poisoning attacks) has emerged as a challenging task. It is known that in some cases \emph{sublinear} perturbations in the training phase or the testing phase can drastically decrease the quality of the predictions. These negative results, however, only prove the \emph{existence} of such successful adversarial perturbations. A natural question for these settings is whether or not we can make classifiers \emph{computationally} robust to \emph{polynomial-time} attacks. In this work, we prove some barriers against achieving such envisioned computational robustness for evasion attacks (for specific metric probability spaces) as well as poisoning attacks. In particular, we show that if the test instances come from a product distribution (e.g., uniform over $\{0,1\}^n$ or $[0,1]^n$, or isotropic $n$-variate Gaussian) and that there is an initial constant error, then there exists a \emph{polynomial-time} attack that finds adversarial examples of Hamming distance $O(\sqrt n)$. For poisoning attacks, we prove that for any deterministic learning algorithm with sample complexity $m$ and any efficiently computable “predicate” defining some “bad” property $B$ for the produced hypothesis (e.g., failing on a particular test) that happens with an initial constant probability, there exist a \emph{polynomial-time} online poisoning attack that replaces $O (\sqrt m)$ of the training examples with other correctly labeled examples and increases the probability of the bad event $B$ to $\approx 1$. Both of our poisoning and evasion attacks are \emph{black-box} in how they access their corresponding components of the system (i.e., the hypothesis, the concept, and the learning algorithm).} }
Endnote
%0 Conference Paper %T Can Adversarially Robust Learning LeverageComputational Hardness? %A Saeed Mahloujifar %A Mohammad Mahmoody %B Proceedings of the 30th International Conference on Algorithmic Learning Theory %C Proceedings of Machine Learning Research %D 2019 %E Aurélien Garivier %E Satyen Kale %F pmlr-v98-mahloujifar19a %I PMLR %P 581--609 %U https://proceedings.mlr.press/v98/mahloujifar19a.html %V 98 %X Making learners robust to adversarial perturbation at test time (i.e., evasion attacks finding adversarial examples) or training time (i.e., data poisoning attacks) has emerged as a challenging task. It is known that in some cases \emph{sublinear} perturbations in the training phase or the testing phase can drastically decrease the quality of the predictions. These negative results, however, only prove the \emph{existence} of such successful adversarial perturbations. A natural question for these settings is whether or not we can make classifiers \emph{computationally} robust to \emph{polynomial-time} attacks. In this work, we prove some barriers against achieving such envisioned computational robustness for evasion attacks (for specific metric probability spaces) as well as poisoning attacks. In particular, we show that if the test instances come from a product distribution (e.g., uniform over $\{0,1\}^n$ or $[0,1]^n$, or isotropic $n$-variate Gaussian) and that there is an initial constant error, then there exists a \emph{polynomial-time} attack that finds adversarial examples of Hamming distance $O(\sqrt n)$. For poisoning attacks, we prove that for any deterministic learning algorithm with sample complexity $m$ and any efficiently computable “predicate” defining some “bad” property $B$ for the produced hypothesis (e.g., failing on a particular test) that happens with an initial constant probability, there exist a \emph{polynomial-time} online poisoning attack that replaces $O (\sqrt m)$ of the training examples with other correctly labeled examples and increases the probability of the bad event $B$ to $\approx 1$. Both of our poisoning and evasion attacks are \emph{black-box} in how they access their corresponding components of the system (i.e., the hypothesis, the concept, and the learning algorithm).
APA
Mahloujifar, S. & Mahmoody, M.. (2019). Can Adversarially Robust Learning LeverageComputational Hardness?. Proceedings of the 30th International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 98:581-609 Available from https://proceedings.mlr.press/v98/mahloujifar19a.html.

Related Material