Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks

Avi Schwarzschild; Micah Goldblum; Arjun Gupta; John P Dickerson; Tom Goldstein

Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks

Avi Schwarzschild, Micah Goldblum, Arjun Gupta, John P Dickerson, Tom Goldstein

Proceedings of the 38th International Conference on Machine Learning, PMLR 139:9389-9398, 2021.

Abstract

Data poisoning and backdoor attacks manipulate training data in order to cause models to fail during inference. A recent survey of industry practitioners found that data poisoning is the number one concern among threats ranging from model stealing to adversarial attacks. However, it remains unclear exactly how dangerous poisoning methods are and which ones are more effective considering that these methods, even ones with identical objectives, have not been tested in consistent or realistic settings. We observe that data poisoning and backdoor attacks are highly sensitive to variations in the testing setup. Moreover, we find that existing methods may not generalize to realistic settings. While these existing works serve as valuable prototypes for data poisoning, we apply rigorous tests to determine the extent to which we should fear them. In order to promote fair comparison in future work, we develop standardized benchmarks for data poisoning and backdoor attacks.

Cite this Paper

BibTeX

@InProceedings{pmlr-v139-schwarzschild21a,
  title = 	 {Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks},
  author =       {Schwarzschild, Avi and Goldblum, Micah and Gupta, Arjun and Dickerson, John P and Goldstein, Tom},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {9389--9398},
  year = 	 {2021},
  editor = 	 {Meila, Marina and Zhang, Tong},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v139/schwarzschild21a/schwarzschild21a.pdf},
  url = 	 {https://proceedings.mlr.press/v139/schwarzschild21a.html},
  abstract = 	 {Data poisoning and backdoor attacks manipulate training data in order to cause models to fail during inference. A recent survey of industry practitioners found that data poisoning is the number one concern among threats ranging from model stealing to adversarial attacks. However, it remains unclear exactly how dangerous poisoning methods are and which ones are more effective considering that these methods, even ones with identical objectives, have not been tested in consistent or realistic settings. We observe that data poisoning and backdoor attacks are highly sensitive to variations in the testing setup. Moreover, we find that existing methods may not generalize to realistic settings. While these existing works serve as valuable prototypes for data poisoning, we apply rigorous tests to determine the extent to which we should fear them. In order to promote fair comparison in future work, we develop standardized benchmarks for data poisoning and backdoor attacks.}
}

Endnote

%0 Conference Paper
%T Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks
%A Avi Schwarzschild
%A Micah Goldblum
%A Arjun Gupta
%A John P Dickerson
%A Tom Goldstein
%B Proceedings of the 38th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Marina Meila
%E Tong Zhang	
%F pmlr-v139-schwarzschild21a
%I PMLR
%P 9389--9398
%U https://proceedings.mlr.press/v139/schwarzschild21a.html
%V 139
%X Data poisoning and backdoor attacks manipulate training data in order to cause models to fail during inference. A recent survey of industry practitioners found that data poisoning is the number one concern among threats ranging from model stealing to adversarial attacks. However, it remains unclear exactly how dangerous poisoning methods are and which ones are more effective considering that these methods, even ones with identical objectives, have not been tested in consistent or realistic settings. We observe that data poisoning and backdoor attacks are highly sensitive to variations in the testing setup. Moreover, we find that existing methods may not generalize to realistic settings. While these existing works serve as valuable prototypes for data poisoning, we apply rigorous tests to determine the extent to which we should fear them. In order to promote fair comparison in future work, we develop standardized benchmarks for data poisoning and backdoor attacks.

APA

Schwarzschild, A., Goldblum, M., Gupta, A., Dickerson, J.P. & Goldstein, T.. (2021). Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:9389-9398 Available from https://proceedings.mlr.press/v139/schwarzschild21a.html.

Related Material

Download PDF