Exploring Model Dynamics for Accumulative Poisoning Discovery

Jianing Zhu, Xiawei Guo, Jiangchao Yao, Chao Du, Li He, Shuo Yuan, Tongliang Liu, Liang Wang, Bo Han
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:42983-43004, 2023.

Abstract

Adversarial poisoning attacks pose huge threats to various machine learning applications. Especially, the recent accumulative poisoning attacks show that it is possible to achieve irreparable harm on models via a sequence of imperceptible attacks followed by a trigger batch. Due to the limited data-level discrepancy in real-time data streaming, current defensive methods are indiscriminate in handling the poison and clean samples. In this paper, we dive into the perspective of model dynamics and propose a novel information measure, namely, Memorization Discrepancy, to explore the defense via the model-level information. By implicitly transferring the changes in the data manipulation to that in the model outputs, Memorization Discrepancy can discover the imperceptible poison samples based on their distinct dynamics from the clean samples. We thoroughly explore its properties and propose Discrepancy-aware Sample Correction (DSC) to defend against accumulative poisoning attacks. Extensive experiments comprehensively characterized Memorization Discrepancy and verified its effectiveness. The code is publicly available at: https://github.com/tmlr-group/Memorization-Discrepancy.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-zhu23d, title = {Exploring Model Dynamics for Accumulative Poisoning Discovery}, author = {Zhu, Jianing and Guo, Xiawei and Yao, Jiangchao and Du, Chao and He, Li and Yuan, Shuo and Liu, Tongliang and Wang, Liang and Han, Bo}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {42983--43004}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/zhu23d/zhu23d.pdf}, url = {https://proceedings.mlr.press/v202/zhu23d.html}, abstract = {Adversarial poisoning attacks pose huge threats to various machine learning applications. Especially, the recent accumulative poisoning attacks show that it is possible to achieve irreparable harm on models via a sequence of imperceptible attacks followed by a trigger batch. Due to the limited data-level discrepancy in real-time data streaming, current defensive methods are indiscriminate in handling the poison and clean samples. In this paper, we dive into the perspective of model dynamics and propose a novel information measure, namely, Memorization Discrepancy, to explore the defense via the model-level information. By implicitly transferring the changes in the data manipulation to that in the model outputs, Memorization Discrepancy can discover the imperceptible poison samples based on their distinct dynamics from the clean samples. We thoroughly explore its properties and propose Discrepancy-aware Sample Correction (DSC) to defend against accumulative poisoning attacks. Extensive experiments comprehensively characterized Memorization Discrepancy and verified its effectiveness. The code is publicly available at: https://github.com/tmlr-group/Memorization-Discrepancy.} }
Endnote
%0 Conference Paper %T Exploring Model Dynamics for Accumulative Poisoning Discovery %A Jianing Zhu %A Xiawei Guo %A Jiangchao Yao %A Chao Du %A Li He %A Shuo Yuan %A Tongliang Liu %A Liang Wang %A Bo Han %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-zhu23d %I PMLR %P 42983--43004 %U https://proceedings.mlr.press/v202/zhu23d.html %V 202 %X Adversarial poisoning attacks pose huge threats to various machine learning applications. Especially, the recent accumulative poisoning attacks show that it is possible to achieve irreparable harm on models via a sequence of imperceptible attacks followed by a trigger batch. Due to the limited data-level discrepancy in real-time data streaming, current defensive methods are indiscriminate in handling the poison and clean samples. In this paper, we dive into the perspective of model dynamics and propose a novel information measure, namely, Memorization Discrepancy, to explore the defense via the model-level information. By implicitly transferring the changes in the data manipulation to that in the model outputs, Memorization Discrepancy can discover the imperceptible poison samples based on their distinct dynamics from the clean samples. We thoroughly explore its properties and propose Discrepancy-aware Sample Correction (DSC) to defend against accumulative poisoning attacks. Extensive experiments comprehensively characterized Memorization Discrepancy and verified its effectiveness. The code is publicly available at: https://github.com/tmlr-group/Memorization-Discrepancy.
APA
Zhu, J., Guo, X., Yao, J., Du, C., He, L., Yuan, S., Liu, T., Wang, L. & Han, B.. (2023). Exploring Model Dynamics for Accumulative Poisoning Discovery. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:42983-43004 Available from https://proceedings.mlr.press/v202/zhu23d.html.

Related Material