Bi-perspective Splitting Defense: Achieving Clean-Seed-Free Backdoor Security

Yangyang Shen, Xiao Tan, Dian Shen, Meng Wang, Beilun Wang
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:54431-54456, 2025.

Abstract

Backdoor attacks have seriously threatened deep neural networks (DNNs) by embedding concealed vulnerabilities through data poisoning. To counteract these attacks, training benign models from poisoned data garnered considerable interest from researchers. High-performing defenses often rely on additional clean subsets/seeds, which is untenable due to increasing privacy concerns and data scarcity. In the absence of additional clean subsets/seeds, defenders resort to complex feature extraction and analysis, resulting in excessive overhead and compromised performance. To address these challenges, we identify the key lies in sufficient utilization of both the easier-to-obtain target labels and clean hard samples. In this work, we propose a Bi-perspective Splitting Defense (BSD). BSD distinguishes clean samples using both semantic and loss statistics characteristics through open set recognition-based splitting (OSS) and altruistic model-based data splitting (ALS) respectively. Through extensive experiments on benchmark datasets and against representative attacks, we empirically demonstrate that BSD surpasses existing defenses by over 20% in average Defense Effectiveness Rating (DER), achieving clean data-free backdoor security.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-shen25d, title = {Bi-perspective Splitting Defense: Achieving Clean-Seed-Free Backdoor Security}, author = {Shen, Yangyang and Tan, Xiao and Shen, Dian and Wang, Meng and Wang, Beilun}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {54431--54456}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/shen25d/shen25d.pdf}, url = {https://proceedings.mlr.press/v267/shen25d.html}, abstract = {Backdoor attacks have seriously threatened deep neural networks (DNNs) by embedding concealed vulnerabilities through data poisoning. To counteract these attacks, training benign models from poisoned data garnered considerable interest from researchers. High-performing defenses often rely on additional clean subsets/seeds, which is untenable due to increasing privacy concerns and data scarcity. In the absence of additional clean subsets/seeds, defenders resort to complex feature extraction and analysis, resulting in excessive overhead and compromised performance. To address these challenges, we identify the key lies in sufficient utilization of both the easier-to-obtain target labels and clean hard samples. In this work, we propose a Bi-perspective Splitting Defense (BSD). BSD distinguishes clean samples using both semantic and loss statistics characteristics through open set recognition-based splitting (OSS) and altruistic model-based data splitting (ALS) respectively. Through extensive experiments on benchmark datasets and against representative attacks, we empirically demonstrate that BSD surpasses existing defenses by over 20% in average Defense Effectiveness Rating (DER), achieving clean data-free backdoor security.} }
Endnote
%0 Conference Paper %T Bi-perspective Splitting Defense: Achieving Clean-Seed-Free Backdoor Security %A Yangyang Shen %A Xiao Tan %A Dian Shen %A Meng Wang %A Beilun Wang %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-shen25d %I PMLR %P 54431--54456 %U https://proceedings.mlr.press/v267/shen25d.html %V 267 %X Backdoor attacks have seriously threatened deep neural networks (DNNs) by embedding concealed vulnerabilities through data poisoning. To counteract these attacks, training benign models from poisoned data garnered considerable interest from researchers. High-performing defenses often rely on additional clean subsets/seeds, which is untenable due to increasing privacy concerns and data scarcity. In the absence of additional clean subsets/seeds, defenders resort to complex feature extraction and analysis, resulting in excessive overhead and compromised performance. To address these challenges, we identify the key lies in sufficient utilization of both the easier-to-obtain target labels and clean hard samples. In this work, we propose a Bi-perspective Splitting Defense (BSD). BSD distinguishes clean samples using both semantic and loss statistics characteristics through open set recognition-based splitting (OSS) and altruistic model-based data splitting (ALS) respectively. Through extensive experiments on benchmark datasets and against representative attacks, we empirically demonstrate that BSD surpasses existing defenses by over 20% in average Defense Effectiveness Rating (DER), achieving clean data-free backdoor security.
APA
Shen, Y., Tan, X., Shen, D., Wang, M. & Wang, B.. (2025). Bi-perspective Splitting Defense: Achieving Clean-Seed-Free Backdoor Security. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:54431-54456 Available from https://proceedings.mlr.press/v267/shen25d.html.

Related Material