Defending SVMs against poisoning attacks: the hardness and DBSCAN approach
Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, PMLR 161:268-278, 2021.
Adversarial machine learning has attracted a great amount of attention in recent years. Due to the great importance of support vector machines (SVM) in machine learning, we consider defending SVM against poisoning attacks in this paper. We study two commonly used strategies for defending: designing robust SVM algorithms and data sanitization. Though several robust SVM algorithms have been proposed before, most of them either are in lack of adversarial-resilience, or rely on strong assumptions about the data distribution or the attacker’s behavior. Moreover, the research on the hardness of designing a quality-guaranteed adversarially-resilient SVM algorithm is still quite limited. We are the first, to the best of our knowledge, to prove that even the simplest hard-margin one-class SVM with adversarial outliers problem is NP-complete, and has no fully PTAS unless P=NP. For data sanitization, we explain the effectiveness of DBSCAN (as a density-based outlier removal method) for defending against poisoning attacks. In particular, we link it to the intrinsic dimensionality by proving a sampling theorem in doubling metrics. In our empirical experiments, we systematically compare several defenses including the DBSCAN and robust SVM methods, and investigate the influences from the intrinsic dimensionality and poisoned fraction to their performances.