Flexible, Efficient, and Stable Adversarial Attacks on Machine Unlearning

Zihan Zhou, Yang Zhou, Zijie Zhang, Lingjuan Lyu, Da Yan, Ruoming Jin, Dejing Dou
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:79595-79643, 2025.

Abstract

Machine unlearning (MU) aims to remove the influence of specific data points from trained models, enhancing compliance with privacy regulations. However, the vulnerability of basic MU models to malicious unlearning requests in adversarial learning environments has been largely overlooked. Existing adversarial MU attacks suffer from three key limitations: inflexibility due to pre-defined attack targets, inefficiency in handling multiple attack requests, and instability caused by non-convex loss functions. To address these challenges, we propose a Flexible, Efficient, and Stable Attack (DDPA). First, leveraging Carathéodory’s theorem, we introduce a convex polyhedral approximation to identify points in the loss landscape where convexity approximately holds, ensuring stable attack performance. Second, inspired by simplex theory and John’s theorem, we develop a regular simplex detection technique that maximizes coverage over the parameter space, improving attack flexibility and efficiency. We theoretically derive the proportion of the effective parameter space occupied by the constructed simplex. We evaluate the attack success rate of our DDPA method on real datasets against state-of-the-art machine unlearning attack methods. Our source code is available at https://github.com/zzz0134/DDPA.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-zhou25al, title = {Flexible, Efficient, and Stable Adversarial Attacks on Machine Unlearning}, author = {Zhou, Zihan and Zhou, Yang and Zhang, Zijie and Lyu, Lingjuan and Yan, Da and Jin, Ruoming and Dou, Dejing}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {79595--79643}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/zhou25al/zhou25al.pdf}, url = {https://proceedings.mlr.press/v267/zhou25al.html}, abstract = {Machine unlearning (MU) aims to remove the influence of specific data points from trained models, enhancing compliance with privacy regulations. However, the vulnerability of basic MU models to malicious unlearning requests in adversarial learning environments has been largely overlooked. Existing adversarial MU attacks suffer from three key limitations: inflexibility due to pre-defined attack targets, inefficiency in handling multiple attack requests, and instability caused by non-convex loss functions. To address these challenges, we propose a Flexible, Efficient, and Stable Attack (DDPA). First, leveraging Carathéodory’s theorem, we introduce a convex polyhedral approximation to identify points in the loss landscape where convexity approximately holds, ensuring stable attack performance. Second, inspired by simplex theory and John’s theorem, we develop a regular simplex detection technique that maximizes coverage over the parameter space, improving attack flexibility and efficiency. We theoretically derive the proportion of the effective parameter space occupied by the constructed simplex. We evaluate the attack success rate of our DDPA method on real datasets against state-of-the-art machine unlearning attack methods. Our source code is available at https://github.com/zzz0134/DDPA.} }
Endnote
%0 Conference Paper %T Flexible, Efficient, and Stable Adversarial Attacks on Machine Unlearning %A Zihan Zhou %A Yang Zhou %A Zijie Zhang %A Lingjuan Lyu %A Da Yan %A Ruoming Jin %A Dejing Dou %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-zhou25al %I PMLR %P 79595--79643 %U https://proceedings.mlr.press/v267/zhou25al.html %V 267 %X Machine unlearning (MU) aims to remove the influence of specific data points from trained models, enhancing compliance with privacy regulations. However, the vulnerability of basic MU models to malicious unlearning requests in adversarial learning environments has been largely overlooked. Existing adversarial MU attacks suffer from three key limitations: inflexibility due to pre-defined attack targets, inefficiency in handling multiple attack requests, and instability caused by non-convex loss functions. To address these challenges, we propose a Flexible, Efficient, and Stable Attack (DDPA). First, leveraging Carathéodory’s theorem, we introduce a convex polyhedral approximation to identify points in the loss landscape where convexity approximately holds, ensuring stable attack performance. Second, inspired by simplex theory and John’s theorem, we develop a regular simplex detection technique that maximizes coverage over the parameter space, improving attack flexibility and efficiency. We theoretically derive the proportion of the effective parameter space occupied by the constructed simplex. We evaluate the attack success rate of our DDPA method on real datasets against state-of-the-art machine unlearning attack methods. Our source code is available at https://github.com/zzz0134/DDPA.
APA
Zhou, Z., Zhou, Y., Zhang, Z., Lyu, L., Yan, D., Jin, R. & Dou, D.. (2025). Flexible, Efficient, and Stable Adversarial Attacks on Machine Unlearning. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:79595-79643 Available from https://proceedings.mlr.press/v267/zhou25al.html.

Related Material