One for All: A Universal Generator for Concept Unlearnability via Multi-Modal Alignment

Chaochao Chen, Jiaming Zhang, Yuyuan Li, Zhongxuan Han
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:7700-7711, 2024.

Abstract

The abundance of free internet data offers unprecedented opportunities for researchers and developers, but it also poses privacy risks. Utilizing data without explicit consent raises critical challenges in protecting personal information.Unlearnable examples have emerged as a feasible protection approach, which renders the data unlearnable, i.e., useless to third parties, by injecting imperceptible perturbations. However, these perturbations only exhibit unlearnable effects on either a particular dataset or label-consistent scenarios, thereby lacking broad applicability. To address both issues concurrently, we propose a universal perturbation generator that harnesses data with concept unlearnability, thereby broadening the scope of unlearnability beyond specific datasets or labels. Specifically, we leverage multi-modal pre-trained models to establish a connection between the data concepts in a shared embedding space. This connection enables the information transformation from image data to text concepts. Consequently, we can align the text embedding using concept-wise discriminant loss, and render the data unlearnable. Extensive experiments conducted on real-world datasets demonstrate the concept unlearnability, i.e., cross-dataset transferability and label-agnostic utility, of our proposed unlearnable examples, as well as their robustness against attacks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-chen24bc, title = {One for All: A Universal Generator for Concept Unlearnability via Multi-Modal Alignment}, author = {Chen, Chaochao and Zhang, Jiaming and Li, Yuyuan and Han, Zhongxuan}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {7700--7711}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/chen24bc/chen24bc.pdf}, url = {https://proceedings.mlr.press/v235/chen24bc.html}, abstract = {The abundance of free internet data offers unprecedented opportunities for researchers and developers, but it also poses privacy risks. Utilizing data without explicit consent raises critical challenges in protecting personal information.Unlearnable examples have emerged as a feasible protection approach, which renders the data unlearnable, i.e., useless to third parties, by injecting imperceptible perturbations. However, these perturbations only exhibit unlearnable effects on either a particular dataset or label-consistent scenarios, thereby lacking broad applicability. To address both issues concurrently, we propose a universal perturbation generator that harnesses data with concept unlearnability, thereby broadening the scope of unlearnability beyond specific datasets or labels. Specifically, we leverage multi-modal pre-trained models to establish a connection between the data concepts in a shared embedding space. This connection enables the information transformation from image data to text concepts. Consequently, we can align the text embedding using concept-wise discriminant loss, and render the data unlearnable. Extensive experiments conducted on real-world datasets demonstrate the concept unlearnability, i.e., cross-dataset transferability and label-agnostic utility, of our proposed unlearnable examples, as well as their robustness against attacks.} }
Endnote
%0 Conference Paper %T One for All: A Universal Generator for Concept Unlearnability via Multi-Modal Alignment %A Chaochao Chen %A Jiaming Zhang %A Yuyuan Li %A Zhongxuan Han %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-chen24bc %I PMLR %P 7700--7711 %U https://proceedings.mlr.press/v235/chen24bc.html %V 235 %X The abundance of free internet data offers unprecedented opportunities for researchers and developers, but it also poses privacy risks. Utilizing data without explicit consent raises critical challenges in protecting personal information.Unlearnable examples have emerged as a feasible protection approach, which renders the data unlearnable, i.e., useless to third parties, by injecting imperceptible perturbations. However, these perturbations only exhibit unlearnable effects on either a particular dataset or label-consistent scenarios, thereby lacking broad applicability. To address both issues concurrently, we propose a universal perturbation generator that harnesses data with concept unlearnability, thereby broadening the scope of unlearnability beyond specific datasets or labels. Specifically, we leverage multi-modal pre-trained models to establish a connection between the data concepts in a shared embedding space. This connection enables the information transformation from image data to text concepts. Consequently, we can align the text embedding using concept-wise discriminant loss, and render the data unlearnable. Extensive experiments conducted on real-world datasets demonstrate the concept unlearnability, i.e., cross-dataset transferability and label-agnostic utility, of our proposed unlearnable examples, as well as their robustness against attacks.
APA
Chen, C., Zhang, J., Li, Y. & Han, Z.. (2024). One for All: A Universal Generator for Concept Unlearnability via Multi-Modal Alignment. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:7700-7711 Available from https://proceedings.mlr.press/v235/chen24bc.html.

Related Material