Cape: Context-Aware Prompt Perturbation Mechanism with Differential Privacy

Haoqi Wu, Wei Dai, Wang Li, Qiang Yan
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:67184-67201, 2025.

Abstract

Large Language Models (LLMs) have gained significant popularity due to their remarkable capabilities in text understanding and generation. However, despite their widespread deployment in inference services such as ChatGPT, concerns about the potential leakage of sensitive user data have arisen. Existing solutions primarily rely on privacy-enhancing technologies to mitigate such risks, facing the trade-off among efficiency, privacy, and utility. To narrow this gap, we propose Cape, a context-aware prompt perturbation mechanism based on differential privacy, to enable efficient inference with an improved privacy-utility trade-off. Concretely, we introduce a hybrid utility function that better captures the token similarity. Additionally, we propose a bucketized sampling mechanism to handle large sampling space, which might lead to long-tail phenomenons. Extensive experiments across multiple datasets, along with ablation studies, demonstrate that Cape achieves a better privacy-utility trade-off compared to prior state-of-the-art works.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-wu25g, title = {Cape: Context-Aware Prompt Perturbation Mechanism with Differential Privacy}, author = {Wu, Haoqi and Dai, Wei and Li, Wang and Yan, Qiang}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {67184--67201}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/wu25g/wu25g.pdf}, url = {https://proceedings.mlr.press/v267/wu25g.html}, abstract = {Large Language Models (LLMs) have gained significant popularity due to their remarkable capabilities in text understanding and generation. However, despite their widespread deployment in inference services such as ChatGPT, concerns about the potential leakage of sensitive user data have arisen. Existing solutions primarily rely on privacy-enhancing technologies to mitigate such risks, facing the trade-off among efficiency, privacy, and utility. To narrow this gap, we propose Cape, a context-aware prompt perturbation mechanism based on differential privacy, to enable efficient inference with an improved privacy-utility trade-off. Concretely, we introduce a hybrid utility function that better captures the token similarity. Additionally, we propose a bucketized sampling mechanism to handle large sampling space, which might lead to long-tail phenomenons. Extensive experiments across multiple datasets, along with ablation studies, demonstrate that Cape achieves a better privacy-utility trade-off compared to prior state-of-the-art works.} }
Endnote
%0 Conference Paper %T Cape: Context-Aware Prompt Perturbation Mechanism with Differential Privacy %A Haoqi Wu %A Wei Dai %A Wang Li %A Qiang Yan %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-wu25g %I PMLR %P 67184--67201 %U https://proceedings.mlr.press/v267/wu25g.html %V 267 %X Large Language Models (LLMs) have gained significant popularity due to their remarkable capabilities in text understanding and generation. However, despite their widespread deployment in inference services such as ChatGPT, concerns about the potential leakage of sensitive user data have arisen. Existing solutions primarily rely on privacy-enhancing technologies to mitigate such risks, facing the trade-off among efficiency, privacy, and utility. To narrow this gap, we propose Cape, a context-aware prompt perturbation mechanism based on differential privacy, to enable efficient inference with an improved privacy-utility trade-off. Concretely, we introduce a hybrid utility function that better captures the token similarity. Additionally, we propose a bucketized sampling mechanism to handle large sampling space, which might lead to long-tail phenomenons. Extensive experiments across multiple datasets, along with ablation studies, demonstrate that Cape achieves a better privacy-utility trade-off compared to prior state-of-the-art works.
APA
Wu, H., Dai, W., Li, W. & Yan, Q.. (2025). Cape: Context-Aware Prompt Perturbation Mechanism with Differential Privacy. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:67184-67201 Available from https://proceedings.mlr.press/v267/wu25g.html.

Related Material