Differentially Private Post-Processing for Fair Regression

Ruicheng Xian, Qiaobo Li, Gautam Kamath, Han Zhao
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:54212-54235, 2024.

Abstract

This paper describes a differentially private post-processing algorithm for learning fair regressors satisfying statistical parity, addressing privacy concerns of machine learning models trained on sensitive data, as well as fairness concerns of their potential to propagate historical biases. Our algorithm can be applied to post-process any given regressor to improve fairness by remapping its outputs. It consists of three steps: first, the output distributions are estimated privately via histogram density estimation and the Laplace mechanism, then their Wasserstein barycenter is computed, and the optimal transports to the barycenter are used for post-processing to satisfy fairness. We analyze the sample complexity of our algorithm and provide fairness guarantee, revealing a trade-off between the statistical bias and variance induced from the choice of the number of bins in the histogram, in which using less bins always favors fairness at the expense of error.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-xian24b, title = {Differentially Private Post-Processing for Fair Regression}, author = {Xian, Ruicheng and Li, Qiaobo and Kamath, Gautam and Zhao, Han}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {54212--54235}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/xian24b/xian24b.pdf}, url = {https://proceedings.mlr.press/v235/xian24b.html}, abstract = {This paper describes a differentially private post-processing algorithm for learning fair regressors satisfying statistical parity, addressing privacy concerns of machine learning models trained on sensitive data, as well as fairness concerns of their potential to propagate historical biases. Our algorithm can be applied to post-process any given regressor to improve fairness by remapping its outputs. It consists of three steps: first, the output distributions are estimated privately via histogram density estimation and the Laplace mechanism, then their Wasserstein barycenter is computed, and the optimal transports to the barycenter are used for post-processing to satisfy fairness. We analyze the sample complexity of our algorithm and provide fairness guarantee, revealing a trade-off between the statistical bias and variance induced from the choice of the number of bins in the histogram, in which using less bins always favors fairness at the expense of error.} }
Endnote
%0 Conference Paper %T Differentially Private Post-Processing for Fair Regression %A Ruicheng Xian %A Qiaobo Li %A Gautam Kamath %A Han Zhao %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-xian24b %I PMLR %P 54212--54235 %U https://proceedings.mlr.press/v235/xian24b.html %V 235 %X This paper describes a differentially private post-processing algorithm for learning fair regressors satisfying statistical parity, addressing privacy concerns of machine learning models trained on sensitive data, as well as fairness concerns of their potential to propagate historical biases. Our algorithm can be applied to post-process any given regressor to improve fairness by remapping its outputs. It consists of three steps: first, the output distributions are estimated privately via histogram density estimation and the Laplace mechanism, then their Wasserstein barycenter is computed, and the optimal transports to the barycenter are used for post-processing to satisfy fairness. We analyze the sample complexity of our algorithm and provide fairness guarantee, revealing a trade-off between the statistical bias and variance induced from the choice of the number of bins in the histogram, in which using less bins always favors fairness at the expense of error.
APA
Xian, R., Li, Q., Kamath, G. & Zhao, H.. (2024). Differentially Private Post-Processing for Fair Regression. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:54212-54235 Available from https://proceedings.mlr.press/v235/xian24b.html.

Related Material