Understanding and Mitigating Accuracy Disparity in Regression

Jianfeng Chi, Yuan Tian, Geoffrey J. Gordon, Han Zhao
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:1866-1876, 2021.

Abstract

With the widespread deployment of large-scale prediction systems in high-stakes domains, e.g., face recognition, criminal justice, etc., disparity on prediction accuracy between different demographic subgroups has called for fundamental understanding on the source of such disparity and algorithmic intervention to mitigate it. In this paper, we study the accuracy disparity problem in regression. To begin with, we first propose an error decomposition theorem, which decomposes the accuracy disparity into the distance between marginal label distributions and the distance between conditional representations, to help explain why such accuracy disparity appears in practice. Motivated by this error decomposition and the general idea of distribution alignment with statistical distances, we then propose an algorithm to reduce this disparity, and analyze its game-theoretic optima of the proposed objective functions. To corroborate our theoretical findings, we also conduct experiments on five benchmark datasets. The experimental results suggest that our proposed algorithms can effectively mitigate accuracy disparity while maintaining the predictive power of the regression models.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-chi21a, title = {Understanding and Mitigating Accuracy Disparity in Regression}, author = {Chi, Jianfeng and Tian, Yuan and Gordon, Geoffrey J. and Zhao, Han}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {1866--1876}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/chi21a/chi21a.pdf}, url = {https://proceedings.mlr.press/v139/chi21a.html}, abstract = {With the widespread deployment of large-scale prediction systems in high-stakes domains, e.g., face recognition, criminal justice, etc., disparity on prediction accuracy between different demographic subgroups has called for fundamental understanding on the source of such disparity and algorithmic intervention to mitigate it. In this paper, we study the accuracy disparity problem in regression. To begin with, we first propose an error decomposition theorem, which decomposes the accuracy disparity into the distance between marginal label distributions and the distance between conditional representations, to help explain why such accuracy disparity appears in practice. Motivated by this error decomposition and the general idea of distribution alignment with statistical distances, we then propose an algorithm to reduce this disparity, and analyze its game-theoretic optima of the proposed objective functions. To corroborate our theoretical findings, we also conduct experiments on five benchmark datasets. The experimental results suggest that our proposed algorithms can effectively mitigate accuracy disparity while maintaining the predictive power of the regression models.} }
Endnote
%0 Conference Paper %T Understanding and Mitigating Accuracy Disparity in Regression %A Jianfeng Chi %A Yuan Tian %A Geoffrey J. Gordon %A Han Zhao %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-chi21a %I PMLR %P 1866--1876 %U https://proceedings.mlr.press/v139/chi21a.html %V 139 %X With the widespread deployment of large-scale prediction systems in high-stakes domains, e.g., face recognition, criminal justice, etc., disparity on prediction accuracy between different demographic subgroups has called for fundamental understanding on the source of such disparity and algorithmic intervention to mitigate it. In this paper, we study the accuracy disparity problem in regression. To begin with, we first propose an error decomposition theorem, which decomposes the accuracy disparity into the distance between marginal label distributions and the distance between conditional representations, to help explain why such accuracy disparity appears in practice. Motivated by this error decomposition and the general idea of distribution alignment with statistical distances, we then propose an algorithm to reduce this disparity, and analyze its game-theoretic optima of the proposed objective functions. To corroborate our theoretical findings, we also conduct experiments on five benchmark datasets. The experimental results suggest that our proposed algorithms can effectively mitigate accuracy disparity while maintaining the predictive power of the regression models.
APA
Chi, J., Tian, Y., Gordon, G.J. & Zhao, H.. (2021). Understanding and Mitigating Accuracy Disparity in Regression. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:1866-1876 Available from https://proceedings.mlr.press/v139/chi21a.html.

Related Material