Delayed Impact of Fair Machine Learning

Lydia T. Liu, Sarah Dean, Esther Rolf, Max Simchowitz, Moritz Hardt
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:3150-3158, 2018.

Abstract

Fairness in machine learning has predominantly been studied in static classification settings without concern for how decisions change the underlying population over time. Conventional wisdom suggests that fairness criteria promote the long-term well-being of those groups they aim to protect. We study how static fairness criteria interact with temporal indicators of well-being, such as long-term improvement, stagnation, and decline in a variable of interest. We demonstrate that even in a one-step feedback model, common fairness criteria in general do not promote improvement over time, and may in fact cause harm in cases where an unconstrained objective would not. We completely characterize the delayed impact of three standard criteria, contrasting the regimes in which these exhibit qualitatively different behavior. In addition, we find that a natural form of measurement error broadens the regime in which fairness criteria perform favorably. Our results highlight the importance of measurement and temporal modeling in the evaluation of fairness criteria, suggesting a range of new challenges and trade-offs.

Cite this Paper


BibTeX
@InProceedings{pmlr-v80-liu18c, title = {Delayed Impact of Fair Machine Learning}, author = {Liu, Lydia T. and Dean, Sarah and Rolf, Esther and Simchowitz, Max and Hardt, Moritz}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, pages = {3150--3158}, year = {2018}, editor = {Dy, Jennifer and Krause, Andreas}, volume = {80}, series = {Proceedings of Machine Learning Research}, month = {10--15 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v80/liu18c/liu18c.pdf}, url = {https://proceedings.mlr.press/v80/liu18c.html}, abstract = {Fairness in machine learning has predominantly been studied in static classification settings without concern for how decisions change the underlying population over time. Conventional wisdom suggests that fairness criteria promote the long-term well-being of those groups they aim to protect. We study how static fairness criteria interact with temporal indicators of well-being, such as long-term improvement, stagnation, and decline in a variable of interest. We demonstrate that even in a one-step feedback model, common fairness criteria in general do not promote improvement over time, and may in fact cause harm in cases where an unconstrained objective would not. We completely characterize the delayed impact of three standard criteria, contrasting the regimes in which these exhibit qualitatively different behavior. In addition, we find that a natural form of measurement error broadens the regime in which fairness criteria perform favorably. Our results highlight the importance of measurement and temporal modeling in the evaluation of fairness criteria, suggesting a range of new challenges and trade-offs.} }
Endnote
%0 Conference Paper %T Delayed Impact of Fair Machine Learning %A Lydia T. Liu %A Sarah Dean %A Esther Rolf %A Max Simchowitz %A Moritz Hardt %B Proceedings of the 35th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2018 %E Jennifer Dy %E Andreas Krause %F pmlr-v80-liu18c %I PMLR %P 3150--3158 %U https://proceedings.mlr.press/v80/liu18c.html %V 80 %X Fairness in machine learning has predominantly been studied in static classification settings without concern for how decisions change the underlying population over time. Conventional wisdom suggests that fairness criteria promote the long-term well-being of those groups they aim to protect. We study how static fairness criteria interact with temporal indicators of well-being, such as long-term improvement, stagnation, and decline in a variable of interest. We demonstrate that even in a one-step feedback model, common fairness criteria in general do not promote improvement over time, and may in fact cause harm in cases where an unconstrained objective would not. We completely characterize the delayed impact of three standard criteria, contrasting the regimes in which these exhibit qualitatively different behavior. In addition, we find that a natural form of measurement error broadens the regime in which fairness criteria perform favorably. Our results highlight the importance of measurement and temporal modeling in the evaluation of fairness criteria, suggesting a range of new challenges and trade-offs.
APA
Liu, L.T., Dean, S., Rolf, E., Simchowitz, M. & Hardt, M.. (2018). Delayed Impact of Fair Machine Learning. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:3150-3158 Available from https://proceedings.mlr.press/v80/liu18c.html.

Related Material