Fairness without Harm: Decoupled Classifiers with Preference Guarantees
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:6373-6382, 2019.
In domains such as medicine, it can be acceptable for machine learning models to include sensitive attributes such as gender and ethnicity. In this work, we argue that when there is this kind of treatment disparity, then it should be in the best interest of each group. Drawing on ethical principles such as beneficence ("do the best") and non-maleficence ("do no harm"), we show how to use sensitive attributes to train decoupled classifiers that satisfy preference guarantees. These guarantees ensure the majority of individuals in each group prefer their assigned classifier to (i) a pooled model that ignores group membership (rationality), and (ii) the model assigned to any other group (envy-freeness). We introduce a recursive procedure that adaptively selects group attributes for decoupling, and present formal conditions to ensure preference guarantees in terms of generalization error. We validate the effectiveness of the procedure on real-world datasets, showing that it improves accuracy without violating preference guarantees on test data.