[edit]
On the Vulnerability of Fairness Constrained Learning to Malicious Noise
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:4096-4104, 2024.
Abstract
We consider the vulnerability of fairness-constrained learning to small amounts of malicious noise in the training data. [Konstantinov and Lampert, 2021] initiated the study of this question and presented negative results showing there exist data distributions where for several fairness constraints, any proper learner will exhibit high vulnerability when group sizes are imbalanced. Here, we present a more optimistic view, showing that if we allow randomized classifiers, then the landscape is much more nuanced. For example, for Demographic Parity we show we can incur only a $\Theta(\alpha)$ loss in accuracy, where $\alpha$ is the malicious noise rate, matching the best possible even without fairness constraints. For Equal Opportunity, we show we can incur an $O(\sqrt{\alpha})$ loss, and give a matching $\Omega(\sqrt{\alpha})$ lower bound. In contrast, [Konstantinov and Lampert, 2021] showed for proper learners the loss in accuracy for both notions is $\Omega(1)$. The key technical novelty of our work is how randomization can bypass simple "tricks" an adversary can use to amplify his power. We also consider additional fairness notions including Equalized Odds and Calibration. For these fairness notions, the excess accuracy clusters into three natural regimes $O(\alpha)$, $O(\sqrt{\alpha})$, and $O(1)$. These results provide a more fine-grained view of the sensitivity of fairness-constrained learning to adversarial noise in training data.