[edit]
Fairness Trade-Offs and Partial Debiasing
Proceedings of The 14th Asian Conference on Machine
Learning, PMLR 189:112-136, 2023.
Abstract
Previous literature has shown that bias mitigating
algorithms were sometimes prone to overfitting and
had poor out-of-sample generalisation. This paper is
first and foremost concerned with establishing a
mathematical framework to tackle the specific issue
of generalisation. Throughout this work, we consider
fairness trade-offs and objectives mixing
statistical loss over the whole sample and fairness
penalties on categories (which could stem from
different values of protected attributes),
encompassing partial de-biasing. We do so by
adopting two different but complementary viewpoints:
first, we consider a PAC-type setup and derive
probabilistic upper bounds involving sample-only
information; second, we leverage an asymptotic
framework to derive a closed-form limiting
distribution for the difference between the
empirical trade-off and the true trade-off. While
these results provide guarantees for learning
fairness metrics across categories, they also point
out to the key (but asymmetric) role played by class
imbalance. To summarise, learning fairness without
having access to enough category-level samples is
hard, and a simple numerical experiment shows that
it can lead to spurious results.