To Pool or Not To Pool: Analyzing the Regularizing Effects of Group-Fair Training on Shared Models

Cyrus Cousins, I. Elizabeth Kumar, Suresh Venkatasubramanian
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:4573-4581, 2024.

Abstract

In fair machine learning, one source of performance disparities between groups is overfitting to groups with relatively few training samples. We derive group-specific bounds on the generalization error of welfare-centric fair machine learning that benefit from the larger sample size of the majority group. We do this by considering group-specific Rademacher averages over a restricted hypothesis class, which contains the family of models likely to perform well with respect to a fair learning objective (e.g., a power-mean). Our simulations demonstrate these bounds improve over a naïve method, as expected by theory, with particularly significant improvement for smaller group sizes.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-cousins24a, title = {To Pool or Not To Pool: Analyzing the Regularizing Effects of Group-Fair Training on Shared Models}, author = {Cousins, Cyrus and Elizabeth Kumar, I. and Venkatasubramanian, Suresh}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {4573--4581}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/cousins24a/cousins24a.pdf}, url = {https://proceedings.mlr.press/v238/cousins24a.html}, abstract = {In fair machine learning, one source of performance disparities between groups is overfitting to groups with relatively few training samples. We derive group-specific bounds on the generalization error of welfare-centric fair machine learning that benefit from the larger sample size of the majority group. We do this by considering group-specific Rademacher averages over a restricted hypothesis class, which contains the family of models likely to perform well with respect to a fair learning objective (e.g., a power-mean). Our simulations demonstrate these bounds improve over a naïve method, as expected by theory, with particularly significant improvement for smaller group sizes.} }
Endnote
%0 Conference Paper %T To Pool or Not To Pool: Analyzing the Regularizing Effects of Group-Fair Training on Shared Models %A Cyrus Cousins %A I. Elizabeth Kumar %A Suresh Venkatasubramanian %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-cousins24a %I PMLR %P 4573--4581 %U https://proceedings.mlr.press/v238/cousins24a.html %V 238 %X In fair machine learning, one source of performance disparities between groups is overfitting to groups with relatively few training samples. We derive group-specific bounds on the generalization error of welfare-centric fair machine learning that benefit from the larger sample size of the majority group. We do this by considering group-specific Rademacher averages over a restricted hypothesis class, which contains the family of models likely to perform well with respect to a fair learning objective (e.g., a power-mean). Our simulations demonstrate these bounds improve over a naïve method, as expected by theory, with particularly significant improvement for smaller group sizes.
APA
Cousins, C., Elizabeth Kumar, I. & Venkatasubramanian, S.. (2024). To Pool or Not To Pool: Analyzing the Regularizing Effects of Group-Fair Training on Shared Models. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:4573-4581 Available from https://proceedings.mlr.press/v238/cousins24a.html.

Related Material