Fairness Overfitting in Machine Learning: An Information-Theoretic Perspective

Firas Laakom; Haobo Chen; Jürgen Schmidhuber; Yuheng Bu

Fairness Overfitting in Machine Learning: An Information-Theoretic Perspective

Firas Laakom, Haobo Chen, Jürgen Schmidhuber, Yuheng Bu

Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:32078-32115, 2025.

Abstract

Despite substantial progress in promoting fairness in high-stake applications using machine learning models, existing methods often modify the training process, such as through regularizers or other interventions, but lack formal guarantees that fairness achieved during training will generalize to unseen data. Although overfitting with respect to prediction performance has been extensively studied, overfitting in terms of fairness loss has received far less attention. This paper proposes a theoretical framework for analyzing fairness generalization error through an information-theoretic lens. Our novel bounding technique is based on Efron–Stein inequality, which allows us to derive tight information-theoretic fairness generalization bounds with both Mutual Information (MI) and Conditional Mutual Information (CMI). Our empirical results validate the tightness and practical relevance of these bounds across diverse fairness-aware learning algorithms. Our framework offers valuable insights to guide the design of algorithms improving fairness generalization.

Cite this Paper

BibTeX

@InProceedings{pmlr-v267-laakom25a,
  title = 	 {Fairness Overfitting in Machine Learning: An Information-Theoretic Perspective},
  author =       {Laakom, Firas and Chen, Haobo and Schmidhuber, J\"{u}rgen and Bu, Yuheng},
  booktitle = 	 {Proceedings of the 42nd International Conference on Machine Learning},
  pages = 	 {32078--32115},
  year = 	 {2025},
  editor = 	 {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry},
  volume = 	 {267},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--19 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v267/main/assets/laakom25a/laakom25a.pdf},
  url = 	 {https://proceedings.mlr.press/v267/laakom25a.html},
  abstract = 	 {Despite substantial progress in promoting fairness in high-stake applications using machine learning models, existing methods often modify the training process, such as through regularizers or other interventions, but lack formal guarantees that fairness achieved during training will generalize to unseen data. Although overfitting with respect to prediction performance has been extensively studied, overfitting in terms of fairness loss has received far less attention. This paper proposes a theoretical framework for analyzing fairness generalization error through an information-theoretic lens. Our novel bounding technique is based on Efron–Stein inequality, which allows us to derive tight information-theoretic fairness generalization bounds with both Mutual Information (MI) and Conditional Mutual Information (CMI). Our empirical results validate the tightness and practical relevance of these bounds across diverse fairness-aware learning algorithms. Our framework offers valuable insights to guide the design of algorithms improving fairness generalization.}
}

Endnote

%0 Conference Paper
%T Fairness Overfitting in Machine Learning: An Information-Theoretic Perspective
%A Firas Laakom
%A Haobo Chen
%A Jürgen Schmidhuber
%A Yuheng Bu
%B Proceedings of the 42nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2025
%E Aarti Singh
%E Maryam Fazel
%E Daniel Hsu
%E Simon Lacoste-Julien
%E Felix Berkenkamp
%E Tegan Maharaj
%E Kiri Wagstaff
%E Jerry Zhu	
%F pmlr-v267-laakom25a
%I PMLR
%P 32078--32115
%U https://proceedings.mlr.press/v267/laakom25a.html
%V 267
%X Despite substantial progress in promoting fairness in high-stake applications using machine learning models, existing methods often modify the training process, such as through regularizers or other interventions, but lack formal guarantees that fairness achieved during training will generalize to unseen data. Although overfitting with respect to prediction performance has been extensively studied, overfitting in terms of fairness loss has received far less attention. This paper proposes a theoretical framework for analyzing fairness generalization error through an information-theoretic lens. Our novel bounding technique is based on Efron–Stein inequality, which allows us to derive tight information-theoretic fairness generalization bounds with both Mutual Information (MI) and Conditional Mutual Information (CMI). Our empirical results validate the tightness and practical relevance of these bounds across diverse fairness-aware learning algorithms. Our framework offers valuable insights to guide the design of algorithms improving fairness generalization.

APA

Laakom, F., Chen, H., Schmidhuber, J. & Bu, Y.. (2025). Fairness Overfitting in Machine Learning: An Information-Theoretic Perspective. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:32078-32115 Available from https://proceedings.mlr.press/v267/laakom25a.html.

Fairness Overfitting in Machine Learning: An Information-Theoretic Perspective

Abstract

Cite this Paper

Related Material