Revisiting the Berkeley Admissions data: Statistical Tests for Causal Hypotheses

Sourbh Bhadane, Joris Marten Mooij, Philip Boeken, Onno Zoeter
Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence, PMLR 286:271-295, 2025.

Abstract

Reasoning about fairness through correlation-based notions is rife with pitfalls. The 1973 University of California, Berkeley graduate school admissions case from \citet{BickelHO75} is a classic example of one such pitfall, namely Simpson’s paradox. The discrepancy in admission rates among male and female applicants, in the aggregate data over all departments, vanishes when admission rates per department are examined. We reason about the Berkeley graduate school admissions case through a causal lens. In the process, we introduce a statistical test for causal hypothesis testing based on Pearl’s instrumental-variable inequalities \citep{Pearl95}. We compare different causal notions of fairness that are based on graphical, counterfactual and interventional queries on the causal model, and develop statistical tests for these notions that use only observational data. We study the logical relations between notions, and show that while notions may not be equivalent, their corresponding statistical tests coincide for the case at hand. We believe that a thorough case-based causal analysis helps develop a more principled understanding of both causal hypothesis testing and fairness.

Cite this Paper


BibTeX
@InProceedings{pmlr-v286-bhadane25a, title = {Revisiting the Berkeley Admissions data: Statistical Tests for Causal Hypotheses}, author = {Bhadane, Sourbh and Mooij, Joris Marten and Boeken, Philip and Zoeter, Onno}, booktitle = {Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence}, pages = {271--295}, year = {2025}, editor = {Chiappa, Silvia and Magliacane, Sara}, volume = {286}, series = {Proceedings of Machine Learning Research}, month = {21--25 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v286/main/assets/bhadane25a/bhadane25a.pdf}, url = {https://proceedings.mlr.press/v286/bhadane25a.html}, abstract = {Reasoning about fairness through correlation-based notions is rife with pitfalls. The 1973 University of California, Berkeley graduate school admissions case from \citet{BickelHO75} is a classic example of one such pitfall, namely Simpson’s paradox. The discrepancy in admission rates among male and female applicants, in the aggregate data over all departments, vanishes when admission rates per department are examined. We reason about the Berkeley graduate school admissions case through a causal lens. In the process, we introduce a statistical test for causal hypothesis testing based on Pearl’s instrumental-variable inequalities \citep{Pearl95}. We compare different causal notions of fairness that are based on graphical, counterfactual and interventional queries on the causal model, and develop statistical tests for these notions that use only observational data. We study the logical relations between notions, and show that while notions may not be equivalent, their corresponding statistical tests coincide for the case at hand. We believe that a thorough case-based causal analysis helps develop a more principled understanding of both causal hypothesis testing and fairness.} }
Endnote
%0 Conference Paper %T Revisiting the Berkeley Admissions data: Statistical Tests for Causal Hypotheses %A Sourbh Bhadane %A Joris Marten Mooij %A Philip Boeken %A Onno Zoeter %B Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2025 %E Silvia Chiappa %E Sara Magliacane %F pmlr-v286-bhadane25a %I PMLR %P 271--295 %U https://proceedings.mlr.press/v286/bhadane25a.html %V 286 %X Reasoning about fairness through correlation-based notions is rife with pitfalls. The 1973 University of California, Berkeley graduate school admissions case from \citet{BickelHO75} is a classic example of one such pitfall, namely Simpson’s paradox. The discrepancy in admission rates among male and female applicants, in the aggregate data over all departments, vanishes when admission rates per department are examined. We reason about the Berkeley graduate school admissions case through a causal lens. In the process, we introduce a statistical test for causal hypothesis testing based on Pearl’s instrumental-variable inequalities \citep{Pearl95}. We compare different causal notions of fairness that are based on graphical, counterfactual and interventional queries on the causal model, and develop statistical tests for these notions that use only observational data. We study the logical relations between notions, and show that while notions may not be equivalent, their corresponding statistical tests coincide for the case at hand. We believe that a thorough case-based causal analysis helps develop a more principled understanding of both causal hypothesis testing and fairness.
APA
Bhadane, S., Mooij, J.M., Boeken, P. & Zoeter, O.. (2025). Revisiting the Berkeley Admissions data: Statistical Tests for Causal Hypotheses. Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 286:271-295 Available from https://proceedings.mlr.press/v286/bhadane25a.html.

Related Material