Imputation Strategies Under Clinical Presence: Impact on Algorithmic Fairness

Vincent Jeanselme, Maria De-Arteaga, Zhe Zhang, Jessica Barrett, Brian Tom
Proceedings of the 2nd Machine Learning for Health symposium, PMLR 193:12-34, 2022.

Abstract

Biases have marked medical history, leading to unequal care affecting marginalised groups. The patterns of missingness in observational data often reflect these group discrepancies, but the algorithmic fairness implications of group-specific missingness are not well understood. Despite its potential impact, imputation is too often an overlooked preprocessing step. When explicitly considered, attention is placed on overall performance, ignoring how this preprocessing can reinforce group-specific inequities. Our work questions this choice by studying how imputation affects downstream algorithmic fairness. First, we provide a structured view of the relationship between clinical presence mechanisms and group-specific missingness patterns. Then, through simulations and real-world experiments, we demonstrate that the imputation choice influences marginalised group performance and that no imputation strategy consistently reduces disparities. Importantly, our results show that current practices may endanger health equity as similarly performing imputation strategies at the population level can affect marginalised groups differently. Finally, we propose recommendations for mitigating inequities that may stem from a neglected step of the machine learning pipeline.

Cite this Paper


BibTeX
@InProceedings{pmlr-v193-jeanselme22a, title = {Imputation Strategies Under Clinical Presence: Impact on Algorithmic Fairness}, author = {Jeanselme, Vincent and De-Arteaga, Maria and Zhang, Zhe and Barrett, Jessica and Tom, Brian}, booktitle = {Proceedings of the 2nd Machine Learning for Health symposium}, pages = {12--34}, year = {2022}, editor = {Parziale, Antonio and Agrawal, Monica and Joshi, Shalmali and Chen, Irene Y. and Tang, Shengpu and Oala, Luis and Subbaswamy, Adarsh}, volume = {193}, series = {Proceedings of Machine Learning Research}, month = {28 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v193/jeanselme22a/jeanselme22a.pdf}, url = {https://proceedings.mlr.press/v193/jeanselme22a.html}, abstract = {Biases have marked medical history, leading to unequal care affecting marginalised groups. The patterns of missingness in observational data often reflect these group discrepancies, but the algorithmic fairness implications of group-specific missingness are not well understood. Despite its potential impact, imputation is too often an overlooked preprocessing step. When explicitly considered, attention is placed on overall performance, ignoring how this preprocessing can reinforce group-specific inequities. Our work questions this choice by studying how imputation affects downstream algorithmic fairness. First, we provide a structured view of the relationship between clinical presence mechanisms and group-specific missingness patterns. Then, through simulations and real-world experiments, we demonstrate that the imputation choice influences marginalised group performance and that no imputation strategy consistently reduces disparities. Importantly, our results show that current practices may endanger health equity as similarly performing imputation strategies at the population level can affect marginalised groups differently. Finally, we propose recommendations for mitigating inequities that may stem from a neglected step of the machine learning pipeline.} }
Endnote
%0 Conference Paper %T Imputation Strategies Under Clinical Presence: Impact on Algorithmic Fairness %A Vincent Jeanselme %A Maria De-Arteaga %A Zhe Zhang %A Jessica Barrett %A Brian Tom %B Proceedings of the 2nd Machine Learning for Health symposium %C Proceedings of Machine Learning Research %D 2022 %E Antonio Parziale %E Monica Agrawal %E Shalmali Joshi %E Irene Y. Chen %E Shengpu Tang %E Luis Oala %E Adarsh Subbaswamy %F pmlr-v193-jeanselme22a %I PMLR %P 12--34 %U https://proceedings.mlr.press/v193/jeanselme22a.html %V 193 %X Biases have marked medical history, leading to unequal care affecting marginalised groups. The patterns of missingness in observational data often reflect these group discrepancies, but the algorithmic fairness implications of group-specific missingness are not well understood. Despite its potential impact, imputation is too often an overlooked preprocessing step. When explicitly considered, attention is placed on overall performance, ignoring how this preprocessing can reinforce group-specific inequities. Our work questions this choice by studying how imputation affects downstream algorithmic fairness. First, we provide a structured view of the relationship between clinical presence mechanisms and group-specific missingness patterns. Then, through simulations and real-world experiments, we demonstrate that the imputation choice influences marginalised group performance and that no imputation strategy consistently reduces disparities. Importantly, our results show that current practices may endanger health equity as similarly performing imputation strategies at the population level can affect marginalised groups differently. Finally, we propose recommendations for mitigating inequities that may stem from a neglected step of the machine learning pipeline.
APA
Jeanselme, V., De-Arteaga, M., Zhang, Z., Barrett, J. & Tom, B.. (2022). Imputation Strategies Under Clinical Presence: Impact on Algorithmic Fairness. Proceedings of the 2nd Machine Learning for Health symposium, in Proceedings of Machine Learning Research 193:12-34 Available from https://proceedings.mlr.press/v193/jeanselme22a.html.

Related Material