Towards Equitable Kidney Tumor Segmentation: Bias Evaluation and Mitigation

Muhammad Muneeb Afzal, Muhammad Osama Khan, Shujaat Mirza
Proceedings of the 3rd Machine Learning for Health Symposium, PMLR 225:13-26, 2023.

Abstract

Kidney tumors, affecting over 400,000 individuals annually, require accurate segmentation for effective treatment and surgical planning. Yet, manual segmentation is time-consuming, steering the medical community towards automated methods. While computer-aided diagnostic tools promise improvements, their transition into the real world mandates an understanding of their performance across diverse population subgroups. Our study is the first to investigate fairness concerning kidney and tumor segmentation, particularly focusing on sensitive attributes like sex and age. Our findings show an existence of bias in performance across both attributes. In particular, despite a male-dominated training dataset, females showed superior segmentation performance. Age groups 60-70 and above 70 also deviated significantly from the average performance for all ages. To address these biases, we comprehensively explore bias mitigation strategies - encompassing pre-processing techniques (Resampling Algorithm and Stratified Batch Sampling) and in-processing methods (Fair Meta-learning and architectural adjustments). Specifically, Attention U-Net was identified as the optimal model for balancing fairness across both attributes while maintaining high segmentation performance. We present a crucial insight that the architecture itself could be a source of inherent biases, and careful selection of the network design can inherently reduce these biases. Our assessment of UNet variants challenges the prevailing paradigm of model selection predicated solely on segmentation performance, especially considering the profound implications biases can have in clinical outcomes.

Cite this Paper


BibTeX
@InProceedings{pmlr-v225-afzal23a, title = {Towards Equitable Kidney Tumor Segmentation: Bias Evaluation and Mitigation}, author = {Afzal, Muhammad Muneeb and Khan, Muhammad Osama and Mirza, Shujaat}, booktitle = {Proceedings of the 3rd Machine Learning for Health Symposium}, pages = {13--26}, year = {2023}, editor = {Hegselmann, Stefan and Parziale, Antonio and Shanmugam, Divya and Tang, Shengpu and Asiedu, Mercy Nyamewaa and Chang, Serina and Hartvigsen, Tom and Singh, Harvineet}, volume = {225}, series = {Proceedings of Machine Learning Research}, month = {10 Dec}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v225/afzal23a/afzal23a.pdf}, url = {https://proceedings.mlr.press/v225/afzal23a.html}, abstract = {Kidney tumors, affecting over 400,000 individuals annually, require accurate segmentation for effective treatment and surgical planning. Yet, manual segmentation is time-consuming, steering the medical community towards automated methods. While computer-aided diagnostic tools promise improvements, their transition into the real world mandates an understanding of their performance across diverse population subgroups. Our study is the first to investigate fairness concerning kidney and tumor segmentation, particularly focusing on sensitive attributes like sex and age. Our findings show an existence of bias in performance across both attributes. In particular, despite a male-dominated training dataset, females showed superior segmentation performance. Age groups 60-70 and above 70 also deviated significantly from the average performance for all ages. To address these biases, we comprehensively explore bias mitigation strategies - encompassing pre-processing techniques (Resampling Algorithm and Stratified Batch Sampling) and in-processing methods (Fair Meta-learning and architectural adjustments). Specifically, Attention U-Net was identified as the optimal model for balancing fairness across both attributes while maintaining high segmentation performance. We present a crucial insight that the architecture itself could be a source of inherent biases, and careful selection of the network design can inherently reduce these biases. Our assessment of UNet variants challenges the prevailing paradigm of model selection predicated solely on segmentation performance, especially considering the profound implications biases can have in clinical outcomes.} }
Endnote
%0 Conference Paper %T Towards Equitable Kidney Tumor Segmentation: Bias Evaluation and Mitigation %A Muhammad Muneeb Afzal %A Muhammad Osama Khan %A Shujaat Mirza %B Proceedings of the 3rd Machine Learning for Health Symposium %C Proceedings of Machine Learning Research %D 2023 %E Stefan Hegselmann %E Antonio Parziale %E Divya Shanmugam %E Shengpu Tang %E Mercy Nyamewaa Asiedu %E Serina Chang %E Tom Hartvigsen %E Harvineet Singh %F pmlr-v225-afzal23a %I PMLR %P 13--26 %U https://proceedings.mlr.press/v225/afzal23a.html %V 225 %X Kidney tumors, affecting over 400,000 individuals annually, require accurate segmentation for effective treatment and surgical planning. Yet, manual segmentation is time-consuming, steering the medical community towards automated methods. While computer-aided diagnostic tools promise improvements, their transition into the real world mandates an understanding of their performance across diverse population subgroups. Our study is the first to investigate fairness concerning kidney and tumor segmentation, particularly focusing on sensitive attributes like sex and age. Our findings show an existence of bias in performance across both attributes. In particular, despite a male-dominated training dataset, females showed superior segmentation performance. Age groups 60-70 and above 70 also deviated significantly from the average performance for all ages. To address these biases, we comprehensively explore bias mitigation strategies - encompassing pre-processing techniques (Resampling Algorithm and Stratified Batch Sampling) and in-processing methods (Fair Meta-learning and architectural adjustments). Specifically, Attention U-Net was identified as the optimal model for balancing fairness across both attributes while maintaining high segmentation performance. We present a crucial insight that the architecture itself could be a source of inherent biases, and careful selection of the network design can inherently reduce these biases. Our assessment of UNet variants challenges the prevailing paradigm of model selection predicated solely on segmentation performance, especially considering the profound implications biases can have in clinical outcomes.
APA
Afzal, M.M., Khan, M.O. & Mirza, S.. (2023). Towards Equitable Kidney Tumor Segmentation: Bias Evaluation and Mitigation. Proceedings of the 3rd Machine Learning for Health Symposium, in Proceedings of Machine Learning Research 225:13-26 Available from https://proceedings.mlr.press/v225/afzal23a.html.

Related Material