How fair is your graph? Exploring fairness concerns in neuroimaging studies
Proceedings of the 7th Machine Learning for Healthcare Conference, PMLR 182:459-478, 2022.
Recent work on neuroimaging has demonstrated significant benefits of using population graphs to capture non-imaging information in the prediction of neurodegenerative and neurodevelopmental disorders. These non-imaging attributes may not only contain demographic information about the individuals, e.g. age or sex, but also the acquisition site, as imaging protocols and hardware might significantly differ across sites in large-scale studies. The effect of the latter is particularly prevalent in functional connectomics studies, where it remains unclear how to sufficiently homogenise fMRI signals across the different sites. In addition, recent studies have highlighted the need to investigate potential biases in the classifiers devised using large-scale datasets, which might be imbalanced in terms of one or more sensitive attributes. This can be exacerbated when employing these attributes in a population graph to explicitly introduce inductive biases to the machine learning model and lead to disparate predictive performance across sub-populations. This study scrutinises such a system and aims to uncover potential biases of a semi-supervised classifier that relies on a population graph. We further explore the effect of the graph structure and stratification strategies, as well as methods to mitigate such biases and produce fairer predictions across the population.