[edit]
Comparing Objective Functions for Segmentation and Detection of Microaneurysms in Retinal Images
Proceedings of the Third Conference on Medical Imaging with Deep Learning, PMLR 121:19-32, 2020.
Abstract
Retinal microaneurysms (MAs) are the earliest signs of diabetic retinopathy (DR) which is the leading cause of blindness among the working aged population in the western world. Detection of MAs present a particular challenge as MA pixels account for less than 0.5$%$ of the retinal image. In deep neural networks the learning process can be adversely affected by imbalance which introduces a bias towards the most well represented class. Recently, a number of objective functions have been proposed as alternatives to the standard Crossentropy (CE) loss in efforts to combat this problem. In this work we investigate the influence of the network objective during optimization by comparing Residual U-nets trained for segmentation of MAs in retinal images using six different objective functions; weighted and unweighted CE, Dice loss, weighted and unweighted Focal loss and Focal Tversky loss. We also perform test with the CE objective using a more complex model. Three networks with different seeds are trained for each objective function using optimized hyper-parameter settings on a dataset of 382 images with pixel level annotations for MAs. Instance level MA detection performance is evaluated with the average free response receiver operator characteristic (FROC) score calculated as the mean sensitivity at seven average false positives per image thresholds on 80 test images. The image level MA detection performance and detection of low levels of DR is evaluated with bootstrapped AUC scores on the same images and a separate test set of 1287 images. Significance test for image level detection accuracy ($\alpha$ = 0.05) is performed using Cochran’s Q and McNemar’s test. Segmentation performance is evaluated with the average pixel precision (AP) score. For instance level detection and pixel segmentation we perform repeated measures ANOVA with Post-Hoc tests. Results: Losses based on the CE index perform significantly better than the Dice and Focal Tversky loss for instance level detection and pixel segmentation. The highest FROC score of 0.5448 ($\pm$0.0096) and AP of 0.4888 ($\pm$0.0196) is achieved using weighted CE. For all objectives excluding the Focal Tversky loss (AUC = 0.5) there is no significant difference for image level detection accuracy on the 80 image test set. The highest AUC of 0.993 (95$%$ CI: 0.980 - 1.0) is achieved using the Focal loss. For detection of mild DR on the set of 1287 images there is a significant difference between model objectives $(p = 2.87e^{-12})$. An AUC of 0.730 (95$%$ CI: 0.683 - 0.745 is achieved using the complex model with CE. Using the Focal Tversky objective we fail to detect any MAs on both instance and image level. Conclusion: Our results suggest that it is important to benchmark new losses against the CE and Focal loss functions, as we achieve similar or better results in our test using these objectives.