Retina U-Net: Embarrassingly Simple Exploitation of Segmentation Supervision for Medical Object Detection

Paul F. Jaeger, Simon A. A. Kohl, Sebastian Bickelhaupt, Fabian Isensee, Tristan Anselm Kuder, Heinz-Peter Schlemmer, Klaus H. Maier-Hein
Proceedings of the Machine Learning for Health NeurIPS Workshop, PMLR 116:171-183, 2020.

Abstract

The task of localizing and categorizing objects in medical images often remains formulated as a semantic segmentation problem. This approach, however, only indirectly solves the coarse localization task by predicting pixel-level scores, requiring ad-hoc heuristics when mapping back to object-level scores. State-of-the-art object detectors on the other hand, allow for individual object scoring in an end-to-end fashion, while ironically trading in the ability to exploit the full pixel-wise supervision signal. This can be particularly disadvantageous in the setting of medical image analysis, where data sets are notoriously small. In this paper, we propose Retina U-Net, a simple architecture, which naturally fuses the Retina Net one-stage detector with the U-Net architecture widely used for semantic segmentation in medical images. The proposed architecture recaptures discarded supervision signals by complementing object detection with an auxiliary task in the form of semantic segmentation without introducing the additional complexity of previously proposed two-stage detectors. We evaluate the importance of full segmentation supervision on two medical data sets, provide an in-depth analysis on a series of toy experiments and show how the corresponding performance gain grows in the limit of small data sets. Retina U-Net yields strong detection performance only reached by its more complex two-staged counterparts. Our framework including all methods implemented for operation on 2D and 3D images is available at github.com/pfjaeger/medicaldetectiontoolkit.

Cite this Paper


BibTeX
@InProceedings{pmlr-v116-jaeger20a, title = {{Retina U-Net: Embarrassingly Simple Exploitation of Segmentation Supervision for Medical Object Detection}}, author = {Jaeger, Paul F. and Kohl, Simon A. A. and Bickelhaupt, Sebastian and Isensee, Fabian and Kuder, Tristan Anselm and Schlemmer, Heinz-Peter and Maier-Hein, Klaus H.}, booktitle = {Proceedings of the Machine Learning for Health NeurIPS Workshop}, pages = {171--183}, year = {2020}, editor = {Dalca, Adrian V. and McDermott, Matthew B.A. and Alsentzer, Emily and Finlayson, Samuel G. and Oberst, Michael and Falck, Fabian and Beaulieu-Jones, Brett}, volume = {116}, series = {Proceedings of Machine Learning Research}, month = {13 Dec}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v116/jaeger20a/jaeger20a.pdf}, url = {https://proceedings.mlr.press/v116/jaeger20a.html}, abstract = {The task of localizing and categorizing objects in medical images often remains formulated as a semantic segmentation problem. This approach, however, only indirectly solves the coarse localization task by predicting pixel-level scores, requiring ad-hoc heuristics when mapping back to object-level scores. State-of-the-art object detectors on the other hand, allow for individual object scoring in an end-to-end fashion, while ironically trading in the ability to exploit the full pixel-wise supervision signal. This can be particularly disadvantageous in the setting of medical image analysis, where data sets are notoriously small. In this paper, we propose Retina U-Net, a simple architecture, which naturally fuses the Retina Net one-stage detector with the U-Net architecture widely used for semantic segmentation in medical images. The proposed architecture recaptures discarded supervision signals by complementing object detection with an auxiliary task in the form of semantic segmentation without introducing the additional complexity of previously proposed two-stage detectors. We evaluate the importance of full segmentation supervision on two medical data sets, provide an in-depth analysis on a series of toy experiments and show how the corresponding performance gain grows in the limit of small data sets. Retina U-Net yields strong detection performance only reached by its more complex two-staged counterparts. Our framework including all methods implemented for operation on 2D and 3D images is available at github.com/pfjaeger/medicaldetectiontoolkit.} }
Endnote
%0 Conference Paper %T Retina U-Net: Embarrassingly Simple Exploitation of Segmentation Supervision for Medical Object Detection %A Paul F. Jaeger %A Simon A. A. Kohl %A Sebastian Bickelhaupt %A Fabian Isensee %A Tristan Anselm Kuder %A Heinz-Peter Schlemmer %A Klaus H. Maier-Hein %B Proceedings of the Machine Learning for Health NeurIPS Workshop %C Proceedings of Machine Learning Research %D 2020 %E Adrian V. Dalca %E Matthew B.A. McDermott %E Emily Alsentzer %E Samuel G. Finlayson %E Michael Oberst %E Fabian Falck %E Brett Beaulieu-Jones %F pmlr-v116-jaeger20a %I PMLR %P 171--183 %U https://proceedings.mlr.press/v116/jaeger20a.html %V 116 %X The task of localizing and categorizing objects in medical images often remains formulated as a semantic segmentation problem. This approach, however, only indirectly solves the coarse localization task by predicting pixel-level scores, requiring ad-hoc heuristics when mapping back to object-level scores. State-of-the-art object detectors on the other hand, allow for individual object scoring in an end-to-end fashion, while ironically trading in the ability to exploit the full pixel-wise supervision signal. This can be particularly disadvantageous in the setting of medical image analysis, where data sets are notoriously small. In this paper, we propose Retina U-Net, a simple architecture, which naturally fuses the Retina Net one-stage detector with the U-Net architecture widely used for semantic segmentation in medical images. The proposed architecture recaptures discarded supervision signals by complementing object detection with an auxiliary task in the form of semantic segmentation without introducing the additional complexity of previously proposed two-stage detectors. We evaluate the importance of full segmentation supervision on two medical data sets, provide an in-depth analysis on a series of toy experiments and show how the corresponding performance gain grows in the limit of small data sets. Retina U-Net yields strong detection performance only reached by its more complex two-staged counterparts. Our framework including all methods implemented for operation on 2D and 3D images is available at github.com/pfjaeger/medicaldetectiontoolkit.
APA
Jaeger, P.F., Kohl, S.A.A., Bickelhaupt, S., Isensee, F., Kuder, T.A., Schlemmer, H. & Maier-Hein, K.H.. (2020). Retina U-Net: Embarrassingly Simple Exploitation of Segmentation Supervision for Medical Object Detection. Proceedings of the Machine Learning for Health NeurIPS Workshop, in Proceedings of Machine Learning Research 116:171-183 Available from https://proceedings.mlr.press/v116/jaeger20a.html.

Related Material