Efficient tool segmentation for endoscopic videos in the wild

Clara Tomasini, Iñigo Alonso, Luis Riazuelo, Ana C Murillo
Proceedings of The 5th International Conference on Medical Imaging with Deep Learning, PMLR 172:1218-1234, 2022.

Abstract

In recent years, deep learning methods have become the most effective approach for tool segmentation in endoscopic images, achieving the state of the art on the available public benchmarks. However, these methods present some challenges that hinder their direct deployment in real world scenarios. This work explores how to solve two of the most common challenges: real-time and memory restrictions and false positives in frames with no tools. To cope with the first case, we show how to adapt an efficient general purpose semantic segmentation model. Then, we study how to cope with the common issue of only training on images with at least one tool. Then, when images of endoscopic procedures without tools are processed, there are a lot of false positives. To solve this, we propose to add an extra classification head that performs binary frame classification, to identify frames with no tools present. Finally, we present a thorough comparison of this approach with current state of the art on different benchmarks, including real medical practice recordings, demonstrating similar accuracy with much lower computational requirements.

Cite this Paper


BibTeX
@InProceedings{pmlr-v172-tomasini22a, title = {Efficient tool segmentation for endoscopic videos in the wild}, author = {Tomasini, Clara and Alonso, I{\~n}igo and Riazuelo, Luis and Murillo, Ana C}, booktitle = {Proceedings of The 5th International Conference on Medical Imaging with Deep Learning}, pages = {1218--1234}, year = {2022}, editor = {Konukoglu, Ender and Menze, Bjoern and Venkataraman, Archana and Baumgartner, Christian and Dou, Qi and Albarqouni, Shadi}, volume = {172}, series = {Proceedings of Machine Learning Research}, month = {06--08 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v172/tomasini22a/tomasini22a.pdf}, url = {https://proceedings.mlr.press/v172/tomasini22a.html}, abstract = {In recent years, deep learning methods have become the most effective approach for tool segmentation in endoscopic images, achieving the state of the art on the available public benchmarks. However, these methods present some challenges that hinder their direct deployment in real world scenarios. This work explores how to solve two of the most common challenges: real-time and memory restrictions and false positives in frames with no tools. To cope with the first case, we show how to adapt an efficient general purpose semantic segmentation model. Then, we study how to cope with the common issue of only training on images with at least one tool. Then, when images of endoscopic procedures without tools are processed, there are a lot of false positives. To solve this, we propose to add an extra classification head that performs binary frame classification, to identify frames with no tools present. Finally, we present a thorough comparison of this approach with current state of the art on different benchmarks, including real medical practice recordings, demonstrating similar accuracy with much lower computational requirements.} }
Endnote
%0 Conference Paper %T Efficient tool segmentation for endoscopic videos in the wild %A Clara Tomasini %A Iñigo Alonso %A Luis Riazuelo %A Ana C Murillo %B Proceedings of The 5th International Conference on Medical Imaging with Deep Learning %C Proceedings of Machine Learning Research %D 2022 %E Ender Konukoglu %E Bjoern Menze %E Archana Venkataraman %E Christian Baumgartner %E Qi Dou %E Shadi Albarqouni %F pmlr-v172-tomasini22a %I PMLR %P 1218--1234 %U https://proceedings.mlr.press/v172/tomasini22a.html %V 172 %X In recent years, deep learning methods have become the most effective approach for tool segmentation in endoscopic images, achieving the state of the art on the available public benchmarks. However, these methods present some challenges that hinder their direct deployment in real world scenarios. This work explores how to solve two of the most common challenges: real-time and memory restrictions and false positives in frames with no tools. To cope with the first case, we show how to adapt an efficient general purpose semantic segmentation model. Then, we study how to cope with the common issue of only training on images with at least one tool. Then, when images of endoscopic procedures without tools are processed, there are a lot of false positives. To solve this, we propose to add an extra classification head that performs binary frame classification, to identify frames with no tools present. Finally, we present a thorough comparison of this approach with current state of the art on different benchmarks, including real medical practice recordings, demonstrating similar accuracy with much lower computational requirements.
APA
Tomasini, C., Alonso, I., Riazuelo, L. & Murillo, A.C.. (2022). Efficient tool segmentation for endoscopic videos in the wild. Proceedings of The 5th International Conference on Medical Imaging with Deep Learning, in Proceedings of Machine Learning Research 172:1218-1234 Available from https://proceedings.mlr.press/v172/tomasini22a.html.

Related Material