[edit]
Robust Semi-supervised Detection of Hands in Diverse Open Surgery Environments
Proceedings of the 8th Machine Learning for Healthcare Conference, PMLR 219:736-753, 2023.
Abstract
Artificial intelligence has impacted many aspects of modern medical care but depends critically on data. Videos of medical procedures are a valuable resource for computer vision algorithms but labeling them can be costly and requires expert knowledge. This paper explores how to leverage low-quality, unlabeled videos scraped from the internet in addition to a limited amount of labeled images to improve object detection during surgical procedures. We establish the first benchmark for semi-supervised hand detection during open surgery and show that existing benchmarks in non-medical contexts are not indicative of performance differences on real-world medical applications, where data is noisy and poorly labeled. We propose a end-to-end trainable two-stage object detector that employs consistency loss to learn from unlabeled images. The model is robust to missing labels, variance in hand morphology, and extreme domain shifts such as those encountered in open-source videos of surgeries scraped from YouTube. Our method can predict surgeons’ hands in surgical videos even when only a fraction of hands are labeled in each frame of the labeled set. Adding unlabeled data, we can detect hands more accurately than existing end-to-end semi-supervised object detection algorithms.