Detector-in-the-Loop Tracking: Active Memory Rectification for Stable Glottic Opening Localization

Huayu Wang, Bahaa Alattar, Cheng-Yen Yang, Hsiang-Wei Huang, Jung Heon Kim, Linda Shapiro, Nathan White, Jenq-Neng Hwang
Proceedings of The 9th International Conference on Medical Imaging with Deep Learning, PMLR 315:3750-3763, 2026.

Abstract

Temporal stability in glottic opening localization remains challenging due to the complementary weaknesses of single-frame detectors and foundation-model trackers: the former lacks temporal context, while the latter suffers from memory drift. Specifically, in video laryngoscopy, rapid tissue deformation, occlusions, and visual ambiguities in emergency settings require a robust, temporally aware solution that can prevent progressive tracking errors. We propose Closed-Loop Memory Correction (CL-MC), a detector-in-the-loop framework that supervises Segment Anything Model 2(SAM2) through confidence-aligned state decisions and active memory rectification. High-confidence detections trigger semantic resets that overwrite corrupted tracker memory, effectively mitigating drift accumulation with a training-free foundation tracker in complex endoscopic scenes. On emergency intubation videos, CL-MC achieves state-of-the-art performance, significantly reducing drift and missing rate compared with the SAM2 variants and open loop based methods. Our results establish memory correction as a crucial component for reliable clinical video tracking.

Cite this Paper


BibTeX
@InProceedings{pmlr-v315-wang26g, title = {Detector-in-the-Loop Tracking: Active Memory Rectification for Stable Glottic Opening Localization}, author = {Wang, Huayu and Alattar, Bahaa and Yang, Cheng-Yen and Huang, Hsiang-Wei and Kim, Jung Heon and Shapiro, Linda and White, Nathan and Hwang, Jenq-Neng}, booktitle = {Proceedings of The 9th International Conference on Medical Imaging with Deep Learning}, pages = {3750--3763}, year = {2026}, editor = {Huo, Yuankai and Gao, Mingchen and Kuo, Chang-Fu and Jin, Yueming and Deng, Ruining}, volume = {315}, series = {Proceedings of Machine Learning Research}, month = {08--10 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v315/main/assets/wang26g/wang26g.pdf}, url = {https://proceedings.mlr.press/v315/wang26g.html}, abstract = {Temporal stability in glottic opening localization remains challenging due to the complementary weaknesses of single-frame detectors and foundation-model trackers: the former lacks temporal context, while the latter suffers from memory drift. Specifically, in video laryngoscopy, rapid tissue deformation, occlusions, and visual ambiguities in emergency settings require a robust, temporally aware solution that can prevent progressive tracking errors. We propose Closed-Loop Memory Correction (CL-MC), a detector-in-the-loop framework that supervises Segment Anything Model 2(SAM2) through confidence-aligned state decisions and active memory rectification. High-confidence detections trigger semantic resets that overwrite corrupted tracker memory, effectively mitigating drift accumulation with a training-free foundation tracker in complex endoscopic scenes. On emergency intubation videos, CL-MC achieves state-of-the-art performance, significantly reducing drift and missing rate compared with the SAM2 variants and open loop based methods. Our results establish memory correction as a crucial component for reliable clinical video tracking.} }
Endnote
%0 Conference Paper %T Detector-in-the-Loop Tracking: Active Memory Rectification for Stable Glottic Opening Localization %A Huayu Wang %A Bahaa Alattar %A Cheng-Yen Yang %A Hsiang-Wei Huang %A Jung Heon Kim %A Linda Shapiro %A Nathan White %A Jenq-Neng Hwang %B Proceedings of The 9th International Conference on Medical Imaging with Deep Learning %C Proceedings of Machine Learning Research %D 2026 %E Yuankai Huo %E Mingchen Gao %E Chang-Fu Kuo %E Yueming Jin %E Ruining Deng %F pmlr-v315-wang26g %I PMLR %P 3750--3763 %U https://proceedings.mlr.press/v315/wang26g.html %V 315 %X Temporal stability in glottic opening localization remains challenging due to the complementary weaknesses of single-frame detectors and foundation-model trackers: the former lacks temporal context, while the latter suffers from memory drift. Specifically, in video laryngoscopy, rapid tissue deformation, occlusions, and visual ambiguities in emergency settings require a robust, temporally aware solution that can prevent progressive tracking errors. We propose Closed-Loop Memory Correction (CL-MC), a detector-in-the-loop framework that supervises Segment Anything Model 2(SAM2) through confidence-aligned state decisions and active memory rectification. High-confidence detections trigger semantic resets that overwrite corrupted tracker memory, effectively mitigating drift accumulation with a training-free foundation tracker in complex endoscopic scenes. On emergency intubation videos, CL-MC achieves state-of-the-art performance, significantly reducing drift and missing rate compared with the SAM2 variants and open loop based methods. Our results establish memory correction as a crucial component for reliable clinical video tracking.
APA
Wang, H., Alattar, B., Yang, C., Huang, H., Kim, J.H., Shapiro, L., White, N. & Hwang, J.. (2026). Detector-in-the-Loop Tracking: Active Memory Rectification for Stable Glottic Opening Localization. Proceedings of The 9th International Conference on Medical Imaging with Deep Learning, in Proceedings of Machine Learning Research 315:3750-3763 Available from https://proceedings.mlr.press/v315/wang26g.html.

Related Material