MAIS: Memory-Attention for Interactive Segmentation

Mauricio Orbes, Oeslle Lucena, Sebastien Ourselin, M. Jorge Cardoso
Proceedings of The 8th International Conference on Medical Imaging with Deep Learning, PMLR 301:1258-1272, 2026.

Abstract

Interactive medical segmentation reduces annotation effort by refining predictions through user feedback. Vision Transformer (ViT)-based models, such as the Segment Anything Model (SAM), achieve state-of-the-art performance using user clicks and prior masks as prompts. However, existing methods treat interactions as independent events, leading to redundant corrections and limited refinement gains. We address this by introducing MAIS, a Memory-Attention mechanism for Interactive Segmentation that stores past user inputs and segmentation states, enabling temporal context integration. Our approach enhances ViT-based segmentation across diverse imaging modalities, achieving more efficient and accurate refinements.

Cite this Paper


BibTeX
@InProceedings{pmlr-v301-orbes26a, title = {MAIS: Memory-Attention for Interactive Segmentation}, author = {Orbes, Mauricio and Lucena, Oeslle and Ourselin, Sebastien and Cardoso, M. Jorge}, booktitle = {Proceedings of The 8th International Conference on Medical Imaging with Deep Learning}, pages = {1258--1272}, year = {2026}, editor = {Tasdizen, Tolga and Elhabian, Shireen and Summers, Ronald and Chen, Chen and Koch, Lisa and Zhuang, Yan}, volume = {301}, series = {Proceedings of Machine Learning Research}, month = {09--11 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v301/main/assets/orbes26a/orbes26a.pdf}, url = {https://proceedings.mlr.press/v301/orbes26a.html}, abstract = {Interactive medical segmentation reduces annotation effort by refining predictions through user feedback. Vision Transformer (ViT)-based models, such as the Segment Anything Model (SAM), achieve state-of-the-art performance using user clicks and prior masks as prompts. However, existing methods treat interactions as independent events, leading to redundant corrections and limited refinement gains. We address this by introducing MAIS, a Memory-Attention mechanism for Interactive Segmentation that stores past user inputs and segmentation states, enabling temporal context integration. Our approach enhances ViT-based segmentation across diverse imaging modalities, achieving more efficient and accurate refinements.} }
Endnote
%0 Conference Paper %T MAIS: Memory-Attention for Interactive Segmentation %A Mauricio Orbes %A Oeslle Lucena %A Sebastien Ourselin %A M. Jorge Cardoso %B Proceedings of The 8th International Conference on Medical Imaging with Deep Learning %C Proceedings of Machine Learning Research %D 2026 %E Tolga Tasdizen %E Shireen Elhabian %E Ronald Summers %E Chen Chen %E Lisa Koch %E Yan Zhuang %F pmlr-v301-orbes26a %I PMLR %P 1258--1272 %U https://proceedings.mlr.press/v301/orbes26a.html %V 301 %X Interactive medical segmentation reduces annotation effort by refining predictions through user feedback. Vision Transformer (ViT)-based models, such as the Segment Anything Model (SAM), achieve state-of-the-art performance using user clicks and prior masks as prompts. However, existing methods treat interactions as independent events, leading to redundant corrections and limited refinement gains. We address this by introducing MAIS, a Memory-Attention mechanism for Interactive Segmentation that stores past user inputs and segmentation states, enabling temporal context integration. Our approach enhances ViT-based segmentation across diverse imaging modalities, achieving more efficient and accurate refinements.
APA
Orbes, M., Lucena, O., Ourselin, S. & Cardoso, M.J.. (2026). MAIS: Memory-Attention for Interactive Segmentation. Proceedings of The 8th International Conference on Medical Imaging with Deep Learning, in Proceedings of Machine Learning Research 301:1258-1272 Available from https://proceedings.mlr.press/v301/orbes26a.html.

Related Material