TTT-UNet: Enhancing U-Net with Test-Time Training Layers for Biomedical Image Segmentation

Rong Zhou, Zhengqing Yuan, Zhiling Yan, Weixiang Sun, Kai Zhang, Yiwei Li, Yanfang Ye, Xiang Li, Lichao Sun, Lifang He
Proceedings of The 9th International Conference on Medical Imaging with Deep Learning, PMLR 315:3679-3703, 2026.

Abstract

Biomedical image segmentation is crucial for accurately diagnosing and analyzing various diseases. However, Convolutional Neural Networks (CNNs) and Transformers, the most commonly used architectures for this task, struggle to effectively capture long-range dependencies due to the inherent locality of CNNs and the computational complexity of Transformers. To address this limitation, we introduce TTT-UNet, a novel framework that integrates Test-Time Training (TTT) layers into the traditional U-Net architecture for biomedical image segmentation. TTT-UNet dynamically adjusts model parameters during the test time, enhancing the model’s ability to capture both local and long-range features. We evaluate TTT-UNet on multiple medical imaging datasets, including 3D abdominal organ segmentation in CT and MR images, instrument segmentation in endoscopy images, and cell segmentation in microscopy images. The results demonstrate that TTT-UNet consistently outperforms state-of-the-art CNN-based and Transformer-based segmentation models across all tasks. The code is available at

Cite this Paper


BibTeX
@InProceedings{pmlr-v315-zhou26a, title = {TTT-UNet: Enhancing U-Net with Test-Time Training Layers for Biomedical Image Segmentation}, author = {Zhou, Rong and Yuan, Zhengqing and Yan, Zhiling and Sun, Weixiang and Zhang, Kai and Li, Yiwei and Ye, Yanfang and Li, Xiang and Sun, Lichao and He, Lifang}, booktitle = {Proceedings of The 9th International Conference on Medical Imaging with Deep Learning}, pages = {3679--3703}, year = {2026}, editor = {Huo, Yuankai and Gao, Mingchen and Kuo, Chang-Fu and Jin, Yueming and Deng, Ruining}, volume = {315}, series = {Proceedings of Machine Learning Research}, month = {08--10 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v315/main/assets/zhou26a/zhou26a.pdf}, url = {https://proceedings.mlr.press/v315/zhou26a.html}, abstract = {Biomedical image segmentation is crucial for accurately diagnosing and analyzing various diseases. However, Convolutional Neural Networks (CNNs) and Transformers, the most commonly used architectures for this task, struggle to effectively capture long-range dependencies due to the inherent locality of CNNs and the computational complexity of Transformers. To address this limitation, we introduce TTT-UNet, a novel framework that integrates Test-Time Training (TTT) layers into the traditional U-Net architecture for biomedical image segmentation. TTT-UNet dynamically adjusts model parameters during the test time, enhancing the model’s ability to capture both local and long-range features. We evaluate TTT-UNet on multiple medical imaging datasets, including 3D abdominal organ segmentation in CT and MR images, instrument segmentation in endoscopy images, and cell segmentation in microscopy images. The results demonstrate that TTT-UNet consistently outperforms state-of-the-art CNN-based and Transformer-based segmentation models across all tasks. The code is available at } }
Endnote
%0 Conference Paper %T TTT-UNet: Enhancing U-Net with Test-Time Training Layers for Biomedical Image Segmentation %A Rong Zhou %A Zhengqing Yuan %A Zhiling Yan %A Weixiang Sun %A Kai Zhang %A Yiwei Li %A Yanfang Ye %A Xiang Li %A Lichao Sun %A Lifang He %B Proceedings of The 9th International Conference on Medical Imaging with Deep Learning %C Proceedings of Machine Learning Research %D 2026 %E Yuankai Huo %E Mingchen Gao %E Chang-Fu Kuo %E Yueming Jin %E Ruining Deng %F pmlr-v315-zhou26a %I PMLR %P 3679--3703 %U https://proceedings.mlr.press/v315/zhou26a.html %V 315 %X Biomedical image segmentation is crucial for accurately diagnosing and analyzing various diseases. However, Convolutional Neural Networks (CNNs) and Transformers, the most commonly used architectures for this task, struggle to effectively capture long-range dependencies due to the inherent locality of CNNs and the computational complexity of Transformers. To address this limitation, we introduce TTT-UNet, a novel framework that integrates Test-Time Training (TTT) layers into the traditional U-Net architecture for biomedical image segmentation. TTT-UNet dynamically adjusts model parameters during the test time, enhancing the model’s ability to capture both local and long-range features. We evaluate TTT-UNet on multiple medical imaging datasets, including 3D abdominal organ segmentation in CT and MR images, instrument segmentation in endoscopy images, and cell segmentation in microscopy images. The results demonstrate that TTT-UNet consistently outperforms state-of-the-art CNN-based and Transformer-based segmentation models across all tasks. The code is available at
APA
Zhou, R., Yuan, Z., Yan, Z., Sun, W., Zhang, K., Li, Y., Ye, Y., Li, X., Sun, L. & He, L.. (2026). TTT-UNet: Enhancing U-Net with Test-Time Training Layers for Biomedical Image Segmentation. Proceedings of The 9th International Conference on Medical Imaging with Deep Learning, in Proceedings of Machine Learning Research 315:3679-3703 Available from https://proceedings.mlr.press/v315/zhou26a.html.

Related Material