DTC-WSI: Dynamic Token Compression for Whole-Slide Images

Tawsifur Rahman, Aliasghar Tarkhan, Rama Chellappa, Alexander S. Baras
Proceedings of The 9th International Conference on Medical Imaging with Deep Learning, PMLR 315:3846-3865, 2026.

Abstract

Whole-slide images (WSIs) contain tens of thousands of heterogeneous patches, making transformer-based multiple-instance learning (MIL) computationally expensive due to quadratic attention costs and substantial redundancy in tissue morphology. Existing token-reduction approaches for WSI analysis rely primarily on pruning, which discards information early in training and destabilizes optimization under weak supervision. We propose Dynamic Token Compression for Whole-Slide Images (DTC-WSI), a token-efficient MIL framework that performs progressive, importance-aware WSI compression. DTC-WSI integrates a lightweight saliency network with a multi-stage token compressor that combines bipartite similarity matching and soft differentiable pruning to gradually eliminate redundant or non-diagnostic patches. During training, soft gates enable stable gradient flow, while inference employs deterministic compression for substantial acceleration. This curriculum-style compression preserves discriminative morphology and dramatically reduces computational burden. Across four WSI benchmarks (TCGA-NSCLC, TCGA-BRCA, TCGA-RCC, PANDA), DTC-WSI achieves 5–10$\times$ token reduction, up to 5.3$\times$ faster inference, and 20–40% lower memory usage, while improving MIL classification accuracy by 2–4% over state-of-the-art baselines. Our results demonstrate that dynamic token compression is a powerful and scalable alternative to pruning, enabling efficient transformer-based WSI analysis while improving accuracy.

Cite this Paper


BibTeX
@InProceedings{pmlr-v315-rahman26b, title = {DTC-WSI: Dynamic Token Compression for Whole-Slide Images}, author = {Rahman, Tawsifur and Tarkhan, Aliasghar and Chellappa, Rama and Baras, Alexander S.}, booktitle = {Proceedings of The 9th International Conference on Medical Imaging with Deep Learning}, pages = {3846--3865}, year = {2026}, editor = {Huo, Yuankai and Gao, Mingchen and Kuo, Chang-Fu and Jin, Yueming and Deng, Ruining}, volume = {315}, series = {Proceedings of Machine Learning Research}, month = {08--10 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v315/main/assets/rahman26b/rahman26b.pdf}, url = {https://proceedings.mlr.press/v315/rahman26b.html}, abstract = {Whole-slide images (WSIs) contain tens of thousands of heterogeneous patches, making transformer-based multiple-instance learning (MIL) computationally expensive due to quadratic attention costs and substantial redundancy in tissue morphology. Existing token-reduction approaches for WSI analysis rely primarily on pruning, which discards information early in training and destabilizes optimization under weak supervision. We propose Dynamic Token Compression for Whole-Slide Images (DTC-WSI), a token-efficient MIL framework that performs progressive, importance-aware WSI compression. DTC-WSI integrates a lightweight saliency network with a multi-stage token compressor that combines bipartite similarity matching and soft differentiable pruning to gradually eliminate redundant or non-diagnostic patches. During training, soft gates enable stable gradient flow, while inference employs deterministic compression for substantial acceleration. This curriculum-style compression preserves discriminative morphology and dramatically reduces computational burden. Across four WSI benchmarks (TCGA-NSCLC, TCGA-BRCA, TCGA-RCC, PANDA), DTC-WSI achieves 5–10$\times$ token reduction, up to 5.3$\times$ faster inference, and 20–40% lower memory usage, while improving MIL classification accuracy by 2–4% over state-of-the-art baselines. Our results demonstrate that dynamic token compression is a powerful and scalable alternative to pruning, enabling efficient transformer-based WSI analysis while improving accuracy.} }
Endnote
%0 Conference Paper %T DTC-WSI: Dynamic Token Compression for Whole-Slide Images %A Tawsifur Rahman %A Aliasghar Tarkhan %A Rama Chellappa %A Alexander S. Baras %B Proceedings of The 9th International Conference on Medical Imaging with Deep Learning %C Proceedings of Machine Learning Research %D 2026 %E Yuankai Huo %E Mingchen Gao %E Chang-Fu Kuo %E Yueming Jin %E Ruining Deng %F pmlr-v315-rahman26b %I PMLR %P 3846--3865 %U https://proceedings.mlr.press/v315/rahman26b.html %V 315 %X Whole-slide images (WSIs) contain tens of thousands of heterogeneous patches, making transformer-based multiple-instance learning (MIL) computationally expensive due to quadratic attention costs and substantial redundancy in tissue morphology. Existing token-reduction approaches for WSI analysis rely primarily on pruning, which discards information early in training and destabilizes optimization under weak supervision. We propose Dynamic Token Compression for Whole-Slide Images (DTC-WSI), a token-efficient MIL framework that performs progressive, importance-aware WSI compression. DTC-WSI integrates a lightweight saliency network with a multi-stage token compressor that combines bipartite similarity matching and soft differentiable pruning to gradually eliminate redundant or non-diagnostic patches. During training, soft gates enable stable gradient flow, while inference employs deterministic compression for substantial acceleration. This curriculum-style compression preserves discriminative morphology and dramatically reduces computational burden. Across four WSI benchmarks (TCGA-NSCLC, TCGA-BRCA, TCGA-RCC, PANDA), DTC-WSI achieves 5–10$\times$ token reduction, up to 5.3$\times$ faster inference, and 20–40% lower memory usage, while improving MIL classification accuracy by 2–4% over state-of-the-art baselines. Our results demonstrate that dynamic token compression is a powerful and scalable alternative to pruning, enabling efficient transformer-based WSI analysis while improving accuracy.
APA
Rahman, T., Tarkhan, A., Chellappa, R. & Baras, A.S.. (2026). DTC-WSI: Dynamic Token Compression for Whole-Slide Images. Proceedings of The 9th International Conference on Medical Imaging with Deep Learning, in Proceedings of Machine Learning Research 315:3846-3865 Available from https://proceedings.mlr.press/v315/rahman26b.html.

Related Material