Context Matters: Query-aware Dynamic Long Sequence Modeling of Gigapixel Images

Zhengrui Guo; Qichen Sun; Jiabo Ma; Lishuang Feng; Jinzhuo Wang; Hao Chen

Context Matters: Query-aware Dynamic Long Sequence Modeling of Gigapixel Images

Zhengrui Guo, Qichen Sun, Jiabo Ma, Lishuang Feng, Jinzhuo Wang, Hao Chen

Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:20941-20963, 2025.

Abstract

Whole slide image (WSI) analysis presents significant computational challenges due to the massive number of patches in gigapixel images. While transformer architectures excel at modeling long-range correlations through self-attention, their quadratic computational complexity makes them impractical for computational pathology applications. Existing solutions like local-global or linear self-attention reduce computational costs but compromise the strong modeling capabilities of full self-attention. In this work, we propose Querent, i.e., the query-aware long contextual dynamic modeling framework, which achieves a theoretically bounded approximation of full self-attention while delivering practical efficiency. Our method adaptively predicts which surrounding regions are most relevant for each patch, enabling focused yet unrestricted attention computation only with potentially important contexts. By using efficient region-wise metadata computation and importance estimation, our approach dramatically reduces computational overhead while preserving global perception to model fine-grained patch correlations. Through comprehensive experiments on biomarker prediction, gene mutation prediction, cancer subtyping, and survival analysis across over 10 WSI datasets, our method demonstrates superior performance compared to the state-of-the-art approaches. Codes are available at https://github.com/dddavid4real/Querent.

Cite this Paper

BibTeX

@InProceedings{pmlr-v267-guo25j,
  title = 	 {Context Matters: Query-aware Dynamic Long Sequence Modeling of Gigapixel Images},
  author =       {Guo, Zhengrui and Sun, Qichen and Ma, Jiabo and Feng, Lishuang and Wang, Jinzhuo and Chen, Hao},
  booktitle = 	 {Proceedings of the 42nd International Conference on Machine Learning},
  pages = 	 {20941--20963},
  year = 	 {2025},
  editor = 	 {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry},
  volume = 	 {267},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--19 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v267/main/assets/guo25j/guo25j.pdf},
  url = 	 {https://proceedings.mlr.press/v267/guo25j.html},
  abstract = 	 {Whole slide image (WSI) analysis presents significant computational challenges due to the massive number of patches in gigapixel images. While transformer architectures excel at modeling long-range correlations through self-attention, their quadratic computational complexity makes them impractical for computational pathology applications. Existing solutions like local-global or linear self-attention reduce computational costs but compromise the strong modeling capabilities of full self-attention. In this work, we propose Querent, i.e., the query-aware long contextual dynamic modeling framework, which achieves a theoretically bounded approximation of full self-attention while delivering practical efficiency. Our method adaptively predicts which surrounding regions are most relevant for each patch, enabling focused yet unrestricted attention computation only with potentially important contexts. By using efficient region-wise metadata computation and importance estimation, our approach dramatically reduces computational overhead while preserving global perception to model fine-grained patch correlations. Through comprehensive experiments on biomarker prediction, gene mutation prediction, cancer subtyping, and survival analysis across over 10 WSI datasets, our method demonstrates superior performance compared to the state-of-the-art approaches. Codes are available at https://github.com/dddavid4real/Querent.}
}

Endnote

%0 Conference Paper
%T Context Matters: Query-aware Dynamic Long Sequence Modeling of Gigapixel Images
%A Zhengrui Guo
%A Qichen Sun
%A Jiabo Ma
%A Lishuang Feng
%A Jinzhuo Wang
%A Hao Chen
%B Proceedings of the 42nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2025
%E Aarti Singh
%E Maryam Fazel
%E Daniel Hsu
%E Simon Lacoste-Julien
%E Felix Berkenkamp
%E Tegan Maharaj
%E Kiri Wagstaff
%E Jerry Zhu	
%F pmlr-v267-guo25j
%I PMLR
%P 20941--20963
%U https://proceedings.mlr.press/v267/guo25j.html
%V 267
%X Whole slide image (WSI) analysis presents significant computational challenges due to the massive number of patches in gigapixel images. While transformer architectures excel at modeling long-range correlations through self-attention, their quadratic computational complexity makes them impractical for computational pathology applications. Existing solutions like local-global or linear self-attention reduce computational costs but compromise the strong modeling capabilities of full self-attention. In this work, we propose Querent, i.e., the query-aware long contextual dynamic modeling framework, which achieves a theoretically bounded approximation of full self-attention while delivering practical efficiency. Our method adaptively predicts which surrounding regions are most relevant for each patch, enabling focused yet unrestricted attention computation only with potentially important contexts. By using efficient region-wise metadata computation and importance estimation, our approach dramatically reduces computational overhead while preserving global perception to model fine-grained patch correlations. Through comprehensive experiments on biomarker prediction, gene mutation prediction, cancer subtyping, and survival analysis across over 10 WSI datasets, our method demonstrates superior performance compared to the state-of-the-art approaches. Codes are available at https://github.com/dddavid4real/Querent.

APA

Guo, Z., Sun, Q., Ma, J., Feng, L., Wang, J. & Chen, H.. (2025). Context Matters: Query-aware Dynamic Long Sequence Modeling of Gigapixel Images. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:20941-20963 Available from https://proceedings.mlr.press/v267/guo25j.html.

Context Matters: Query-aware Dynamic Long Sequence Modeling of Gigapixel Images

Abstract

Cite this Paper

Related Material