fmMAP: A Framework Reducing Site-Bias Batch Effect from Foundation Models in Pathology

Hai Cao Truong Nguyen; David Joon Ho

fmMAP: A Framework Reducing Site-Bias Batch Effect from Foundation Models in Pathology

Hai Cao Truong Nguyen, David Joon Ho

Proceedings of the MICCAI Workshop on Computational Pathology, PMLR 316:171-186, 2026.

Abstract

Foundation models (FMs) in pathology are general-purpose models capturing heterogeneous morphological patterns on pathology images leveraged by a vast training dataset.Although FMs have demonstrated promising results in multiple downstream tasks such as classification and retrieval, confounding factors are also embedded in the features potentially causing inaccurate decisions. For example, we observe a batch effect where distinctive medical center signatures are displayed when clustering features from FMs. In this work, we propose Foundation Model-based Manifold Approximation Pipeline (fmMAP) to reduce the batch effect by adjusting features from FMs. Our framework employs supervised uniform manifold approximation (UMAP) to transform features generated by FMs into an optimal space. In this transformed space, characteristics of features of interest (i.e., biological features) are highlighted while other confounding factors are reduced. Experimental results on eight recent FMs show that raw features from the FMs are shown to be unrobust, but fmMAP transforms features to become robust on all FMs according to the robustness index. In addition, fmMAP reduces average balanced accuracy for site prediction and improves average balanced accuracy for tissue type classification achieving more than 96% in publicly available datasets. We expect fmMAP framework will help FMs identify essential pathologic features that would enhance performance on downstream tasks. The code is available at https://github.com/davidholab/fmMAP.

Cite this Paper

BibTeX

@InProceedings{pmlr-v316-nguyen26a,
  title = 	 {fmMAP: A Framework Reducing Site-Bias Batch Effect from Foundation Models in Pathology},
  author =       {Nguyen, Hai Cao Truong and Ho, David Joon},
  booktitle = 	 {Proceedings of the MICCAI Workshop on Computational Pathology},
  pages = 	 {171--186},
  year = 	 {2026},
  editor = 	 {Studer, Linda and Ciompi, Francesco and Khalili, Nadieh and Faryna, Khrystyna and Faryna, Khrystyna and Yeong, Joe and Lau, Mai Chan and Chen, Hao and Liu, Ziyi and Brattoli, Biagio},
  volume = 	 {316},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {27 Sep},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v316/main/assets/nguyen26a/nguyen26a.pdf},
  url = 	 {https://proceedings.mlr.press/v316/nguyen26a.html},
  abstract = 	 {Foundation models (FMs) in pathology are general-purpose models capturing heterogeneous morphological patterns on pathology images leveraged by a vast training dataset.Although FMs have demonstrated promising results in multiple downstream tasks such as classification and retrieval, confounding factors are also embedded in the features potentially causing inaccurate decisions. For example, we observe a batch effect where distinctive medical center signatures are displayed when clustering features from FMs. In this work, we propose Foundation Model-based Manifold Approximation Pipeline (fmMAP) to reduce the batch effect by adjusting features from FMs. Our framework employs supervised uniform manifold approximation (UMAP) to transform features generated by FMs into an optimal space. In this transformed space, characteristics of features of interest (i.e., biological features) are highlighted while other confounding factors are reduced. Experimental results on eight recent FMs show that raw features from the FMs are shown to be unrobust, but fmMAP transforms features to become robust on all FMs according to the robustness index. In addition, fmMAP reduces average balanced accuracy for site prediction and improves average balanced accuracy for tissue type classification achieving more than 96% in publicly available datasets. We expect fmMAP framework will help FMs identify essential pathologic features that would enhance performance on downstream tasks. The code is available at https://github.com/davidholab/fmMAP.}
}

Endnote

%0 Conference Paper
%T fmMAP: A Framework Reducing Site-Bias Batch Effect from Foundation Models in Pathology
%A Hai Cao Truong Nguyen
%A David Joon Ho
%B Proceedings of the MICCAI Workshop on Computational Pathology
%C Proceedings of Machine Learning Research
%D 2026
%E Linda Studer
%E Francesco Ciompi
%E Nadieh Khalili
%E Khrystyna Faryna
%E Khrystyna Faryna
%E Joe Yeong
%E Mai Chan Lau
%E Hao Chen
%E Ziyi Liu
%E Biagio Brattoli	
%F pmlr-v316-nguyen26a
%I PMLR
%P 171--186
%U https://proceedings.mlr.press/v316/nguyen26a.html
%V 316
%X Foundation models (FMs) in pathology are general-purpose models capturing heterogeneous morphological patterns on pathology images leveraged by a vast training dataset.Although FMs have demonstrated promising results in multiple downstream tasks such as classification and retrieval, confounding factors are also embedded in the features potentially causing inaccurate decisions. For example, we observe a batch effect where distinctive medical center signatures are displayed when clustering features from FMs. In this work, we propose Foundation Model-based Manifold Approximation Pipeline (fmMAP) to reduce the batch effect by adjusting features from FMs. Our framework employs supervised uniform manifold approximation (UMAP) to transform features generated by FMs into an optimal space. In this transformed space, characteristics of features of interest (i.e., biological features) are highlighted while other confounding factors are reduced. Experimental results on eight recent FMs show that raw features from the FMs are shown to be unrobust, but fmMAP transforms features to become robust on all FMs according to the robustness index. In addition, fmMAP reduces average balanced accuracy for site prediction and improves average balanced accuracy for tissue type classification achieving more than 96% in publicly available datasets. We expect fmMAP framework will help FMs identify essential pathologic features that would enhance performance on downstream tasks. The code is available at https://github.com/davidholab/fmMAP.

APA

Nguyen, H.C.T. & Ho, D.J.. (2026). fmMAP: A Framework Reducing Site-Bias Batch Effect from Foundation Models in Pathology. Proceedings of the MICCAI Workshop on Computational Pathology, in Proceedings of Machine Learning Research 316:171-186 Available from https://proceedings.mlr.press/v316/nguyen26a.html.

Related Material

Download PDF