Domain Adaptation using Silver Standard Labels for Ki-67 Scoring in Digital Pathology A Step Closer to Widescale Deployment

Amanda Dy; Ngoc-Nhu Jennifer Nguyen; Seyed Hossein Mirjahanmardir; Dimitrios Androutsos; Melanie Dawe; Anthony Fyles; Wei Shi; Fei-Fei Liu; Susan Done; April Khademi

Domain Adaptation using Silver Standard Labels for Ki-67 Scoring in Digital Pathology A Step Closer to Widescale Deployment

Amanda Dy, Ngoc-Nhu Jennifer Nguyen, Seyed Hossein Mirjahanmardir, Dimitrios Androutsos, Melanie Dawe, Anthony Fyles, Wei Shi, Fei-Fei Liu, Susan Done, April Khademi

Medical Imaging with Deep Learning, PMLR 227:653-665, 2024.

Abstract

Deep learning systems have been proposed to improve the objectivity and efficiency of Ki-67 PI scoring. The challenge is that deep learning techniques, while very accurate, suffer from reduced performance when applied to out-of-domain data. This is a critical challenge for clinical translation, as models are typically trained using data available to the vendor, which is not from the target domain. To address this challenge, this study proposes a domain adaptation pipeline that employs an unsupervised framework to generate silver standard (pseudo) labels in the target domain, which is used to augment the gold standard (GS) source domain data. Five training regimes were tested on two validated Ki-67 scoring architectures (UV-Net and piNET), (1) SS Only: trained on target silver standard (SS) labels, (2) GS Only: trained on source GS labels, (3) Mixed: trained on target SS and source GS labels, (4) GS+SS: trained on source GS labels and fine-tuned on target SS labels, and our proposed method (5) SS+GS: trained on source SS labels and fine-tuned on source GS labels. The SS+GS method yielded significantly ($p<0.05$) higher PI accuracy ($95.9%$) and more consistent results compared to the GS Only model on target data. Analysis of t-SNE plots showed features learned by the SS+GS models are more aligned for source and target data which results in improved generalization. The proposed pipeline provides an efficient method for learning the target distribution without the need for manual annotations, which are time-consuming and costly to generate for medical images. This framework can be applied to any target site as a per-laboratory calibration method, for widescale deployment.

Cite this Paper

BibTeX


@InProceedings{pmlr-v227-dy24a,
  title = 	 {Domain Adaptation using Silver Standard Labels for Ki-67 Scoring in Digital Pathology A Step Closer to Widescale Deployment},
  author =       {Dy, Amanda and Nguyen, Ngoc-Nhu Jennifer and Mirjahanmardir, Seyed Hossein and Androutsos, Dimitrios and Dawe, Melanie and Fyles, Anthony and Shi, Wei and Liu, Fei-Fei and Done, Susan and Khademi, April},
  booktitle = 	 {Medical Imaging with Deep Learning},
  pages = 	 {653--665},
  year = 	 {2024},
  editor = 	 {Oguz, Ipek and Noble, Jack and Li, Xiaoxiao and Styner, Martin and Baumgartner, Christian and Rusu, Mirabela and Heinmann, Tobias and Kontos, Despina and Landman, Bennett and Dawant, Benoit},
  volume = 	 {227},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {10--12 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v227/dy24a/dy24a.pdf},
  url = 	 {https://proceedings.mlr.press/v227/dy24a.html},
  abstract = 	 {Deep learning systems have been proposed to improve the objectivity and efficiency of Ki-67 PI scoring. The challenge is that deep learning techniques, while very accurate, suffer from reduced performance when applied to out-of-domain data. This is a critical challenge for clinical translation, as models are typically trained using data available to the vendor, which is not from the target domain. To address this challenge, this study proposes a domain adaptation pipeline that employs an unsupervised framework to generate silver standard (pseudo) labels in the target domain, which is used to augment the gold standard (GS) source domain data. Five training regimes were tested on two validated Ki-67 scoring architectures (UV-Net and piNET), (1) SS Only: trained on target silver standard (SS) labels, (2) GS Only: trained on source GS labels, (3) Mixed: trained on target SS and source GS labels, (4) GS+SS: trained on source GS labels and fine-tuned on target SS labels, and our proposed method (5) SS+GS: trained on source SS labels and fine-tuned on source GS labels. The SS+GS method yielded significantly ($p<0.05$) higher PI accuracy ($95.9%$) and more consistent results compared to the GS Only model on target data. Analysis of t-SNE plots showed features learned by the SS+GS models are more aligned for source and target data which results in improved generalization. The proposed pipeline provides an efficient method for learning the target distribution without the need for manual annotations, which are time-consuming and costly to generate for medical images. This framework can be applied to any target site as a per-laboratory calibration method, for widescale deployment.}
}

Endnote

%0 Conference Paper
%T Domain Adaptation using Silver Standard Labels for Ki-67 Scoring in Digital Pathology A Step Closer to Widescale Deployment
%A Amanda Dy
%A Ngoc-Nhu Jennifer Nguyen
%A Seyed Hossein Mirjahanmardir
%A Dimitrios Androutsos
%A Melanie Dawe
%A Anthony Fyles
%A Wei Shi
%A Fei-Fei Liu
%A Susan Done
%A April Khademi
%B Medical Imaging with Deep Learning
%C Proceedings of Machine Learning Research
%D 2024
%E Ipek Oguz
%E Jack Noble
%E Xiaoxiao Li
%E Martin Styner
%E Christian Baumgartner
%E Mirabela Rusu
%E Tobias Heinmann
%E Despina Kontos
%E Bennett Landman
%E Benoit Dawant	
%F pmlr-v227-dy24a
%I PMLR
%P 653--665
%U https://proceedings.mlr.press/v227/dy24a.html
%V 227
%X Deep learning systems have been proposed to improve the objectivity and efficiency of Ki-67 PI scoring. The challenge is that deep learning techniques, while very accurate, suffer from reduced performance when applied to out-of-domain data. This is a critical challenge for clinical translation, as models are typically trained using data available to the vendor, which is not from the target domain. To address this challenge, this study proposes a domain adaptation pipeline that employs an unsupervised framework to generate silver standard (pseudo) labels in the target domain, which is used to augment the gold standard (GS) source domain data. Five training regimes were tested on two validated Ki-67 scoring architectures (UV-Net and piNET), (1) SS Only: trained on target silver standard (SS) labels, (2) GS Only: trained on source GS labels, (3) Mixed: trained on target SS and source GS labels, (4) GS+SS: trained on source GS labels and fine-tuned on target SS labels, and our proposed method (5) SS+GS: trained on source SS labels and fine-tuned on source GS labels. The SS+GS method yielded significantly ($p<0.05$) higher PI accuracy ($95.9%$) and more consistent results compared to the GS Only model on target data. Analysis of t-SNE plots showed features learned by the SS+GS models are more aligned for source and target data which results in improved generalization. The proposed pipeline provides an efficient method for learning the target distribution without the need for manual annotations, which are time-consuming and costly to generate for medical images. This framework can be applied to any target site as a per-laboratory calibration method, for widescale deployment.

APA


Dy, A., Nguyen, N.J., Mirjahanmardir, S.H., Androutsos, D., Dawe, M., Fyles, A., Shi, W., Liu, F., Done, S. & Khademi, A.. (2024). Domain Adaptation using Silver Standard Labels for Ki-67 Scoring in Digital Pathology A Step Closer to Widescale Deployment. Medical Imaging with Deep Learning, in Proceedings of Machine Learning Research 227:653-665 Available from https://proceedings.mlr.press/v227/dy24a.html.

Related Material

Download PDF