Tailoring automated data augmentation to H&E-stained histopathology

Khrystyna Faryna, Jeroen van der Laak, Geert Litjens
Proceedings of the Fourth Conference on Medical Imaging with Deep Learning, PMLR 143:168-178, 2021.

Abstract

Convolutional neural networks (CNN) are sensitive to domain shifts, which can result in poor generalization. In medical imaging, data acquisition conditions differ among institutions, which leads to variations in image properties and thus domain shift. Stain variation in histopathological slides is a prominent example. Data augmentation is one way to make CNNs robust to varying forms of domain shift, but requires extensive hyperparameter tuning. Due to the large search space, this is cumbersome and often leads to sub-optimal generalization performance. In this work, we focus on automated and computationally efficient data augmentation policy selection for histopathological slides. Building upon the RandAugment framework, we introduce several domain-specific modifications relevant to histopathological images, increasing generalizability. We test these modifications on H&E-stained histopathology slides from Camelyon17 dataset. Our proposed framework outperforms the state-of-the-art manually engineered data augmentation strategy, achieving an area under the ROC curve of 0.964 compared to 0.958, respectively.

Cite this Paper


BibTeX
@InProceedings{pmlr-v143-faryna21a, title = {Tailoring automated data augmentation to H&E-stained histopathology}, author = {Faryna, Khrystyna and van der Laak, Jeroen and Litjens, Geert}, booktitle = {Proceedings of the Fourth Conference on Medical Imaging with Deep Learning}, pages = {168--178}, year = {2021}, editor = {Heinrich, Mattias and Dou, Qi and de Bruijne, Marleen and Lellmann, Jan and Schläfer, Alexander and Ernst, Floris}, volume = {143}, series = {Proceedings of Machine Learning Research}, month = {07--09 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v143/faryna21a/faryna21a.pdf}, url = {https://proceedings.mlr.press/v143/faryna21a.html}, abstract = {Convolutional neural networks (CNN) are sensitive to domain shifts, which can result in poor generalization. In medical imaging, data acquisition conditions differ among institutions, which leads to variations in image properties and thus domain shift. Stain variation in histopathological slides is a prominent example. Data augmentation is one way to make CNNs robust to varying forms of domain shift, but requires extensive hyperparameter tuning. Due to the large search space, this is cumbersome and often leads to sub-optimal generalization performance. In this work, we focus on automated and computationally efficient data augmentation policy selection for histopathological slides. Building upon the RandAugment framework, we introduce several domain-specific modifications relevant to histopathological images, increasing generalizability. We test these modifications on H&E-stained histopathology slides from Camelyon17 dataset. Our proposed framework outperforms the state-of-the-art manually engineered data augmentation strategy, achieving an area under the ROC curve of 0.964 compared to 0.958, respectively.} }
Endnote
%0 Conference Paper %T Tailoring automated data augmentation to H&E-stained histopathology %A Khrystyna Faryna %A Jeroen van der Laak %A Geert Litjens %B Proceedings of the Fourth Conference on Medical Imaging with Deep Learning %C Proceedings of Machine Learning Research %D 2021 %E Mattias Heinrich %E Qi Dou %E Marleen de Bruijne %E Jan Lellmann %E Alexander Schläfer %E Floris Ernst %F pmlr-v143-faryna21a %I PMLR %P 168--178 %U https://proceedings.mlr.press/v143/faryna21a.html %V 143 %X Convolutional neural networks (CNN) are sensitive to domain shifts, which can result in poor generalization. In medical imaging, data acquisition conditions differ among institutions, which leads to variations in image properties and thus domain shift. Stain variation in histopathological slides is a prominent example. Data augmentation is one way to make CNNs robust to varying forms of domain shift, but requires extensive hyperparameter tuning. Due to the large search space, this is cumbersome and often leads to sub-optimal generalization performance. In this work, we focus on automated and computationally efficient data augmentation policy selection for histopathological slides. Building upon the RandAugment framework, we introduce several domain-specific modifications relevant to histopathological images, increasing generalizability. We test these modifications on H&E-stained histopathology slides from Camelyon17 dataset. Our proposed framework outperforms the state-of-the-art manually engineered data augmentation strategy, achieving an area under the ROC curve of 0.964 compared to 0.958, respectively.
APA
Faryna, K., van der Laak, J. & Litjens, G.. (2021). Tailoring automated data augmentation to H&E-stained histopathology. Proceedings of the Fourth Conference on Medical Imaging with Deep Learning, in Proceedings of Machine Learning Research 143:168-178 Available from https://proceedings.mlr.press/v143/faryna21a.html.

Related Material