CNN and Deep Sets for End-to-End Whole Slide Image Representation Learning

Sobhan Hemati, Shivam Kalra, Cameron Meaney, Morteza Babaie, Ali Ghodsi, Hamid Tizhoosh
Proceedings of the Fourth Conference on Medical Imaging with Deep Learning, PMLR 143:301-311, 2021.

Abstract

Digital pathology has enabled us to capture, store and analyze scanned biopsy samples as digital images. Recent advances in deep learning are contributing to computational pathology to improve diagnosis and treatment. However, considering challenges inherent to whole slide images (WSIs), it is not easy to employ deep learning in digital pathology. More importantly, computational bottlenecks induced by the gigapixel WSIs make it difficult to use deep learning for end-to-end image representation. To mitigate this challenge, many patch-based approaches have been proposed. Although patching WSIs enables us to use deep learning, we end up with a bag of patches or set representation which makes downstream tasks non-trivial. More importantly, considering set representation per WSI, it is not clear how one can obtain similarity between two WSIs (sets) for tasks like image search matching. To address this challenge, we propose a neural network based on Convolutions Neural Network (CNN) and Deep Sets to learn one permutation invariant vector representation per WSI in an end-to-end manner. Considering available labels at the WSI level - namely, primary site and cancer subtypes - we train the proposed network in a multi-label setting to encode both primary site and diagnosis. Having in mind that every primary site has its own specific cancer subtypes, we propose to use the predicted label for the primary site to recognize the cancer subtype. The proposed architecture is used for transfer learning of WSIs and validated two different tasks, i.e., search and classification. The results show that the proposed architecture can be used to obtain WSI representations that achieve better performance both in terms of retrieval performance and search time against \emph{Yottixel}, a recently developed search engine for pathology images. Further, the model achieved competitive performance against the state-of-art in lung cancer classification.

Cite this Paper


BibTeX
@InProceedings{pmlr-v143-hemati21a, title = {{CNN} and Deep Sets for End-to-End Whole Slide Image Representation Learning}, author = {Hemati, Sobhan and Kalra, Shivam and Meaney, Cameron and Babaie, Morteza and Ghodsi, Ali and Tizhoosh, Hamid}, booktitle = {Proceedings of the Fourth Conference on Medical Imaging with Deep Learning}, pages = {301--311}, year = {2021}, editor = {Heinrich, Mattias and Dou, Qi and de Bruijne, Marleen and Lellmann, Jan and Schläfer, Alexander and Ernst, Floris}, volume = {143}, series = {Proceedings of Machine Learning Research}, month = {07--09 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v143/hemati21a/hemati21a.pdf}, url = {https://proceedings.mlr.press/v143/hemati21a.html}, abstract = {Digital pathology has enabled us to capture, store and analyze scanned biopsy samples as digital images. Recent advances in deep learning are contributing to computational pathology to improve diagnosis and treatment. However, considering challenges inherent to whole slide images (WSIs), it is not easy to employ deep learning in digital pathology. More importantly, computational bottlenecks induced by the gigapixel WSIs make it difficult to use deep learning for end-to-end image representation. To mitigate this challenge, many patch-based approaches have been proposed. Although patching WSIs enables us to use deep learning, we end up with a bag of patches or set representation which makes downstream tasks non-trivial. More importantly, considering set representation per WSI, it is not clear how one can obtain similarity between two WSIs (sets) for tasks like image search matching. To address this challenge, we propose a neural network based on Convolutions Neural Network (CNN) and Deep Sets to learn one permutation invariant vector representation per WSI in an end-to-end manner. Considering available labels at the WSI level - namely, primary site and cancer subtypes - we train the proposed network in a multi-label setting to encode both primary site and diagnosis. Having in mind that every primary site has its own specific cancer subtypes, we propose to use the predicted label for the primary site to recognize the cancer subtype. The proposed architecture is used for transfer learning of WSIs and validated two different tasks, i.e., search and classification. The results show that the proposed architecture can be used to obtain WSI representations that achieve better performance both in terms of retrieval performance and search time against \emph{Yottixel}, a recently developed search engine for pathology images. Further, the model achieved competitive performance against the state-of-art in lung cancer classification.} }
Endnote
%0 Conference Paper %T CNN and Deep Sets for End-to-End Whole Slide Image Representation Learning %A Sobhan Hemati %A Shivam Kalra %A Cameron Meaney %A Morteza Babaie %A Ali Ghodsi %A Hamid Tizhoosh %B Proceedings of the Fourth Conference on Medical Imaging with Deep Learning %C Proceedings of Machine Learning Research %D 2021 %E Mattias Heinrich %E Qi Dou %E Marleen de Bruijne %E Jan Lellmann %E Alexander Schläfer %E Floris Ernst %F pmlr-v143-hemati21a %I PMLR %P 301--311 %U https://proceedings.mlr.press/v143/hemati21a.html %V 143 %X Digital pathology has enabled us to capture, store and analyze scanned biopsy samples as digital images. Recent advances in deep learning are contributing to computational pathology to improve diagnosis and treatment. However, considering challenges inherent to whole slide images (WSIs), it is not easy to employ deep learning in digital pathology. More importantly, computational bottlenecks induced by the gigapixel WSIs make it difficult to use deep learning for end-to-end image representation. To mitigate this challenge, many patch-based approaches have been proposed. Although patching WSIs enables us to use deep learning, we end up with a bag of patches or set representation which makes downstream tasks non-trivial. More importantly, considering set representation per WSI, it is not clear how one can obtain similarity between two WSIs (sets) for tasks like image search matching. To address this challenge, we propose a neural network based on Convolutions Neural Network (CNN) and Deep Sets to learn one permutation invariant vector representation per WSI in an end-to-end manner. Considering available labels at the WSI level - namely, primary site and cancer subtypes - we train the proposed network in a multi-label setting to encode both primary site and diagnosis. Having in mind that every primary site has its own specific cancer subtypes, we propose to use the predicted label for the primary site to recognize the cancer subtype. The proposed architecture is used for transfer learning of WSIs and validated two different tasks, i.e., search and classification. The results show that the proposed architecture can be used to obtain WSI representations that achieve better performance both in terms of retrieval performance and search time against \emph{Yottixel}, a recently developed search engine for pathology images. Further, the model achieved competitive performance against the state-of-art in lung cancer classification.
APA
Hemati, S., Kalra, S., Meaney, C., Babaie, M., Ghodsi, A. & Tizhoosh, H.. (2021). CNN and Deep Sets for End-to-End Whole Slide Image Representation Learning. Proceedings of the Fourth Conference on Medical Imaging with Deep Learning, in Proceedings of Machine Learning Research 143:301-311 Available from https://proceedings.mlr.press/v143/hemati21a.html.

Related Material