LymphoML: An interpretable artificial intelligence-based method identifies morphologic features that correlate with lymphoma subtype

Vivek Shankar, Xiaoli Yang, Vrishab Krishna, Brent Tan, Oscar Silva, Rebecca Rojansky, Andrew Ng, Fabiola Valvert, Edward Briercheck, David Weinstock, Yasodha Natkunam, Sebastian Fernandez-Pol, Pranav Rajpurkar
Proceedings of the 3rd Machine Learning for Health Symposium, PMLR 225:528-558, 2023.

Abstract

The accurate classification of lymphoma subtypes using hematoxylin and eosin (H {\}& E)-stained tissue is complicated by the wide range of morphological features these cancers can exhibit. We present LymphoML - an interpretable machine learning method that identifies morphologic features that correlate with lymphoma subtypes. Our method applies steps to process H {\}& E-stained tissue microarray cores, segment nuclei and cells, compute features encompassing morphology, texture, and architecture, and train gradient-boosted models to make diagnostic predictions. LymphoML{’}s interpretable models, developed on a limited volume of H {\}& E-stained tissue, achieve non-inferior diagnostic accuracy to pathologists using whole-slide images and outperform black box deep-learning on a dataset of 670 cases from Guatemala spanning 8 lymphoma subtypes. Using SHapley Additive exPlanation (SHAP) analysis, we assess the impact of each feature on model prediction and find that nuclear shape features are most discriminative for DLBCL (F1-score: 78.7 {\}% ) and classical Hodgkin lymphoma (F1-score: 74.5 {\}% ). Finally, we provide the first demonstration that a model combining features from H {\}& E-stained tissue with features from a standardized panel of 6 immunostains results in a similar diagnostic accuracy (85.3 {\}% ) to a 46-stain panel (86.1 {\}% ).

Cite this Paper


BibTeX
@InProceedings{pmlr-v225-shankar23a, title = {LymphoML: An interpretable artificial intelligence-based method identifies morphologic features that correlate with lymphoma subtype}, author = {Shankar, Vivek and Yang, Xiaoli and Krishna, Vrishab and Tan, Brent and Silva, Oscar and Rojansky, Rebecca and Ng, Andrew and Valvert, Fabiola and Briercheck, Edward and Weinstock, David and Natkunam, Yasodha and Fernandez-Pol, Sebastian and Rajpurkar, Pranav}, booktitle = {Proceedings of the 3rd Machine Learning for Health Symposium}, pages = {528--558}, year = {2023}, editor = {Hegselmann, Stefan and Parziale, Antonio and Shanmugam, Divya and Tang, Shengpu and Asiedu, Mercy Nyamewaa and Chang, Serina and Hartvigsen, Tom and Singh, Harvineet}, volume = {225}, series = {Proceedings of Machine Learning Research}, month = {10 Dec}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v225/shankar23a/shankar23a.pdf}, url = {https://proceedings.mlr.press/v225/shankar23a.html}, abstract = {The accurate classification of lymphoma subtypes using hematoxylin and eosin (H {\}& E)-stained tissue is complicated by the wide range of morphological features these cancers can exhibit. We present LymphoML - an interpretable machine learning method that identifies morphologic features that correlate with lymphoma subtypes. Our method applies steps to process H {\}& E-stained tissue microarray cores, segment nuclei and cells, compute features encompassing morphology, texture, and architecture, and train gradient-boosted models to make diagnostic predictions. LymphoML{’}s interpretable models, developed on a limited volume of H {\}& E-stained tissue, achieve non-inferior diagnostic accuracy to pathologists using whole-slide images and outperform black box deep-learning on a dataset of 670 cases from Guatemala spanning 8 lymphoma subtypes. Using SHapley Additive exPlanation (SHAP) analysis, we assess the impact of each feature on model prediction and find that nuclear shape features are most discriminative for DLBCL (F1-score: 78.7 {\}% ) and classical Hodgkin lymphoma (F1-score: 74.5 {\}% ). Finally, we provide the first demonstration that a model combining features from H {\}& E-stained tissue with features from a standardized panel of 6 immunostains results in a similar diagnostic accuracy (85.3 {\}% ) to a 46-stain panel (86.1 {\}% ).} }
Endnote
%0 Conference Paper %T LymphoML: An interpretable artificial intelligence-based method identifies morphologic features that correlate with lymphoma subtype %A Vivek Shankar %A Xiaoli Yang %A Vrishab Krishna %A Brent Tan %A Oscar Silva %A Rebecca Rojansky %A Andrew Ng %A Fabiola Valvert %A Edward Briercheck %A David Weinstock %A Yasodha Natkunam %A Sebastian Fernandez-Pol %A Pranav Rajpurkar %B Proceedings of the 3rd Machine Learning for Health Symposium %C Proceedings of Machine Learning Research %D 2023 %E Stefan Hegselmann %E Antonio Parziale %E Divya Shanmugam %E Shengpu Tang %E Mercy Nyamewaa Asiedu %E Serina Chang %E Tom Hartvigsen %E Harvineet Singh %F pmlr-v225-shankar23a %I PMLR %P 528--558 %U https://proceedings.mlr.press/v225/shankar23a.html %V 225 %X The accurate classification of lymphoma subtypes using hematoxylin and eosin (H {\}& E)-stained tissue is complicated by the wide range of morphological features these cancers can exhibit. We present LymphoML - an interpretable machine learning method that identifies morphologic features that correlate with lymphoma subtypes. Our method applies steps to process H {\}& E-stained tissue microarray cores, segment nuclei and cells, compute features encompassing morphology, texture, and architecture, and train gradient-boosted models to make diagnostic predictions. LymphoML{’}s interpretable models, developed on a limited volume of H {\}& E-stained tissue, achieve non-inferior diagnostic accuracy to pathologists using whole-slide images and outperform black box deep-learning on a dataset of 670 cases from Guatemala spanning 8 lymphoma subtypes. Using SHapley Additive exPlanation (SHAP) analysis, we assess the impact of each feature on model prediction and find that nuclear shape features are most discriminative for DLBCL (F1-score: 78.7 {\}% ) and classical Hodgkin lymphoma (F1-score: 74.5 {\}% ). Finally, we provide the first demonstration that a model combining features from H {\}& E-stained tissue with features from a standardized panel of 6 immunostains results in a similar diagnostic accuracy (85.3 {\}% ) to a 46-stain panel (86.1 {\}% ).
APA
Shankar, V., Yang, X., Krishna, V., Tan, B., Silva, O., Rojansky, R., Ng, A., Valvert, F., Briercheck, E., Weinstock, D., Natkunam, Y., Fernandez-Pol, S. & Rajpurkar, P.. (2023). LymphoML: An interpretable artificial intelligence-based method identifies morphologic features that correlate with lymphoma subtype. Proceedings of the 3rd Machine Learning for Health Symposium, in Proceedings of Machine Learning Research 225:528-558 Available from https://proceedings.mlr.press/v225/shankar23a.html.

Related Material