Identifiable Phenotyping using Constrained Non-Negative Matrix Factorization

Shalmali Joshi, Suriya Gunasekar, David Sontag, Ghosh Joydeep
Proceedings of the 1st Machine Learning for Healthcare Conference, PMLR 56:17-41, 2016.

Abstract

This work proposes a new algorithm for automated and simultaneous phenotyping of multiple co-occurring medical conditions, also referred to as comorbidities, using clinical notes from electronic health records (EHRs). A latent factor estimation technique, non-negative matrix factorization (NMF), is augmented with domain constraints from weak supervision to obtain sparse latent factors that are grounded to a fixed set of chronic conditions. The proposed grounding mechanism ensures a one-to-one identifiable and interpretable mapping between the latent factors and the target comorbidities. Qualitative assessment of the empirical results by clinical experts show that the proposed model learns clinically interpretable phenotypes which are also shown to have competitive performance on 30 day mortality prediction task. The proposed method can be readily adapted to any non-negative EHR data across various healthcare institutions.

Cite this Paper


BibTeX
@InProceedings{pmlr-v56-Joshi16, title = {Identifiable Phenotyping using Constrained Non-Negative Matrix Factorization}, author = {Joshi, Shalmali and Gunasekar, Suriya and Sontag, David and Joydeep, Ghosh}, booktitle = {Proceedings of the 1st Machine Learning for Healthcare Conference}, pages = {17--41}, year = {2016}, editor = {Doshi-Velez, Finale and Fackler, Jim and Kale, David and Wallace, Byron and Wiens, Jenna}, volume = {56}, series = {Proceedings of Machine Learning Research}, address = {Northeastern University, Boston, MA, USA}, month = {18--19 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v56/Joshi16.pdf}, url = {https://proceedings.mlr.press/v56/Joshi16.html}, abstract = {This work proposes a new algorithm for automated and simultaneous phenotyping of multiple co-occurring medical conditions, also referred to as comorbidities, using clinical notes from electronic health records (EHRs). A latent factor estimation technique, non-negative matrix factorization (NMF), is augmented with domain constraints from weak supervision to obtain sparse latent factors that are grounded to a fixed set of chronic conditions. The proposed grounding mechanism ensures a one-to-one identifiable and interpretable mapping between the latent factors and the target comorbidities. Qualitative assessment of the empirical results by clinical experts show that the proposed model learns clinically interpretable phenotypes which are also shown to have competitive performance on 30 day mortality prediction task. The proposed method can be readily adapted to any non-negative EHR data across various healthcare institutions.} }
Endnote
%0 Conference Paper %T Identifiable Phenotyping using Constrained Non-Negative Matrix Factorization %A Shalmali Joshi %A Suriya Gunasekar %A David Sontag %A Ghosh Joydeep %B Proceedings of the 1st Machine Learning for Healthcare Conference %C Proceedings of Machine Learning Research %D 2016 %E Finale Doshi-Velez %E Jim Fackler %E David Kale %E Byron Wallace %E Jenna Wiens %F pmlr-v56-Joshi16 %I PMLR %P 17--41 %U https://proceedings.mlr.press/v56/Joshi16.html %V 56 %X This work proposes a new algorithm for automated and simultaneous phenotyping of multiple co-occurring medical conditions, also referred to as comorbidities, using clinical notes from electronic health records (EHRs). A latent factor estimation technique, non-negative matrix factorization (NMF), is augmented with domain constraints from weak supervision to obtain sparse latent factors that are grounded to a fixed set of chronic conditions. The proposed grounding mechanism ensures a one-to-one identifiable and interpretable mapping between the latent factors and the target comorbidities. Qualitative assessment of the empirical results by clinical experts show that the proposed model learns clinically interpretable phenotypes which are also shown to have competitive performance on 30 day mortality prediction task. The proposed method can be readily adapted to any non-negative EHR data across various healthcare institutions.
RIS
TY - CPAPER TI - Identifiable Phenotyping using Constrained Non-Negative Matrix Factorization AU - Shalmali Joshi AU - Suriya Gunasekar AU - David Sontag AU - Ghosh Joydeep BT - Proceedings of the 1st Machine Learning for Healthcare Conference DA - 2016/12/10 ED - Finale Doshi-Velez ED - Jim Fackler ED - David Kale ED - Byron Wallace ED - Jenna Wiens ID - pmlr-v56-Joshi16 PB - PMLR DP - Proceedings of Machine Learning Research VL - 56 SP - 17 EP - 41 L1 - http://proceedings.mlr.press/v56/Joshi16.pdf UR - https://proceedings.mlr.press/v56/Joshi16.html AB - This work proposes a new algorithm for automated and simultaneous phenotyping of multiple co-occurring medical conditions, also referred to as comorbidities, using clinical notes from electronic health records (EHRs). A latent factor estimation technique, non-negative matrix factorization (NMF), is augmented with domain constraints from weak supervision to obtain sparse latent factors that are grounded to a fixed set of chronic conditions. The proposed grounding mechanism ensures a one-to-one identifiable and interpretable mapping between the latent factors and the target comorbidities. Qualitative assessment of the empirical results by clinical experts show that the proposed model learns clinically interpretable phenotypes which are also shown to have competitive performance on 30 day mortality prediction task. The proposed method can be readily adapted to any non-negative EHR data across various healthcare institutions. ER -
APA
Joshi, S., Gunasekar, S., Sontag, D. & Joydeep, G.. (2016). Identifiable Phenotyping using Constrained Non-Negative Matrix Factorization. Proceedings of the 1st Machine Learning for Healthcare Conference, in Proceedings of Machine Learning Research 56:17-41 Available from https://proceedings.mlr.press/v56/Joshi16.html.

Related Material