Phenotype Inference with Semi-Supervised Mixed Membership Models

Victor A. Rodriguez, Adler Perotte
Proceedings of the 4th Machine Learning for Healthcare Conference, PMLR 106:304-324, 2019.

Abstract

Disease phenotyping algorithms are designed to sift through clinical data stores to identify patients with specific diseases. Supervised phenotyping methods require significant quantities of expert-labeled data, while unsupervised methods may learn spurious or non-disease phenotypes. To address these limitations, we propose the Semi-Supervised Mixed Membership Model (SS3M) a probabilistic graphical model for learning disease phenotypes from partially labeled clinical data. We show SS3M can generate interpretable, disease-specific phenotypes which capture the clinical features of the disease concepts specified by the labels provided to the model. Furthermore, SS3M phenotypes demonstrate competitive predictive performance relative to commonly used baselines.

Cite this Paper


BibTeX
@InProceedings{pmlr-v106-rodriguez19a, title = {Phenotype Inference with Semi-Supervised Mixed Membership Models}, author = {Rodriguez, Victor A. and Perotte, Adler}, booktitle = {Proceedings of the 4th Machine Learning for Healthcare Conference}, pages = {304--324}, year = {2019}, editor = {Doshi-Velez, Finale and Fackler, Jim and Jung, Ken and Kale, David and Ranganath, Rajesh and Wallace, Byron and Wiens, Jenna}, volume = {106}, series = {Proceedings of Machine Learning Research}, month = {09--10 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v106/rodriguez19a/rodriguez19a.pdf}, url = {https://proceedings.mlr.press/v106/rodriguez19a.html}, abstract = {Disease phenotyping algorithms are designed to sift through clinical data stores to identify patients with specific diseases. Supervised phenotyping methods require significant quantities of expert-labeled data, while unsupervised methods may learn spurious or non-disease phenotypes. To address these limitations, we propose the Semi-Supervised Mixed Membership Model (SS3M) a probabilistic graphical model for learning disease phenotypes from partially labeled clinical data. We show SS3M can generate interpretable, disease-specific phenotypes which capture the clinical features of the disease concepts specified by the labels provided to the model. Furthermore, SS3M phenotypes demonstrate competitive predictive performance relative to commonly used baselines.} }
Endnote
%0 Conference Paper %T Phenotype Inference with Semi-Supervised Mixed Membership Models %A Victor A. Rodriguez %A Adler Perotte %B Proceedings of the 4th Machine Learning for Healthcare Conference %C Proceedings of Machine Learning Research %D 2019 %E Finale Doshi-Velez %E Jim Fackler %E Ken Jung %E David Kale %E Rajesh Ranganath %E Byron Wallace %E Jenna Wiens %F pmlr-v106-rodriguez19a %I PMLR %P 304--324 %U https://proceedings.mlr.press/v106/rodriguez19a.html %V 106 %X Disease phenotyping algorithms are designed to sift through clinical data stores to identify patients with specific diseases. Supervised phenotyping methods require significant quantities of expert-labeled data, while unsupervised methods may learn spurious or non-disease phenotypes. To address these limitations, we propose the Semi-Supervised Mixed Membership Model (SS3M) a probabilistic graphical model for learning disease phenotypes from partially labeled clinical data. We show SS3M can generate interpretable, disease-specific phenotypes which capture the clinical features of the disease concepts specified by the labels provided to the model. Furthermore, SS3M phenotypes demonstrate competitive predictive performance relative to commonly used baselines.
APA
Rodriguez, V.A. & Perotte, A.. (2019). Phenotype Inference with Semi-Supervised Mixed Membership Models. Proceedings of the 4th Machine Learning for Healthcare Conference, in Proceedings of Machine Learning Research 106:304-324 Available from https://proceedings.mlr.press/v106/rodriguez19a.html.

Related Material