Multi-task Prediction of Disease Onsets from Longitudinal Laboratory Tests

Narges Razavian; Jake Marcus; David Sontag

Multi-task Prediction of Disease Onsets from Longitudinal Laboratory Tests

Narges Razavian, Jake Marcus, David Sontag

Proceedings of the 1st Machine Learning for Healthcare Conference, PMLR 56:73-100, 2016.

Abstract

Disparate areas of machine learning have benefited from models that can take raw data with little preprocessing as input and learn rich representations of that raw data in order to perform well on a given prediction task. We evaluate this approach in healthcare by using longitudinal measurements of lab tests, one of the more raw signals of a patient’s health state widely available in clinical data, to predict disease onsets. In particular, we train a Long Short-Term Memory (LSTM) recurrent neural network and two novel convolutional neural networks for multi-task prediction of disease onset for 133 conditions based on 18 common lab tests measured over time in a cohort of 298K patients derived from 8 years of administrative claims data. We compare the neural networks to a logistic regression with several hand-engineered, clinically relevant features. We find that the representation-based learning approaches significantly outperform this baseline. We believe that our work suggests a new avenue for patient risk stratification based solely on lab results.

Cite this Paper

BibTeX


@InProceedings{pmlr-v56-Razavian16,
  title = 	 {Multi-task Prediction of Disease Onsets from Longitudinal Laboratory Tests},
  author = 	 {Razavian, Narges and Marcus, Jake and Sontag, David},
  booktitle = 	 {Proceedings of the 1st Machine Learning for Healthcare Conference},
  pages = 	 {73--100},
  year = 	 {2016},
  editor = 	 {Doshi-Velez, Finale and Fackler, Jim and Kale, David and Wallace, Byron and Wiens, Jenna},
  volume = 	 {56},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Northeastern University, Boston, MA, USA},
  month = 	 {18--19 Aug},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v56/Razavian16.pdf},
  url = 	 {https://proceedings.mlr.press/v56/Razavian16.html},
  abstract = 	 {Disparate areas of machine learning have benefited from models that can take raw data with little preprocessing as input and learn rich representations of that raw data in order to perform well on a given prediction task. We evaluate this approach in healthcare by using longitudinal measurements of lab tests, one of the more raw signals of a patient’s health state widely available in clinical data, to predict disease onsets. In particular, we train a Long Short-Term Memory (LSTM) recurrent neural network and two novel convolutional neural networks for multi-task prediction of disease onset for 133 conditions based on 18 common lab tests measured over time in a cohort of 298K patients derived from 8 years of administrative claims data. We compare the neural networks to a logistic regression with several hand-engineered, clinically relevant features. We find that the representation-based learning approaches significantly outperform this baseline. We believe that our work suggests a new avenue for patient risk stratification based solely on lab results.}
}

Endnote

%0 Conference Paper
%T Multi-task Prediction of Disease Onsets from Longitudinal Laboratory Tests
%A Narges Razavian
%A Jake Marcus
%A David Sontag
%B Proceedings of the 1st Machine Learning for Healthcare Conference
%C Proceedings of Machine Learning Research
%D 2016
%E Finale Doshi-Velez
%E Jim Fackler
%E David Kale
%E Byron Wallace
%E Jenna Wiens	
%F pmlr-v56-Razavian16
%I PMLR
%P 73--100
%U https://proceedings.mlr.press/v56/Razavian16.html
%V 56
%X Disparate areas of machine learning have benefited from models that can take raw data with little preprocessing as input and learn rich representations of that raw data in order to perform well on a given prediction task. We evaluate this approach in healthcare by using longitudinal measurements of lab tests, one of the more raw signals of a patient’s health state widely available in clinical data, to predict disease onsets. In particular, we train a Long Short-Term Memory (LSTM) recurrent neural network and two novel convolutional neural networks for multi-task prediction of disease onset for 133 conditions based on 18 common lab tests measured over time in a cohort of 298K patients derived from 8 years of administrative claims data. We compare the neural networks to a logistic regression with several hand-engineered, clinically relevant features. We find that the representation-based learning approaches significantly outperform this baseline. We believe that our work suggests a new avenue for patient risk stratification based solely on lab results.

RIS


TY  - CPAPER
TI  - Multi-task Prediction of Disease Onsets from Longitudinal Laboratory Tests
AU  - Narges Razavian
AU  - Jake Marcus
AU  - David Sontag
BT  - Proceedings of the 1st Machine Learning for Healthcare Conference
DA  - 2016/12/10
ED  - Finale Doshi-Velez
ED  - Jim Fackler
ED  - David Kale
ED  - Byron Wallace
ED  - Jenna Wiens	
ID  - pmlr-v56-Razavian16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 56
SP  - 73
EP  - 100
L1  - http://proceedings.mlr.press/v56/Razavian16.pdf
UR  - https://proceedings.mlr.press/v56/Razavian16.html
AB  - Disparate areas of machine learning have benefited from models that can take raw data with little preprocessing as input and learn rich representations of that raw data in order to perform well on a given prediction task. We evaluate this approach in healthcare by using longitudinal measurements of lab tests, one of the more raw signals of a patient’s health state widely available in clinical data, to predict disease onsets. In particular, we train a Long Short-Term Memory (LSTM) recurrent neural network and two novel convolutional neural networks for multi-task prediction of disease onset for 133 conditions based on 18 common lab tests measured over time in a cohort of 298K patients derived from 8 years of administrative claims data. We compare the neural networks to a logistic regression with several hand-engineered, clinically relevant features. We find that the representation-based learning approaches significantly outperform this baseline. We believe that our work suggests a new avenue for patient risk stratification based solely on lab results.
ER  -

APA


Razavian, N., Marcus, J. & Sontag, D.. (2016). Multi-task Prediction of Disease Onsets from Longitudinal Laboratory Tests. Proceedings of the 1st Machine Learning for Healthcare Conference, in Proceedings of Machine Learning Research 56:73-100 Available from https://proceedings.mlr.press/v56/Razavian16.html.

Related Material

Download PDF