TransEHR: Self-Supervised Transformer for Clinical Time Series Data

Yanbo Xu, Shangqing Xu, Manav Ramprassad, Alexey Tumanov, Chao Zhang
Proceedings of the 3rd Machine Learning for Health Symposium, PMLR 225:623-635, 2023.

Abstract

Deep neural networks, including the Transformer architecture, have achieved remarkable performance in various time series tasks. However, their effectiveness in handling clinical time series data is hindered by specific challenges: 1) Sparse event sequences collected asynchronously with multivariate time series, and 2) Limited availability of labeled data. To address these challenges, we propose Our code is available at https://github.com/SigmaTsing/TransEHR.git . , a self-supervised Transformer model designed to encode multi-sourced asynchronous sequential data, such as structured Electronic Health Records (EHRs), efficiently. We introduce three pretext tasks for pre-training the Transformer model, utilizing large amounts of unlabeled structured EHR data, followed by fine-tuning on downstream prediction tasks using the limited labeled data. Through extensive experiments on three real-world health datasets, we demonstrate that our model achieves state-of-the-art performance on benchmark clinical tasks, including in-hospital mortality classification, phenotyping, and length-of-stay prediction. Our findings highlight the efficacy of in effectively addressing the challenges associated with clinical time series data, thus contributing to advancements in healthcare analytics.

Cite this Paper


BibTeX
@InProceedings{pmlr-v225-xu23a, title = {TransEHR: Self-Supervised Transformer for Clinical Time Series Data}, author = {Xu, Yanbo and Xu, Shangqing and Ramprassad, Manav and Tumanov, Alexey and Zhang, Chao}, booktitle = {Proceedings of the 3rd Machine Learning for Health Symposium}, pages = {623--635}, year = {2023}, editor = {Hegselmann, Stefan and Parziale, Antonio and Shanmugam, Divya and Tang, Shengpu and Asiedu, Mercy Nyamewaa and Chang, Serina and Hartvigsen, Tom and Singh, Harvineet}, volume = {225}, series = {Proceedings of Machine Learning Research}, month = {10 Dec}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v225/xu23a/xu23a.pdf}, url = {https://proceedings.mlr.press/v225/xu23a.html}, abstract = {Deep neural networks, including the Transformer architecture, have achieved remarkable performance in various time series tasks. However, their effectiveness in handling clinical time series data is hindered by specific challenges: 1) Sparse event sequences collected asynchronously with multivariate time series, and 2) Limited availability of labeled data. To address these challenges, we propose Our code is available at https://github.com/SigmaTsing/TransEHR.git . , a self-supervised Transformer model designed to encode multi-sourced asynchronous sequential data, such as structured Electronic Health Records (EHRs), efficiently. We introduce three pretext tasks for pre-training the Transformer model, utilizing large amounts of unlabeled structured EHR data, followed by fine-tuning on downstream prediction tasks using the limited labeled data. Through extensive experiments on three real-world health datasets, we demonstrate that our model achieves state-of-the-art performance on benchmark clinical tasks, including in-hospital mortality classification, phenotyping, and length-of-stay prediction. Our findings highlight the efficacy of in effectively addressing the challenges associated with clinical time series data, thus contributing to advancements in healthcare analytics.} }
Endnote
%0 Conference Paper %T TransEHR: Self-Supervised Transformer for Clinical Time Series Data %A Yanbo Xu %A Shangqing Xu %A Manav Ramprassad %A Alexey Tumanov %A Chao Zhang %B Proceedings of the 3rd Machine Learning for Health Symposium %C Proceedings of Machine Learning Research %D 2023 %E Stefan Hegselmann %E Antonio Parziale %E Divya Shanmugam %E Shengpu Tang %E Mercy Nyamewaa Asiedu %E Serina Chang %E Tom Hartvigsen %E Harvineet Singh %F pmlr-v225-xu23a %I PMLR %P 623--635 %U https://proceedings.mlr.press/v225/xu23a.html %V 225 %X Deep neural networks, including the Transformer architecture, have achieved remarkable performance in various time series tasks. However, their effectiveness in handling clinical time series data is hindered by specific challenges: 1) Sparse event sequences collected asynchronously with multivariate time series, and 2) Limited availability of labeled data. To address these challenges, we propose Our code is available at https://github.com/SigmaTsing/TransEHR.git . , a self-supervised Transformer model designed to encode multi-sourced asynchronous sequential data, such as structured Electronic Health Records (EHRs), efficiently. We introduce three pretext tasks for pre-training the Transformer model, utilizing large amounts of unlabeled structured EHR data, followed by fine-tuning on downstream prediction tasks using the limited labeled data. Through extensive experiments on three real-world health datasets, we demonstrate that our model achieves state-of-the-art performance on benchmark clinical tasks, including in-hospital mortality classification, phenotyping, and length-of-stay prediction. Our findings highlight the efficacy of in effectively addressing the challenges associated with clinical time series data, thus contributing to advancements in healthcare analytics.
APA
Xu, Y., Xu, S., Ramprassad, M., Tumanov, A. & Zhang, C.. (2023). TransEHR: Self-Supervised Transformer for Clinical Time Series Data. Proceedings of the 3rd Machine Learning for Health Symposium, in Proceedings of Machine Learning Research 225:623-635 Available from https://proceedings.mlr.press/v225/xu23a.html.

Related Material