Supervised Electrocardiogram(ECG) Features Outperform Knowledge-based And Unsupervised Features In Individualized Survival Prediction

Yousef Nademi, Sunil V Kalmady, Weijie Sun, Shi-ang Qi, Abram Hindle, Padma Kaul, Russell Greiner
Proceedings of the 3rd Machine Learning for Health Symposium, PMLR 225:368-384, 2023.

Abstract

An electrocardiogram (ECG) provides crucial information about an individual’s health status. Researchers utilize ECG data to develop learners for a variety of tasks, ranging from diagnosing ECG abnormalities to estimating time to death – here modeled as individual survival distributions (ISDs). The way the ECG is represented is important for creating an effective learner. While many traditional ECG-based prediction models rely on hand-crafted features, such as heart rate, this study aims to achieve a better representation. The effectiveness of various ECG based feature extraction methods for prediction of ISDs, either supervised or unsupervised, have not been explored previously. The study uses a large ECG dataset from 244,077 patients with over 1.6 million 12-lead ECGs, each labeled with the patient{’}s disease {–} one or more International Classification of Diseases (ICD) codes. We explored extracting high-level features from ECG traces using various approaches, then trained models that used these ECG features (along with age and sex), across a range of training sizes, to estimate patient-specific ISDs. The results showed that the supervised feature extractor method produced ECG features that can estimate ISD curves better than ECG features obtained from unsupervised or knowledge-based methods. Supervised ECG features required fewer training instances (as low as 500) to learn ISD models that performed better than the baseline model that only used age and sex. On the other hand, unsupervised and knowledge-based ECG features required over 5,000 training samples to produce ISD models that performed better than the baseline. The study’s findings may assist researchers in selecting the most appropriate approach for extracting high-level features from ECG signals to estimate patient-specific ISD curves.

Cite this Paper


BibTeX
@InProceedings{pmlr-v225-nademi23a, title = {Supervised Electrocardiogram(ECG) Features Outperform Knowledge-based And Unsupervised Features In Individualized Survival Prediction}, author = {Nademi, Yousef and Kalmady, Sunil V and Sun, Weijie and Qi, Shi-ang and Hindle, Abram and Kaul, Padma and Greiner, Russell}, booktitle = {Proceedings of the 3rd Machine Learning for Health Symposium}, pages = {368--384}, year = {2023}, editor = {Hegselmann, Stefan and Parziale, Antonio and Shanmugam, Divya and Tang, Shengpu and Asiedu, Mercy Nyamewaa and Chang, Serina and Hartvigsen, Tom and Singh, Harvineet}, volume = {225}, series = {Proceedings of Machine Learning Research}, month = {10 Dec}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v225/nademi23a/nademi23a.pdf}, url = {https://proceedings.mlr.press/v225/nademi23a.html}, abstract = {An electrocardiogram (ECG) provides crucial information about an individual’s health status. Researchers utilize ECG data to develop learners for a variety of tasks, ranging from diagnosing ECG abnormalities to estimating time to death – here modeled as individual survival distributions (ISDs). The way the ECG is represented is important for creating an effective learner. While many traditional ECG-based prediction models rely on hand-crafted features, such as heart rate, this study aims to achieve a better representation. The effectiveness of various ECG based feature extraction methods for prediction of ISDs, either supervised or unsupervised, have not been explored previously. The study uses a large ECG dataset from 244,077 patients with over 1.6 million 12-lead ECGs, each labeled with the patient{’}s disease {–} one or more International Classification of Diseases (ICD) codes. We explored extracting high-level features from ECG traces using various approaches, then trained models that used these ECG features (along with age and sex), across a range of training sizes, to estimate patient-specific ISDs. The results showed that the supervised feature extractor method produced ECG features that can estimate ISD curves better than ECG features obtained from unsupervised or knowledge-based methods. Supervised ECG features required fewer training instances (as low as 500) to learn ISD models that performed better than the baseline model that only used age and sex. On the other hand, unsupervised and knowledge-based ECG features required over 5,000 training samples to produce ISD models that performed better than the baseline. The study’s findings may assist researchers in selecting the most appropriate approach for extracting high-level features from ECG signals to estimate patient-specific ISD curves.} }
Endnote
%0 Conference Paper %T Supervised Electrocardiogram(ECG) Features Outperform Knowledge-based And Unsupervised Features In Individualized Survival Prediction %A Yousef Nademi %A Sunil V Kalmady %A Weijie Sun %A Shi-ang Qi %A Abram Hindle %A Padma Kaul %A Russell Greiner %B Proceedings of the 3rd Machine Learning for Health Symposium %C Proceedings of Machine Learning Research %D 2023 %E Stefan Hegselmann %E Antonio Parziale %E Divya Shanmugam %E Shengpu Tang %E Mercy Nyamewaa Asiedu %E Serina Chang %E Tom Hartvigsen %E Harvineet Singh %F pmlr-v225-nademi23a %I PMLR %P 368--384 %U https://proceedings.mlr.press/v225/nademi23a.html %V 225 %X An electrocardiogram (ECG) provides crucial information about an individual’s health status. Researchers utilize ECG data to develop learners for a variety of tasks, ranging from diagnosing ECG abnormalities to estimating time to death – here modeled as individual survival distributions (ISDs). The way the ECG is represented is important for creating an effective learner. While many traditional ECG-based prediction models rely on hand-crafted features, such as heart rate, this study aims to achieve a better representation. The effectiveness of various ECG based feature extraction methods for prediction of ISDs, either supervised or unsupervised, have not been explored previously. The study uses a large ECG dataset from 244,077 patients with over 1.6 million 12-lead ECGs, each labeled with the patient{’}s disease {–} one or more International Classification of Diseases (ICD) codes. We explored extracting high-level features from ECG traces using various approaches, then trained models that used these ECG features (along with age and sex), across a range of training sizes, to estimate patient-specific ISDs. The results showed that the supervised feature extractor method produced ECG features that can estimate ISD curves better than ECG features obtained from unsupervised or knowledge-based methods. Supervised ECG features required fewer training instances (as low as 500) to learn ISD models that performed better than the baseline model that only used age and sex. On the other hand, unsupervised and knowledge-based ECG features required over 5,000 training samples to produce ISD models that performed better than the baseline. The study’s findings may assist researchers in selecting the most appropriate approach for extracting high-level features from ECG signals to estimate patient-specific ISD curves.
APA
Nademi, Y., Kalmady, S.V., Sun, W., Qi, S., Hindle, A., Kaul, P. & Greiner, R.. (2023). Supervised Electrocardiogram(ECG) Features Outperform Knowledge-based And Unsupervised Features In Individualized Survival Prediction. Proceedings of the 3rd Machine Learning for Health Symposium, in Proceedings of Machine Learning Research 225:368-384 Available from https://proceedings.mlr.press/v225/nademi23a.html.

Related Material