Self-Supervised Learning of ECG and PPG Signals for Multi-Modal Health Monitoring

SiChang Liu, Ning Wang, ZongMin Wang
Proceedings of 2025 2nd International Conference on Machine Learning and Intelligent Computing, PMLR 278:350-358, 2025.

Abstract

Self-supervised multimodal time-series analysis faces critical challenges including cross-domain temporal shifts, sensor noise, and inter-subject variability, which degrade disease classification performance. Existing methods often depend on labeled data or explicit target domain alignment, limiting their clinical practicality. We propose TSTA-Net, a novel framework that integrates: (1) a residual spatiotemporal transformer (STN) to dynamically correct sensor shifts and motion artifacts, (2) a dual-branch Transformer for capturing long-range dependencies, and (3) hierarchical contrastive learning for spatiotemporal alignment of ECG and PPG signals. This integrated approach addresses both temporal dynamics and spatial inconsistencies through joint optimization. On atrial fibrillation detection, TSTA-Net achieves a 9.3% higher F1-score than state-of-the-art self-supervised methods, with ablation studies verifying that the spatiotemporal alignment mechanism contributes 68% of the performance gain. The lightweight framework ($<$1M parameters) reduces annotation dependency while enabling real-time arrhythmia screening on wearable devices, advancing self-supervised learning for practical healthcare applications.

Cite this Paper


BibTeX
@InProceedings{pmlr-v278-liu25d, title = {Self-Supervised Learning of ECG and PPG Signals for Multi-Modal Health Monitoring}, author = {Liu, SiChang and Wang, Ning and Wang, ZongMin}, booktitle = {Proceedings of 2025 2nd International Conference on Machine Learning and Intelligent Computing}, pages = {350--358}, year = {2025}, editor = {Zeng, Nianyin and Pachori, Ram Bilas and Wang, Dongshu}, volume = {278}, series = {Proceedings of Machine Learning Research}, month = {25--27 Apr}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v278/main/assets/liu25d/liu25d.pdf}, url = {https://proceedings.mlr.press/v278/liu25d.html}, abstract = {Self-supervised multimodal time-series analysis faces critical challenges including cross-domain temporal shifts, sensor noise, and inter-subject variability, which degrade disease classification performance. Existing methods often depend on labeled data or explicit target domain alignment, limiting their clinical practicality. We propose TSTA-Net, a novel framework that integrates: (1) a residual spatiotemporal transformer (STN) to dynamically correct sensor shifts and motion artifacts, (2) a dual-branch Transformer for capturing long-range dependencies, and (3) hierarchical contrastive learning for spatiotemporal alignment of ECG and PPG signals. This integrated approach addresses both temporal dynamics and spatial inconsistencies through joint optimization. On atrial fibrillation detection, TSTA-Net achieves a 9.3% higher F1-score than state-of-the-art self-supervised methods, with ablation studies verifying that the spatiotemporal alignment mechanism contributes 68% of the performance gain. The lightweight framework ($<$1M parameters) reduces annotation dependency while enabling real-time arrhythmia screening on wearable devices, advancing self-supervised learning for practical healthcare applications.} }
Endnote
%0 Conference Paper %T Self-Supervised Learning of ECG and PPG Signals for Multi-Modal Health Monitoring %A SiChang Liu %A Ning Wang %A ZongMin Wang %B Proceedings of 2025 2nd International Conference on Machine Learning and Intelligent Computing %C Proceedings of Machine Learning Research %D 2025 %E Nianyin Zeng %E Ram Bilas Pachori %E Dongshu Wang %F pmlr-v278-liu25d %I PMLR %P 350--358 %U https://proceedings.mlr.press/v278/liu25d.html %V 278 %X Self-supervised multimodal time-series analysis faces critical challenges including cross-domain temporal shifts, sensor noise, and inter-subject variability, which degrade disease classification performance. Existing methods often depend on labeled data or explicit target domain alignment, limiting their clinical practicality. We propose TSTA-Net, a novel framework that integrates: (1) a residual spatiotemporal transformer (STN) to dynamically correct sensor shifts and motion artifacts, (2) a dual-branch Transformer for capturing long-range dependencies, and (3) hierarchical contrastive learning for spatiotemporal alignment of ECG and PPG signals. This integrated approach addresses both temporal dynamics and spatial inconsistencies through joint optimization. On atrial fibrillation detection, TSTA-Net achieves a 9.3% higher F1-score than state-of-the-art self-supervised methods, with ablation studies verifying that the spatiotemporal alignment mechanism contributes 68% of the performance gain. The lightweight framework ($<$1M parameters) reduces annotation dependency while enabling real-time arrhythmia screening on wearable devices, advancing self-supervised learning for practical healthcare applications.
APA
Liu, S., Wang, N. & Wang, Z.. (2025). Self-Supervised Learning of ECG and PPG Signals for Multi-Modal Health Monitoring. Proceedings of 2025 2nd International Conference on Machine Learning and Intelligent Computing, in Proceedings of Machine Learning Research 278:350-358 Available from https://proceedings.mlr.press/v278/liu25d.html.

Related Material