PhysioJEPA: Joint Embedding Representations of Physiological Signals for Real Time Risk Estimation in the Intensive Care Unit

Benjamin Fox, Dung Hoang, Joy Jiang, Pushkala Jayaraman, Ankit Parekh, Girish N. Nadkarni, Ankit Sakhuja
Proceedings of the Fifth Machine Learning for Health Symposium, PMLR 297:120-135, 2026.

Abstract

Self-supervised learning of multi-modal, high-frequency physiological signals is largely unexplored, despite its potential for critical care applications. We present PhysioJEPA, a Joint Embedding Predictive Architecture (JEPA) designed for multi-modal physiological signals from critical care bedside monitoring devices. PhysioJEPA learns representations from 30-minute segments of physiological signals from three channels: arterial blood pressure, electrocardiography lead II, and photoplethysmography. Trained on over 10.7 million minutes of data from 4,282 intensive care unit stays (N=2,631 patients) in the Medical Information Mart for Intensive Care-III (MIMIC-III) Waveform Database, the learned, frozen representations of PhysioJEPA can be used to estimate 5-minute risk of hypotension (AUROC = 0.83 [Confidence Interval or CI 0.83–0.84]) and shock index (AUROC = 0.95 [0.95–0.96]), with comparable performance to a self-supervised Patch Time Series Transformer framework (AUROC = 0.87 [0.86–0.87] and 0.96 [0.96–0.96]), better performance compared to another JEPA physiological signal model, ECG-JEPA (AUROC = 0.73 [0.72–0.74] and 0.92 [0.92–0.93]), and better performance compared to a supervised convolutional model (AUROC = 0.78 [0.78–0.78] and 0.95 [0.95–0.95]). Notably, it can generalize to an independent healthcare system (AUROC = 0.78 [0.78–0.78] and 0.92 [0.92–0.93]) better than all comparison models. These results suggest that self-supervised JEPA representation learning is a promising approach for multi-modal bedside monitoring signal data.

Cite this Paper


BibTeX
@InProceedings{pmlr-v297-fox26a, title = {{PhysioJEPA}: Joint Embedding Representations of Physiological Signals for Real Time Risk Estimation in the Intensive Care Unit}, author = {Fox, Benjamin and Hoang, Dung and Jiang, Joy and Jayaraman, Pushkala and Parekh, Ankit and Nadkarni, Girish N. and Sakhuja, Ankit}, booktitle = {Proceedings of the Fifth Machine Learning for Health Symposium}, pages = {120--135}, year = {2026}, editor = {Argaw, Peniel and Zhang, Haoran and Jabbour, Sarah and Chandak, Payal and Ji, Jerry and Mukherjee, Sumit and Salaudeen, Olawale and Chang, Trenton and Healey, Elizabeth and Gröger, Fabian and Adibi, Amin and Hegselmann, Stefan and Wild, Benjamin and Noori, Ayush}, volume = {297}, series = {Proceedings of Machine Learning Research}, month = {13--14 Dec}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v297/main/assets/fox26a/fox26a.pdf}, url = {https://proceedings.mlr.press/v297/fox26a.html}, abstract = {Self-supervised learning of multi-modal, high-frequency physiological signals is largely unexplored, despite its potential for critical care applications. We present PhysioJEPA, a Joint Embedding Predictive Architecture (JEPA) designed for multi-modal physiological signals from critical care bedside monitoring devices. PhysioJEPA learns representations from 30-minute segments of physiological signals from three channels: arterial blood pressure, electrocardiography lead II, and photoplethysmography. Trained on over 10.7 million minutes of data from 4,282 intensive care unit stays (N=2,631 patients) in the Medical Information Mart for Intensive Care-III (MIMIC-III) Waveform Database, the learned, frozen representations of PhysioJEPA can be used to estimate 5-minute risk of hypotension (AUROC = 0.83 [Confidence Interval or CI 0.83–0.84]) and shock index (AUROC = 0.95 [0.95–0.96]), with comparable performance to a self-supervised Patch Time Series Transformer framework (AUROC = 0.87 [0.86–0.87] and 0.96 [0.96–0.96]), better performance compared to another JEPA physiological signal model, ECG-JEPA (AUROC = 0.73 [0.72–0.74] and 0.92 [0.92–0.93]), and better performance compared to a supervised convolutional model (AUROC = 0.78 [0.78–0.78] and 0.95 [0.95–0.95]). Notably, it can generalize to an independent healthcare system (AUROC = 0.78 [0.78–0.78] and 0.92 [0.92–0.93]) better than all comparison models. These results suggest that self-supervised JEPA representation learning is a promising approach for multi-modal bedside monitoring signal data.} }
Endnote
%0 Conference Paper %T PhysioJEPA: Joint Embedding Representations of Physiological Signals for Real Time Risk Estimation in the Intensive Care Unit %A Benjamin Fox %A Dung Hoang %A Joy Jiang %A Pushkala Jayaraman %A Ankit Parekh %A Girish N. Nadkarni %A Ankit Sakhuja %B Proceedings of the Fifth Machine Learning for Health Symposium %C Proceedings of Machine Learning Research %D 2026 %E Peniel Argaw %E Haoran Zhang %E Sarah Jabbour %E Payal Chandak %E Jerry Ji %E Sumit Mukherjee %E Olawale Salaudeen %E Trenton Chang %E Elizabeth Healey %E Fabian Gröger %E Amin Adibi %E Stefan Hegselmann %E Benjamin Wild %E Ayush Noori %F pmlr-v297-fox26a %I PMLR %P 120--135 %U https://proceedings.mlr.press/v297/fox26a.html %V 297 %X Self-supervised learning of multi-modal, high-frequency physiological signals is largely unexplored, despite its potential for critical care applications. We present PhysioJEPA, a Joint Embedding Predictive Architecture (JEPA) designed for multi-modal physiological signals from critical care bedside monitoring devices. PhysioJEPA learns representations from 30-minute segments of physiological signals from three channels: arterial blood pressure, electrocardiography lead II, and photoplethysmography. Trained on over 10.7 million minutes of data from 4,282 intensive care unit stays (N=2,631 patients) in the Medical Information Mart for Intensive Care-III (MIMIC-III) Waveform Database, the learned, frozen representations of PhysioJEPA can be used to estimate 5-minute risk of hypotension (AUROC = 0.83 [Confidence Interval or CI 0.83–0.84]) and shock index (AUROC = 0.95 [0.95–0.96]), with comparable performance to a self-supervised Patch Time Series Transformer framework (AUROC = 0.87 [0.86–0.87] and 0.96 [0.96–0.96]), better performance compared to another JEPA physiological signal model, ECG-JEPA (AUROC = 0.73 [0.72–0.74] and 0.92 [0.92–0.93]), and better performance compared to a supervised convolutional model (AUROC = 0.78 [0.78–0.78] and 0.95 [0.95–0.95]). Notably, it can generalize to an independent healthcare system (AUROC = 0.78 [0.78–0.78] and 0.92 [0.92–0.93]) better than all comparison models. These results suggest that self-supervised JEPA representation learning is a promising approach for multi-modal bedside monitoring signal data.
APA
Fox, B., Hoang, D., Jiang, J., Jayaraman, P., Parekh, A., Nadkarni, G.N. & Sakhuja, A.. (2026). PhysioJEPA: Joint Embedding Representations of Physiological Signals for Real Time Risk Estimation in the Intensive Care Unit. Proceedings of the Fifth Machine Learning for Health Symposium, in Proceedings of Machine Learning Research 297:120-135 Available from https://proceedings.mlr.press/v297/fox26a.html.

Related Material