[edit]
Uncovering Trajectory and Topological Signatures in Multimodal Pediatric Sleep Embeddings
Proceedings of the Fifth Machine Learning for Health Symposium, PMLR 297:1392-1411, 2026.
Abstract
While generative models have shown promise in pediatric sleep analysis, the latent structure of their multimodal embeddings remains poorly understood. This work investigates session-wide diagnostic information contained in the sequences of 30-second pediatric {PSG} epochs embedded by a multimodal masked autoencoder. We test whether augmenting embeddings with (i) {PHATE}-derived per-epoch coordinates and whole-night movement descriptors, (ii) persistent homology summaries of the embedding cloud, and (iii) {EHR} yields task-relevant signals. Simple linear and {MLP} models, chosen for interpretability rather than state-of-the-art performance, show that geometric, topological, and clinical features each provide complementary gains. For binary predictions, feature importance is task-dependent, and more expressive late-fusion models generally perform better, with {AUPRC} improving 0.26$\rightarrow$0.34 for desaturation, 0.31$\rightarrow$0.48 for {EEG} arousal, 0.09$\rightarrow$0.22 for hypopnea, and 0.05$\rightarrow$0.14 for apnea. We also report Brier score and Expected Calibration Error, where the full fusion model yields the best calibration across all four binary tasks. Our study reveals that latent geometry/topology and {EHR} offer complementary, interpretable signals beyond embeddings, improving calibration and robustness under extreme imbalance.