[edit]
xMADD: A Unified Diffusion Framework for Conditioned Synthesis of Medical Images and Waveforms
Proceedings of the Fifth Machine Learning for Health Symposium, PMLR 297:588-604, 2026.
Abstract
Diffusion models have shown remarkable success in generating high-quality perceptual data, but their use for controlled generation in biomedicine remains limited. We introduce {xMADD} (cross-Modal cross-Attention Denoising Diffusion), a conditional diffusion framework for producing diverse, high-resolution medical data, including cardiac {MRI}, brain {MRI}, and {ECG} waveforms, guided by clinical phenotypes, demographics, and multimodal signals. By incorporating cross-attention over conditional embeddings, {xMADD} enables control over generation. Compared to existing generative approaches, {xMADD} achieves superior image fidelity and stability, while accurately reflecting conditioning phenotypes across modalities. Our results highlight the potential of controlled diffusion-based generation to expand biomedical datasets and facilitate data-sharing without compromising sensitive patient data.