A Systematic Comparison of Data Representations for Transformer-Based ECG Arrhythmia Classification

Mona Aman; Godbright Uiso; Carine Mukamakuza; Vijayakumar Bhagavatula

A Systematic Comparison of Data Representations for Transformer-Based ECG Arrhythmia Classification

Mona Aman, Godbright Uiso, Carine Mukamakuza, Vijayakumar Bhagavatula

Proceedings of The Second AAAI Bridge Program on AI for Medicine and Healthcare, PMLR 317:37-45, 2026.

Abstract

Automated electrocardiogram (ECG) classification plays a key role in detecting cardiac arrhythmias efficiently and objectively. Despite major advances in deep learning, there remains no consensus on whether one-dimensional (1D) temporal or two-dimensional (2D) time–frequency representations yield superior diagnostic accuracy. This study presents a controlled comparison between Vision Transformer (ViT) architectures trained on raw 1D ECG sequences and Short-Time Fourier Transform (STFT)-based 2D spectrograms using the CPSC2018 dataset. Both models share comparable architectures and parameter counts to isolate the effect of signal representation. The 1D-ViT achieved the highest overall accuracy (96.5%) and F1-score (96.5%), confirming that preserving temporal continuity is critical for arrhythmia discrimination. The 2D-ViT achieved lower accuracy (92.6%) due to temporal information loss, though it maintained competitive calibration (AUC 98.6%) and generalization. A bidirectional fusion model combining both encoders through cross-attention exhibited complementary behavior but did not surpass the 1D baseline. These findings indicate that while spectro-temporal information can enhance interpretability and stability, temporal-domain fidelity remains the dominant factor for reliable ECG classification.

Cite this Paper

BibTeX

@InProceedings{pmlr-v317-aman26a,
  title = 	 {A Systematic Comparison of Data Representations for Transformer-Based ECG Arrhythmia Classification},
  author =       {Aman, Mona and Uiso, Godbright and Mukamakuza, Carine and Bhagavatula, Vijayakumar},
  booktitle = 	 {Proceedings of The Second AAAI Bridge Program on AI for Medicine and Healthcare},
  pages = 	 {37--45},
  year = 	 {2026},
  editor = 	 {Wu, Junde and Pan, Jiazhen and Zhu, Jiayuan and Luo, Luyang and Li, Yitong and Xu, Min and Jin, Yueming and Rueckert, Daniel},
  volume = 	 {317},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {20--21 Jan},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v317/main/assets/aman26a/aman26a.pdf},
  url = 	 {https://proceedings.mlr.press/v317/aman26a.html},
  abstract = 	 {Automated electrocardiogram (ECG) classification plays a key role in detecting cardiac arrhythmias efficiently and objectively. Despite major advances in deep learning, there remains no consensus on whether one-dimensional (1D) temporal or two-dimensional (2D) time–frequency representations yield superior diagnostic accuracy. This study presents a controlled comparison between Vision Transformer (ViT) architectures trained on raw 1D ECG sequences and Short-Time Fourier Transform (STFT)-based 2D spectrograms using the CPSC2018 dataset. Both models share comparable architectures and parameter counts to isolate the effect of signal representation. The 1D-ViT achieved the highest overall accuracy (96.5%) and F1-score (96.5%), confirming that preserving temporal continuity is critical for arrhythmia discrimination. The 2D-ViT achieved lower accuracy (92.6%) due to temporal information loss, though it maintained competitive calibration (AUC 98.6%) and generalization. A bidirectional fusion model combining both encoders through cross-attention exhibited complementary behavior but did not surpass the 1D baseline. These findings indicate that while spectro-temporal information can enhance interpretability and stability, temporal-domain fidelity remains the dominant factor for reliable ECG classification.}
}

Endnote

%0 Conference Paper
%T A Systematic Comparison of Data Representations for Transformer-Based ECG Arrhythmia Classification
%A Mona Aman
%A Godbright Uiso
%A Carine Mukamakuza
%A Vijayakumar Bhagavatula
%B Proceedings of The Second AAAI Bridge Program on AI for Medicine and Healthcare
%C Proceedings of Machine Learning Research
%D 2026
%E Junde Wu
%E Jiazhen Pan
%E Jiayuan Zhu
%E Luyang Luo
%E Yitong Li
%E Min Xu
%E Yueming Jin
%E Daniel Rueckert	
%F pmlr-v317-aman26a
%I PMLR
%P 37--45
%U https://proceedings.mlr.press/v317/aman26a.html
%V 317
%X Automated electrocardiogram (ECG) classification plays a key role in detecting cardiac arrhythmias efficiently and objectively. Despite major advances in deep learning, there remains no consensus on whether one-dimensional (1D) temporal or two-dimensional (2D) time–frequency representations yield superior diagnostic accuracy. This study presents a controlled comparison between Vision Transformer (ViT) architectures trained on raw 1D ECG sequences and Short-Time Fourier Transform (STFT)-based 2D spectrograms using the CPSC2018 dataset. Both models share comparable architectures and parameter counts to isolate the effect of signal representation. The 1D-ViT achieved the highest overall accuracy (96.5%) and F1-score (96.5%), confirming that preserving temporal continuity is critical for arrhythmia discrimination. The 2D-ViT achieved lower accuracy (92.6%) due to temporal information loss, though it maintained competitive calibration (AUC 98.6%) and generalization. A bidirectional fusion model combining both encoders through cross-attention exhibited complementary behavior but did not surpass the 1D baseline. These findings indicate that while spectro-temporal information can enhance interpretability and stability, temporal-domain fidelity remains the dominant factor for reliable ECG classification.

APA

Aman, M., Uiso, G., Mukamakuza, C. & Bhagavatula, V.. (2026). A Systematic Comparison of Data Representations for Transformer-Based ECG Arrhythmia Classification. Proceedings of The Second AAAI Bridge Program on AI for Medicine and Healthcare, in Proceedings of Machine Learning Research 317:37-45 Available from https://proceedings.mlr.press/v317/aman26a.html.

Related Material

Download PDF