When Attention Fails: Pitfalls of Attention-based Model Interpretability for High-dimensional Clinical Time-Series

Shashank Yadav; Vignesh Subbian

When Attention Fails: Pitfalls of Attention-based Model Interpretability for High-dimensional Clinical Time-Series

Shashank Yadav, Vignesh Subbian

Proceedings of the sixth Conference on Health, Inference, and Learning, PMLR 287:289-305, 2025.

Abstract

Attention-based deep learning models are widely used for clinical time-series analysis, largely due to their perceived ability to enhance model interpretability. However, the reliability, faithfulness, and consistency of attention mechanisms as an interpretability tool in high-dimensional clinical time series data require further investigation. We conducted a comprehensive evaluation of consistency and faithfulness of attention mechanisms in deep learning models applied to high-dimensional clinical time-series data. Specifically, we trained 1000 different variants of an attention-based LSTM model architecture with random initializations to analyze the consistency of attention scores across mortality prediction and patient severity group classification. Our findings revealed significant inconsistencies in attention scores for individual samples across the thousand model variants. Visual inspection of attention weight distributions indicated that the attention mechanism did not consistently focus on the same feature-time pairs, challenging the assumption of faithfulness and reliability in model interpretability. The observed inconsistencies in per-sample attention weights suggest that attention mechanisms are unreliable as an interpretability tool for clinical decision-making tasks involving high-dimensional time-series data. While attention mechanisms may enhance model performance metrics, they often fail to produce clinically meaningful and consistent interpretations, limiting their utility in healthcare settings where transparency is critical for informed decision-making.

Cite this Paper

BibTeX

@InProceedings{pmlr-v287-yadav25a,
  title = 	 {When Attention Fails: Pitfalls of Attention-based Model Interpretability for High-dimensional Clinical Time-Series},
  author =       {Yadav, Shashank and Subbian, Vignesh},
  booktitle = 	 {Proceedings of the sixth Conference on Health, Inference, and Learning},
  pages = 	 {289--305},
  year = 	 {2025},
  editor = 	 {Xu, Xuhai Orson and Choi, Edward and Singhal, Pankhuri and Gerych, Walter and Tang, Shengpu and Agrawal, Monica and Subbaswamy, Adarsh and Sizikova, Elena and Dunn, Jessilyn and Daneshjou, Roxana and Sarker, Tasmie and McDermott, Matthew and Chen, Irene},
  volume = 	 {287},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {25--27 Jun},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v287/main/assets/yadav25a/yadav25a.pdf},
  url = 	 {https://proceedings.mlr.press/v287/yadav25a.html},
  abstract = 	 {Attention-based deep learning models are widely used for clinical time-series analysis, largely due to their perceived ability to enhance model interpretability. However, the reliability, faithfulness, and consistency of attention mechanisms as an interpretability tool in high-dimensional clinical time series data require further investigation. We conducted a comprehensive evaluation of consistency and faithfulness of attention mechanisms in deep learning models applied to high-dimensional clinical time-series data. Specifically, we trained 1000 different variants of an attention-based LSTM model architecture with random initializations to analyze the consistency of attention scores across mortality prediction and patient severity group classification. Our findings revealed significant inconsistencies in attention scores for individual samples across the thousand model variants. Visual inspection of attention weight distributions indicated that the attention mechanism did not consistently focus on the same feature-time pairs, challenging the assumption of faithfulness and reliability in model interpretability. The observed inconsistencies in per-sample attention weights suggest that attention mechanisms are unreliable as an interpretability tool for clinical decision-making tasks involving high-dimensional time-series data. While attention mechanisms may enhance model performance metrics, they often fail to produce clinically meaningful and consistent interpretations, limiting their utility in healthcare settings where transparency is critical for informed decision-making.}
}

Endnote

%0 Conference Paper
%T When Attention Fails: Pitfalls of Attention-based Model Interpretability for High-dimensional Clinical Time-Series
%A Shashank Yadav
%A Vignesh Subbian
%B Proceedings of the sixth Conference on Health, Inference, and Learning
%C Proceedings of Machine Learning Research
%D 2025
%E Xuhai Orson Xu
%E Edward Choi
%E Pankhuri Singhal
%E Walter Gerych
%E Shengpu Tang
%E Monica Agrawal
%E Adarsh Subbaswamy
%E Elena Sizikova
%E Jessilyn Dunn
%E Roxana Daneshjou
%E Tasmie Sarker
%E Matthew McDermott
%E Irene Chen	
%F pmlr-v287-yadav25a
%I PMLR
%P 289--305
%U https://proceedings.mlr.press/v287/yadav25a.html
%V 287
%X Attention-based deep learning models are widely used for clinical time-series analysis, largely due to their perceived ability to enhance model interpretability. However, the reliability, faithfulness, and consistency of attention mechanisms as an interpretability tool in high-dimensional clinical time series data require further investigation. We conducted a comprehensive evaluation of consistency and faithfulness of attention mechanisms in deep learning models applied to high-dimensional clinical time-series data. Specifically, we trained 1000 different variants of an attention-based LSTM model architecture with random initializations to analyze the consistency of attention scores across mortality prediction and patient severity group classification. Our findings revealed significant inconsistencies in attention scores for individual samples across the thousand model variants. Visual inspection of attention weight distributions indicated that the attention mechanism did not consistently focus on the same feature-time pairs, challenging the assumption of faithfulness and reliability in model interpretability. The observed inconsistencies in per-sample attention weights suggest that attention mechanisms are unreliable as an interpretability tool for clinical decision-making tasks involving high-dimensional time-series data. While attention mechanisms may enhance model performance metrics, they often fail to produce clinically meaningful and consistent interpretations, limiting their utility in healthcare settings where transparency is critical for informed decision-making.

APA

Yadav, S. & Subbian, V.. (2025). When Attention Fails: Pitfalls of Attention-based Model Interpretability for High-dimensional Clinical Time-Series. Proceedings of the sixth Conference on Health, Inference, and Learning, in Proceedings of Machine Learning Research 287:289-305 Available from https://proceedings.mlr.press/v287/yadav25a.html.

Related Material

Download PDF