Multimodal Classification of Alzheimer’s Disease by Combining Facial and Eye-Tracking Data

Shih-Han Chou, Miini Teng, Harshinee Sriram, Chuyuan Li, Giuseppe Carenini, Cristina Conati, Thalia S. Field, Hyeju Jang, Gabriel Murray
Proceedings of the 4th Machine Learning for Health Symposium, PMLR 259:219-232, 2025.

Abstract

In recent years, there has been growing interest in developing a non-invasive tool for detecting Alzheimer’s Disease (AD). Previous studies have shown that a single modality such as speech or eye-tracking (ET) data can be effective for classifying AD patients from healthy individuals. However, understanding the role of other modalities, and especially the integration of facial analysis with ET for enhancing dementia classification, remains under-explored. In this paper, we investigate whether we can leverage facial patterns in AD patients by building on EMOTION-FAN—a deep learning model initially developed for recognizing seven distinct human emotions, now fine-tuned for our facial analysis tasks. We also explore the efficacy of leveraging multimodal information by combining the results from the facial and ET data through a late fusion technique. Specifically, our approach uses a neural classifier to learn from raw ET data (VTNet) alongside the fine-tuned EMOTION-FAN model that learns from the facial data. Experimental results show that facial data gives superior results than ET data. Notably, we obtain higher scores when both modalities are combined, providing strong evidence that integrating multimodal data benefits performance on this task.

Cite this Paper


BibTeX
@InProceedings{pmlr-v259-chou25a, title = {Multimodal Classification of Alzheimer’s Disease by Combining Facial and Eye-Tracking Data}, author = {Chou, Shih-Han and Teng, Miini and Sriram, Harshinee and Li, Chuyuan and Carenini, Giuseppe and Conati, Cristina and Field, Thalia S. and Jang, Hyeju and Murray, Gabriel}, booktitle = {Proceedings of the 4th Machine Learning for Health Symposium}, pages = {219--232}, year = {2025}, editor = {Hegselmann, Stefan and Zhou, Helen and Healey, Elizabeth and Chang, Trenton and Ellington, Caleb and Mhasawade, Vishwali and Tonekaboni, Sana and Argaw, Peniel and Zhang, Haoran}, volume = {259}, series = {Proceedings of Machine Learning Research}, month = {15--16 Dec}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v259/main/assets/chou25a/chou25a.pdf}, url = {https://proceedings.mlr.press/v259/chou25a.html}, abstract = {In recent years, there has been growing interest in developing a non-invasive tool for detecting Alzheimer’s Disease (AD). Previous studies have shown that a single modality such as speech or eye-tracking (ET) data can be effective for classifying AD patients from healthy individuals. However, understanding the role of other modalities, and especially the integration of facial analysis with ET for enhancing dementia classification, remains under-explored. In this paper, we investigate whether we can leverage facial patterns in AD patients by building on EMOTION-FAN—a deep learning model initially developed for recognizing seven distinct human emotions, now fine-tuned for our facial analysis tasks. We also explore the efficacy of leveraging multimodal information by combining the results from the facial and ET data through a late fusion technique. Specifically, our approach uses a neural classifier to learn from raw ET data (VTNet) alongside the fine-tuned EMOTION-FAN model that learns from the facial data. Experimental results show that facial data gives superior results than ET data. Notably, we obtain higher scores when both modalities are combined, providing strong evidence that integrating multimodal data benefits performance on this task.} }
Endnote
%0 Conference Paper %T Multimodal Classification of Alzheimer’s Disease by Combining Facial and Eye-Tracking Data %A Shih-Han Chou %A Miini Teng %A Harshinee Sriram %A Chuyuan Li %A Giuseppe Carenini %A Cristina Conati %A Thalia S. Field %A Hyeju Jang %A Gabriel Murray %B Proceedings of the 4th Machine Learning for Health Symposium %C Proceedings of Machine Learning Research %D 2025 %E Stefan Hegselmann %E Helen Zhou %E Elizabeth Healey %E Trenton Chang %E Caleb Ellington %E Vishwali Mhasawade %E Sana Tonekaboni %E Peniel Argaw %E Haoran Zhang %F pmlr-v259-chou25a %I PMLR %P 219--232 %U https://proceedings.mlr.press/v259/chou25a.html %V 259 %X In recent years, there has been growing interest in developing a non-invasive tool for detecting Alzheimer’s Disease (AD). Previous studies have shown that a single modality such as speech or eye-tracking (ET) data can be effective for classifying AD patients from healthy individuals. However, understanding the role of other modalities, and especially the integration of facial analysis with ET for enhancing dementia classification, remains under-explored. In this paper, we investigate whether we can leverage facial patterns in AD patients by building on EMOTION-FAN—a deep learning model initially developed for recognizing seven distinct human emotions, now fine-tuned for our facial analysis tasks. We also explore the efficacy of leveraging multimodal information by combining the results from the facial and ET data through a late fusion technique. Specifically, our approach uses a neural classifier to learn from raw ET data (VTNet) alongside the fine-tuned EMOTION-FAN model that learns from the facial data. Experimental results show that facial data gives superior results than ET data. Notably, we obtain higher scores when both modalities are combined, providing strong evidence that integrating multimodal data benefits performance on this task.
APA
Chou, S., Teng, M., Sriram, H., Li, C., Carenini, G., Conati, C., Field, T.S., Jang, H. & Murray, G.. (2025). Multimodal Classification of Alzheimer’s Disease by Combining Facial and Eye-Tracking Data. Proceedings of the 4th Machine Learning for Health Symposium, in Proceedings of Machine Learning Research 259:219-232 Available from https://proceedings.mlr.press/v259/chou25a.html.

Related Material