OV-MER: Towards Open-Vocabulary Multimodal Emotion Recognition

Zheng Lian, Haiyang Sun, Licai Sun, Haoyu Chen, Lan Chen, Hao Gu, Zhuofan Wen, Shun Chen, Zhang Siyuan, Hailiang Yao, Bin Liu, Rui Liu, Shan Liang, Ya Li, Jiangyan Yi, Jianhua Tao
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:37015-37050, 2025.

Abstract

Multimodal Emotion Recognition (MER) is a critical research area that seeks to decode human emotions from diverse data modalities. However, existing machine learning methods predominantly rely on predefined emotion taxonomies, which fail to capture the inherent complexity, subtlety, and multi-appraisal nature of human emotional experiences, as demonstrated by studies in psychology and cognitive science. To overcome this limitation, we advocate for introducing the concept of open vocabulary into MER. This paradigm shift aims to enable models to predict emotions beyond a fixed label space, accommodating a flexible set of categories to better reflect the nuanced spectrum of human emotions. To achieve this, we propose a novel paradigm: Open-Vocabulary MER (OV-MER), which enables emotion prediction without being confined to predefined spaces. However, constructing a dataset that encompasses the full range of emotions for OV-MER is practically infeasible; hence, we present a comprehensive solution including a newly curated database, novel evaluation metrics, and a preliminary benchmark. By advancing MER from basic emotions to more nuanced and diverse emotional states, we hope this work can inspire the next generation of MER, enhancing its generalizability and applicability in real-world scenarios. Code and dataset are available at: https://github.com/zeroQiaoba/AffectGPT.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-lian25b, title = {{OV}-{MER}: Towards Open-Vocabulary Multimodal Emotion Recognition}, author = {Lian, Zheng and Sun, Haiyang and Sun, Licai and Chen, Haoyu and Chen, Lan and Gu, Hao and Wen, Zhuofan and Chen, Shun and Siyuan, Zhang and Yao, Hailiang and Liu, Bin and Liu, Rui and Liang, Shan and Li, Ya and Yi, Jiangyan and Tao, Jianhua}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {37015--37050}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/lian25b/lian25b.pdf}, url = {https://proceedings.mlr.press/v267/lian25b.html}, abstract = {Multimodal Emotion Recognition (MER) is a critical research area that seeks to decode human emotions from diverse data modalities. However, existing machine learning methods predominantly rely on predefined emotion taxonomies, which fail to capture the inherent complexity, subtlety, and multi-appraisal nature of human emotional experiences, as demonstrated by studies in psychology and cognitive science. To overcome this limitation, we advocate for introducing the concept of open vocabulary into MER. This paradigm shift aims to enable models to predict emotions beyond a fixed label space, accommodating a flexible set of categories to better reflect the nuanced spectrum of human emotions. To achieve this, we propose a novel paradigm: Open-Vocabulary MER (OV-MER), which enables emotion prediction without being confined to predefined spaces. However, constructing a dataset that encompasses the full range of emotions for OV-MER is practically infeasible; hence, we present a comprehensive solution including a newly curated database, novel evaluation metrics, and a preliminary benchmark. By advancing MER from basic emotions to more nuanced and diverse emotional states, we hope this work can inspire the next generation of MER, enhancing its generalizability and applicability in real-world scenarios. Code and dataset are available at: https://github.com/zeroQiaoba/AffectGPT.} }
Endnote
%0 Conference Paper %T OV-MER: Towards Open-Vocabulary Multimodal Emotion Recognition %A Zheng Lian %A Haiyang Sun %A Licai Sun %A Haoyu Chen %A Lan Chen %A Hao Gu %A Zhuofan Wen %A Shun Chen %A Zhang Siyuan %A Hailiang Yao %A Bin Liu %A Rui Liu %A Shan Liang %A Ya Li %A Jiangyan Yi %A Jianhua Tao %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-lian25b %I PMLR %P 37015--37050 %U https://proceedings.mlr.press/v267/lian25b.html %V 267 %X Multimodal Emotion Recognition (MER) is a critical research area that seeks to decode human emotions from diverse data modalities. However, existing machine learning methods predominantly rely on predefined emotion taxonomies, which fail to capture the inherent complexity, subtlety, and multi-appraisal nature of human emotional experiences, as demonstrated by studies in psychology and cognitive science. To overcome this limitation, we advocate for introducing the concept of open vocabulary into MER. This paradigm shift aims to enable models to predict emotions beyond a fixed label space, accommodating a flexible set of categories to better reflect the nuanced spectrum of human emotions. To achieve this, we propose a novel paradigm: Open-Vocabulary MER (OV-MER), which enables emotion prediction without being confined to predefined spaces. However, constructing a dataset that encompasses the full range of emotions for OV-MER is practically infeasible; hence, we present a comprehensive solution including a newly curated database, novel evaluation metrics, and a preliminary benchmark. By advancing MER from basic emotions to more nuanced and diverse emotional states, we hope this work can inspire the next generation of MER, enhancing its generalizability and applicability in real-world scenarios. Code and dataset are available at: https://github.com/zeroQiaoba/AffectGPT.
APA
Lian, Z., Sun, H., Sun, L., Chen, H., Chen, L., Gu, H., Wen, Z., Chen, S., Siyuan, Z., Yao, H., Liu, B., Liu, R., Liang, S., Li, Y., Yi, J. & Tao, J.. (2025). OV-MER: Towards Open-Vocabulary Multimodal Emotion Recognition. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:37015-37050 Available from https://proceedings.mlr.press/v267/lian25b.html.

Related Material