Prototypical Model with Information-Theoretic Loss Functions for Generalized Zero-Shot Learning

Chunlin Ji, Zhan Xiong, Meiying Zhang, Huiwen Yang, Feng Chen, Hanchun Shen
Proceedings of the 15th Asian Conference on Machine Learning, PMLR 222:566-581, 2024.

Abstract

Generalized zero-shot learning (GZSL) is still a technical challenge of deep learning. To preserve the semantic relation between source and target classes when only trained with data from source classes, we address the quantification of the knowledge transfer from an information-theoretic viewpoint. We use the prototypical model and format the variables of concern as a probability vector. Taking advantage of the probability vector representation, information measurements can be effectively evaluated with simple closed forms. We propose two information-theoretic loss functions: a mutual information loss to bridge seen data and target classes; an uncertainty-aware entropy constraint loss to prevent overfitting when using seen data to learn the embedding of target classes. Simulation shows that, as a deterministic model, our proposed method obtains state-of-the-art results on GZSL benchmark datasets. We achieve 21% − 64% improvements over the baseline model – deep calibration network (DCN) and demonstrate that a deterministic model can perform as well as generative ones. Furthermore, the proposed method is compatible with generative models and can noticeably improve their performance.

Cite this Paper


BibTeX
@InProceedings{pmlr-v222-ji24c, title = {Prototypical Model with Information-Theoretic Loss Functions for Generalized Zero-Shot Learning}, author = {Ji, Chunlin and Xiong, Zhan and Zhang, Meiying and Yang, Huiwen and Chen, Feng and Shen, Hanchun}, booktitle = {Proceedings of the 15th Asian Conference on Machine Learning}, pages = {566--581}, year = {2024}, editor = {Yanıkoğlu, Berrin and Buntine, Wray}, volume = {222}, series = {Proceedings of Machine Learning Research}, month = {11--14 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v222/ji24c/ji24c.pdf}, url = {https://proceedings.mlr.press/v222/ji24c.html}, abstract = {Generalized zero-shot learning (GZSL) is still a technical challenge of deep learning. To preserve the semantic relation between source and target classes when only trained with data from source classes, we address the quantification of the knowledge transfer from an information-theoretic viewpoint. We use the prototypical model and format the variables of concern as a probability vector. Taking advantage of the probability vector representation, information measurements can be effectively evaluated with simple closed forms. We propose two information-theoretic loss functions: a mutual information loss to bridge seen data and target classes; an uncertainty-aware entropy constraint loss to prevent overfitting when using seen data to learn the embedding of target classes. Simulation shows that, as a deterministic model, our proposed method obtains state-of-the-art results on GZSL benchmark datasets. We achieve 21% − 64% improvements over the baseline model – deep calibration network (DCN) and demonstrate that a deterministic model can perform as well as generative ones. Furthermore, the proposed method is compatible with generative models and can noticeably improve their performance.} }
Endnote
%0 Conference Paper %T Prototypical Model with Information-Theoretic Loss Functions for Generalized Zero-Shot Learning %A Chunlin Ji %A Zhan Xiong %A Meiying Zhang %A Huiwen Yang %A Feng Chen %A Hanchun Shen %B Proceedings of the 15th Asian Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Berrin Yanıkoğlu %E Wray Buntine %F pmlr-v222-ji24c %I PMLR %P 566--581 %U https://proceedings.mlr.press/v222/ji24c.html %V 222 %X Generalized zero-shot learning (GZSL) is still a technical challenge of deep learning. To preserve the semantic relation between source and target classes when only trained with data from source classes, we address the quantification of the knowledge transfer from an information-theoretic viewpoint. We use the prototypical model and format the variables of concern as a probability vector. Taking advantage of the probability vector representation, information measurements can be effectively evaluated with simple closed forms. We propose two information-theoretic loss functions: a mutual information loss to bridge seen data and target classes; an uncertainty-aware entropy constraint loss to prevent overfitting when using seen data to learn the embedding of target classes. Simulation shows that, as a deterministic model, our proposed method obtains state-of-the-art results on GZSL benchmark datasets. We achieve 21% − 64% improvements over the baseline model – deep calibration network (DCN) and demonstrate that a deterministic model can perform as well as generative ones. Furthermore, the proposed method is compatible with generative models and can noticeably improve their performance.
APA
Ji, C., Xiong, Z., Zhang, M., Yang, H., Chen, F. & Shen, H.. (2024). Prototypical Model with Information-Theoretic Loss Functions for Generalized Zero-Shot Learning. Proceedings of the 15th Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 222:566-581 Available from https://proceedings.mlr.press/v222/ji24c.html.

Related Material