[edit]
Prototypical Model with Information-Theoretic Loss Functions for Generalized Zero-Shot Learning
Proceedings of the 15th Asian Conference on Machine Learning, PMLR 222:566-581, 2024.
Abstract
Generalized zero-shot learning (GZSL) is still a technical challenge of deep learning. To preserve the semantic relation between source and target classes when only trained with data from source classes, we address the quantification of the knowledge transfer from an information-theoretic viewpoint. We use the prototypical model and format the variables of concern as a probability vector. Taking advantage of the probability vector representation, information measurements can be effectively evaluated with simple closed forms. We propose two information-theoretic loss functions: a mutual information loss to bridge seen data and target classes; an uncertainty-aware entropy constraint loss to prevent overfitting when using seen data to learn the embedding of target classes. Simulation shows that, as a deterministic model, our proposed method obtains state-of-the-art results on GZSL benchmark datasets. We achieve 21% − 64% improvements over the baseline model – deep calibration network (DCN) and demonstrate that a deterministic model can perform as well as generative ones. Furthermore, the proposed method is compatible with generative models and can noticeably improve their performance.