[edit]
Max-Margin Min-Entropy Models
Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, PMLR 22:779-787, 2012.
Abstract
We propose a new family of latent variable models called max-margin min-entropy (M3E) models, which define a distribution over the output and the hidden variables conditioned on the input. Given an input, an M3E model predicts the output with the smallest corresponding Renyi entropy of generalized distribution. This is equivalent to minimizing a score that consists of two terms: (i) the negative log-likelihood of the output, ensuring that the output has a high probability; and (ii) a measure of uncertainty over the distribution of the hidden variables conditioned on the input and the output, ensuring that there is little confusion in the values of the hidden variables. Given a training dataset, the parameters of an M3E model are learned by maximizing the margin between the Renyi entropies of the ground-truth output and all other incorrect outputs. Training an M3E can be viewed as minimizing an upper bound on a user-defined loss, and includes, as a special case, the latent support vector machine framework. We demonstrate the efficacy of M3E models on two standard machine learning applications, discriminative motif finding and image classification, using publicly available datasets.