Structural Maxent Models
Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:391-399, 2015.
We present a new class of density estimation models, Structural Maxent models, with feature functions selected from possibly very complex families. The design of our models is motivated by data-dependent convergence bounds and benefits from new data-dependent learning bounds expressed in terms of the Rademacher complexities of the sub-families composing the family of features considered. We prove a duality theorem, which we use to derive our Structural Maxent algorithm. We give a full description of our algorithm, including the details of its derivation and report the results of several experiments demonstrating that its performance compares favorably to that of existing regularized Maxent. We further similarly define conditional Structural Maxent models for multi-class classification problems. These are conditional probability models making use of possibly complex feature families. We also prove a duality theorem for these models which shows the connection between these models and existing binary and multi-class deep boosting algorithms.