Structural Maxent Models

Corinna Cortes, Vitaly Kuznetsov, Mehryar Mohri, Umar Syed
Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:391-399, 2015.

Abstract

We present a new class of density estimation models, Structural Maxent models, with feature functions selected from possibly very complex families. The design of our models is motivated by data-dependent convergence bounds and benefits from new data-dependent learning bounds expressed in terms of the Rademacher complexities of the sub-families composing the family of features considered. We prove a duality theorem, which we use to derive our Structural Maxent algorithm. We give a full description of our algorithm, including the details of its derivation and report the results of several experiments demonstrating that its performance compares favorably to that of existing regularized Maxent. We further similarly define conditional Structural Maxent models for multi-class classification problems. These are conditional probability models making use of possibly complex feature families. We also prove a duality theorem for these models which shows the connection between these models and existing binary and multi-class deep boosting algorithms.

Cite this Paper


BibTeX
@InProceedings{pmlr-v37-cortes15, title = {Structural Maxent Models}, author = {Cortes, Corinna and Kuznetsov, Vitaly and Mohri, Mehryar and Syed, Umar}, booktitle = {Proceedings of the 32nd International Conference on Machine Learning}, pages = {391--399}, year = {2015}, editor = {Bach, Francis and Blei, David}, volume = {37}, series = {Proceedings of Machine Learning Research}, address = {Lille, France}, month = {07--09 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v37/cortes15.pdf}, url = { http://proceedings.mlr.press/v37/cortes15.html }, abstract = {We present a new class of density estimation models, Structural Maxent models, with feature functions selected from possibly very complex families. The design of our models is motivated by data-dependent convergence bounds and benefits from new data-dependent learning bounds expressed in terms of the Rademacher complexities of the sub-families composing the family of features considered. We prove a duality theorem, which we use to derive our Structural Maxent algorithm. We give a full description of our algorithm, including the details of its derivation and report the results of several experiments demonstrating that its performance compares favorably to that of existing regularized Maxent. We further similarly define conditional Structural Maxent models for multi-class classification problems. These are conditional probability models making use of possibly complex feature families. We also prove a duality theorem for these models which shows the connection between these models and existing binary and multi-class deep boosting algorithms.} }
Endnote
%0 Conference Paper %T Structural Maxent Models %A Corinna Cortes %A Vitaly Kuznetsov %A Mehryar Mohri %A Umar Syed %B Proceedings of the 32nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2015 %E Francis Bach %E David Blei %F pmlr-v37-cortes15 %I PMLR %P 391--399 %U http://proceedings.mlr.press/v37/cortes15.html %V 37 %X We present a new class of density estimation models, Structural Maxent models, with feature functions selected from possibly very complex families. The design of our models is motivated by data-dependent convergence bounds and benefits from new data-dependent learning bounds expressed in terms of the Rademacher complexities of the sub-families composing the family of features considered. We prove a duality theorem, which we use to derive our Structural Maxent algorithm. We give a full description of our algorithm, including the details of its derivation and report the results of several experiments demonstrating that its performance compares favorably to that of existing regularized Maxent. We further similarly define conditional Structural Maxent models for multi-class classification problems. These are conditional probability models making use of possibly complex feature families. We also prove a duality theorem for these models which shows the connection between these models and existing binary and multi-class deep boosting algorithms.
RIS
TY - CPAPER TI - Structural Maxent Models AU - Corinna Cortes AU - Vitaly Kuznetsov AU - Mehryar Mohri AU - Umar Syed BT - Proceedings of the 32nd International Conference on Machine Learning DA - 2015/06/01 ED - Francis Bach ED - David Blei ID - pmlr-v37-cortes15 PB - PMLR DP - Proceedings of Machine Learning Research VL - 37 SP - 391 EP - 399 L1 - http://proceedings.mlr.press/v37/cortes15.pdf UR - http://proceedings.mlr.press/v37/cortes15.html AB - We present a new class of density estimation models, Structural Maxent models, with feature functions selected from possibly very complex families. The design of our models is motivated by data-dependent convergence bounds and benefits from new data-dependent learning bounds expressed in terms of the Rademacher complexities of the sub-families composing the family of features considered. We prove a duality theorem, which we use to derive our Structural Maxent algorithm. We give a full description of our algorithm, including the details of its derivation and report the results of several experiments demonstrating that its performance compares favorably to that of existing regularized Maxent. We further similarly define conditional Structural Maxent models for multi-class classification problems. These are conditional probability models making use of possibly complex feature families. We also prove a duality theorem for these models which shows the connection between these models and existing binary and multi-class deep boosting algorithms. ER -
APA
Cortes, C., Kuznetsov, V., Mohri, M. & Syed, U.. (2015). Structural Maxent Models. Proceedings of the 32nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 37:391-399 Available from http://proceedings.mlr.press/v37/cortes15.html .

Related Material