Structured Output Learning with High Order Loss Functions
Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, PMLR 22:1212-1220, 2012.
Often when modeling structured domains, it is desirable to leverage information that is not naturally expressed as simply a label. Examples include knowledge about the evaluation measure that will be used at test time, and partial (weak) label information. When the additional information has structure that factorizes according to small subsets of variables (i.e., is \emphlow order, or \emphdecomposable), several approaches can be used to incorporate it into a learning procedure. Our focus in this work is the more challenging case, where the additional information does not factorize according to low order graphical model structure; we call this the \emphhigh order case. We propose to formalize various forms of this additional information as high order loss functions, which may have complex interactions over large subsets of variables. We then address the computational challenges inherent in learning according to such loss functions, particularly focusing on the loss-augmented inference problem that arises in large margin learning; we show that learning with high order loss functions is often practical, giving strong empirical results, with one popular and several novel high-order loss functions, in several settings.