Consistent Structured Prediction with Max-Min Margin Markov Networks

Alex Nowak; Francis Bach; Alessandro Rudi

Consistent Structured Prediction with Max-Min Margin Markov Networks

Alex Nowak, Francis Bach, Alessandro Rudi

Proceedings of the 37th International Conference on Machine Learning, PMLR 119:7381-7391, 2020.

Abstract

Max-margin methods for binary classification such as the support vector machine (SVM) have been extended to the structured prediction setting under the name of max-margin Markov networks ($M^3N$), or more generally structural SVMs. Unfortunately, these methods are statistically inconsistent when the relationship between inputs and labels is far from deterministic. We overcome such limitations by defining the learning problem in terms of a “max-min” margin formulation, naming the resulting method max-min margin Markov networks ($M^4N$). We prove consistency and finite sample generalization bounds for $M^4N$ and provide an explicit algorithm to compute the estimator. The algorithm achieves a generalization error of $O(1/\sqrt{n})$ for a total cost of $O(n)$ projection-oracle calls (which have at most the same cost as the max-oracle from $M^3N$). Experiments on multi-class classification, ordinal regression, sequence prediction and ranking demonstrate the effectiveness of the proposed method.

Cite this Paper

BibTeX

@InProceedings{pmlr-v119-nowak20a,
  title = 	 {Consistent Structured Prediction with Max-Min Margin {M}arkov Networks},
  author =       {Nowak, Alex and Bach, Francis and Rudi, Alessandro},
  booktitle = 	 {Proceedings of the 37th International Conference on Machine Learning},
  pages = 	 {7381--7391},
  year = 	 {2020},
  editor = 	 {III, Hal Daumé and Singh, Aarti},
  volume = 	 {119},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--18 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v119/nowak20a/nowak20a.pdf},
  url = 	 {https://proceedings.mlr.press/v119/nowak20a.html},
  abstract = 	 {Max-margin methods for binary classification such as the support vector machine (SVM) have been extended to the structured prediction setting under the name of max-margin Markov networks ($M^3N$), or more generally structural SVMs. Unfortunately, these methods are statistically inconsistent when the relationship between inputs and labels is far from deterministic. We overcome such limitations by defining the learning problem in terms of a “max-min” margin formulation, naming the resulting method max-min margin Markov networks ($M^4N$). We prove consistency and finite sample generalization bounds for $M^4N$ and provide an explicit algorithm to compute the estimator. The algorithm achieves a generalization error of $O(1/\sqrt{n})$ for a total cost of $O(n)$ projection-oracle calls (which have at most the same cost as the max-oracle from $M^3N$). Experiments on multi-class classification, ordinal regression, sequence prediction and ranking demonstrate the effectiveness of the proposed method.}
}

Endnote

%0 Conference Paper
%T Consistent Structured Prediction with Max-Min Margin Markov Networks
%A Alex Nowak
%A Francis Bach
%A Alessandro Rudi
%B Proceedings of the 37th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2020
%E Hal Daumé III
%E Aarti Singh	
%F pmlr-v119-nowak20a
%I PMLR
%P 7381--7391
%U https://proceedings.mlr.press/v119/nowak20a.html
%V 119
%X Max-margin methods for binary classification such as the support vector machine (SVM) have been extended to the structured prediction setting under the name of max-margin Markov networks ($M^3N$), or more generally structural SVMs. Unfortunately, these methods are statistically inconsistent when the relationship between inputs and labels is far from deterministic. We overcome such limitations by defining the learning problem in terms of a “max-min” margin formulation, naming the resulting method max-min margin Markov networks ($M^4N$). We prove consistency and finite sample generalization bounds for $M^4N$ and provide an explicit algorithm to compute the estimator. The algorithm achieves a generalization error of $O(1/\sqrt{n})$ for a total cost of $O(n)$ projection-oracle calls (which have at most the same cost as the max-oracle from $M^3N$). Experiments on multi-class classification, ordinal regression, sequence prediction and ranking demonstrate the effectiveness of the proposed method.

APA

Nowak, A., Bach, F. & Rudi, A.. (2020). Consistent Structured Prediction with Max-Min Margin Markov Networks. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:7381-7391 Available from https://proceedings.mlr.press/v119/nowak20a.html.

Consistent Structured Prediction with Max-Min Margin Markov Networks

Abstract

Cite this Paper

Related Material