Gaussian Margin Machines

Koby Crammer; Mehryar Mohri; Fernando Pereira

Gaussian Margin Machines

Koby Crammer, Mehryar Mohri, Fernando Pereira

Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, PMLR 5:105-112, 2009.

Abstract

We introduce Gaussian Margin Machines (GMMs), which maintain a Gaussian distribution over weight vectors for binary classification. The learning algorithm for these machines seeks the least informative distribution that will classify the training data correctly with high probability. One formulation can be expressed as a convex constrained optimization problem whose solution can be represented linearly in terms of training instances and their inner and outer products, supporting kernelization. The algorithm has a natural PAC-Bayesian generalization bound. A preliminary evaluation on handwriting recognition data shows that our algorithm improves over SVMs for the same task. methods, we maintain a distribution over alternative weight vectors, rather than committing to a single specific one. However, these distributions are not derived by Bayes? rule. Instead, they represent our knowledge of the weights given constraints imposed by the training examples. Specifically, we use a Gaussian distribution over weight vectors with mean and covariance parameters that are learned from the training data. The learning algorithm seeks for a distribution with a small Kullback-Leibler (KL) divergence from a ﬁxed isotropic distribution, such that each training example is correctly classified by a strict majority of the weight vectors. Conceptually, this is a large-margin probabilistic principle, instead of the geometric large margin principle in SVMs. The learning problem for GMMs can be expressed as a convex constrained optimization, and its optimal solution

Cite this Paper

BibTeX


@InProceedings{pmlr-v5-crammer09a,
  title = 	 {Gaussian Margin Machines},
  author = 	 {Crammer, Koby and Mohri, Mehryar and Pereira, Fernando},
  booktitle = 	 {Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics},
  pages = 	 {105--112},
  year = 	 {2009},
  editor = 	 {van Dyk, David and Welling, Max},
  volume = 	 {5},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Hilton Clearwater Beach Resort, Clearwater Beach, Florida USA},
  month = 	 {16--18 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v5/crammer09a/crammer09a.pdf},
  url = 	 {https://proceedings.mlr.press/v5/crammer09a.html},
  abstract = 	 {We introduce Gaussian Margin Machines   (GMMs), which maintain a Gaussian distribution over weight vectors for binary classification. The learning algorithm for these machines   seeks the least informative distribution that will   classify the training data correctly with high   probability. One formulation can be expressed   as a convex constrained optimization problem   whose solution can be represented linearly   in terms of training instances and their inner   and outer products, supporting kernelization.   The algorithm has a natural PAC-Bayesian   generalization bound. A preliminary evaluation   on handwriting recognition data shows that our   algorithm improves over SVMs for the same   task.   methods, we maintain a distribution over alternative weight   vectors, rather than committing to a single specific one. However, these distributions are not derived by Bayes? rule.   Instead, they represent our knowledge of the weights given constraints imposed by the training examples. Specifically, we use a Gaussian distribution over weight vectors with   mean and covariance parameters that are learned from the training data. The learning algorithm seeks for a distribution with a small Kullback-Leibler (KL) divergence from a   ﬁxed isotropic distribution, such that each training example is correctly classified by a strict majority of the weight   vectors. Conceptually, this is a large-margin probabilistic   principle, instead of the geometric large margin principle   in SVMs.   The learning problem for GMMs can be expressed as a   convex constrained optimization, and its optimal solution}
}

Endnote

%0 Conference Paper
%T Gaussian Margin Machines
%A Koby Crammer
%A Mehryar Mohri
%A Fernando Pereira
%B Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2009
%E David van Dyk
%E Max Welling	
%F pmlr-v5-crammer09a
%I PMLR
%P 105--112
%U https://proceedings.mlr.press/v5/crammer09a.html
%V 5
%X We introduce Gaussian Margin Machines   (GMMs), which maintain a Gaussian distribution over weight vectors for binary classification. The learning algorithm for these machines   seeks the least informative distribution that will   classify the training data correctly with high   probability. One formulation can be expressed   as a convex constrained optimization problem   whose solution can be represented linearly   in terms of training instances and their inner   and outer products, supporting kernelization.   The algorithm has a natural PAC-Bayesian   generalization bound. A preliminary evaluation   on handwriting recognition data shows that our   algorithm improves over SVMs for the same   task.   methods, we maintain a distribution over alternative weight   vectors, rather than committing to a single specific one. However, these distributions are not derived by Bayes? rule.   Instead, they represent our knowledge of the weights given constraints imposed by the training examples. Specifically, we use a Gaussian distribution over weight vectors with   mean and covariance parameters that are learned from the training data. The learning algorithm seeks for a distribution with a small Kullback-Leibler (KL) divergence from a   ﬁxed isotropic distribution, such that each training example is correctly classified by a strict majority of the weight   vectors. Conceptually, this is a large-margin probabilistic   principle, instead of the geometric large margin principle   in SVMs.   The learning problem for GMMs can be expressed as a   convex constrained optimization, and its optimal solution

RIS


TY  - CPAPER
TI  - Gaussian Margin Machines
AU  - Koby Crammer
AU  - Mehryar Mohri
AU  - Fernando Pereira
BT  - Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics
DA  - 2009/04/15
ED  - David van Dyk
ED  - Max Welling	
ID  - pmlr-v5-crammer09a
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 5
SP  - 105
EP  - 112
L1  - http://proceedings.mlr.press/v5/crammer09a/crammer09a.pdf
UR  - https://proceedings.mlr.press/v5/crammer09a.html
AB  - We introduce Gaussian Margin Machines   (GMMs), which maintain a Gaussian distribution over weight vectors for binary classification. The learning algorithm for these machines   seeks the least informative distribution that will   classify the training data correctly with high   probability. One formulation can be expressed   as a convex constrained optimization problem   whose solution can be represented linearly   in terms of training instances and their inner   and outer products, supporting kernelization.   The algorithm has a natural PAC-Bayesian   generalization bound. A preliminary evaluation   on handwriting recognition data shows that our   algorithm improves over SVMs for the same   task.   methods, we maintain a distribution over alternative weight   vectors, rather than committing to a single specific one. However, these distributions are not derived by Bayes? rule.   Instead, they represent our knowledge of the weights given constraints imposed by the training examples. Specifically, we use a Gaussian distribution over weight vectors with   mean and covariance parameters that are learned from the training data. The learning algorithm seeks for a distribution with a small Kullback-Leibler (KL) divergence from a   ﬁxed isotropic distribution, such that each training example is correctly classified by a strict majority of the weight   vectors. Conceptually, this is a large-margin probabilistic   principle, instead of the geometric large margin principle   in SVMs.   The learning problem for GMMs can be expressed as a   convex constrained optimization, and its optimal solution
ER  -

APA


Crammer, K., Mohri, M. & Pereira, F.. (2009). Gaussian Margin Machines. Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 5:105-112 Available from https://proceedings.mlr.press/v5/crammer09a.html.

Related Material

Download PDF