Kernel Learning by Unconstrained Optimization

Fuxin Li; Yunshan Fu; Yu-Hong Dai; Cristian Sminchisescu; Jue Wang

Kernel Learning by Unconstrained Optimization

Fuxin Li, Yunshan Fu, Yu-Hong Dai, Cristian Sminchisescu, Jue Wang

Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, PMLR 5:328-335, 2009.

Abstract

We study the problem of learning a kernel matrix from an apriori kernel and training data. In this context, we propose a new unconstrained convex optimization formulation, with an arbitrary convex second-order differentiable loss function on kernel entries and a LogDet divergence term for regularization. Since the number of variables is of order $O(n^2)$, the computational cost of standard Newton and quasi-Newton methods for learning a kernel matrix is prohibitive. Here an operator form Hessian is used to develop an $O(n^3)$ trust-region inexact Newton method, where the Newton direction is computed using several conjugate gradient steps on the Hessian operator equation. On the uspst dataset, our algorithm can handle 2 million optimization variables within one hour. Experiments are shown for both linear (Mahalanobis) metric learning and for kernel learning. The convergence rate, speed and performance of several loss functions and algorithms are discussed.

Cite this Paper

BibTeX

@InProceedings{pmlr-v5-li09a,
  title = 	 {Kernel Learning by Unconstrained Optimization},
  author = 	 {Li, Fuxin and Fu, Yunshan and Dai, Yu-Hong and Sminchisescu, Cristian and Wang, Jue},
  booktitle = 	 {Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics},
  pages = 	 {328--335},
  year = 	 {2009},
  editor = 	 {van Dyk, David and Welling, Max},
  volume = 	 {5},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Hilton Clearwater Beach Resort, Clearwater Beach, Florida USA},
  month = 	 {16--18 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v5/li09a/li09a.pdf},
  url = 	 {https://proceedings.mlr.press/v5/li09a.html},
  abstract = 	 {We study the problem of learning a kernel matrix from an  apriori kernel and training data. In this context, we propose a new unconstrained convex optimization formulation, with an  arbitrary convex second-order differentiable loss function on kernel  entries and a LogDet divergence term for regularization. Since the  number of variables is of order $O(n^2)$, the computational cost of  standard Newton and quasi-Newton methods for learning a kernel  matrix is prohibitive. Here an operator form Hessian is  used to develop an $O(n^3)$ trust-region inexact Newton method,  where the Newton direction is computed using several conjugate  gradient steps on the Hessian operator equation. On the uspst  dataset, our algorithm can handle 2 million optimization variables  within one hour. Experiments are shown for both linear (Mahalanobis)  metric learning and for kernel learning. The convergence rate, speed  and performance of several loss functions and algorithms are  discussed.}
}

Endnote

%0 Conference Paper
%T Kernel Learning by Unconstrained Optimization
%A Fuxin Li
%A Yunshan Fu
%A Yu-Hong Dai
%A Cristian Sminchisescu
%A Jue Wang
%B Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2009
%E David van Dyk
%E Max Welling	
%F pmlr-v5-li09a
%I PMLR
%P 328--335
%U https://proceedings.mlr.press/v5/li09a.html
%V 5
%X We study the problem of learning a kernel matrix from an  apriori kernel and training data. In this context, we propose a new unconstrained convex optimization formulation, with an  arbitrary convex second-order differentiable loss function on kernel  entries and a LogDet divergence term for regularization. Since the  number of variables is of order $O(n^2)$, the computational cost of  standard Newton and quasi-Newton methods for learning a kernel  matrix is prohibitive. Here an operator form Hessian is  used to develop an $O(n^3)$ trust-region inexact Newton method,  where the Newton direction is computed using several conjugate  gradient steps on the Hessian operator equation. On the uspst  dataset, our algorithm can handle 2 million optimization variables  within one hour. Experiments are shown for both linear (Mahalanobis)  metric learning and for kernel learning. The convergence rate, speed  and performance of several loss functions and algorithms are  discussed.

RIS

TY  - CPAPER
TI  - Kernel Learning by Unconstrained Optimization
AU  - Fuxin Li
AU  - Yunshan Fu
AU  - Yu-Hong Dai
AU  - Cristian Sminchisescu
AU  - Jue Wang
BT  - Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics
DA  - 2009/04/15
ED  - David van Dyk
ED  - Max Welling	
ID  - pmlr-v5-li09a
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 5
SP  - 328
EP  - 335
L1  - http://proceedings.mlr.press/v5/li09a/li09a.pdf
UR  - https://proceedings.mlr.press/v5/li09a.html
AB  - We study the problem of learning a kernel matrix from an  apriori kernel and training data. In this context, we propose a new unconstrained convex optimization formulation, with an  arbitrary convex second-order differentiable loss function on kernel  entries and a LogDet divergence term for regularization. Since the  number of variables is of order $O(n^2)$, the computational cost of  standard Newton and quasi-Newton methods for learning a kernel  matrix is prohibitive. Here an operator form Hessian is  used to develop an $O(n^3)$ trust-region inexact Newton method,  where the Newton direction is computed using several conjugate  gradient steps on the Hessian operator equation. On the uspst  dataset, our algorithm can handle 2 million optimization variables  within one hour. Experiments are shown for both linear (Mahalanobis)  metric learning and for kernel learning. The convergence rate, speed  and performance of several loss functions and algorithms are  discussed.
ER  -

APA

Li, F., Fu, Y., Dai, Y., Sminchisescu, C. & Wang, J.. (2009). Kernel Learning by Unconstrained Optimization. Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 5:328-335 Available from https://proceedings.mlr.press/v5/li09a.html.

Related Material

Download PDF