[edit]
An Information Geometry Approach for Distance Metric Learning
Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, PMLR 5:591-598, 2009.
Abstract
Metric learning is an important problem in machine learning and pattern recognition. In this paper, we propose a framework for metric learning based on information geometry. The key idea is to construct two kernel matrices for the given training data: one is based on the distance metric and the other is based on the assigned class labels. Inspired by the idea of information geometry, we relate these two kernel matrices to two Gaussian distributions, and the difference between the two kernel matrices is then computed by the Kullback-Leibler (KL) divergence between the two Gaussian distributions. The optimal distance metric is then found by minimizing the divergence between the two distributions. Based on this idea, we present two metric learning algorithms, one for linear distance metric and the other for nonlinear distance with the introduction of a kernel function. Unlike many existing algorithms for metric learning that require solving a non-trivial optimization problem and therefore are computationally expensive when the data dimension is high, the proposed algorithms have a closed-form solution and are computationally more efficient. Extensive experiments with data classification and face recognition show that the proposed algorithms are comparable to or better than the state-of-the-art algorithms for metric learning.