Linear-time training of nonlinear low-dimensional embeddings
Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, PMLR 33:968-977, 2014.
Nonlinear embeddings such as stochastic neighbor embedding or the elastic embedding achieve better results than spectral methods but require an expensive, nonconvex optimization, where the objective function and gradient are quadratic on the sample size. We address this bottleneck by formulating the optimization as an N-body problem and using fast multipole methods (FMMs) to approximate the gradient in linear time. We study the effect, in theory and experiment, of approximating gradients in the optimization and show that the expected error is related to the mean curvature of the objective function, and that gradually increasing the accuracy level in the FMM over iterations leads to a faster training. When combined with standard optimizers, such as gradient descent or L-BFGS, the resulting algorithm beats the \mathcalO(N \log N) Barnes-Hut method and achieves reasonable embeddings for one million points in around three hours’ runtime.