Linear-time training of nonlinear low-dimensional embeddings

Max Vladymyrov, Miguel Carreira-Perpinan
; Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, PMLR 33:968-977, 2014.

Abstract

Nonlinear embeddings such as stochastic neighbor embedding or the elastic embedding achieve better results than spectral methods but require an expensive, nonconvex optimization, where the objective function and gradient are quadratic on the sample size. We address this bottleneck by formulating the optimization as an N-body problem and using fast multipole methods (FMMs) to approximate the gradient in linear time. We study the effect, in theory and experiment, of approximating gradients in the optimization and show that the expected error is related to the mean curvature of the objective function, and that gradually increasing the accuracy level in the FMM over iterations leads to a faster training. When combined with standard optimizers, such as gradient descent or L-BFGS, the resulting algorithm beats the \mathcalO(N \log N) Barnes-Hut method and achieves reasonable embeddings for one million points in around three hours’ runtime.

Cite this Paper


BibTeX
@InProceedings{pmlr-v33-vladymyrov14, title = {{Linear-time training of nonlinear low-dimensional embeddings}}, author = {Max Vladymyrov and Miguel Carreira-Perpinan}, pages = {968--977}, year = {2014}, editor = {Samuel Kaski and Jukka Corander}, volume = {33}, series = {Proceedings of Machine Learning Research}, address = {Reykjavik, Iceland}, month = {22--25 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v33/vladymyrov14.pdf}, url = {http://proceedings.mlr.press/v33/vladymyrov14.html}, abstract = {Nonlinear embeddings such as stochastic neighbor embedding or the elastic embedding achieve better results than spectral methods but require an expensive, nonconvex optimization, where the objective function and gradient are quadratic on the sample size. We address this bottleneck by formulating the optimization as an N-body problem and using fast multipole methods (FMMs) to approximate the gradient in linear time. We study the effect, in theory and experiment, of approximating gradients in the optimization and show that the expected error is related to the mean curvature of the objective function, and that gradually increasing the accuracy level in the FMM over iterations leads to a faster training. When combined with standard optimizers, such as gradient descent or L-BFGS, the resulting algorithm beats the \mathcalO(N \log N) Barnes-Hut method and achieves reasonable embeddings for one million points in around three hours’ runtime.} }
Endnote
%0 Conference Paper %T Linear-time training of nonlinear low-dimensional embeddings %A Max Vladymyrov %A Miguel Carreira-Perpinan %B Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2014 %E Samuel Kaski %E Jukka Corander %F pmlr-v33-vladymyrov14 %I PMLR %J Proceedings of Machine Learning Research %P 968--977 %U http://proceedings.mlr.press %V 33 %W PMLR %X Nonlinear embeddings such as stochastic neighbor embedding or the elastic embedding achieve better results than spectral methods but require an expensive, nonconvex optimization, where the objective function and gradient are quadratic on the sample size. We address this bottleneck by formulating the optimization as an N-body problem and using fast multipole methods (FMMs) to approximate the gradient in linear time. We study the effect, in theory and experiment, of approximating gradients in the optimization and show that the expected error is related to the mean curvature of the objective function, and that gradually increasing the accuracy level in the FMM over iterations leads to a faster training. When combined with standard optimizers, such as gradient descent or L-BFGS, the resulting algorithm beats the \mathcalO(N \log N) Barnes-Hut method and achieves reasonable embeddings for one million points in around three hours’ runtime.
RIS
TY - CPAPER TI - Linear-time training of nonlinear low-dimensional embeddings AU - Max Vladymyrov AU - Miguel Carreira-Perpinan BT - Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics PY - 2014/04/02 DA - 2014/04/02 ED - Samuel Kaski ED - Jukka Corander ID - pmlr-v33-vladymyrov14 PB - PMLR SP - 968 DP - PMLR EP - 977 L1 - http://proceedings.mlr.press/v33/vladymyrov14.pdf UR - http://proceedings.mlr.press/v33/vladymyrov14.html AB - Nonlinear embeddings such as stochastic neighbor embedding or the elastic embedding achieve better results than spectral methods but require an expensive, nonconvex optimization, where the objective function and gradient are quadratic on the sample size. We address this bottleneck by formulating the optimization as an N-body problem and using fast multipole methods (FMMs) to approximate the gradient in linear time. We study the effect, in theory and experiment, of approximating gradients in the optimization and show that the expected error is related to the mean curvature of the objective function, and that gradually increasing the accuracy level in the FMM over iterations leads to a faster training. When combined with standard optimizers, such as gradient descent or L-BFGS, the resulting algorithm beats the \mathcalO(N \log N) Barnes-Hut method and achieves reasonable embeddings for one million points in around three hours’ runtime. ER -
APA
Vladymyrov, M. & Carreira-Perpinan, M.. (2014). Linear-time training of nonlinear low-dimensional embeddings. Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, in PMLR 33:968-977

Related Material