Linear-time training of nonlinear low-dimensional embeddings

Max Vladymyrov, Miguel Carreira-Perpinan
Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, PMLR 33:968-977, 2014.

Abstract

Nonlinear embeddings such as stochastic neighbor embedding or the elastic embedding achieve better results than spectral methods but require an expensive, nonconvex optimization, where the objective function and gradient are quadratic on the sample size. We address this bottleneck by formulating the optimization as an N-body problem and using fast multipole methods (FMMs) to approximate the gradient in linear time. We study the effect, in theory and experiment, of approximating gradients in the optimization and show that the expected error is related to the mean curvature of the objective function, and that gradually increasing the accuracy level in the FMM over iterations leads to a faster training. When combined with standard optimizers, such as gradient descent or L-BFGS, the resulting algorithm beats the \mathcalO(N \log N) Barnes-Hut method and achieves reasonable embeddings for one million points in around three hours’ runtime.

Cite this Paper


BibTeX
@InProceedings{pmlr-v33-vladymyrov14, title = {{Linear-time training of nonlinear low-dimensional embeddings}}, author = {Vladymyrov, Max and Carreira-Perpinan, Miguel}, booktitle = {Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics}, pages = {968--977}, year = {2014}, editor = {Kaski, Samuel and Corander, Jukka}, volume = {33}, series = {Proceedings of Machine Learning Research}, address = {Reykjavik, Iceland}, month = {22--25 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v33/vladymyrov14.pdf}, url = {https://proceedings.mlr.press/v33/vladymyrov14.html}, abstract = {Nonlinear embeddings such as stochastic neighbor embedding or the elastic embedding achieve better results than spectral methods but require an expensive, nonconvex optimization, where the objective function and gradient are quadratic on the sample size. We address this bottleneck by formulating the optimization as an N-body problem and using fast multipole methods (FMMs) to approximate the gradient in linear time. We study the effect, in theory and experiment, of approximating gradients in the optimization and show that the expected error is related to the mean curvature of the objective function, and that gradually increasing the accuracy level in the FMM over iterations leads to a faster training. When combined with standard optimizers, such as gradient descent or L-BFGS, the resulting algorithm beats the \mathcalO(N \log N) Barnes-Hut method and achieves reasonable embeddings for one million points in around three hours’ runtime.} }
Endnote
%0 Conference Paper %T Linear-time training of nonlinear low-dimensional embeddings %A Max Vladymyrov %A Miguel Carreira-Perpinan %B Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2014 %E Samuel Kaski %E Jukka Corander %F pmlr-v33-vladymyrov14 %I PMLR %P 968--977 %U https://proceedings.mlr.press/v33/vladymyrov14.html %V 33 %X Nonlinear embeddings such as stochastic neighbor embedding or the elastic embedding achieve better results than spectral methods but require an expensive, nonconvex optimization, where the objective function and gradient are quadratic on the sample size. We address this bottleneck by formulating the optimization as an N-body problem and using fast multipole methods (FMMs) to approximate the gradient in linear time. We study the effect, in theory and experiment, of approximating gradients in the optimization and show that the expected error is related to the mean curvature of the objective function, and that gradually increasing the accuracy level in the FMM over iterations leads to a faster training. When combined with standard optimizers, such as gradient descent or L-BFGS, the resulting algorithm beats the \mathcalO(N \log N) Barnes-Hut method and achieves reasonable embeddings for one million points in around three hours’ runtime.
RIS
TY - CPAPER TI - Linear-time training of nonlinear low-dimensional embeddings AU - Max Vladymyrov AU - Miguel Carreira-Perpinan BT - Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics DA - 2014/04/02 ED - Samuel Kaski ED - Jukka Corander ID - pmlr-v33-vladymyrov14 PB - PMLR DP - Proceedings of Machine Learning Research VL - 33 SP - 968 EP - 977 L1 - http://proceedings.mlr.press/v33/vladymyrov14.pdf UR - https://proceedings.mlr.press/v33/vladymyrov14.html AB - Nonlinear embeddings such as stochastic neighbor embedding or the elastic embedding achieve better results than spectral methods but require an expensive, nonconvex optimization, where the objective function and gradient are quadratic on the sample size. We address this bottleneck by formulating the optimization as an N-body problem and using fast multipole methods (FMMs) to approximate the gradient in linear time. We study the effect, in theory and experiment, of approximating gradients in the optimization and show that the expected error is related to the mean curvature of the objective function, and that gradually increasing the accuracy level in the FMM over iterations leads to a faster training. When combined with standard optimizers, such as gradient descent or L-BFGS, the resulting algorithm beats the \mathcalO(N \log N) Barnes-Hut method and achieves reasonable embeddings for one million points in around three hours’ runtime. ER -
APA
Vladymyrov, M. & Carreira-Perpinan, M.. (2014). Linear-time training of nonlinear low-dimensional embeddings. Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 33:968-977 Available from https://proceedings.mlr.press/v33/vladymyrov14.html.

Related Material