Linear-time training of nonlinear low-dimensional embeddings

Max Vladymyrov; Miguel Carreira-Perpinan

Linear-time training of nonlinear low-dimensional embeddings

Max Vladymyrov, Miguel Carreira-Perpinan

Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, PMLR 33:968-977, 2014.

Abstract

Nonlinear embeddings such as stochastic neighbor embedding or the elastic embedding achieve better results than spectral methods but require an expensive, nonconvex optimization, where the objective function and gradient are quadratic on the sample size. We address this bottleneck by formulating the optimization as an N-body problem and using fast multipole methods (FMMs) to approximate the gradient in linear time. We study the effect, in theory and experiment, of approximating gradients in the optimization and show that the expected error is related to the mean curvature of the objective function, and that gradually increasing the accuracy level in the FMM over iterations leads to a faster training. When combined with standard optimizers, such as gradient descent or L-BFGS, the resulting algorithm beats the \mathcalO(N \log N) Barnes-Hut method and achieves reasonable embeddings for one million points in around three hours’ runtime.

Cite this Paper

BibTeX

@InProceedings{pmlr-v33-vladymyrov14,
  title = 	 {{Linear-time training of nonlinear low-dimensional embeddings}},
  author = 	 {Vladymyrov, Max and Carreira-Perpinan, Miguel},
  booktitle = 	 {Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics},
  pages = 	 {968--977},
  year = 	 {2014},
  editor = 	 {Kaski, Samuel and Corander, Jukka},
  volume = 	 {33},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Reykjavik, Iceland},
  month = 	 {22--25 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v33/vladymyrov14.pdf},
  url = 	 {https://proceedings.mlr.press/v33/vladymyrov14.html},
  abstract = 	 {Nonlinear embeddings such as stochastic neighbor embedding or the elastic embedding achieve better results than spectral methods but require an expensive, nonconvex optimization, where the objective function and gradient are quadratic on the sample size. We address this bottleneck by formulating the optimization as an N-body problem and using fast multipole methods (FMMs) to approximate the gradient in linear time. We study the effect, in theory and experiment, of approximating gradients in the optimization and show that the expected error is related to the mean curvature of the objective function, and that gradually increasing the accuracy level in the FMM over iterations leads to a faster training. When combined with standard optimizers, such as gradient descent or L-BFGS, the resulting algorithm beats the \mathcalO(N \log N) Barnes-Hut method and achieves reasonable embeddings for one million points in around three hours’ runtime.}
}

Endnote

%0 Conference Paper
%T Linear-time training of nonlinear low-dimensional embeddings
%A Max Vladymyrov
%A Miguel Carreira-Perpinan
%B Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2014
%E Samuel Kaski
%E Jukka Corander	
%F pmlr-v33-vladymyrov14
%I PMLR
%P 968--977
%U https://proceedings.mlr.press/v33/vladymyrov14.html
%V 33
%X Nonlinear embeddings such as stochastic neighbor embedding or the elastic embedding achieve better results than spectral methods but require an expensive, nonconvex optimization, where the objective function and gradient are quadratic on the sample size. We address this bottleneck by formulating the optimization as an N-body problem and using fast multipole methods (FMMs) to approximate the gradient in linear time. We study the effect, in theory and experiment, of approximating gradients in the optimization and show that the expected error is related to the mean curvature of the objective function, and that gradually increasing the accuracy level in the FMM over iterations leads to a faster training. When combined with standard optimizers, such as gradient descent or L-BFGS, the resulting algorithm beats the \mathcalO(N \log N) Barnes-Hut method and achieves reasonable embeddings for one million points in around three hours’ runtime.

RIS

TY  - CPAPER
TI  - Linear-time training of nonlinear low-dimensional embeddings
AU  - Max Vladymyrov
AU  - Miguel Carreira-Perpinan
BT  - Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics
DA  - 2014/04/02
ED  - Samuel Kaski
ED  - Jukka Corander	
ID  - pmlr-v33-vladymyrov14
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 33
SP  - 968
EP  - 977
L1  - http://proceedings.mlr.press/v33/vladymyrov14.pdf
UR  - https://proceedings.mlr.press/v33/vladymyrov14.html
AB  - Nonlinear embeddings such as stochastic neighbor embedding or the elastic embedding achieve better results than spectral methods but require an expensive, nonconvex optimization, where the objective function and gradient are quadratic on the sample size. We address this bottleneck by formulating the optimization as an N-body problem and using fast multipole methods (FMMs) to approximate the gradient in linear time. We study the effect, in theory and experiment, of approximating gradients in the optimization and show that the expected error is related to the mean curvature of the objective function, and that gradually increasing the accuracy level in the FMM over iterations leads to a faster training. When combined with standard optimizers, such as gradient descent or L-BFGS, the resulting algorithm beats the \mathcalO(N \log N) Barnes-Hut method and achieves reasonable embeddings for one million points in around three hours’ runtime.
ER  -

APA

Vladymyrov, M. & Carreira-Perpinan, M.. (2014). Linear-time training of nonlinear low-dimensional embeddings. Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 33:968-977 Available from https://proceedings.mlr.press/v33/vladymyrov14.html.

Linear-time training of nonlinear low-dimensional embeddings

Abstract

Cite this Paper

Related Material