Simplex Random Features

Isaac Reid, Krzysztof Marcin Choromanski, Valerii Likhosherstov, Adrian Weller
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:28864-28888, 2023.

Abstract

We present Simplex Random Features (SimRFs), a new random feature (RF) mechanism for unbiased approximation of the softmax and Gaussian kernels by geometrical correlation of random projection vectors. We prove that SimRFs provide the smallest possible mean square error (MSE) on unbiased estimates of these kernels among the class of weight-independent geometrically-coupled positive random feature (PRF) mechanisms, substantially outperforming the previously most accurate Orthogonal Random Features (ORFs) at no observable extra cost. We present a more computationally expensive SimRFs+ variant, which we prove is asymptotically optimal in the broader family of weight-dependent geometrical coupling schemes (which permit correlations between random vector directions and norms). In extensive empirical studies, we show consistent gains provided by SimRFs in settings including pointwise kernel estimation, nonparametric classification and scalable Transformers.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-reid23a, title = {Simplex Random Features}, author = {Reid, Isaac and Choromanski, Krzysztof Marcin and Likhosherstov, Valerii and Weller, Adrian}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {28864--28888}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/reid23a/reid23a.pdf}, url = {https://proceedings.mlr.press/v202/reid23a.html}, abstract = {We present Simplex Random Features (SimRFs), a new random feature (RF) mechanism for unbiased approximation of the softmax and Gaussian kernels by geometrical correlation of random projection vectors. We prove that SimRFs provide the smallest possible mean square error (MSE) on unbiased estimates of these kernels among the class of weight-independent geometrically-coupled positive random feature (PRF) mechanisms, substantially outperforming the previously most accurate Orthogonal Random Features (ORFs) at no observable extra cost. We present a more computationally expensive SimRFs+ variant, which we prove is asymptotically optimal in the broader family of weight-dependent geometrical coupling schemes (which permit correlations between random vector directions and norms). In extensive empirical studies, we show consistent gains provided by SimRFs in settings including pointwise kernel estimation, nonparametric classification and scalable Transformers.} }
Endnote
%0 Conference Paper %T Simplex Random Features %A Isaac Reid %A Krzysztof Marcin Choromanski %A Valerii Likhosherstov %A Adrian Weller %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-reid23a %I PMLR %P 28864--28888 %U https://proceedings.mlr.press/v202/reid23a.html %V 202 %X We present Simplex Random Features (SimRFs), a new random feature (RF) mechanism for unbiased approximation of the softmax and Gaussian kernels by geometrical correlation of random projection vectors. We prove that SimRFs provide the smallest possible mean square error (MSE) on unbiased estimates of these kernels among the class of weight-independent geometrically-coupled positive random feature (PRF) mechanisms, substantially outperforming the previously most accurate Orthogonal Random Features (ORFs) at no observable extra cost. We present a more computationally expensive SimRFs+ variant, which we prove is asymptotically optimal in the broader family of weight-dependent geometrical coupling schemes (which permit correlations between random vector directions and norms). In extensive empirical studies, we show consistent gains provided by SimRFs in settings including pointwise kernel estimation, nonparametric classification and scalable Transformers.
APA
Reid, I., Choromanski, K.M., Likhosherstov, V. & Weller, A.. (2023). Simplex Random Features. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:28864-28888 Available from https://proceedings.mlr.press/v202/reid23a.html.

Related Material