[edit]
CountSketches, Feature Hashing and the Median of Three
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:6011-6020, 2021.
Abstract
In this paper, we revisit the classic CountSketch method, which is a sparse, random projection that transforms a (high-dimensional) Euclidean vector v to a vector of dimension (2t−1)s, where t,s>0 are integer parameters. It is known that a CountSketch allows estimating coordinates of v with variance bounded by ‖. For t > 1, the estimator takes the median of 2t-1 independent estimates, and the probability that the estimate is off by more than 2 \|v\|_2/\sqrt{s} is exponentially small in t. This suggests choosing t to be logarithmic in a desired inverse failure probability. However, implementations of CountSketch often use a small, constant t. Previous work only predicts a constant factor improvement in this setting. Our main contribution is a new analysis of CountSketch, showing an improvement in variance to O(\min\{\|v\|_1^2/s^2,\|v\|_2^2/s\}) when t > 1. That is, the variance decreases proportionally to s^{-2}, asymptotically for large enough s.