Probability Inequalities for Kernel Embeddings in Sampling without Replacement
Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, PMLR 51:66-74, 2016.
The \emphkernel embedding of distributions is a popular machine learning technique to manipulate probability distributions and an integral part of numerous applications. Its empirical counterpart is an estimate from a finite dataset of samples from the distribution under consideration. However, for large-scale learning problems the empirical kernel embedding becomes infeasible to compute and approximate, constant time, solutions are necessary. Instead of the full dataset, a random subset of smaller size can be used to calculate the empirical kernel embedding, known as \emphsampling without replacement. In this work we generalize the results of (Serfling 1974) to quantify the difference between this two estimates. We derive probability inequalities for the kernel embedding and more general inequalities for Banach space valued martingales in the setting of sampling without replacement.