The Fast Loaded Dice Roller: A Near-Optimal Exact Sampler for Discrete Probability Distributions
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:1036-1046, 2020.
This paper introduces a new algorithm for the fundamental problem of generating a random integer from a discrete probability distribution using a source of independent and unbiased random coin flips. We prove that this algorithm, which we call the Fast Loaded Dice Roller (FLDR), is highly efficient in both space and time: (i) the size of the sampler is guaranteed to be linear in the number of bits needed to encode the input distribution; and (ii) the expected number of bits of entropy it consumes per sample is at most 6 bits more than the information-theoretically optimal rate. We present fast implementations of the linear-time preprocessing and near-optimal sampling algorithms using unsigned integer arithmetic. Empirical evaluations on a broad set of probability distributions establish that FLDR is 2x-10x faster in both preprocessing and sampling than multiple baseline algorithms, including the widely-used alias and interval samplers. It also uses up to 10000x less space than the information-theoretically optimal sampler, at the expense of less than 1.5x runtime overhead.