The Fast Loaded Dice Roller: A Near-Optimal Exact Sampler for Discrete Probability Distributions

Feras Saad, Cameron Freer, Martin Rinard, Vikash Mansinghka
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:1036-1046, 2020.

Abstract

This paper introduces a new algorithm for the fundamental problem of generating a random integer from a discrete probability distribution using a source of independent and unbiased random coin flips. We prove that this algorithm, which we call the Fast Loaded Dice Roller (FLDR), is highly efficient in both space and time: (i) the size of the sampler is guaranteed to be linear in the number of bits needed to encode the input distribution; and (ii) the expected number of bits of entropy it consumes per sample is at most 6 bits more than the information-theoretically optimal rate. We present fast implementations of the linear-time preprocessing and near-optimal sampling algorithms using unsigned integer arithmetic. Empirical evaluations on a broad set of probability distributions establish that FLDR is 2x-10x faster in both preprocessing and sampling than multiple baseline algorithms, including the widely-used alias and interval samplers. It also uses up to 10000x less space than the information-theoretically optimal sampler, at the expense of less than 1.5x runtime overhead.

Cite this Paper


BibTeX
@InProceedings{pmlr-v108-saad20a, title = {The Fast Loaded Dice Roller: A Near-Optimal Exact Sampler for Discrete Probability Distributions}, author = {Saad, Feras and Freer, Cameron and Rinard, Martin and Mansinghka, Vikash}, booktitle = {Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics}, pages = {1036--1046}, year = {2020}, editor = {Chiappa, Silvia and Calandra, Roberto}, volume = {108}, series = {Proceedings of Machine Learning Research}, month = {26--28 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v108/saad20a/saad20a.pdf}, url = {https://proceedings.mlr.press/v108/saad20a.html}, abstract = {This paper introduces a new algorithm for the fundamental problem of generating a random integer from a discrete probability distribution using a source of independent and unbiased random coin flips. We prove that this algorithm, which we call the Fast Loaded Dice Roller (FLDR), is highly efficient in both space and time: (i) the size of the sampler is guaranteed to be linear in the number of bits needed to encode the input distribution; and (ii) the expected number of bits of entropy it consumes per sample is at most 6 bits more than the information-theoretically optimal rate. We present fast implementations of the linear-time preprocessing and near-optimal sampling algorithms using unsigned integer arithmetic. Empirical evaluations on a broad set of probability distributions establish that FLDR is 2x-10x faster in both preprocessing and sampling than multiple baseline algorithms, including the widely-used alias and interval samplers. It also uses up to 10000x less space than the information-theoretically optimal sampler, at the expense of less than 1.5x runtime overhead.} }
Endnote
%0 Conference Paper %T The Fast Loaded Dice Roller: A Near-Optimal Exact Sampler for Discrete Probability Distributions %A Feras Saad %A Cameron Freer %A Martin Rinard %A Vikash Mansinghka %B Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2020 %E Silvia Chiappa %E Roberto Calandra %F pmlr-v108-saad20a %I PMLR %P 1036--1046 %U https://proceedings.mlr.press/v108/saad20a.html %V 108 %X This paper introduces a new algorithm for the fundamental problem of generating a random integer from a discrete probability distribution using a source of independent and unbiased random coin flips. We prove that this algorithm, which we call the Fast Loaded Dice Roller (FLDR), is highly efficient in both space and time: (i) the size of the sampler is guaranteed to be linear in the number of bits needed to encode the input distribution; and (ii) the expected number of bits of entropy it consumes per sample is at most 6 bits more than the information-theoretically optimal rate. We present fast implementations of the linear-time preprocessing and near-optimal sampling algorithms using unsigned integer arithmetic. Empirical evaluations on a broad set of probability distributions establish that FLDR is 2x-10x faster in both preprocessing and sampling than multiple baseline algorithms, including the widely-used alias and interval samplers. It also uses up to 10000x less space than the information-theoretically optimal sampler, at the expense of less than 1.5x runtime overhead.
APA
Saad, F., Freer, C., Rinard, M. & Mansinghka, V.. (2020). The Fast Loaded Dice Roller: A Near-Optimal Exact Sampler for Discrete Probability Distributions. Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 108:1036-1046 Available from https://proceedings.mlr.press/v108/saad20a.html.

Related Material