Adaptive Monte Carlo via Bandit Allocation

James Neufeld, Andras Gyorgy, Csaba Szepesvari, Dale Schuurmans
Proceedings of the 31st International Conference on Machine Learning, PMLR 32(2):1944-1952, 2014.

Abstract

We consider the problem of sequentially choosing between a set of unbiased Monte Carlo estimators to minimize the mean-squared-error (MSE) of a final combined estimate. By reducing this task to a stochastic multi-armed bandit problem, we show that well developed allocation strategies can be used to achieve an MSE that approaches that of the best estimator chosen in retrospect. We then extend these developments to a scenario where alternative estimators have different, possibly stochastic, costs. The outcome is a new set of adaptive Monte Carlo strategies that provide stronger guarantees than previous approaches while offering practical advantages.

Cite this Paper


BibTeX
@InProceedings{pmlr-v32-neufeld14, title = {Adaptive Monte Carlo via Bandit Allocation}, author = {Neufeld, James and Gyorgy, Andras and Szepesvari, Csaba and Schuurmans, Dale}, booktitle = {Proceedings of the 31st International Conference on Machine Learning}, pages = {1944--1952}, year = {2014}, editor = {Xing, Eric P. and Jebara, Tony}, volume = {32}, number = {2}, series = {Proceedings of Machine Learning Research}, address = {Bejing, China}, month = {22--24 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v32/neufeld14.pdf}, url = {https://proceedings.mlr.press/v32/neufeld14.html}, abstract = {We consider the problem of sequentially choosing between a set of unbiased Monte Carlo estimators to minimize the mean-squared-error (MSE) of a final combined estimate. By reducing this task to a stochastic multi-armed bandit problem, we show that well developed allocation strategies can be used to achieve an MSE that approaches that of the best estimator chosen in retrospect. We then extend these developments to a scenario where alternative estimators have different, possibly stochastic, costs. The outcome is a new set of adaptive Monte Carlo strategies that provide stronger guarantees than previous approaches while offering practical advantages.} }
Endnote
%0 Conference Paper %T Adaptive Monte Carlo via Bandit Allocation %A James Neufeld %A Andras Gyorgy %A Csaba Szepesvari %A Dale Schuurmans %B Proceedings of the 31st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2014 %E Eric P. Xing %E Tony Jebara %F pmlr-v32-neufeld14 %I PMLR %P 1944--1952 %U https://proceedings.mlr.press/v32/neufeld14.html %V 32 %N 2 %X We consider the problem of sequentially choosing between a set of unbiased Monte Carlo estimators to minimize the mean-squared-error (MSE) of a final combined estimate. By reducing this task to a stochastic multi-armed bandit problem, we show that well developed allocation strategies can be used to achieve an MSE that approaches that of the best estimator chosen in retrospect. We then extend these developments to a scenario where alternative estimators have different, possibly stochastic, costs. The outcome is a new set of adaptive Monte Carlo strategies that provide stronger guarantees than previous approaches while offering practical advantages.
RIS
TY - CPAPER TI - Adaptive Monte Carlo via Bandit Allocation AU - James Neufeld AU - Andras Gyorgy AU - Csaba Szepesvari AU - Dale Schuurmans BT - Proceedings of the 31st International Conference on Machine Learning DA - 2014/06/18 ED - Eric P. Xing ED - Tony Jebara ID - pmlr-v32-neufeld14 PB - PMLR DP - Proceedings of Machine Learning Research VL - 32 IS - 2 SP - 1944 EP - 1952 L1 - http://proceedings.mlr.press/v32/neufeld14.pdf UR - https://proceedings.mlr.press/v32/neufeld14.html AB - We consider the problem of sequentially choosing between a set of unbiased Monte Carlo estimators to minimize the mean-squared-error (MSE) of a final combined estimate. By reducing this task to a stochastic multi-armed bandit problem, we show that well developed allocation strategies can be used to achieve an MSE that approaches that of the best estimator chosen in retrospect. We then extend these developments to a scenario where alternative estimators have different, possibly stochastic, costs. The outcome is a new set of adaptive Monte Carlo strategies that provide stronger guarantees than previous approaches while offering practical advantages. ER -
APA
Neufeld, J., Gyorgy, A., Szepesvari, C. & Schuurmans, D.. (2014). Adaptive Monte Carlo via Bandit Allocation. Proceedings of the 31st International Conference on Machine Learning, in Proceedings of Machine Learning Research 32(2):1944-1952 Available from https://proceedings.mlr.press/v32/neufeld14.html.

Related Material