Estimating Maximum Expected Value through Gaussian Approximation

Carlo D’Eramo, Marcello Restelli, Alessandro Nuara
Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:1032-1040, 2016.

Abstract

This paper is about the estimation of the maximum expected value of a set of independent random variables. The performance of several learning algorithms (e.g., Q-learning) is affected by the accuracy of such estimation. Unfortunately, no unbiased estimator exists. The usual approach of taking the maximum of the sample means leads to large overestimates that may significantly harm the performance of the learning algorithm. Recent works have shown that the cross validation estimator—which is negatively biased—outperforms the maximum estimator in many sequential decision-making scenarios. On the other hand, the relative performance of the two estimators is highly problem-dependent. In this paper, we propose a new estimator for the maximum expected value, based on a weighted average of the sample means, where the weights are computed using Gaussian approximations for the distributions of the sample means. We compare the proposed estimator with the other state-of-the-art methods both theoretically, by deriving upper bounds to the bias and the variance of the estimator, and empirically, by testing the performance on different sequential learning problems.

Cite this Paper


BibTeX
@InProceedings{pmlr-v48-deramo16, title = {Estimating Maximum Expected Value through Gaussian Approximation}, author = {D’Eramo, Carlo and Restelli, Marcello and Nuara, Alessandro}, booktitle = {Proceedings of The 33rd International Conference on Machine Learning}, pages = {1032--1040}, year = {2016}, editor = {Balcan, Maria Florina and Weinberger, Kilian Q.}, volume = {48}, series = {Proceedings of Machine Learning Research}, address = {New York, New York, USA}, month = {20--22 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v48/deramo16.pdf}, url = {https://proceedings.mlr.press/v48/deramo16.html}, abstract = {This paper is about the estimation of the maximum expected value of a set of independent random variables. The performance of several learning algorithms (e.g., Q-learning) is affected by the accuracy of such estimation. Unfortunately, no unbiased estimator exists. The usual approach of taking the maximum of the sample means leads to large overestimates that may significantly harm the performance of the learning algorithm. Recent works have shown that the cross validation estimator—which is negatively biased—outperforms the maximum estimator in many sequential decision-making scenarios. On the other hand, the relative performance of the two estimators is highly problem-dependent. In this paper, we propose a new estimator for the maximum expected value, based on a weighted average of the sample means, where the weights are computed using Gaussian approximations for the distributions of the sample means. We compare the proposed estimator with the other state-of-the-art methods both theoretically, by deriving upper bounds to the bias and the variance of the estimator, and empirically, by testing the performance on different sequential learning problems.} }
Endnote
%0 Conference Paper %T Estimating Maximum Expected Value through Gaussian Approximation %A Carlo D’Eramo %A Marcello Restelli %A Alessandro Nuara %B Proceedings of The 33rd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2016 %E Maria Florina Balcan %E Kilian Q. Weinberger %F pmlr-v48-deramo16 %I PMLR %P 1032--1040 %U https://proceedings.mlr.press/v48/deramo16.html %V 48 %X This paper is about the estimation of the maximum expected value of a set of independent random variables. The performance of several learning algorithms (e.g., Q-learning) is affected by the accuracy of such estimation. Unfortunately, no unbiased estimator exists. The usual approach of taking the maximum of the sample means leads to large overestimates that may significantly harm the performance of the learning algorithm. Recent works have shown that the cross validation estimator—which is negatively biased—outperforms the maximum estimator in many sequential decision-making scenarios. On the other hand, the relative performance of the two estimators is highly problem-dependent. In this paper, we propose a new estimator for the maximum expected value, based on a weighted average of the sample means, where the weights are computed using Gaussian approximations for the distributions of the sample means. We compare the proposed estimator with the other state-of-the-art methods both theoretically, by deriving upper bounds to the bias and the variance of the estimator, and empirically, by testing the performance on different sequential learning problems.
RIS
TY - CPAPER TI - Estimating Maximum Expected Value through Gaussian Approximation AU - Carlo D’Eramo AU - Marcello Restelli AU - Alessandro Nuara BT - Proceedings of The 33rd International Conference on Machine Learning DA - 2016/06/11 ED - Maria Florina Balcan ED - Kilian Q. Weinberger ID - pmlr-v48-deramo16 PB - PMLR DP - Proceedings of Machine Learning Research VL - 48 SP - 1032 EP - 1040 L1 - http://proceedings.mlr.press/v48/deramo16.pdf UR - https://proceedings.mlr.press/v48/deramo16.html AB - This paper is about the estimation of the maximum expected value of a set of independent random variables. The performance of several learning algorithms (e.g., Q-learning) is affected by the accuracy of such estimation. Unfortunately, no unbiased estimator exists. The usual approach of taking the maximum of the sample means leads to large overestimates that may significantly harm the performance of the learning algorithm. Recent works have shown that the cross validation estimator—which is negatively biased—outperforms the maximum estimator in many sequential decision-making scenarios. On the other hand, the relative performance of the two estimators is highly problem-dependent. In this paper, we propose a new estimator for the maximum expected value, based on a weighted average of the sample means, where the weights are computed using Gaussian approximations for the distributions of the sample means. We compare the proposed estimator with the other state-of-the-art methods both theoretically, by deriving upper bounds to the bias and the variance of the estimator, and empirically, by testing the performance on different sequential learning problems. ER -
APA
D’Eramo, C., Restelli, M. & Nuara, A.. (2016). Estimating Maximum Expected Value through Gaussian Approximation. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:1032-1040 Available from https://proceedings.mlr.press/v48/deramo16.html.

Related Material