Estimating Maximum Expected Value through Gaussian Approximation

Carlo D’Eramo; Marcello Restelli; Alessandro Nuara

Estimating Maximum Expected Value through Gaussian Approximation

Carlo D’Eramo, Marcello Restelli, Alessandro Nuara

Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:1032-1040, 2016.

Abstract

This paper is about the estimation of the maximum expected value of a set of independent random variables. The performance of several learning algorithms (e.g., Q-learning) is affected by the accuracy of such estimation. Unfortunately, no unbiased estimator exists. The usual approach of taking the maximum of the sample means leads to large overestimates that may significantly harm the performance of the learning algorithm. Recent works have shown that the cross validation estimator—which is negatively biased—outperforms the maximum estimator in many sequential decision-making scenarios. On the other hand, the relative performance of the two estimators is highly problem-dependent. In this paper, we propose a new estimator for the maximum expected value, based on a weighted average of the sample means, where the weights are computed using Gaussian approximations for the distributions of the sample means. We compare the proposed estimator with the other state-of-the-art methods both theoretically, by deriving upper bounds to the bias and the variance of the estimator, and empirically, by testing the performance on different sequential learning problems.

Cite this Paper

BibTeX


@InProceedings{pmlr-v48-deramo16,
  title = 	 {Estimating Maximum Expected Value through Gaussian Approximation},
  author = 	 {D’Eramo, Carlo and Restelli, Marcello and Nuara, Alessandro},
  booktitle = 	 {Proceedings of The 33rd International Conference on Machine Learning},
  pages = 	 {1032--1040},
  year = 	 {2016},
  editor = 	 {Balcan, Maria Florina and Weinberger, Kilian Q.},
  volume = 	 {48},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {New York, New York, USA},
  month = 	 {20--22 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v48/deramo16.pdf},
  url = 	 {https://proceedings.mlr.press/v48/deramo16.html},
  abstract = 	 {This paper is about the estimation of the maximum expected value of a set of independent random variables. The performance of several learning algorithms (e.g., Q-learning) is affected by the accuracy of such estimation. Unfortunately, no unbiased estimator exists. The usual approach of taking the maximum of the sample means leads to large overestimates that may significantly harm the performance of the learning algorithm. Recent works have shown that the cross validation estimator—which is negatively biased—outperforms the maximum estimator in many sequential decision-making scenarios. On the other hand, the relative performance of the two estimators is highly problem-dependent. In this paper, we propose a new estimator for the maximum expected value, based on a weighted average of the sample means, where the weights are computed using Gaussian approximations for the distributions of the sample means. We compare the proposed estimator with the other state-of-the-art methods both theoretically, by deriving upper bounds to the bias and the variance of the estimator, and empirically, by testing the performance on different sequential learning problems.}
}

Endnote

%0 Conference Paper
%T Estimating Maximum Expected Value through Gaussian Approximation
%A Carlo D’Eramo
%A Marcello Restelli
%A Alessandro Nuara
%B Proceedings of The 33rd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2016
%E Maria Florina Balcan
%E Kilian Q. Weinberger	
%F pmlr-v48-deramo16
%I PMLR
%P 1032--1040
%U https://proceedings.mlr.press/v48/deramo16.html
%V 48
%X This paper is about the estimation of the maximum expected value of a set of independent random variables. The performance of several learning algorithms (e.g., Q-learning) is affected by the accuracy of such estimation. Unfortunately, no unbiased estimator exists. The usual approach of taking the maximum of the sample means leads to large overestimates that may significantly harm the performance of the learning algorithm. Recent works have shown that the cross validation estimator—which is negatively biased—outperforms the maximum estimator in many sequential decision-making scenarios. On the other hand, the relative performance of the two estimators is highly problem-dependent. In this paper, we propose a new estimator for the maximum expected value, based on a weighted average of the sample means, where the weights are computed using Gaussian approximations for the distributions of the sample means. We compare the proposed estimator with the other state-of-the-art methods both theoretically, by deriving upper bounds to the bias and the variance of the estimator, and empirically, by testing the performance on different sequential learning problems.

RIS


TY  - CPAPER
TI  - Estimating Maximum Expected Value through Gaussian Approximation
AU  - Carlo D’Eramo
AU  - Marcello Restelli
AU  - Alessandro Nuara
BT  - Proceedings of The 33rd International Conference on Machine Learning
DA  - 2016/06/11
ED  - Maria Florina Balcan
ED  - Kilian Q. Weinberger	
ID  - pmlr-v48-deramo16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 48
SP  - 1032
EP  - 1040
L1  - http://proceedings.mlr.press/v48/deramo16.pdf
UR  - https://proceedings.mlr.press/v48/deramo16.html
AB  - This paper is about the estimation of the maximum expected value of a set of independent random variables. The performance of several learning algorithms (e.g., Q-learning) is affected by the accuracy of such estimation. Unfortunately, no unbiased estimator exists. The usual approach of taking the maximum of the sample means leads to large overestimates that may significantly harm the performance of the learning algorithm. Recent works have shown that the cross validation estimator—which is negatively biased—outperforms the maximum estimator in many sequential decision-making scenarios. On the other hand, the relative performance of the two estimators is highly problem-dependent. In this paper, we propose a new estimator for the maximum expected value, based on a weighted average of the sample means, where the weights are computed using Gaussian approximations for the distributions of the sample means. We compare the proposed estimator with the other state-of-the-art methods both theoretically, by deriving upper bounds to the bias and the variance of the estimator, and empirically, by testing the performance on different sequential learning problems.
ER  -

APA


D’Eramo, C., Restelli, M. & Nuara, A.. (2016). Estimating Maximum Expected Value through Gaussian Approximation. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:1032-1040 Available from https://proceedings.mlr.press/v48/deramo16.html.

Estimating Maximum Expected Value through Gaussian Approximation

Abstract

Cite this Paper

Related Material