Maximum Likelihood vs. Sequential Normalized Maximum Likelihood in On-line Density Estimation

Wojciech Kotłowski, Peter Grünwald
Proceedings of the 24th Annual Conference on Learning Theory, PMLR 19:457-476, 2011.

Abstract

The paper considers sequential prediction of individual sequences with log loss (online density estimation) using an exponential family of distributions. We first analyze the regret of the maximum likelihood (“follow the leader”) strategy. We find that this strategy is (1) suboptimal and (2) requires an additional assumption about boundedness of the data sequence. We then show that both problems can be be addressed by adding the currently predicted outcome to the calculation of the maximum likelihood, followed by normalization of the distribution. The strategy obtained in this way is known in the literature as the sequential normalized maximum likelihood or last-step minimax strategy. We show for the first time that for general exponential families, the regret is bounded by the familiar $(k/2) \log n$ and thus optimal up to $O(1)$. We also show the relationship to the Bayes strategy with Jeffreys’ prior.

Cite this Paper


BibTeX
@InProceedings{pmlr-v19-kotlowski11a, title = {Maximum Likelihood vs. Sequential Normalized Maximum Likelihood in On-line Density Estimation}, author = {Kotłowski, Wojciech and Grünwald, Peter}, booktitle = {Proceedings of the 24th Annual Conference on Learning Theory}, pages = {457--476}, year = {2011}, editor = {Kakade, Sham M. and von Luxburg, Ulrike}, volume = {19}, series = {Proceedings of Machine Learning Research}, address = {Budapest, Hungary}, month = {09--11 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v19/kotlowski11a/kotlowski11a.pdf}, url = {https://proceedings.mlr.press/v19/kotlowski11a.html}, abstract = {The paper considers sequential prediction of individual sequences with log loss (online density estimation) using an exponential family of distributions. We first analyze the regret of the maximum likelihood (“follow the leader”) strategy. We find that this strategy is (1) suboptimal and (2) requires an additional assumption about boundedness of the data sequence. We then show that both problems can be be addressed by adding the currently predicted outcome to the calculation of the maximum likelihood, followed by normalization of the distribution. The strategy obtained in this way is known in the literature as the sequential normalized maximum likelihood or last-step minimax strategy. We show for the first time that for general exponential families, the regret is bounded by the familiar $(k/2) \log n$ and thus optimal up to $O(1)$. We also show the relationship to the Bayes strategy with Jeffreys’ prior.} }
Endnote
%0 Conference Paper %T Maximum Likelihood vs. Sequential Normalized Maximum Likelihood in On-line Density Estimation %A Wojciech Kotłowski %A Peter Grünwald %B Proceedings of the 24th Annual Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2011 %E Sham M. Kakade %E Ulrike von Luxburg %F pmlr-v19-kotlowski11a %I PMLR %P 457--476 %U https://proceedings.mlr.press/v19/kotlowski11a.html %V 19 %X The paper considers sequential prediction of individual sequences with log loss (online density estimation) using an exponential family of distributions. We first analyze the regret of the maximum likelihood (“follow the leader”) strategy. We find that this strategy is (1) suboptimal and (2) requires an additional assumption about boundedness of the data sequence. We then show that both problems can be be addressed by adding the currently predicted outcome to the calculation of the maximum likelihood, followed by normalization of the distribution. The strategy obtained in this way is known in the literature as the sequential normalized maximum likelihood or last-step minimax strategy. We show for the first time that for general exponential families, the regret is bounded by the familiar $(k/2) \log n$ and thus optimal up to $O(1)$. We also show the relationship to the Bayes strategy with Jeffreys’ prior.
RIS
TY - CPAPER TI - Maximum Likelihood vs. Sequential Normalized Maximum Likelihood in On-line Density Estimation AU - Wojciech Kotłowski AU - Peter Grünwald BT - Proceedings of the 24th Annual Conference on Learning Theory DA - 2011/12/21 ED - Sham M. Kakade ED - Ulrike von Luxburg ID - pmlr-v19-kotlowski11a PB - PMLR DP - Proceedings of Machine Learning Research VL - 19 SP - 457 EP - 476 L1 - http://proceedings.mlr.press/v19/kotlowski11a/kotlowski11a.pdf UR - https://proceedings.mlr.press/v19/kotlowski11a.html AB - The paper considers sequential prediction of individual sequences with log loss (online density estimation) using an exponential family of distributions. We first analyze the regret of the maximum likelihood (“follow the leader”) strategy. We find that this strategy is (1) suboptimal and (2) requires an additional assumption about boundedness of the data sequence. We then show that both problems can be be addressed by adding the currently predicted outcome to the calculation of the maximum likelihood, followed by normalization of the distribution. The strategy obtained in this way is known in the literature as the sequential normalized maximum likelihood or last-step minimax strategy. We show for the first time that for general exponential families, the regret is bounded by the familiar $(k/2) \log n$ and thus optimal up to $O(1)$. We also show the relationship to the Bayes strategy with Jeffreys’ prior. ER -
APA
Kotłowski, W. & Grünwald, P.. (2011). Maximum Likelihood vs. Sequential Normalized Maximum Likelihood in On-line Density Estimation. Proceedings of the 24th Annual Conference on Learning Theory, in Proceedings of Machine Learning Research 19:457-476 Available from https://proceedings.mlr.press/v19/kotlowski11a.html.

Related Material