Prediction from compression for models with infinite memory, with applications to hidden Markov and renewal processes

Yanjun Han, Tianze Jiang, Yihong Wu
Proceedings of Thirty Seventh Conference on Learning Theory, PMLR 247:2270-2307, 2024.

Abstract

Consider the problem of predicting the next symbol given a sample path of length $n$, whose joint distribution belongs to a distribution class that may have long-term memory. The goal is to compete with the conditional predictor that knows the true model. For both hidden Markov models (HMMs) and renewal processes, we determine the optimal prediction risk in Kullback-Leibler divergence up to universal constant factors. Extending existing results in finite-order Markov models (Han et al. (2023)) and drawing ideas from universal compression, the proposed estimator has a prediction risk bounded by redundancy of the distribution class and a memory term that accounts for the long-range dependency of the model. Notably, for HMMs with bounded state and observation spaces, a polynomial-time estimator based on dynamic programming is shown to achieve the optimal prediction risk $\Theta(\frac{\log n}{n})$; prior to this work, the only known result of this type is $O(\frac{1}{\log n})$ obtained using Markov approximation (Sharan et al. (2018)). Matching minimax lower bounds are obtained by making connections to redundancy and mutual information via a reduction argument.

Cite this Paper


BibTeX
@InProceedings{pmlr-v247-han24a, title = {Prediction from compression for models with infinite memory, with applications to hidden Markov and renewal processes}, author = {Han, Yanjun and Jiang, Tianze and Wu, Yihong}, booktitle = {Proceedings of Thirty Seventh Conference on Learning Theory}, pages = {2270--2307}, year = {2024}, editor = {Agrawal, Shipra and Roth, Aaron}, volume = {247}, series = {Proceedings of Machine Learning Research}, month = {30 Jun--03 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v247/han24a/han24a.pdf}, url = {https://proceedings.mlr.press/v247/han24a.html}, abstract = {Consider the problem of predicting the next symbol given a sample path of length $n$, whose joint distribution belongs to a distribution class that may have long-term memory. The goal is to compete with the conditional predictor that knows the true model. For both hidden Markov models (HMMs) and renewal processes, we determine the optimal prediction risk in Kullback-Leibler divergence up to universal constant factors. Extending existing results in finite-order Markov models (Han et al. (2023)) and drawing ideas from universal compression, the proposed estimator has a prediction risk bounded by redundancy of the distribution class and a memory term that accounts for the long-range dependency of the model. Notably, for HMMs with bounded state and observation spaces, a polynomial-time estimator based on dynamic programming is shown to achieve the optimal prediction risk $\Theta(\frac{\log n}{n})$; prior to this work, the only known result of this type is $O(\frac{1}{\log n})$ obtained using Markov approximation (Sharan et al. (2018)). Matching minimax lower bounds are obtained by making connections to redundancy and mutual information via a reduction argument.} }
Endnote
%0 Conference Paper %T Prediction from compression for models with infinite memory, with applications to hidden Markov and renewal processes %A Yanjun Han %A Tianze Jiang %A Yihong Wu %B Proceedings of Thirty Seventh Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2024 %E Shipra Agrawal %E Aaron Roth %F pmlr-v247-han24a %I PMLR %P 2270--2307 %U https://proceedings.mlr.press/v247/han24a.html %V 247 %X Consider the problem of predicting the next symbol given a sample path of length $n$, whose joint distribution belongs to a distribution class that may have long-term memory. The goal is to compete with the conditional predictor that knows the true model. For both hidden Markov models (HMMs) and renewal processes, we determine the optimal prediction risk in Kullback-Leibler divergence up to universal constant factors. Extending existing results in finite-order Markov models (Han et al. (2023)) and drawing ideas from universal compression, the proposed estimator has a prediction risk bounded by redundancy of the distribution class and a memory term that accounts for the long-range dependency of the model. Notably, for HMMs with bounded state and observation spaces, a polynomial-time estimator based on dynamic programming is shown to achieve the optimal prediction risk $\Theta(\frac{\log n}{n})$; prior to this work, the only known result of this type is $O(\frac{1}{\log n})$ obtained using Markov approximation (Sharan et al. (2018)). Matching minimax lower bounds are obtained by making connections to redundancy and mutual information via a reduction argument.
APA
Han, Y., Jiang, T. & Wu, Y.. (2024). Prediction from compression for models with infinite memory, with applications to hidden Markov and renewal processes. Proceedings of Thirty Seventh Conference on Learning Theory, in Proceedings of Machine Learning Research 247:2270-2307 Available from https://proceedings.mlr.press/v247/han24a.html.

Related Material