Provably Efficient Representation Learning with Tractable Planning in Low-Rank POMDP

Jiacheng Guo, Zihao Li, Huazheng Wang, Mengdi Wang, Zhuoran Yang, Xuezhou Zhang
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:11967-11997, 2023.

Abstract

In this paper, we study representation learning in partially observable Markov Decision Processes (POMDPs), where the agent learns a decoder function that maps a series of high-dimensional raw observations to a compact representation and uses it for more efficient exploration and planning. We focus our attention on the sub-classes of $\gamma$-observable and decodable POMDPs, for which it has been shown that statistically tractable learning is possible, but there has not been any computationally efficient algorithm. We first present an algorithm for decodable PMMDPs that combines maximum likelihood estimation (MLE) and optimism in the face of uncertainty (OFU) to perform representation learning and achieve efficient sample complexity, while only calling supervised learning computational oracles. We then show how to adapt this algorithm to also work in the broader class of $\gamma$-observable POMDPs.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-guo23d, title = {Provably Efficient Representation Learning with Tractable Planning in Low-Rank {POMDP}}, author = {Guo, Jiacheng and Li, Zihao and Wang, Huazheng and Wang, Mengdi and Yang, Zhuoran and Zhang, Xuezhou}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {11967--11997}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/guo23d/guo23d.pdf}, url = {https://proceedings.mlr.press/v202/guo23d.html}, abstract = {In this paper, we study representation learning in partially observable Markov Decision Processes (POMDPs), where the agent learns a decoder function that maps a series of high-dimensional raw observations to a compact representation and uses it for more efficient exploration and planning. We focus our attention on the sub-classes of $\gamma$-observable and decodable POMDPs, for which it has been shown that statistically tractable learning is possible, but there has not been any computationally efficient algorithm. We first present an algorithm for decodable PMMDPs that combines maximum likelihood estimation (MLE) and optimism in the face of uncertainty (OFU) to perform representation learning and achieve efficient sample complexity, while only calling supervised learning computational oracles. We then show how to adapt this algorithm to also work in the broader class of $\gamma$-observable POMDPs.} }
Endnote
%0 Conference Paper %T Provably Efficient Representation Learning with Tractable Planning in Low-Rank POMDP %A Jiacheng Guo %A Zihao Li %A Huazheng Wang %A Mengdi Wang %A Zhuoran Yang %A Xuezhou Zhang %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-guo23d %I PMLR %P 11967--11997 %U https://proceedings.mlr.press/v202/guo23d.html %V 202 %X In this paper, we study representation learning in partially observable Markov Decision Processes (POMDPs), where the agent learns a decoder function that maps a series of high-dimensional raw observations to a compact representation and uses it for more efficient exploration and planning. We focus our attention on the sub-classes of $\gamma$-observable and decodable POMDPs, for which it has been shown that statistically tractable learning is possible, but there has not been any computationally efficient algorithm. We first present an algorithm for decodable PMMDPs that combines maximum likelihood estimation (MLE) and optimism in the face of uncertainty (OFU) to perform representation learning and achieve efficient sample complexity, while only calling supervised learning computational oracles. We then show how to adapt this algorithm to also work in the broader class of $\gamma$-observable POMDPs.
APA
Guo, J., Li, Z., Wang, H., Wang, M., Yang, Z. & Zhang, X.. (2023). Provably Efficient Representation Learning with Tractable Planning in Low-Rank POMDP. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:11967-11997 Available from https://proceedings.mlr.press/v202/guo23d.html.

Related Material