Hierarchical POMDP controller optimization by likelihood maximization

Marc Toussaint, Laurent Charlin, Pascal Poupart
Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence, PMLR R6:562-570, 2008.

Abstract

Planning can often be simplified by decomposing the task into smaller tasks arranged hierarchically. Charlin et al. [4] recently showed that the hierarchy discovery problem can be framed as a non-convex optimization problem. However, the inherent computational difficulty of solving such an optimization problem makes it hard to scale to real-world problems. In another line of research, Toussaint et al. [18] developed a method to solve planning problems by maximum-likelihood estimation. In this paper, we show how the hierarchy discovery problem in partially observable domains can be tackled using a similar maximum likelihood approach. Our technique first transforms the problem into a dynamic Bayesian network through which a hierarchical structure can naturally be discovered while optimizing the policy. Experimental results demonstrate that this approach scales better than previous techniques based on non-convex optimization.

Cite this Paper


BibTeX
@InProceedings{pmlr-vR6-toussaint08a, title = {Hierarchical POMDP controller optimization by likelihood maximization}, author = {Toussaint, Marc and Charlin, Laurent and Poupart, Pascal}, booktitle = {Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence}, pages = {562--570}, year = {2008}, editor = {McAllester, David A. and Myllymäki, Petri}, volume = {R6}, series = {Proceedings of Machine Learning Research}, month = {09--12 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/r6/main/assets/toussaint08a/toussaint08a.pdf}, url = {https://proceedings.mlr.press/r6/toussaint08a.html}, abstract = {Planning can often be simplified by decomposing the task into smaller tasks arranged hierarchically. Charlin et al. [4] recently showed that the hierarchy discovery problem can be framed as a non-convex optimization problem. However, the inherent computational difficulty of solving such an optimization problem makes it hard to scale to real-world problems. In another line of research, Toussaint et al. [18] developed a method to solve planning problems by maximum-likelihood estimation. In this paper, we show how the hierarchy discovery problem in partially observable domains can be tackled using a similar maximum likelihood approach. Our technique first transforms the problem into a dynamic Bayesian network through which a hierarchical structure can naturally be discovered while optimizing the policy. Experimental results demonstrate that this approach scales better than previous techniques based on non-convex optimization.}, note = {Reissued by PMLR on 09 October 2024.} }
Endnote
%0 Conference Paper %T Hierarchical POMDP controller optimization by likelihood maximization %A Marc Toussaint %A Laurent Charlin %A Pascal Poupart %B Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2008 %E David A. McAllester %E Petri Myllymäki %F pmlr-vR6-toussaint08a %I PMLR %P 562--570 %U https://proceedings.mlr.press/r6/toussaint08a.html %V R6 %X Planning can often be simplified by decomposing the task into smaller tasks arranged hierarchically. Charlin et al. [4] recently showed that the hierarchy discovery problem can be framed as a non-convex optimization problem. However, the inherent computational difficulty of solving such an optimization problem makes it hard to scale to real-world problems. In another line of research, Toussaint et al. [18] developed a method to solve planning problems by maximum-likelihood estimation. In this paper, we show how the hierarchy discovery problem in partially observable domains can be tackled using a similar maximum likelihood approach. Our technique first transforms the problem into a dynamic Bayesian network through which a hierarchical structure can naturally be discovered while optimizing the policy. Experimental results demonstrate that this approach scales better than previous techniques based on non-convex optimization. %Z Reissued by PMLR on 09 October 2024.
APA
Toussaint, M., Charlin, L. & Poupart, P.. (2008). Hierarchical POMDP controller optimization by likelihood maximization. Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research R6:562-570 Available from https://proceedings.mlr.press/r6/toussaint08a.html. Reissued by PMLR on 09 October 2024.

Related Material