Hierarchical POMDP controller optimization by likelihood maximization

Marc Toussaint; Laurent Charlin; Pascal Poupart

Hierarchical POMDP controller optimization by likelihood maximization

Marc Toussaint, Laurent Charlin, Pascal Poupart

Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence, PMLR R6:562-570, 2008.

Abstract

Planning can often be simplified by decomposing the task into smaller tasks arranged hierarchically. Charlin et al. [4] recently showed that the hierarchy discovery problem can be framed as a non-convex optimization problem. However, the inherent computational difficulty of solving such an optimization problem makes it hard to scale to real-world problems. In another line of research, Toussaint et al. [18] developed a method to solve planning problems by maximum-likelihood estimation. In this paper, we show how the hierarchy discovery problem in partially observable domains can be tackled using a similar maximum likelihood approach. Our technique first transforms the problem into a dynamic Bayesian network through which a hierarchical structure can naturally be discovered while optimizing the policy. Experimental results demonstrate that this approach scales better than previous techniques based on non-convex optimization.

Cite this Paper

BibTeX

@InProceedings{pmlr-vR6-toussaint08a,
  title = 	 {Hierarchical POMDP controller optimization by likelihood maximization},
  author =       {Toussaint, Marc and Charlin, Laurent and Poupart, Pascal},
  booktitle = 	 {Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence},
  pages = 	 {562--570},
  year = 	 {2008},
  editor = 	 {McAllester, David A. and Myllymäki, Petri},
  volume = 	 {R6},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {09--12 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/r6/main/assets/toussaint08a/toussaint08a.pdf},
  url = 	 {https://proceedings.mlr.press/r6/toussaint08a.html},
  abstract = 	 {Planning can often be simplified by decomposing the task into smaller tasks arranged hierarchically. Charlin et al. [4] recently showed that the hierarchy discovery problem can be framed as a non-convex optimization problem. However, the inherent computational difficulty of solving such an optimization problem makes it hard to scale to real-world problems. In another line of research, Toussaint et al. [18] developed a method to solve planning problems by maximum-likelihood estimation. In this paper, we show how the hierarchy discovery problem in partially observable domains can be tackled using a similar maximum likelihood approach. Our technique first transforms the problem into a dynamic Bayesian network through which a hierarchical structure can naturally be discovered while optimizing the policy. Experimental results demonstrate that this approach scales better than previous techniques based on non-convex optimization.},
  note =         {Reissued by PMLR on 09 October 2024.}
}

Endnote

%0 Conference Paper
%T Hierarchical POMDP controller optimization by likelihood maximization
%A Marc Toussaint
%A Laurent Charlin
%A Pascal Poupart
%B Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence
%C Proceedings of Machine Learning Research
%D 2008
%E David A. McAllester
%E Petri Myllymäki	
%F pmlr-vR6-toussaint08a
%I PMLR
%P 562--570
%U https://proceedings.mlr.press/r6/toussaint08a.html
%V R6
%X Planning can often be simplified by decomposing the task into smaller tasks arranged hierarchically. Charlin et al. [4] recently showed that the hierarchy discovery problem can be framed as a non-convex optimization problem. However, the inherent computational difficulty of solving such an optimization problem makes it hard to scale to real-world problems. In another line of research, Toussaint et al. [18] developed a method to solve planning problems by maximum-likelihood estimation. In this paper, we show how the hierarchy discovery problem in partially observable domains can be tackled using a similar maximum likelihood approach. Our technique first transforms the problem into a dynamic Bayesian network through which a hierarchical structure can naturally be discovered while optimizing the policy. Experimental results demonstrate that this approach scales better than previous techniques based on non-convex optimization.
%Z Reissued by PMLR on 09 October 2024.

APA

Toussaint, M., Charlin, L. & Poupart, P.. (2008). Hierarchical POMDP controller optimization by likelihood maximization. Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research R6:562-570 Available from https://proceedings.mlr.press/r6/toussaint08a.html. Reissued by PMLR on 09 October 2024.

Related Material

Download PDF