Compositional Learning-based Planning for Vision POMDPs

Sampada Deglurkar, Michael H Lim, Johnathan Tucker, Zachary N Sunberg, Aleksandra Faust, Claire Tomlin
Proceedings of The 5th Annual Learning for Dynamics and Control Conference, PMLR 211:469-482, 2023.

Abstract

The Partially Observable Markov Decision Process (POMDP) is a powerful framework for capturing decision-making problems that involve state and transition uncertainty. However, most current POMDP planners cannot effectively handle high-dimensional image observations prevalent in real world applications, and often require lengthy online training that requires interaction with the environment. In this work, we propose Visual Tree Search (VTS), a compositional learning and planning procedure that combines generative models learned offline with online model-based POMDP planning. The deep generative observation models evaluate the likelihood of and predict future image observations in a Monte Carlo tree search planner. We show that VTS is robust to different types of image noises that were not present during training and can adapt to different reward structures without the need to re-train. This new approach significantly and stably outperforms several baseline state-of-the-art vision POMDP algorithms while using a fraction of the training time.

Cite this Paper


BibTeX
@InProceedings{pmlr-v211-deglurkar23a, title = {Compositional Learning-based Planning for Vision POMDPs}, author = {Deglurkar, Sampada and Lim, Michael H and Tucker, Johnathan and Sunberg, Zachary N and Faust, Aleksandra and Tomlin, Claire}, booktitle = {Proceedings of The 5th Annual Learning for Dynamics and Control Conference}, pages = {469--482}, year = {2023}, editor = {Matni, Nikolai and Morari, Manfred and Pappas, George J.}, volume = {211}, series = {Proceedings of Machine Learning Research}, month = {15--16 Jun}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v211/deglurkar23a/deglurkar23a.pdf}, url = {https://proceedings.mlr.press/v211/deglurkar23a.html}, abstract = {The Partially Observable Markov Decision Process (POMDP) is a powerful framework for capturing decision-making problems that involve state and transition uncertainty. However, most current POMDP planners cannot effectively handle high-dimensional image observations prevalent in real world applications, and often require lengthy online training that requires interaction with the environment. In this work, we propose Visual Tree Search (VTS), a compositional learning and planning procedure that combines generative models learned offline with online model-based POMDP planning. The deep generative observation models evaluate the likelihood of and predict future image observations in a Monte Carlo tree search planner. We show that VTS is robust to different types of image noises that were not present during training and can adapt to different reward structures without the need to re-train. This new approach significantly and stably outperforms several baseline state-of-the-art vision POMDP algorithms while using a fraction of the training time.} }
Endnote
%0 Conference Paper %T Compositional Learning-based Planning for Vision POMDPs %A Sampada Deglurkar %A Michael H Lim %A Johnathan Tucker %A Zachary N Sunberg %A Aleksandra Faust %A Claire Tomlin %B Proceedings of The 5th Annual Learning for Dynamics and Control Conference %C Proceedings of Machine Learning Research %D 2023 %E Nikolai Matni %E Manfred Morari %E George J. Pappas %F pmlr-v211-deglurkar23a %I PMLR %P 469--482 %U https://proceedings.mlr.press/v211/deglurkar23a.html %V 211 %X The Partially Observable Markov Decision Process (POMDP) is a powerful framework for capturing decision-making problems that involve state and transition uncertainty. However, most current POMDP planners cannot effectively handle high-dimensional image observations prevalent in real world applications, and often require lengthy online training that requires interaction with the environment. In this work, we propose Visual Tree Search (VTS), a compositional learning and planning procedure that combines generative models learned offline with online model-based POMDP planning. The deep generative observation models evaluate the likelihood of and predict future image observations in a Monte Carlo tree search planner. We show that VTS is robust to different types of image noises that were not present during training and can adapt to different reward structures without the need to re-train. This new approach significantly and stably outperforms several baseline state-of-the-art vision POMDP algorithms while using a fraction of the training time.
APA
Deglurkar, S., Lim, M.H., Tucker, J., Sunberg, Z.N., Faust, A. & Tomlin, C.. (2023). Compositional Learning-based Planning for Vision POMDPs. Proceedings of The 5th Annual Learning for Dynamics and Control Conference, in Proceedings of Machine Learning Research 211:469-482 Available from https://proceedings.mlr.press/v211/deglurkar23a.html.

Related Material