Efficient Monte Carlo Tree Search via On-the-Fly State-Conditioned Action Abstraction

Yunhyeok Kwak; Inwoo Hwang; Dooyoung Kim; Sanghack Lee; Byoung-Tak Zhang

Efficient Monte Carlo Tree Search via On-the-Fly State-Conditioned Action Abstraction

Yunhyeok Kwak, Inwoo Hwang, Dooyoung Kim, Sanghack Lee, Byoung-Tak Zhang

Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence, PMLR 244:2076-2093, 2024.

Abstract

Monte Carlo Tree Search (MCTS) has showcased its efficacy across a broad spectrum of decision-making problems. However, its performance often degrades under vast combinatorial action space, especially where an action is composed of multiple sub-actions. In this work, we propose an action abstraction based on the compositional structure between a state and sub-actions for improving the efficiency of MCTS under a factored action space. Our method learns a latent dynamics model with an auxiliary network that captures sub-actions relevant to the transition on the current state, which we call state-conditioned action abstraction. Notably, it infers such compositional relationships from high-dimensional observations without the known environment model. During the tree traversal, our method constructs the state-conditioned action abstraction for each node on-the-fly, reducing the search space by discarding the exploration of redundant sub-actions. Experimental results demonstrate the superior sample efficiency of our method compared to vanilla MuZero, which suffers from expansive action space.

Cite this Paper

BibTeX


@InProceedings{pmlr-v244-kwak24a,
  title = 	 {Efficient Monte Carlo Tree Search via On-the-Fly State-Conditioned Action Abstraction},
  author =       {Kwak, Yunhyeok and Hwang, Inwoo and Kim, Dooyoung and Lee, Sanghack and Zhang, Byoung-Tak},
  booktitle = 	 {Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence},
  pages = 	 {2076--2093},
  year = 	 {2024},
  editor = 	 {Kiyavash, Negar and Mooij, Joris M.},
  volume = 	 {244},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {15--19 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v244/main/assets/kwak24a/kwak24a.pdf},
  url = 	 {https://proceedings.mlr.press/v244/kwak24a.html},
  abstract = 	 {Monte Carlo Tree Search (MCTS) has showcased its efficacy across a broad spectrum of decision-making problems. However, its performance often degrades under vast combinatorial action space, especially where an action is composed of multiple sub-actions. In this work, we propose an action abstraction based on the compositional structure between a state and sub-actions for improving the efficiency of MCTS under a factored action space. Our method learns a latent dynamics model with an auxiliary network that captures sub-actions relevant to the transition on the current state, which we call state-conditioned action abstraction. Notably, it infers such compositional relationships from high-dimensional observations without the known environment model. During the tree traversal, our method constructs the state-conditioned action abstraction for each node on-the-fly, reducing the search space by discarding the exploration of redundant sub-actions. Experimental results demonstrate the superior sample efficiency of our method compared to vanilla MuZero, which suffers from expansive action space.}
}

Endnote

%0 Conference Paper
%T Efficient Monte Carlo Tree Search via On-the-Fly State-Conditioned Action Abstraction
%A Yunhyeok Kwak
%A Inwoo Hwang
%A Dooyoung Kim
%A Sanghack Lee
%A Byoung-Tak Zhang
%B Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence
%C Proceedings of Machine Learning Research
%D 2024
%E Negar Kiyavash
%E Joris M. Mooij	
%F pmlr-v244-kwak24a
%I PMLR
%P 2076--2093
%U https://proceedings.mlr.press/v244/kwak24a.html
%V 244
%X Monte Carlo Tree Search (MCTS) has showcased its efficacy across a broad spectrum of decision-making problems. However, its performance often degrades under vast combinatorial action space, especially where an action is composed of multiple sub-actions. In this work, we propose an action abstraction based on the compositional structure between a state and sub-actions for improving the efficiency of MCTS under a factored action space. Our method learns a latent dynamics model with an auxiliary network that captures sub-actions relevant to the transition on the current state, which we call state-conditioned action abstraction. Notably, it infers such compositional relationships from high-dimensional observations without the known environment model. During the tree traversal, our method constructs the state-conditioned action abstraction for each node on-the-fly, reducing the search space by discarding the exploration of redundant sub-actions. Experimental results demonstrate the superior sample efficiency of our method compared to vanilla MuZero, which suffers from expansive action space.

APA


Kwak, Y., Hwang, I., Kim, D., Lee, S. & Zhang, B.. (2024). Efficient Monte Carlo Tree Search via On-the-Fly State-Conditioned Action Abstraction. Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 244:2076-2093 Available from https://proceedings.mlr.press/v244/kwak24a.html.

Efficient Monte Carlo Tree Search via On-the-Fly State-Conditioned Action Abstraction

Abstract

Cite this Paper

Related Material