Mastering Board Games by External and Internal Planning with Language Models

John Schultz; Jakub Adamek; Matej Jusup; Marc Lanctot; Michael Kaisers; Sarah Perrin; Daniel Hennes; Jeremy Shar; Cannada A. Lewis; Anian Ruoss; Tom Zahavy; Petar Veličković; Laurel Prince; Satinder Singh; Eric Malmi; Nenad Tomasev

Mastering Board Games by External and Internal Planning with Language Models

John Schultz, Jakub Adamek, Matej Jusup, Marc Lanctot, Michael Kaisers, Sarah Perrin, Daniel Hennes, Jeremy Shar, Cannada A. Lewis, Anian Ruoss, Tom Zahavy, Petar Veličković, Laurel Prince, Satinder Singh, Eric Malmi, Nenad Tomasev

Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:53581-53644, 2025.

Abstract

Advancing planning and reasoning capabilities of Large Language Models (LLMs) is one of the key prerequisites towards unlocking their potential for performing reliably in complex and impactful domains. In this paper, we aim to demonstrate this across board games (Chess, Fischer Random / Chess960, Connect Four, and Hex), and we show that search-based planning can yield significant improvements in LLM game-playing strength. We introduce, compare and contrast two major approaches: In external search, the model guides Monte Carlo Tree Search (MCTS) rollouts and evaluations without calls to an external game engine, and in internal search, the model is trained to generate in-context a linearized tree of search and a resulting final choice. Both build on a language model pre-trained on relevant domain knowledge, reliably capturing the transition and value functions in the respective environments, with minimal hallucinations. We evaluate our LLM search implementations against game-specific state-of-the-art engines, showcasing substantial improvements in strength over the base model, and reaching Grandmaster-level performance in chess while operating closer to the human search budget. Our proposed approach, combining search with domain knowledge, is not specific to board games, hinting at more general future applications.

Cite this Paper

BibTeX

@InProceedings{pmlr-v267-schultz25a,
  title = 	 {Mastering Board Games by External and Internal Planning with Language Models},
  author =       {Schultz, John and Adamek, Jakub and Jusup, Matej and Lanctot, Marc and Kaisers, Michael and Perrin, Sarah and Hennes, Daniel and Shar, Jeremy and Lewis, Cannada A. and Ruoss, Anian and Zahavy, Tom and Veli\v{c}kovi\'{c}, Petar and Prince, Laurel and Singh, Satinder and Malmi, Eric and Tomasev, Nenad},
  booktitle = 	 {Proceedings of the 42nd International Conference on Machine Learning},
  pages = 	 {53581--53644},
  year = 	 {2025},
  editor = 	 {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry},
  volume = 	 {267},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--19 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v267/main/assets/schultz25a/schultz25a.pdf},
  url = 	 {https://proceedings.mlr.press/v267/schultz25a.html},
  abstract = 	 {Advancing planning and reasoning capabilities of Large Language Models (LLMs) is one of the key prerequisites towards unlocking their potential for performing reliably in complex and impactful domains. In this paper, we aim to demonstrate this across board games (Chess, Fischer Random / Chess960, Connect Four, and Hex), and we show that search-based planning can yield significant improvements in LLM game-playing strength. We introduce, compare and contrast two major approaches: In external search, the model guides Monte Carlo Tree Search (MCTS) rollouts and evaluations without calls to an external game engine, and in internal search, the model is trained to generate in-context a linearized tree of search and a resulting final choice. Both build on a language model pre-trained on relevant domain knowledge, reliably capturing the transition and value functions in the respective environments, with minimal hallucinations. We evaluate our LLM search implementations against game-specific state-of-the-art engines, showcasing substantial improvements in strength over the base model, and reaching Grandmaster-level performance in chess while operating closer to the human search budget. Our proposed approach, combining search with domain knowledge, is not specific to board games, hinting at more general future applications.}
}

Endnote

%0 Conference Paper
%T Mastering Board Games by External and Internal Planning with Language Models
%A John Schultz
%A Jakub Adamek
%A Matej Jusup
%A Marc Lanctot
%A Michael Kaisers
%A Sarah Perrin
%A Daniel Hennes
%A Jeremy Shar
%A Cannada A. Lewis
%A Anian Ruoss
%A Tom Zahavy
%A Petar Veličković
%A Laurel Prince
%A Satinder Singh
%A Eric Malmi
%A Nenad Tomasev
%B Proceedings of the 42nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2025
%E Aarti Singh
%E Maryam Fazel
%E Daniel Hsu
%E Simon Lacoste-Julien
%E Felix Berkenkamp
%E Tegan Maharaj
%E Kiri Wagstaff
%E Jerry Zhu	
%F pmlr-v267-schultz25a
%I PMLR
%P 53581--53644
%U https://proceedings.mlr.press/v267/schultz25a.html
%V 267
%X Advancing planning and reasoning capabilities of Large Language Models (LLMs) is one of the key prerequisites towards unlocking their potential for performing reliably in complex and impactful domains. In this paper, we aim to demonstrate this across board games (Chess, Fischer Random / Chess960, Connect Four, and Hex), and we show that search-based planning can yield significant improvements in LLM game-playing strength. We introduce, compare and contrast two major approaches: In external search, the model guides Monte Carlo Tree Search (MCTS) rollouts and evaluations without calls to an external game engine, and in internal search, the model is trained to generate in-context a linearized tree of search and a resulting final choice. Both build on a language model pre-trained on relevant domain knowledge, reliably capturing the transition and value functions in the respective environments, with minimal hallucinations. We evaluate our LLM search implementations against game-specific state-of-the-art engines, showcasing substantial improvements in strength over the base model, and reaching Grandmaster-level performance in chess while operating closer to the human search budget. Our proposed approach, combining search with domain knowledge, is not specific to board games, hinting at more general future applications.

APA

Schultz, J., Adamek, J., Jusup, M., Lanctot, M., Kaisers, M., Perrin, S., Hennes, D., Shar, J., Lewis, C.A., Ruoss, A., Zahavy, T., Veličković, P., Prince, L., Singh, S., Malmi, E. & Tomasev, N.. (2025). Mastering Board Games by External and Internal Planning with Language Models. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:53581-53644 Available from https://proceedings.mlr.press/v267/schultz25a.html.

Mastering Board Games by External and Internal Planning with Language Models

Abstract

Cite this Paper

Related Material