DipLLM: Fine-Tuning LLM for Strategic Decision-making in Diplomacy

Kaixuan Xu, Jiajun Chai, Sicheng Li, Yuqian Fu, Yuanheng Zhu, Dongbin Zhao
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:69078-69095, 2025.

Abstract

Diplomacy is a complex multiplayer game that re- quires both cooperation and competition, posing significant challenges for AI systems. Traditional methods rely on equilibrium search to generate extensive game data for training, which demands substantial computational resources. Large Lan- guage Models (LLMs) offer a promising alterna- tive, leveraging pre-trained knowledge to achieve strong performance with relatively small-scale fine-tuning. However, applying LLMs to Diplo- macy remains challenging due to the exponential growth of possible action combinations and the intricate strategic interactions among players. To address this challenge, we propose DipLLM, a fine-tuned LLM-based agent that learns equilib- rium policies for Diplomacy. DipLLM employs an autoregressive factorization framework to sim- plify the complex task of multi-unit action assign- ment into a sequence of unit-level decisions. By defining an equilibrium policy within this frame- work as the learning objective, we fine-tune the model using only 1.5% of the data required by the state-of-the-art Cicero model, surpassing its per- formance. Our results demonstrate the potential of fine-tuned LLMs for tackling complex strategic decision-making in multiplayer games.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-xu25b, title = {{D}ip{LLM}: Fine-Tuning {LLM} for Strategic Decision-making in Diplomacy}, author = {Xu, Kaixuan and Chai, Jiajun and Li, Sicheng and Fu, Yuqian and Zhu, Yuanheng and Zhao, Dongbin}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {69078--69095}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/xu25b/xu25b.pdf}, url = {https://proceedings.mlr.press/v267/xu25b.html}, abstract = {Diplomacy is a complex multiplayer game that re- quires both cooperation and competition, posing significant challenges for AI systems. Traditional methods rely on equilibrium search to generate extensive game data for training, which demands substantial computational resources. Large Lan- guage Models (LLMs) offer a promising alterna- tive, leveraging pre-trained knowledge to achieve strong performance with relatively small-scale fine-tuning. However, applying LLMs to Diplo- macy remains challenging due to the exponential growth of possible action combinations and the intricate strategic interactions among players. To address this challenge, we propose DipLLM, a fine-tuned LLM-based agent that learns equilib- rium policies for Diplomacy. DipLLM employs an autoregressive factorization framework to sim- plify the complex task of multi-unit action assign- ment into a sequence of unit-level decisions. By defining an equilibrium policy within this frame- work as the learning objective, we fine-tune the model using only 1.5% of the data required by the state-of-the-art Cicero model, surpassing its per- formance. Our results demonstrate the potential of fine-tuned LLMs for tackling complex strategic decision-making in multiplayer games.} }
Endnote
%0 Conference Paper %T DipLLM: Fine-Tuning LLM for Strategic Decision-making in Diplomacy %A Kaixuan Xu %A Jiajun Chai %A Sicheng Li %A Yuqian Fu %A Yuanheng Zhu %A Dongbin Zhao %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-xu25b %I PMLR %P 69078--69095 %U https://proceedings.mlr.press/v267/xu25b.html %V 267 %X Diplomacy is a complex multiplayer game that re- quires both cooperation and competition, posing significant challenges for AI systems. Traditional methods rely on equilibrium search to generate extensive game data for training, which demands substantial computational resources. Large Lan- guage Models (LLMs) offer a promising alterna- tive, leveraging pre-trained knowledge to achieve strong performance with relatively small-scale fine-tuning. However, applying LLMs to Diplo- macy remains challenging due to the exponential growth of possible action combinations and the intricate strategic interactions among players. To address this challenge, we propose DipLLM, a fine-tuned LLM-based agent that learns equilib- rium policies for Diplomacy. DipLLM employs an autoregressive factorization framework to sim- plify the complex task of multi-unit action assign- ment into a sequence of unit-level decisions. By defining an equilibrium policy within this frame- work as the learning objective, we fine-tune the model using only 1.5% of the data required by the state-of-the-art Cicero model, surpassing its per- formance. Our results demonstrate the potential of fine-tuned LLMs for tackling complex strategic decision-making in multiplayer games.
APA
Xu, K., Chai, J., Li, S., Fu, Y., Zhu, Y. & Zhao, D.. (2025). DipLLM: Fine-Tuning LLM for Strategic Decision-making in Diplomacy. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:69078-69095 Available from https://proceedings.mlr.press/v267/xu25b.html.

Related Material