LLM-Assisted Semantically Diverse Teammate Generation for Efficient Multi-agent Coordination

Lihe Li; Lei Yuan; Pengsen Liu; Tao Jiang; Yang Yu

LLM-Assisted Semantically Diverse Teammate Generation for Efficient Multi-agent Coordination

Lihe Li, Lei Yuan, Pengsen Liu, Tao Jiang, Yang Yu

Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:36743-36764, 2025.

Abstract

Training with diverse teammates is the key for learning generalizable agents. Typical approaches aim to generate diverse teammates by utilizing techniques like randomization, designing regularization terms, or reducing policy compatibility, etc. However, such teammates lack semantic information, resulting in inefficient teammate generation and poor adaptability of the agents. To tackle these challenges, we propose Semantically Diverse Teammate Generation (SemDiv), a novel framework leveraging the capabilities of large language models (LLMs) to discover and learn diverse coordination behaviors at the semantic level. In each iteration, SemDiv first generates a novel coordination behavior described in natural language, then translates it into a reward function to train a teammate policy. Once the policy is verified to be meaningful, novel, and aligned with the behavior, the agents train a policy for coordination. Through this iterative process, SemDiv efficiently generates a diverse set of semantically grounded teammates, enabling agents to develop specialized policies, and select the most suitable ones through language-based reasoning to adapt to unseen teammates. Experiments show that SemDiv generates teammates covering a wide range of coordination behaviors, including those unreachable by baseline methods. Evaluation across four MARL environments, each with five unseen representative teammates, demonstrates SemDiv’s superior coordination and adaptability. Our code is available at https://github.com/lilh76/SemDiv.

Cite this Paper

BibTeX

@InProceedings{pmlr-v267-li25dq,
  title = 	 {{LLM}-Assisted Semantically Diverse Teammate Generation for Efficient Multi-agent Coordination},
  author =       {Li, Lihe and Yuan, Lei and Liu, Pengsen and Jiang, Tao and Yu, Yang},
  booktitle = 	 {Proceedings of the 42nd International Conference on Machine Learning},
  pages = 	 {36743--36764},
  year = 	 {2025},
  editor = 	 {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry},
  volume = 	 {267},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--19 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v267/main/assets/li25dq/li25dq.pdf},
  url = 	 {https://proceedings.mlr.press/v267/li25dq.html},
  abstract = 	 {Training with diverse teammates is the key for learning generalizable agents. Typical approaches aim to generate diverse teammates by utilizing techniques like randomization, designing regularization terms, or reducing policy compatibility, etc. However, such teammates lack semantic information, resulting in inefficient teammate generation and poor adaptability of the agents. To tackle these challenges, we propose Semantically Diverse Teammate Generation (SemDiv), a novel framework leveraging the capabilities of large language models (LLMs) to discover and learn diverse coordination behaviors at the semantic level. In each iteration, SemDiv first generates a novel coordination behavior described in natural language, then translates it into a reward function to train a teammate policy. Once the policy is verified to be meaningful, novel, and aligned with the behavior, the agents train a policy for coordination. Through this iterative process, SemDiv efficiently generates a diverse set of semantically grounded teammates, enabling agents to develop specialized policies, and select the most suitable ones through language-based reasoning to adapt to unseen teammates. Experiments show that SemDiv generates teammates covering a wide range of coordination behaviors, including those unreachable by baseline methods. Evaluation across four MARL environments, each with five unseen representative teammates, demonstrates SemDiv’s superior coordination and adaptability. Our code is available at https://github.com/lilh76/SemDiv.}
}

Endnote

%0 Conference Paper
%T LLM-Assisted Semantically Diverse Teammate Generation for Efficient Multi-agent Coordination
%A Lihe Li
%A Lei Yuan
%A Pengsen Liu
%A Tao Jiang
%A Yang Yu
%B Proceedings of the 42nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2025
%E Aarti Singh
%E Maryam Fazel
%E Daniel Hsu
%E Simon Lacoste-Julien
%E Felix Berkenkamp
%E Tegan Maharaj
%E Kiri Wagstaff
%E Jerry Zhu	
%F pmlr-v267-li25dq
%I PMLR
%P 36743--36764
%U https://proceedings.mlr.press/v267/li25dq.html
%V 267
%X Training with diverse teammates is the key for learning generalizable agents. Typical approaches aim to generate diverse teammates by utilizing techniques like randomization, designing regularization terms, or reducing policy compatibility, etc. However, such teammates lack semantic information, resulting in inefficient teammate generation and poor adaptability of the agents. To tackle these challenges, we propose Semantically Diverse Teammate Generation (SemDiv), a novel framework leveraging the capabilities of large language models (LLMs) to discover and learn diverse coordination behaviors at the semantic level. In each iteration, SemDiv first generates a novel coordination behavior described in natural language, then translates it into a reward function to train a teammate policy. Once the policy is verified to be meaningful, novel, and aligned with the behavior, the agents train a policy for coordination. Through this iterative process, SemDiv efficiently generates a diverse set of semantically grounded teammates, enabling agents to develop specialized policies, and select the most suitable ones through language-based reasoning to adapt to unseen teammates. Experiments show that SemDiv generates teammates covering a wide range of coordination behaviors, including those unreachable by baseline methods. Evaluation across four MARL environments, each with five unseen representative teammates, demonstrates SemDiv’s superior coordination and adaptability. Our code is available at https://github.com/lilh76/SemDiv.

APA

Li, L., Yuan, L., Liu, P., Jiang, T. & Yu, Y.. (2025). LLM-Assisted Semantically Diverse Teammate Generation for Efficient Multi-agent Coordination. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:36743-36764 Available from https://proceedings.mlr.press/v267/li25dq.html.

LLM-Assisted Semantically Diverse Teammate Generation for Efficient Multi-agent Coordination

Abstract

Cite this Paper

Related Material