Language-Guided Traffic Simulation via Scene-Level Diffusion

Ziyuan Zhong; Davis Rempe; Yuxiao Chen; Boris Ivanovic; Yulong Cao; Danfei Xu; Marco Pavone; Baishakhi Ray

Language-Guided Traffic Simulation via Scene-Level Diffusion

Ziyuan Zhong, Davis Rempe, Yuxiao Chen, Boris Ivanovic, Yulong Cao, Danfei Xu, Marco Pavone, Baishakhi Ray

Proceedings of The 7th Conference on Robot Learning, PMLR 229:144-177, 2023.

Abstract

Realistic and controllable traffic simulation is a core capability that is necessary to accelerate autonomous vehicle (AV) development. However, current approaches for controlling learning-based traffic models require significant domain expertise and are difficult for practitioners to use. To remedy this, we present CTG++, a scene-level conditional diffusion model that can be guided by language instructions. Developing this requires tackling two challenges: the need for a realistic and controllable traffic model backbone, and an effective method to interface with a traffic model using language. To address these challenges, we first propose a scene-level diffusion model equipped with a spatio-temporal transformer backbone, which generates realistic and controllable traffic. We then harness a large language model (LLM) to convert a user’s query into a loss function, guiding the diffusion model towards query-compliant generation. Through comprehensive evaluation, we demonstrate the effectiveness of our proposed method in generating realistic, query-compliant traffic simulations.

Cite this Paper

BibTeX


@InProceedings{pmlr-v229-zhong23a,
  title = 	 {Language-Guided Traffic Simulation via Scene-Level Diffusion},
  author =       {Zhong, Ziyuan and Rempe, Davis and Chen, Yuxiao and Ivanovic, Boris and Cao, Yulong and Xu, Danfei and Pavone, Marco and Ray, Baishakhi},
  booktitle = 	 {Proceedings of The 7th Conference on Robot Learning},
  pages = 	 {144--177},
  year = 	 {2023},
  editor = 	 {Tan, Jie and Toussaint, Marc and Darvish, Kourosh},
  volume = 	 {229},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {06--09 Nov},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v229/zhong23a/zhong23a.pdf},
  url = 	 {https://proceedings.mlr.press/v229/zhong23a.html},
  abstract = 	 {Realistic and controllable traffic simulation is a core capability that is necessary to accelerate autonomous vehicle (AV) development. However, current approaches for controlling learning-based traffic models require significant domain expertise and are difficult for practitioners to use. To remedy this, we present CTG++, a scene-level conditional diffusion model that can be guided by language instructions. Developing this requires tackling two challenges: the need for a realistic and controllable traffic model backbone, and an effective method to interface with a traffic model using language. To address these challenges, we first propose a scene-level diffusion model equipped with a spatio-temporal transformer backbone, which generates realistic and controllable traffic. We then harness a large language model (LLM) to convert a user’s query into a loss function, guiding the diffusion model towards query-compliant generation. Through comprehensive evaluation, we demonstrate the effectiveness of our proposed method in generating realistic, query-compliant traffic simulations.}
}

Endnote

%0 Conference Paper
%T Language-Guided Traffic Simulation via Scene-Level Diffusion
%A Ziyuan Zhong
%A Davis Rempe
%A Yuxiao Chen
%A Boris Ivanovic
%A Yulong Cao
%A Danfei Xu
%A Marco Pavone
%A Baishakhi Ray
%B Proceedings of The 7th Conference on Robot Learning
%C Proceedings of Machine Learning Research
%D 2023
%E Jie Tan
%E Marc Toussaint
%E Kourosh Darvish	
%F pmlr-v229-zhong23a
%I PMLR
%P 144--177
%U https://proceedings.mlr.press/v229/zhong23a.html
%V 229
%X Realistic and controllable traffic simulation is a core capability that is necessary to accelerate autonomous vehicle (AV) development. However, current approaches for controlling learning-based traffic models require significant domain expertise and are difficult for practitioners to use. To remedy this, we present CTG++, a scene-level conditional diffusion model that can be guided by language instructions. Developing this requires tackling two challenges: the need for a realistic and controllable traffic model backbone, and an effective method to interface with a traffic model using language. To address these challenges, we first propose a scene-level diffusion model equipped with a spatio-temporal transformer backbone, which generates realistic and controllable traffic. We then harness a large language model (LLM) to convert a user’s query into a loss function, guiding the diffusion model towards query-compliant generation. Through comprehensive evaluation, we demonstrate the effectiveness of our proposed method in generating realistic, query-compliant traffic simulations.

APA


Zhong, Z., Rempe, D., Chen, Y., Ivanovic, B., Cao, Y., Xu, D., Pavone, M. & Ray, B.. (2023). Language-Guided Traffic Simulation via Scene-Level Diffusion. Proceedings of The 7th Conference on Robot Learning, in Proceedings of Machine Learning Research 229:144-177 Available from https://proceedings.mlr.press/v229/zhong23a.html.

Language-Guided Traffic Simulation via Scene-Level Diffusion

Abstract

Cite this Paper

Related Material