BranchOut: Capturing Realistic Multimodality in Autonomous Driving Decisions

Hee Jae Kim, Zekai Yin, Lei Lai, Jason Lee, Eshed Ohn-Bar
Proceedings of The 9th Conference on Robot Learning, PMLR 305:1940-1952, 2025.

Abstract

Modeling the nuanced, multimodal nature of human driving remains a core challenge for autonomous systems, as existing methods often fail to capture the diversity of plausible behaviors in complex real-world scenarios. In this work, we introduce a novel benchmark and end-to-end planner for modeling realistic multimodality in autonomous driving decisions. We propose a Gaussian Mixture Model (GMM)-based diffusion model designed to explicitly capture human-like, multimodal driving decisions in diverse contexts. Our model achieves state-of-the-art performance on current benchmarks, but reveals weaknesses in standard evaluation practices, which rely on single ground-truth trajectories or coarse closed-loop metrics while often penalizing diverse yet plausible alternatives. To address this limitation, we further develop a human-in-the-loop simulation benchmark that enables finer-grained evaluations and measures multimodal realism in challenging driving settings. Our code, models, and benchmark data will be released to promote more accurate and human-aware evaluation of autonomous driving models.

Cite this Paper


BibTeX
@InProceedings{pmlr-v305-kim25b, title = {BranchOut: Capturing Realistic Multimodality in Autonomous Driving Decisions}, author = {Kim, Hee Jae and Yin, Zekai and Lai, Lei and Lee, Jason and Ohn-Bar, Eshed}, booktitle = {Proceedings of The 9th Conference on Robot Learning}, pages = {1940--1952}, year = {2025}, editor = {Lim, Joseph and Song, Shuran and Park, Hae-Won}, volume = {305}, series = {Proceedings of Machine Learning Research}, month = {27--30 Sep}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v305/main/assets/kim25b/kim25b.pdf}, url = {https://proceedings.mlr.press/v305/kim25b.html}, abstract = {Modeling the nuanced, multimodal nature of human driving remains a core challenge for autonomous systems, as existing methods often fail to capture the diversity of plausible behaviors in complex real-world scenarios. In this work, we introduce a novel benchmark and end-to-end planner for modeling realistic multimodality in autonomous driving decisions. We propose a Gaussian Mixture Model (GMM)-based diffusion model designed to explicitly capture human-like, multimodal driving decisions in diverse contexts. Our model achieves state-of-the-art performance on current benchmarks, but reveals weaknesses in standard evaluation practices, which rely on single ground-truth trajectories or coarse closed-loop metrics while often penalizing diverse yet plausible alternatives. To address this limitation, we further develop a human-in-the-loop simulation benchmark that enables finer-grained evaluations and measures multimodal realism in challenging driving settings. Our code, models, and benchmark data will be released to promote more accurate and human-aware evaluation of autonomous driving models.} }
Endnote
%0 Conference Paper %T BranchOut: Capturing Realistic Multimodality in Autonomous Driving Decisions %A Hee Jae Kim %A Zekai Yin %A Lei Lai %A Jason Lee %A Eshed Ohn-Bar %B Proceedings of The 9th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2025 %E Joseph Lim %E Shuran Song %E Hae-Won Park %F pmlr-v305-kim25b %I PMLR %P 1940--1952 %U https://proceedings.mlr.press/v305/kim25b.html %V 305 %X Modeling the nuanced, multimodal nature of human driving remains a core challenge for autonomous systems, as existing methods often fail to capture the diversity of plausible behaviors in complex real-world scenarios. In this work, we introduce a novel benchmark and end-to-end planner for modeling realistic multimodality in autonomous driving decisions. We propose a Gaussian Mixture Model (GMM)-based diffusion model designed to explicitly capture human-like, multimodal driving decisions in diverse contexts. Our model achieves state-of-the-art performance on current benchmarks, but reveals weaknesses in standard evaluation practices, which rely on single ground-truth trajectories or coarse closed-loop metrics while often penalizing diverse yet plausible alternatives. To address this limitation, we further develop a human-in-the-loop simulation benchmark that enables finer-grained evaluations and measures multimodal realism in challenging driving settings. Our code, models, and benchmark data will be released to promote more accurate and human-aware evaluation of autonomous driving models.
APA
Kim, H.J., Yin, Z., Lai, L., Lee, J. & Ohn-Bar, E.. (2025). BranchOut: Capturing Realistic Multimodality in Autonomous Driving Decisions. Proceedings of The 9th Conference on Robot Learning, in Proceedings of Machine Learning Research 305:1940-1952 Available from https://proceedings.mlr.press/v305/kim25b.html.

Related Material