Uncertainty-Aware Decision Transformer for Stochastic Driving Environments

Zenan Li, Fan Nie, Qiao Sun, Fang Da, Hang Zhao
Proceedings of The 8th Conference on Robot Learning, PMLR 270:364-386, 2025.

Abstract

Offline Reinforcement Learning (RL) enables policy learning without active interactions, making it especially appealing for self-driving tasks. Recent successes of Transformers inspire casting offline RL as sequence modeling, which, however, fails in stochastic environments with incorrect assumptions that identical actions can consistently achieve the same goal. In this paper, we introduce an UNcertainty-awaRE deciSion Transformer (UNREST) for planning in stochastic driving environments without introducing additional transition or complex generative models. Specifically, UNREST estimates uncertainties by conditional mutual information between transitions and returns. Discovering ’uncertainty accumulation’ and ’temporal locality’ properties of driving environments, we replace the global returns in decision transformers with truncated returns less affected by environments to learn from actual outcomes of actions rather than environment transitions. We also dynamically evaluate uncertainty at inference for cautious planning. Extensive experiments demonstrate UNREST’s superior performance in various driving scenarios and the power of our uncertainty estimation strategy.

Cite this Paper


BibTeX
@InProceedings{pmlr-v270-li25b, title = {Uncertainty-Aware Decision Transformer for Stochastic Driving Environments}, author = {Li, Zenan and Nie, Fan and Sun, Qiao and Da, Fang and Zhao, Hang}, booktitle = {Proceedings of The 8th Conference on Robot Learning}, pages = {364--386}, year = {2025}, editor = {Agrawal, Pulkit and Kroemer, Oliver and Burgard, Wolfram}, volume = {270}, series = {Proceedings of Machine Learning Research}, month = {06--09 Nov}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v270/main/assets/li25b/li25b.pdf}, url = {https://proceedings.mlr.press/v270/li25b.html}, abstract = {Offline Reinforcement Learning (RL) enables policy learning without active interactions, making it especially appealing for self-driving tasks. Recent successes of Transformers inspire casting offline RL as sequence modeling, which, however, fails in stochastic environments with incorrect assumptions that identical actions can consistently achieve the same goal. In this paper, we introduce an UNcertainty-awaRE deciSion Transformer (UNREST) for planning in stochastic driving environments without introducing additional transition or complex generative models. Specifically, UNREST estimates uncertainties by conditional mutual information between transitions and returns. Discovering ’uncertainty accumulation’ and ’temporal locality’ properties of driving environments, we replace the global returns in decision transformers with truncated returns less affected by environments to learn from actual outcomes of actions rather than environment transitions. We also dynamically evaluate uncertainty at inference for cautious planning. Extensive experiments demonstrate UNREST’s superior performance in various driving scenarios and the power of our uncertainty estimation strategy.} }
Endnote
%0 Conference Paper %T Uncertainty-Aware Decision Transformer for Stochastic Driving Environments %A Zenan Li %A Fan Nie %A Qiao Sun %A Fang Da %A Hang Zhao %B Proceedings of The 8th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2025 %E Pulkit Agrawal %E Oliver Kroemer %E Wolfram Burgard %F pmlr-v270-li25b %I PMLR %P 364--386 %U https://proceedings.mlr.press/v270/li25b.html %V 270 %X Offline Reinforcement Learning (RL) enables policy learning without active interactions, making it especially appealing for self-driving tasks. Recent successes of Transformers inspire casting offline RL as sequence modeling, which, however, fails in stochastic environments with incorrect assumptions that identical actions can consistently achieve the same goal. In this paper, we introduce an UNcertainty-awaRE deciSion Transformer (UNREST) for planning in stochastic driving environments without introducing additional transition or complex generative models. Specifically, UNREST estimates uncertainties by conditional mutual information between transitions and returns. Discovering ’uncertainty accumulation’ and ’temporal locality’ properties of driving environments, we replace the global returns in decision transformers with truncated returns less affected by environments to learn from actual outcomes of actions rather than environment transitions. We also dynamically evaluate uncertainty at inference for cautious planning. Extensive experiments demonstrate UNREST’s superior performance in various driving scenarios and the power of our uncertainty estimation strategy.
APA
Li, Z., Nie, F., Sun, Q., Da, F. & Zhao, H.. (2025). Uncertainty-Aware Decision Transformer for Stochastic Driving Environments. Proceedings of The 8th Conference on Robot Learning, in Proceedings of Machine Learning Research 270:364-386 Available from https://proceedings.mlr.press/v270/li25b.html.

Related Material