Towards Capturing the Temporal Dynamics for Trajectory Prediction: a Coarse-to-Fine Approach

Xiaosong Jia; Li Chen; Penghao Wu; Jia Zeng; Junchi Yan; Hongyang Li; Yu Qiao

Towards Capturing the Temporal Dynamics for Trajectory Prediction: a Coarse-to-Fine Approach

Xiaosong Jia, Li Chen, Penghao Wu, Jia Zeng, Junchi Yan, Hongyang Li, Yu Qiao

Proceedings of The 6th Conference on Robot Learning, PMLR 205:910-920, 2023.

Abstract

Trajectory prediction is one of the basic tasks in the autonomous driving field, which aims to predict the future position of other agents around the ego vehicle so that a safe yet efficient driving plan could be generated in the downstream module. Recently, deep learning based methods dominate the field. State-of-the-art (SOTA) methods usually follow an encoder-decoder paradigm. Specifically, the encoder is responsible for extracting information from agents’ history states and HD-Map and providing a representation vector for each agent. Taking these vectors as input, the decoder predicts multi-step future positions for each agent, which is usually accomplished by a single multi-layer perceptron (MLP) to directly output a Tx2 tensor. Though models with adoptation of MLP decoder have dominated the leaderboard of multiple datasets, ‘the elephant in the room is that the temporal correlation among future time-steps is ignored since there is no direct relation among output neurons of a MLP. In this work, we examine this design choice and investigate several ways to apply the temporal inductive bias into the generation of future trajectories on top of a SOTA encoder. We find that simply using autoregressive RNN to generate future positions would lead to significant performance drop even with techniques such as history highway and teacher forcing. Instead, taking scratch trajectories generated by MLP as input, an additional refinement module based on structures with temporal prior such as RNN or 1D-CNN could remarkably boost the accuracy. Furthermore, we examine several objective functions to emphasize the temporal priors. By the combination of aforementioned techniques to introduce the temporal prior, we improve the top-ranked method’s performance by a large margin and achieve SOTA result on the Waymo Open Motion Challenge.

Cite this Paper

BibTeX


@InProceedings{pmlr-v205-jia23a,
  title = 	 {Towards Capturing the Temporal Dynamics for Trajectory Prediction: a Coarse-to-Fine Approach},
  author =       {Jia, Xiaosong and Chen, Li and Wu, Penghao and Zeng, Jia and Yan, Junchi and Li, Hongyang and Qiao, Yu},
  booktitle = 	 {Proceedings of The 6th Conference on Robot Learning},
  pages = 	 {910--920},
  year = 	 {2023},
  editor = 	 {Liu, Karen and Kulic, Dana and Ichnowski, Jeff},
  volume = 	 {205},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {14--18 Dec},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v205/jia23a/jia23a.pdf},
  url = 	 {https://proceedings.mlr.press/v205/jia23a.html},
  abstract = 	 { Trajectory prediction is one of the basic tasks in the autonomous driving field, which aims to predict the future position of other agents around the ego vehicle so that a safe yet efficient driving plan could be generated in the downstream module. Recently, deep learning based methods dominate the field. State-of-the-art (SOTA) methods usually follow an encoder-decoder paradigm. Specifically, the encoder is responsible for extracting information from agents’ history states and HD-Map and providing a representation vector for each agent. Taking these vectors as input, the decoder predicts multi-step future positions for each agent, which is usually accomplished by a single multi-layer perceptron (MLP) to directly output a Tx2 tensor. Though models with adoptation of MLP decoder have dominated the leaderboard of multiple datasets, ‘the elephant in the room is that the temporal correlation among future time-steps is ignored since there is no direct relation among output neurons of a MLP. In this work, we examine this design choice and investigate several ways to apply the temporal inductive bias into the generation of future trajectories on top of a SOTA encoder. We find that simply using autoregressive RNN to generate future positions would lead to significant performance drop even with techniques such as history highway and teacher forcing. Instead, taking scratch trajectories generated by MLP as input, an additional refinement module based on structures with temporal prior such as RNN or 1D-CNN could remarkably boost the accuracy. Furthermore, we examine several objective functions to  emphasize the temporal priors. By the combination of aforementioned techniques to introduce the temporal prior, we improve the top-ranked method’s performance by a large margin and achieve SOTA result on the Waymo Open Motion Challenge.}
}

Endnote

%0 Conference Paper
%T Towards Capturing the Temporal Dynamics for Trajectory Prediction: a Coarse-to-Fine Approach
%A Xiaosong Jia
%A Li Chen
%A Penghao Wu
%A Jia Zeng
%A Junchi Yan
%A Hongyang Li
%A Yu Qiao
%B Proceedings of The 6th Conference on Robot Learning
%C Proceedings of Machine Learning Research
%D 2023
%E Karen Liu
%E Dana Kulic
%E Jeff Ichnowski	
%F pmlr-v205-jia23a
%I PMLR
%P 910--920
%U https://proceedings.mlr.press/v205/jia23a.html
%V 205
%X  Trajectory prediction is one of the basic tasks in the autonomous driving field, which aims to predict the future position of other agents around the ego vehicle so that a safe yet efficient driving plan could be generated in the downstream module. Recently, deep learning based methods dominate the field. State-of-the-art (SOTA) methods usually follow an encoder-decoder paradigm. Specifically, the encoder is responsible for extracting information from agents’ history states and HD-Map and providing a representation vector for each agent. Taking these vectors as input, the decoder predicts multi-step future positions for each agent, which is usually accomplished by a single multi-layer perceptron (MLP) to directly output a Tx2 tensor. Though models with adoptation of MLP decoder have dominated the leaderboard of multiple datasets, ‘the elephant in the room is that the temporal correlation among future time-steps is ignored since there is no direct relation among output neurons of a MLP. In this work, we examine this design choice and investigate several ways to apply the temporal inductive bias into the generation of future trajectories on top of a SOTA encoder. We find that simply using autoregressive RNN to generate future positions would lead to significant performance drop even with techniques such as history highway and teacher forcing. Instead, taking scratch trajectories generated by MLP as input, an additional refinement module based on structures with temporal prior such as RNN or 1D-CNN could remarkably boost the accuracy. Furthermore, we examine several objective functions to  emphasize the temporal priors. By the combination of aforementioned techniques to introduce the temporal prior, we improve the top-ranked method’s performance by a large margin and achieve SOTA result on the Waymo Open Motion Challenge.

APA


Jia, X., Chen, L., Wu, P., Zeng, J., Yan, J., Li, H. & Qiao, Y.. (2023). Towards Capturing the Temporal Dynamics for Trajectory Prediction: a Coarse-to-Fine Approach. Proceedings of The 6th Conference on Robot Learning, in Proceedings of Machine Learning Research 205:910-920 Available from https://proceedings.mlr.press/v205/jia23a.html.

Towards Capturing the Temporal Dynamics for Trajectory Prediction: a Coarse-to-Fine Approach

Abstract

Cite this Paper

Related Material