Pedestrian Cross Forecasting with Hybrid Feature Fusion

Meng Dong
Proceedings of the 15th Asian Conference on Machine Learning, PMLR 222:327-342, 2024.

Abstract

Forecasting the crossing intention of pedestrians is an essential task for the safe driving of Autonomous Vehicles (AVs) in the real world. Pedestrians’ behaviors are usually influenced by their surroundings in traffic scenes. Recent works based on vision-based neural networks extract key information from images to perform prediction. However, in the driving environment, there exists much critical information, such as the social and scene interaction in the driving area, the location and distance between the ego car and target pedestrian, and the motion of all targets. How properly exploring and utilizing the above implicit interactions will promote the development of Autonomous Vehicles. In this chapter, two novel attributes, the pedestrian’s location on the road or sidewalk, and the relative distance from the target pedestrian to the ego-car, which are derived from the semantic map and depth map combined with bounding boxes, are introduced. A hybrid prediction network based on multi-modal is proposed to capture the interactions between all the features and predict pedestrian crossing intention. Evaluated by two public pedestrian crossing datasets, PIE and JAAD, the proposed hybrid framework outperforms the state-of-the-art by about an accuracy of 3%.

Cite this Paper


BibTeX
@InProceedings{pmlr-v222-dong24a, title = {Pedestrian Cross Forecasting with Hybrid Feature Fusion}, author = {Dong, Meng}, booktitle = {Proceedings of the 15th Asian Conference on Machine Learning}, pages = {327--342}, year = {2024}, editor = {Yanıkoğlu, Berrin and Buntine, Wray}, volume = {222}, series = {Proceedings of Machine Learning Research}, month = {11--14 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v222/dong24a/dong24a.pdf}, url = {https://proceedings.mlr.press/v222/dong24a.html}, abstract = {Forecasting the crossing intention of pedestrians is an essential task for the safe driving of Autonomous Vehicles (AVs) in the real world. Pedestrians’ behaviors are usually influenced by their surroundings in traffic scenes. Recent works based on vision-based neural networks extract key information from images to perform prediction. However, in the driving environment, there exists much critical information, such as the social and scene interaction in the driving area, the location and distance between the ego car and target pedestrian, and the motion of all targets. How properly exploring and utilizing the above implicit interactions will promote the development of Autonomous Vehicles. In this chapter, two novel attributes, the pedestrian’s location on the road or sidewalk, and the relative distance from the target pedestrian to the ego-car, which are derived from the semantic map and depth map combined with bounding boxes, are introduced. A hybrid prediction network based on multi-modal is proposed to capture the interactions between all the features and predict pedestrian crossing intention. Evaluated by two public pedestrian crossing datasets, PIE and JAAD, the proposed hybrid framework outperforms the state-of-the-art by about an accuracy of 3%.} }
Endnote
%0 Conference Paper %T Pedestrian Cross Forecasting with Hybrid Feature Fusion %A Meng Dong %B Proceedings of the 15th Asian Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Berrin Yanıkoğlu %E Wray Buntine %F pmlr-v222-dong24a %I PMLR %P 327--342 %U https://proceedings.mlr.press/v222/dong24a.html %V 222 %X Forecasting the crossing intention of pedestrians is an essential task for the safe driving of Autonomous Vehicles (AVs) in the real world. Pedestrians’ behaviors are usually influenced by their surroundings in traffic scenes. Recent works based on vision-based neural networks extract key information from images to perform prediction. However, in the driving environment, there exists much critical information, such as the social and scene interaction in the driving area, the location and distance between the ego car and target pedestrian, and the motion of all targets. How properly exploring and utilizing the above implicit interactions will promote the development of Autonomous Vehicles. In this chapter, two novel attributes, the pedestrian’s location on the road or sidewalk, and the relative distance from the target pedestrian to the ego-car, which are derived from the semantic map and depth map combined with bounding boxes, are introduced. A hybrid prediction network based on multi-modal is proposed to capture the interactions between all the features and predict pedestrian crossing intention. Evaluated by two public pedestrian crossing datasets, PIE and JAAD, the proposed hybrid framework outperforms the state-of-the-art by about an accuracy of 3%.
APA
Dong, M.. (2024). Pedestrian Cross Forecasting with Hybrid Feature Fusion. Proceedings of the 15th Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 222:327-342 Available from https://proceedings.mlr.press/v222/dong24a.html.

Related Material