[edit]
Pedestrian Cross Forecasting with Hybrid Feature Fusion
Proceedings of the 15th Asian Conference on Machine Learning, PMLR 222:327-342, 2024.
Abstract
Forecasting the crossing intention of pedestrians is an essential task for the safe driving of Autonomous Vehicles (AVs) in the real world. Pedestrians’ behaviors are usually influenced by their surroundings in traffic scenes. Recent works based on vision-based neural networks extract key information from images to perform prediction. However, in the driving environment, there exists much critical information, such as the social and scene interaction in the driving area, the location and distance between the ego car and target pedestrian, and the motion of all targets. How properly exploring and utilizing the above implicit interactions will promote the development of Autonomous Vehicles. In this chapter, two novel attributes, the pedestrian’s location on the road or sidewalk, and the relative distance from the target pedestrian to the ego-car, which are derived from the semantic map and depth map combined with bounding boxes, are introduced. A hybrid prediction network based on multi-modal is proposed to capture the interactions between all the features and predict pedestrian crossing intention. Evaluated by two public pedestrian crossing datasets, PIE and JAAD, the proposed hybrid framework outperforms the state-of-the-art by about an accuracy of 3%.