CC-3DT: Panoramic 3D Object Tracking via Cross-Camera Fusion

Tobias Fischer, Yung-Hsu Yang, Suryansh Kumar, Min Sun, Fisher Yu
Proceedings of The 6th Conference on Robot Learning, PMLR 205:2294-2305, 2023.

Abstract

To track the 3D locations and trajectories of the other traffic participants at any given time, modern autonomous vehicles are equipped with multiple cameras that cover the vehicle’s full surroundings. Yet, camera-based 3D object tracking methods prioritize optimizing the single-camera setup and resort to post-hoc fusion in a multi-camera setup. In this paper, we propose a method for panoramic 3D object tracking, called CC-3DT, that associates and models object trajectories both temporally and across views, and improves the overall tracking consistency. In particular, our method fuses 3D detections from multiple cameras before association, reducing identity switches significantly and improving motion modeling. Our experiments on large-scale driving datasets show that fusion before association leads to a large margin of improvement over post-hoc fusion. We set a new state-of-the-art with 12.6% improvement in average multi-object tracking accuracy (AMOTA) among all camera-based methods on the competitive NuScenes 3D tracking benchmark, outperforming previously published methods by 6.5% in AMOTA with the same 3D detector.

Cite this Paper


BibTeX
@InProceedings{pmlr-v205-fischer23a, title = {CC-3DT: Panoramic 3D Object Tracking via Cross-Camera Fusion}, author = {Fischer, Tobias and Yang, Yung-Hsu and Kumar, Suryansh and Sun, Min and Yu, Fisher}, booktitle = {Proceedings of The 6th Conference on Robot Learning}, pages = {2294--2305}, year = {2023}, editor = {Liu, Karen and Kulic, Dana and Ichnowski, Jeff}, volume = {205}, series = {Proceedings of Machine Learning Research}, month = {14--18 Dec}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v205/fischer23a/fischer23a.pdf}, url = {https://proceedings.mlr.press/v205/fischer23a.html}, abstract = {To track the 3D locations and trajectories of the other traffic participants at any given time, modern autonomous vehicles are equipped with multiple cameras that cover the vehicle’s full surroundings. Yet, camera-based 3D object tracking methods prioritize optimizing the single-camera setup and resort to post-hoc fusion in a multi-camera setup. In this paper, we propose a method for panoramic 3D object tracking, called CC-3DT, that associates and models object trajectories both temporally and across views, and improves the overall tracking consistency. In particular, our method fuses 3D detections from multiple cameras before association, reducing identity switches significantly and improving motion modeling. Our experiments on large-scale driving datasets show that fusion before association leads to a large margin of improvement over post-hoc fusion. We set a new state-of-the-art with 12.6% improvement in average multi-object tracking accuracy (AMOTA) among all camera-based methods on the competitive NuScenes 3D tracking benchmark, outperforming previously published methods by 6.5% in AMOTA with the same 3D detector. } }
Endnote
%0 Conference Paper %T CC-3DT: Panoramic 3D Object Tracking via Cross-Camera Fusion %A Tobias Fischer %A Yung-Hsu Yang %A Suryansh Kumar %A Min Sun %A Fisher Yu %B Proceedings of The 6th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2023 %E Karen Liu %E Dana Kulic %E Jeff Ichnowski %F pmlr-v205-fischer23a %I PMLR %P 2294--2305 %U https://proceedings.mlr.press/v205/fischer23a.html %V 205 %X To track the 3D locations and trajectories of the other traffic participants at any given time, modern autonomous vehicles are equipped with multiple cameras that cover the vehicle’s full surroundings. Yet, camera-based 3D object tracking methods prioritize optimizing the single-camera setup and resort to post-hoc fusion in a multi-camera setup. In this paper, we propose a method for panoramic 3D object tracking, called CC-3DT, that associates and models object trajectories both temporally and across views, and improves the overall tracking consistency. In particular, our method fuses 3D detections from multiple cameras before association, reducing identity switches significantly and improving motion modeling. Our experiments on large-scale driving datasets show that fusion before association leads to a large margin of improvement over post-hoc fusion. We set a new state-of-the-art with 12.6% improvement in average multi-object tracking accuracy (AMOTA) among all camera-based methods on the competitive NuScenes 3D tracking benchmark, outperforming previously published methods by 6.5% in AMOTA with the same 3D detector.
APA
Fischer, T., Yang, Y., Kumar, S., Sun, M. & Yu, F.. (2023). CC-3DT: Panoramic 3D Object Tracking via Cross-Camera Fusion. Proceedings of The 6th Conference on Robot Learning, in Proceedings of Machine Learning Research 205:2294-2305 Available from https://proceedings.mlr.press/v205/fischer23a.html.

Related Material