CooPercept: Cooperative Perception for 3D Object Detection of Autonomous Vehicles
Abstract
:1. Introduction
- To meet the requirements of cooperative perception in terms of spatial coverage and stringent computation time, we propose a lightweight and effective sequential LiDAR–camera fusion method based on image semantic segmentation. The model can be deployed in multiple autonomous vehicles, and the fused data are shared among vehicles for cooperative object detection.
- To overcome the aforementioned deficiencies of existing works, we develop a cooperative 3D object detection method based on feature-level data fusion that integrates the processed point cloud features with the image semantic information from multiple vehicles. By fusing voxel features, our method can accurately identify more potential objects than post-detection fusion, and the size of the transmitted and computed data is significantly reduced compared with raw data-level fusion methods.
- We conduct comprehensive experiments, with the results showing that the proposed approach improves the detection precision and robustness of connected autonomous vehicles with acceptable computational overhead and communication costs. By augmenting the point cloud with image semantics, the proposed method possesses the advantages of both LiDAR and camera data. The detection performance of the cooperative perception scheme based on voxel feature fusion compares favorably with that of raw data fusion schemes, while the size of the transmitted data required by feature fusion is far less than the size of the original data.
2. System Model and Fusion Schemes
2.1. Self-Data Processing
2.1.1. LiDAR–Camera Fusion
2.1.2. Voxel Feature Encoding
2.1.3. Compression
2.2. Cross-CAV Fusion
2.2.1. Temporal and Spatial Alignment
2.2.2. Voxel Feature Fusion
2.3. Output Network
3. Experiments
3.1. Dataset
3.2. Test Scenarios
3.3. Experiment Setup
3.4. Baselines
3.5. Evaluation of CooPercept
3.5.1. Comparison with Benchmarks
3.5.2. Fusion Robustness
3.5.3. Transmission and Computation
3.6. Qualitative Results
4. Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Jiang, Z.; Pan, W.; Liu, J.; Dang, S.; Yang, Z.; Li, H.; Pan, Y. Efficient and Unbiased Safety Test for Autonomous Driving Systems. IEEE Trans. Intell. Veh. 2023, 8, 3336–3348. [Google Scholar] [CrossRef]
- Chen, L.; Li, Y.; Huang, C.; Li, B.; Xing, Y.; Tian, D.; Li, L.; Hu, Z.; Na, X.; Li, Z.; et al. Milestones in Autonomous Driving and Intelligent Vehicles: Survey of Surveys. IEEE Trans. Intell. Veh. 2023, 8, 1046–1056. [Google Scholar] [CrossRef]
- Wang, L.; Zhang, X.; Song, Z.; Bi, J.; Zhang, G.; Wei, H.; Tang, L.; Yang, L.; Li, J.; Jia, C.; et al. Multi-Modal 3D Object Detection in Autonomous Driving: A Survey and Taxonomy. IEEE Trans. Intell. Veh. 2023, 8, 3781–3798. [Google Scholar] [CrossRef]
- Wang, K.; Zhou, T.; Li, X.; Ren, F. Performance and Challenges of 3D Object Detection Methods in Complex Scenes for Autonomous Driving. IEEE Trans. Intell. Veh. 2023, 8, 1699–1716. [Google Scholar] [CrossRef]
- Watta, P.; Zhang, X.; Murphey, Y.L. Vehicle Position and Context Detection Using V2V Communication. IEEE Trans. Intell. Veh. 2021, 6, 634–648. [Google Scholar] [CrossRef]
- Hurl, B.; Cohen, R.; Czarnecki, K.; Waslander, S. TruPercept: Trust modelling for autonomous vehicle cooperative perception from synthetic data. In Proceedings of the 2020 IEEE Intelligent Vehicles Symposium (IV), Las Vegas, NV, USA, 19 October–13 November 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 341–347. [Google Scholar]
- Schwarz, B. Mapping the world in 3D. Nat. Photonics 2010, 4, 429–430. [Google Scholar] [CrossRef]
- Simony, M.; Milzy, S.; Amendey, K.; Gross, H.M. Complex-yolo: An euler-region-proposal for real-time 3D object detection on point clouds. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops, Munich, Germany, 8–14 September 2018; pp. 197–209. [Google Scholar]
- Bayomi, N.; Fernandez, J.E. Eyes in the Sky: Drones Applications in the Built Environment under Climate Change Challenges. Drones 2023, 7, 637. [Google Scholar] [CrossRef]
- Zhou, Y.; Tuzel, O. Voxelnet: End-to-end learning for point cloud based 3D object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4490–4499. [Google Scholar]
- Shi, S.; Wang, X.; Li, H. Pointrcnn: 3D object proposal generation and detection from point cloud. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 770–779. [Google Scholar]
- Qi, C.R.; Su, H.; Mo, K.; Guibas, L.J. Pointnet: Deep learning on point sets for 3D classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 652–660. [Google Scholar]
- Yang, Z.; Sun, Y.; Liu, S.; Shen, X.; Jia, J. Std: Sparse-to-dense 3D object detector for point cloud. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1951–1960. [Google Scholar]
- Ku, J.; Mozifian, M.; Lee, J.; Harakeh, A.; Waslander, S.L. Joint 3D proposal generation and object detection from view aggregation. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–8. [Google Scholar]
- Liu, F.; Shan, J.; Xiong, B.; Fang, Z. A real-time and multi-sensor-based landing area recognition system for uavs. Drones 2022, 6, 118. [Google Scholar] [CrossRef]
- Chen, X.; Ma, H.; Wan, J.; Li, B.; Xia, T. Multi-view 3D object detection network for autonomous driving. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1907–1915. [Google Scholar]
- Liang, M.; Yang, B.; Wang, S.; Urtasun, R. Deep continuous fusion for multi-sensor 3D object detection. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 641–656. [Google Scholar]
- Wang, Z.; Zhan, W.; Tomizuka, M. Fusing bird’s eye view lidar point cloud and front view camera image for 3D object detection. In Proceedings of the 2018 IEEE Intelligent Vehicles Symposium (IV), Changshu, China, 26–30 June 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–6. [Google Scholar]
- Qi, C.R.; Liu, W.; Wu, C.; Su, H.; Guibas, L.J. Frustum pointnets for 3D object detection from rgb-d data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 918–927. [Google Scholar]
- Wang, Z.; Jia, K. Frustum convnet: Sliding frustums to aggregate local point-wise features for amodal 3D object detection. In Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 3–8 November 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1742–1749. [Google Scholar]
- Yang, Z.; Sun, Y.; Liu, S.; Shen, X.; Jia, J. Ipod: Intensive point-based object detector for point cloud. arXiv 2018, arXiv:1812.05276. [Google Scholar]
- Li, J.; Xu, R.; Liu, X.; Ma, J.; Chi, Z.; Ma, J.; Yu, H. Learning for Vehicle-to-Vehicle Cooperative Perception Under Lossy Communication. IEEE Trans. Intell. Veh. 2023, 8, 2650–2660. [Google Scholar] [CrossRef]
- Meng, Z.; Xia, X.; Xu, R.; Liu, W.; Ma, J. HYDRO-3D: Hybrid Object Detection and Tracking for Cooperative Perception Using 3D LiDAR. IEEE Trans. Intell. Veh. 2023, 8, 4069–4080. [Google Scholar] [CrossRef]
- Arnold, E.; Al-Jarrah, O.Y.; Dianati, M.; Fallah, S.; Oxtoby, D.; Mouzakitis, A. Cooperative object classification for driving applications. In Proceedings of the 2019 IEEE Intelligent Vehicles Symposium (IV), Paris, France, 9–12 June 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 2484–2489. [Google Scholar]
- Chen, Q.; Tang, S.; Yang, Q.; Fu, S. Cooper: Cooperative perception for connected autonomous vehicles based on 3D point clouds. In Proceedings of the 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS), Dallas, TX, USA, 7–10 July 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 514–524. [Google Scholar]
- Correa, A.; Alms, R.; Gozalvez, J.; Sepulcre, M.; Rondinone, M.; Blokpoel, R.; Lücken, L.; Thandavarayan, G. Infrastructure support for cooperative maneuvers in connected and automated driving. In Proceedings of the 2019 IEEE Intelligent Vehicles Symposium (IV), Paris, France, 9–12 June 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 20–25. [Google Scholar]
- Hobert, L.; Festag, A.; Llatser, I.; Altomare, L.; Visintainer, F.; Kovacs, A. Enhancements of V2X communication in support of cooperative autonomous driving. IEEE Commun. Mag. 2015, 53, 64–70. [Google Scholar] [CrossRef]
- Liu, C.; Chen, J.; Chen, Y.; Payton, R.; Riley, M.; Yang, S.H. Self-Supervised Adaptive Weighting for Cooperative Perception in V2V Communications. IEEE Trans. Intell. Veh. 2023, 9, 3569–3580. [Google Scholar] [CrossRef]
- Li, Z.; Liang, H.; Wang, H.; Zhao, M.; Wang, J.; Zheng, X. MKD-Cooper: Cooperative 3D Object Detection for Autonomous Driving via Multi-teacher Knowledge Distillation. IEEE Trans. Intell. Veh. 2023, 9, 1490–1500. [Google Scholar] [CrossRef]
- Song, R.; Xu, R.; Festag, A.; Ma, J.; Knoll, A. FedBEVT: Federated Learning Bird’s Eye View Perception Transformer in Road Traffic Systems. IEEE Trans. Intell. Veh. 2023, 9, 958–969. [Google Scholar] [CrossRef]
- Chang, C.; Zhang, J.; Zhang, K.; Zhong, W.; Peng, X.; Li, S.; Li, L. BEV-V2X: Cooperative Birds-Eye-View Fusion and Grid Occupancy Prediction via V2X-Based Data Sharing. IEEE Trans. Intell. Veh. 2023, 8, 4498–4514. [Google Scholar] [CrossRef]
- Li, Y.; Niu, J.; Ouyang, Z. Fusion strategy of multi-sensor based object detection for self-driving vehicles. In Proceedings of the 2020 International Wireless Communications and Mobile Computing (IWCMC), Limassol, Cyprus, 15–19 June 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1549–1554. [Google Scholar]
- Jisen, W. A study on target recognition algorithm based on 3D point cloud and feature fusion. In Proceedings of the 2021 IEEE 4th International Conference on Automation, Electronics and Electrical Engineering (AUTEEE), Shenyang, China, 19–21 November 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 630–633. [Google Scholar]
- Zhu, Y.; Sapra, K.; Reda, F.A.; Shih, K.J.; Newsam, S.; Tao, A.; Catanzaro, B. Improving semantic segmentation via video propagation and label relaxation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 8856–8865. [Google Scholar]
- Ballé, J.; Minnen, D.; Singh, S.; Hwang, S.J.; Johnston, N. Variational image compression with a scale hyperprior. arXiv 2018, arXiv:1802.01436. [Google Scholar]
- Goodfellow, I.; Warde-Farley, D.; Mirza, M.; Courville, A.; Bengio, Y. Maxout networks. In Proceedings of the International Conference on Machine Learning, Atlanta, GA, USA, 17–19 June 2013; PMLR: London, UK, 2013; pp. 1319–1327. [Google Scholar]
- Girshick, R. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 2015, 28, 91–99. [Google Scholar] [CrossRef] [PubMed]
- Geiger, A.; Lenz, P.; Urtasun, R. Are we ready for autonomous driving? The kitti vision benchmark suite. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; IEEE: Piscataway, NJ, USA, 2012; pp. 3354–3361. [Google Scholar]
- Xu, R.; Xiang, H.; Xia, X.; Han, X.; Li, J.; Ma, J. Opv2v: An open benchmark dataset and fusion pipeline for perception with vehicle-to-vehicle communication. In Proceedings of the 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, 23–27 May 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 2583–2589. [Google Scholar]
Scenario | IOU | F-PointNet | F-ConvNet | LiDAR Fusion | Output Fusion | CooPercept-a | CooPercept-b | CooPercept | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Near | Far | Near | Far | Near | Far | Near | Far | Near | Far | Near | Far | Near | Far | ||
Road Inter-sections | 0.5 | 72.24 | 33.73 | 74.46 | 32.87 | 81.25 | 60.34 | 72.57 | 52.66 | 78.46 | 57.46 | 76.85 | 43.33 | 80.50 | 62.72 |
0.7 | 63.75 | 24.66 | 66.53 | 28.75 | 72.64 | 53.50 | 66.46 | 44.74 | 67.19 | 50.94 | 70.64 | 35.46 | 73.67 | 56.28 | |
Multi-lane Roads | 0.5 | 68.06 | 40.14 | 70.97 | 37.13 | 83.54 | 62.44 | 69.60 | 55.35 | 77.07 | 55.84 | 72.32 | 40.38 | 78.15 | 68.73 |
0.7 | 60.28 | 32.36 | 63.86 | 30.02 | 77.72 | 59.06 | 60.41 | 44.65 | 69.52 | 52.48 | 65.13 | 37.52 | 76.14 | 60.78 | |
Rain | 0.5 | 50.06 | 22.14 | 51.97 | 24.13 | 62.54 | 50.44 | 52.60 | 30.35 | 54.07 | 33.21 | 59.62 | 30.50 | 66.33 | 58.19 |
0.7 | 35.28 | 16.36 | 32.86 | 17.02 | 53.77 | 37.06 | 36.41 | 27.65 | 45.52 | 30.15 | 47.48 | 22.25 | 61.70 | 50.12 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, Y.; Chen, B.; Qin, J.; Hu, F.; Hao, J. CooPercept: Cooperative Perception for 3D Object Detection of Autonomous Vehicles. Drones 2024, 8, 228. https://doi.org/10.3390/drones8060228
Zhang Y, Chen B, Qin J, Hu F, Hao J. CooPercept: Cooperative Perception for 3D Object Detection of Autonomous Vehicles. Drones. 2024; 8(6):228. https://doi.org/10.3390/drones8060228
Chicago/Turabian StyleZhang, Yuxuan, Bing Chen, Jie Qin, Feng Hu, and Jie Hao. 2024. "CooPercept: Cooperative Perception for 3D Object Detection of Autonomous Vehicles" Drones 8, no. 6: 228. https://doi.org/10.3390/drones8060228
APA StyleZhang, Y., Chen, B., Qin, J., Hu, F., & Hao, J. (2024). CooPercept: Cooperative Perception for 3D Object Detection of Autonomous Vehicles. Drones, 8(6), 228. https://doi.org/10.3390/drones8060228