Enhanced Object Detection in Autonomous Vehicles through LiDAR—Camera Sensor Fusion
Abstract
:1. Introduction
- The design of a LiDAR—camera fusion strategy for object detection is presented. First, the 3D point cloud object detection box is projected onto the image via joint calibration results. Then, a target box IoU matching strategy based on center-point distance probability is adopted to match and fuse the 2D point cloud projection box with the camera detection target box. Subsequently, the D–S theory is utilized for class confidence fusion to obtain the final fusion detection result.
- In response to the problem of ID transformation, which occurs when the target is occluded, the DeepSORT algorithm is improved via the addition of an unscented Kalman filter to accurately predict the nonlinear target motion state. The IoU matching module incorporates target motion information to improve the matching accuracy in the data association process.
2. Methods
2.1. Overall Framework
2.2. Fusion Detection
2.2.1. Target Box IoU Matching Strategy
2.2.2. D–S Theory for Class Confidence Fusion
2.3. Improved DeepSORT for Object Tracking
2.3.1. Unscented Kalman Filter State Estimation
2.3.2. Improving Data Association in IoU Matching Modules
3. Experimental Preparation and Data Introduction
4. Experimental Results and Analysis
4.1. Experimental Analysis of Moving Object Detection
4.1.1. Evaluation Indicators
4.1.2. Experiment Using Cameras
4.1.3. Experiment Using LiDAR
4.1.4. LiDAR—Camera Fusion
4.2. Experimental Analysis of Moving Object Tracking
4.2.1. Evaluation Indicators
4.2.2. Experiment for Pedestrians
4.2.3. Experiment for Cars
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Badue, C.; Guidolini, R.; Carneiro, R.V.; Azevedo, P.; Cardoso, V.B.; Forechi, A.; Jesus, L.; Berriel, R.; Paixao, T.M.; Mutz, F.; et al. Self-driving cars: A survey. Expert Syst. Appl. 2021, 165, 113816. [Google Scholar] [CrossRef]
- Bishop, R. Intelligent vehicle applications worldwide. IEEE Intell. Syst. Their Appl. 2000, 15, 78–81. [Google Scholar] [CrossRef]
- Lan, Y.; Huang, J.; Chen, X. Environmental perception for information and immune control algorithm of miniature intelligent vehicle. Int. J. Control Autom. 2017, 10, 221–232. [Google Scholar] [CrossRef]
- Mozaffari, S.; Al-Jarrah, O.Y.; Dianati, M.; Jennings, P.; Mouzakitis, A. Deep Learning-Based Vehicle Behavior Prediction for Autonomous Driving Applications: A Review. IEEE Trans. Intell. Transp. Syst. 2020, 23, 33–47. [Google Scholar] [CrossRef]
- Mehra, A.; Mandal, M.; Narang, P.; Chamola, V. ReViewNet: A Fast and Resource Optimized Network for Enabling Safe Autonomous Driving in Hazy Weather Conditions. IEEE Trans. Intell. Transp. Syst. 2020, 22, 4256–4266. [Google Scholar] [CrossRef]
- Liu, X.; Baiocchi, O. A comparison of the definitions for smart sensors, smart objects and Things in IoT. In Proceedings of the 2016 IEEE 7th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), Vancouver, BC, Canada, 13–15 October 2016. [Google Scholar]
- Fayyad, J.; Jaradat, M.A.; Gruyer, D.; Najjaran, H. Deep Learning Sensor Fusion for Autonomous Vehicle Perception and Localization: A Review. Sensors 2020, 20, 4220. [Google Scholar] [CrossRef] [PubMed]
- Chen, X.D.; Zhang, J.C.; Pang, W.S.; Ai, D.H.; Wang, Y.; Cai, H.Y. Key Technology and Application Algorithm of Intelligent Driving Vehicle LiDAR. Opto-Electron. Eng. 2019, 46, 190182. [Google Scholar] [CrossRef]
- Fan, J.; Huang, Y.; Shan, J.; Zhang, S.; Zhu, F. Extrinsic calibration between a camera and a 2D laser rangefinder using a photogrammetric control field. Sensors 2019, 19, 2030. [Google Scholar] [CrossRef]
- Vivet, D.; Debord, A.; Pagès, G. PAVO: A Parallax based Bi-Monocular VO Approach for Autonomous Navigation in Various Environments. In Proceedings of the DISP Conference, St Hugh College, Oxford, UK, 29–30 April 2019. [Google Scholar]
- Mishra, S.; Osteen, P.R.; Pandey, G.; Saripalli, S. Experimental Evaluation of 3D-LIDAR Camera Extrinsic Calibration. In Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA, 24 October 2020–24 January 2021; pp. 9020–9026. [Google Scholar] [CrossRef]
- Kanezaki, A.; Suzuki, T.; Harada, T.; Kuniyoshi, Y. Fast object detection for robots in a cluttered indoor environment using integral 3D feature table. In Proceedings of the IEEE International Conference on Robotics and Automation, Shanghai, China, 9–13 May 2011; pp. 4026–4033. [Google Scholar] [CrossRef]
- Jeong, J.; Cho, Y.; Kim, A. The road is enough! Extrinsic calibration of non-overlapping stereo camera and LiDAR using road information. IEEE Robot. Autom. Lett. 2019, 4, 2831–2838. [Google Scholar] [CrossRef]
- Lv, X.; Wang, B.; Ye, D.; Wang, S. LCCNet: LiDAR and Camera Self-Calibration using Cost Volume Network. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Nashville, TN, USA, 19–25 June 2021; pp. 2888–2895. [Google Scholar] [CrossRef]
- Wu, X.; Zhang, C.; Liu, Y. Calibrank: Effective Lidar-Camera Extrinsic Calibration by Multi-Modal Learning to Rank. In Proceedings of the IEEE International Conference on Image Processing, Abu Dhabi, United Arab Emirates, 25–28 October 2020; pp. 3189–3193. [Google Scholar]
- Gong, X.; Lin, Y.; Liu, J. Extrinsic calibration of a 3D LIDAR and a camera using a trihedron. Opt. Lasers Eng. 2013, 51, 394–401. [Google Scholar] [CrossRef]
- Li, M.L.; Dai, B.; Li, Z.; He, H.B. High-precision Calibration of Placement Parameters between a Ground 3D Laser Scanner and an External Digital Camera. Opt. Precis. Eng. 2016, 24, 2158–2166. [Google Scholar] [CrossRef]
- Cao, M.W.; Qian, Y.Q.; Wang, B.; Wang, X.; Yu, X.Y. Joint Calibration of Panoramic Camera and LiDAR Based on Supervised Learning. arXiv 2018, arXiv:1709.029261. [Google Scholar]
- Yoo, J.H.; Kim, Y.; Kim, J.; Choi, J.W. 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection. arXiv 2020, arXiv:2004.12636v2. [Google Scholar]
- Shahian Jahromi, B.; Tulabandhula, T.; Cetin, S. Real-Time Hybrid Multi-Sensor Fusion Framework for Perception in Autonomous Vehicles. Sensors 2019, 19, 4357. [Google Scholar] [CrossRef]
- Wu, Q.; Li, X.; Wang, K.; Bilal, H. Regional feature fusion for on-road detection of objects using camera and 3D-LiDAR in high-speed autonomous vehicles. Soft Comput. 2023, 27, 18195–18213. [Google Scholar] [CrossRef]
- Arikumar, K.S.; Deepak Kumar, A.; Gadekallu, T.R.; Prathiba, S.B.; Tamilarasi, K. Real-Time 3D Object Detection and Clas-sification in Autonomous Driving Environment Using 3D LiDAR and Camera Sensors. Electronics 2022, 11, 4203. [Google Scholar] [CrossRef]
- Chen, X.; Ma, H.; Wan, J.; Li, B.; Xia, T. Multi-view 3d object detection network for autonomous driving. In Proceedings of the CVPR, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Kim, J.; Kim, J.; Cho, J. An advanced object classification strategy using YOLO through camera and LiDAR sensor fusion. In Proceedings of the 2019 13th International Conference on Signal Processing and Communication Systems (ICSPCS), Gold Coast, Australia, 16–18 December 2019. [Google Scholar]
- Wang, Y.; Liu, X.; Zhao, Q.; He, H.; Yao, Z. Target Detection for Construction Machinery Based on Deep Learning and Mul-ti-source Data Fusion. IEEE Sens. J. 2023, 23, 11070–11081. [Google Scholar] [CrossRef]
- Xu, D.; Anguelov, D.; Jain, A. PointFusion: Deep Sensor Fusion for 3D Bounding Box Estimation. arXiv 2018, arXiv:1711.10871v2. [Google Scholar]
- Wang, J.; Liu, F. Temporal evidence combination method for multi-sensor target recognition based on DS theory and IFS. J. Syst. Eng. Electron. 2017, 28, 1114–1125. [Google Scholar] [CrossRef]
- Bewley, A.; Ge, Z.; Ott, L.; Ramos, F.; Upcroft, B. Simple online and real time tracking. In Proceedings of the 2016 International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; pp. 3464–3468. [Google Scholar]
- Wojke, N.; Bewley, A.; Paulus, D. Simple online and real time tracking with a deep association metric. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; pp. 3645–3649. [Google Scholar]
- Wang, X.; Fu, C.; Li, Z.; Lai, Y.; He, J. DeepFusionMOT: A 3D multi-object tracking framework based on camera-LiDAR fusion with deep association. IEEE Robot. Autom. Lett. 2022, 7, 8260–8267. [Google Scholar] [CrossRef]
- Wang, L.; Zhang, X.; Qin, W.; Li, X.; Gao, J.; Yang, L.; Liu, H. Camo-mot: Combined appearance-motion optimization for 3d multi-object tracking with camera-lidar fusion. IEEE Trans. Intell. Transp. Syst. 2023, 24, 11981–11996. [Google Scholar] [CrossRef]
- Chen, M.; Ren, Y.; Ou, M. Adaptive Robust Path Tracking Control for Autonomous Vehicles Considering Multi-Dimensional System Uncertainty. World Electr. Veh. J. 2023, 14, 11. [Google Scholar] [CrossRef]
- Hosseinzadeh, M.; Sinopoli, B.; Bobick, A.F. Toward Safe and Efficient Human–Robot Interaction via Behavior-Driven Danger Signaling. IEEE Trans. Control. Syst. Technol. 2024, 32, 1. [Google Scholar] [CrossRef]
- Zhao, J.; Xu, H.; Liu, H.; Wu, J.; Zheng, Y.; Wu, D. Detection and tracking of pedestrians and vehicles using roadside LiDAR sensors. Transportation Research Part C: Emerg. Technol. 2019, 100, 68–87. [Google Scholar] [CrossRef]
- Geiger, A.; Lenz, P.; Stiller, C.; Urtasun, R. Vision meets robotics: The kitti dataset. Int. J. Robot. Res. 2013, 32, 1231–1237. [Google Scholar] [CrossRef]
Pseudo-Code: Unscented Kalman Filtering |
---|
Initialization: Select the number and location of sigma points Assign weights to each sigma point Initialize the state vector x and covariance matrix P For each time step t: Prediction Step: For each sigma point X_sigma[i]: X_sigma[i] = nonlinear_function(X_sigma[i]) // Apply nonlinear dynamic model Compute the predicted mean and covariance: X_pred = sum(W[i] * X_sigma[i]) // Weighted sum of sigma points P_pred = sum(W[i] * (X_sigma[i] − X_pred)′ * (X_sigma[i] − X_pred)) //Weighted covariance Update Step: For each sigma point X_sigma[i]: Z_sigma[i] = measurement_model(X_sigma[i]) // Apply nonlinear measurement model Compute the measurement mean and covariance: Z_pred = sum(W[i] * Z_sigma[i]) // Weighted sum of transformed sigma points R = sum(W[i] * (Z_sigma[i] − Z_pred)′ * (Z_sigma[i] − Z_pred)) //Weighted covariance of measurements Compute the Kalman gain K: K = P_pred * H′ * inv(H * P_pred * H′ + R) // Kalman gain matrix Update the state vector and covariance matrix: X = X_pred + K * (Z − Z_pred) // State update P = (I − K * H) * P_pred // Covariance update Iteration: Use the updated X and P for the next iteration |
Item | Parameter |
---|---|
Model | Q20 |
Maximum resolution | 1920 ×1800 |
Pixel | 4M pixel |
Frame rate | 30 FPS |
Item | Parameter |
---|---|
Number of lines | 16 |
Frame rate | 10 Hz |
Laser wavelength | 905 nm |
Range capability | 150 m |
Accuracy | ±2.0 cm |
HFOV | 360° |
VFOV | 30° |
Horizontal resolution | 0.4° |
Vertical resolution | 2.0° |
Item | Parameter |
---|---|
Operating system | Ubuntu 18.04 |
CPU | Intel(R) Core i5-12400 |
Memory | 16 GB |
GPU | NVIDIA GeForce RTX 3060 |
Graphics memory | 12 GB |
CUDA version | Cuda 11.1 + CuDNN 8.6.0 |
Development language | Python 3.8 |
Deep learning framework version | PyTorch 1.12 |
Scene | Category | Precision (%) | FP Rate (%) | Miss Rate (%) |
---|---|---|---|---|
Day | car | 92.73 | 0.83 | 1.04 |
pedestrian | 91.24 | 1.48 | 2.97 | |
Night | car | 89.54 | 3.06 | 3.18 |
pedestrian | 86.38 | 7.83 | 4.26 |
Object | Sensor | Probability of Pedestrian | Probability of Car | Uncertainty |
---|---|---|---|---|
Car | Camera | 0.061 | 0.903 | 0.036 |
LiDAR | 0.048 | 0.925 | 0.024 | |
LiDAR—camera fusion | 0.012 | 0.973 | 0.004 | |
Pedestrian | Camera | 0.934 | 0.042 | 0.021 |
LiDAR | 0.837 | 0.117 | 0.047 | |
LiDAR—camera fusion | 0.954 | 0.020 | 0.005 |
Object | Sensor | Probability of Pedestrian | Probability of Car | Uncertainty |
---|---|---|---|---|
Car | Camera | 0.126 | 0.832 | 0.038 |
LiDAR | 0.062 | 0.915 | 0.025 | |
LiDAR—camera fusion | 0.003 | 0.941 | 0.005 | |
Pedestrian | Camera | 0.893 | 0.042 | 0.031 |
LiDAR | 0.834 | 0.123 | 0.051 | |
LiDAR—camera fusion | 0.925 | 0.023 | 0.007 |
Methods | MOTA | MOTP | HOTA | IDF1 |
---|---|---|---|---|
SORT | 0.49 | 0.62 | 0.46 | 0.51 |
ByteTrack | 0.60 | 0.71 | 0.52 | 0.57 |
DeepSORT | 0.56 | 0.74 | 0.53 | 0.59 |
Improved DeepSORT | 0.66 | 0.79 | 0.61 | 0.72 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Dai, Z.; Guan, Z.; Chen, Q.; Xu, Y.; Sun, F. Enhanced Object Detection in Autonomous Vehicles through LiDAR—Camera Sensor Fusion. World Electr. Veh. J. 2024, 15, 297. https://doi.org/10.3390/wevj15070297
Dai Z, Guan Z, Chen Q, Xu Y, Sun F. Enhanced Object Detection in Autonomous Vehicles through LiDAR—Camera Sensor Fusion. World Electric Vehicle Journal. 2024; 15(7):297. https://doi.org/10.3390/wevj15070297
Chicago/Turabian StyleDai, Zhongmou, Zhiwei Guan, Qiang Chen, Yi Xu, and Fengyi Sun. 2024. "Enhanced Object Detection in Autonomous Vehicles through LiDAR—Camera Sensor Fusion" World Electric Vehicle Journal 15, no. 7: 297. https://doi.org/10.3390/wevj15070297
APA StyleDai, Z., Guan, Z., Chen, Q., Xu, Y., & Sun, F. (2024). Enhanced Object Detection in Autonomous Vehicles through LiDAR—Camera Sensor Fusion. World Electric Vehicle Journal, 15(7), 297. https://doi.org/10.3390/wevj15070297