Evaluation of 3D Vulnerable Objects’ Detection Using a Multi-Sensors System for Autonomous Vehicles
Abstract
:1. Introduction
- Self-localization.
- Environment recognition.
- Motion prediction.
- Decision Making.
- Trajectory generation.
- Ego-vehicle control.
1.1. Automation Levels
1.2. Autonomous Vehicles’ Sensory Systems
1.3. Related Work
1.3.1. Images Acquired by Cameras
1.3.2. Point Clouds Acquired by LiDARs
- Projection of a point cloud into a 2D plane in order to apply 2D detection frameworks to acquire 3D localization on projected images.
1.3.3. Sensor Fusion
1.4. Paper Organization
2. Real-Time Object Detection
2.1. Deep-Learning-Based Object Detection
- Variable weather and lighting conditions.
- Reflective objects.
- Diverse object sizes.
- The occlusion and truncation of obstacles.
2.2. Overlapping Detection
3. LiDAR Data Processing
3.1. Hokuyo UTM-30LX
- Range: 30 m.
- 270° scanning angle.
- 0.25° angular resolution.
- Light source: Laser semiconductor 870 nm, Laser class 1.
- Supply voltage: 12 VDC ± 10%.
- Supply current: a maximum of 1 A, normal is 0.7 A.
- Power consumption: less than 8 W.
3.2. Conversion of Radial Measurements into Perpendicular Measurements
3.3. Linearization and Smoothing
3.4. Grouping of LiDAR Measurements into Clusters with Unique IDs
- Minimum cluster size: in order to avoid the creation of numerous unneeded mini-clusters that may represent objects’ subregions, different cluster sizes were tested. The smaller the size of clusters, the more false clusters were created.
- Setting a threshold to the difference which sets the edge between consecutive clusters.
4. Camera and LiDAR Fusion
4.1. Sensor Placement
4.2. Mapping between Image and LiDAR Coordinates
- Two-dimensional bounding boxes drawn over image pixels.
- Object classes.
- A pixel x-coordinate (xPixel).
- Width of the frame (FrameWidth).
- Horizontal field of view of the camera (HFOV).
- The hypotenuse goes from the camera to the edge of the image and has an angle (θ) formed between the hypotenuse and (ℓ).
- The hypotenuse goes from the camera to (xPixel) and has an angle (ϕ) formed between its hypotenuse and (ℓ).
4.3. Complementary Camera and LiDAR Fusion
5. Results
5.1. Real-Time Object Detection
5.2. Processing of LiDAR Measurements
5.3. Adding a Third Dimension to Visual Bounding Boxes
6. Discussion
6.1. Conclusions
6.2. Limitations and Future Work
- The use of multiple cameras in order to cover a wider horizontal field of view without causing much image distortion.
- Since the KITTI dataset only has daytime driving data, we suggest evaluating the real-time image-based object detection module on the Waymo Open Dataset.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Department for Transport. The Pathway to Driverless Cars: Summary Report and Action Plan; Department for Transport: London, UK, 2015.
- Kaiwartya, O.; Abdullah, A.H.; Cao, Y.; Altameem, A.; Prasad, M.; Lin, C.; Liu, X. Internet of vehicles: Motivation, layered architecture, network model, challenges, and future aspects. IEEE Access 2016, 4, 5356–5373. [Google Scholar] [CrossRef]
- Arena, F.; Pau, G. An overview of vehicular communications. Future Internet 2019, 11, 27. [Google Scholar] [CrossRef] [Green Version]
- Ondruš, J.; Kolla, E.; Vertaľ, P.; Šarić, Ž. How Do Autonomous Cars Work? Transp. Res. Procedia 2020, 44, 226–233. [Google Scholar] [CrossRef]
- Khatab, E.; Onsy, A.; Varley, M.; Abouelfarag, A. Vulnerable objects detection for autonomous driving: A review. Integration 2021, 78, 36–48. [Google Scholar] [CrossRef]
- Ahangar, M.N.; Ahmed, Q.Z.; Khan, F.A.; Hafeez, M. A survey of autonomous vehicles: Enabling communication technologies and challenges. Sensors 2021, 21, 706. [Google Scholar] [CrossRef]
- Zhu, H.; Yuen, K.; Mihaylova, L.; Leung, H. Overview of environment perception for intelligent vehicles. IEEE Trans. Intell. Transp. Syst. 2017, 18, 2584–2601. [Google Scholar] [CrossRef] [Green Version]
- van Brummelen, J.; O’Brien, M.; Gruyer, D.; Najjaran, H. Autonomous vehicle perception: The technology of today and tomorrow. Transp. Res. Part C Emerg. Technol. 2018, 89, 384–406. [Google Scholar] [CrossRef]
- Yoneda, K.; Suganuma, N.; Yanase, R.; Aldibaja, M. Automated driving recognition technologies for adverse weather conditions. IATSS Res. 2019, 43, 253–262. [Google Scholar] [CrossRef]
- SAE On-Road Automated Vehicle Standards Committee and Others, Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles; SAE International: Warrendale, PA, USA, 2018.
- Dai, D.; Chen, Z.; Bao, P.; Wang, J. A Review of 3D Object Detection for Autonomous Driving of Electric Vehicles. World Electr. Veh. J. 2021, 12, 139. [Google Scholar] [CrossRef]
- Kovačić, K.; Ivanjko, E.; Gold, H. Computer vision systems in road vehicles: A review. arXiv 2013, arXiv:1310.0315. [Google Scholar]
- Ilas, C. Electronic sensing technologies for autonomous ground vehicles: A review. In Proceedings of the 2013 8th International Symposium on Advanced Topics in Electrical Engineering (Atee), Bucharest, Romania, 23–25 May 2013; pp. 1–6. [Google Scholar]
- Aqel, M.O.; Marhaban, M.H.; Saripan, M.I.; Ismail, N.B. Review of visual odometry: Types, approaches, challenges, and applications. SpringerPlus 2016, 5, 1897. [Google Scholar] [CrossRef] [Green Version]
- Shi, W.; Alawieh, M.B.; Li, X.; Yu, H. Algorithm and hardware implementation for visual perception system in autonomous vehicle: A survey. Integr. VLSI J. 2017, 59, 148–156. [Google Scholar] [CrossRef]
- Campbell, S.; O’Mahony, N.; Krpalcova, L.; Riordan, D.; Walsh, J.; Murphy, A.; Ryan, C. Sensor technology in autonomous vehicles: A review. In Proceedings of the 2018 29th Irish Signals and Systems Conference (ISSC), Belfast, UK, 21–22 June 2018; pp. 1–4. [Google Scholar]
- Kocić, J.; Jovičić, N.; Drndarević, V. Sensors and sensor fusion in autonomous vehicles. In Proceedings of the 2018 26th Telecommunications Forum (TELFOR), Belgrade, Serbia, 20–21 November 2018; pp. 420–425. [Google Scholar]
- Rosique, F.; Navarro, P.J.; Fernández, C.; Padilla, A. A systematic review of perception system and simulators for autonomous vehicles research. Sensors 2019, 19, 648. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Haris, M.; Glowacz, A. Road Object Detection: A Comparative Study of Deep Learning-Based Algorithms. Electronics 2021, 10, 1932. [Google Scholar] [CrossRef]
- Yoon, K.; Song, Y.; Jeon, M. Multiple hypothesis tracking algorithm for multi-target multi-camera tracking with disjoint views. IET Image Process. 2018, 12, 1175–1184. [Google Scholar] [CrossRef] [Green Version]
- Mousavian, A.; Anguelov, D.; Flynn, J.; Kosecka, J. 3d bounding box estimation using deep learning and geometry. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7074–7082. [Google Scholar]
- Chen, X.; Kundu, K.; Zhang, Z.; Ma, H.; Fidler, S.; Urtasun, R. Monocular 3d object detection for autonomous driving. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2147–2156. [Google Scholar]
- Wang, Z.; Wu, Y.; Niu, Q. Multi-sensor fusion in automated driving: A survey. IEEE Access 2019, 8, 2847–2868. [Google Scholar] [CrossRef]
- Asvadi; Garrote, L.; Premebida, C.; Peixoto, P.; Nunes, U.J. Multimodal vehicle detection: Fusing 3D-LIDAR and color camera data. Pattern Recognit. Lett. 2018, 115, 20–29. [Google Scholar] [CrossRef]
- Zhang, X.; Xu, W.; Dong, C.; Dolan, J.M. Efficient L-shape fitting for vehicle detection using laser scanners. In Proceedings of the 2017 IEEE Intelligent Vehicles Symposium (IV), Los Angeles, CA, USA, 11–14 June 2017; pp. 54–59. [Google Scholar]
- Taipalus, T.; Ahtiainen, J. Human detection and tracking with knee-high mobile 2D LIDAR. In Proceedings of the 2011 IEEE International Conference on Robotics and Biomimetics, Karon Beach, Thailand, 7–11 December 2011; pp. 1672–1677. [Google Scholar]
- Shao, X.; Zhao, H.; Nakamura, K.; Katabira, K.; Shibasaki, R.; Nakagawa, Y. Detection and tracking of multiple pedestrians by using laser range scanners. In Proceedings of the 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems, San Diego, CA, USA, 29 October–2 November 2007; pp. 2174–2179. [Google Scholar]
- Rozsa, Z.; Sziranyi, T. Obstacle prediction for automated guided vehicles based on point clouds measured by a tilted LIDAR sensor. IEEE Trans. Intell. Transp. Syst. 2018, 19, 2708–2720. [Google Scholar] [CrossRef] [Green Version]
- García, F.; Jiménez, F.; Naranjo, J.E.; Zato, J.G.; Aparicio, F.; Armingol, J.M.; de la Escalera, A. Environment perception based on LIDAR sensors for real road applications. Robotica 2012, 30, 185–193. [Google Scholar] [CrossRef] [Green Version]
- Shi, S.; Jiang, L.; Deng, J.; Wang, Z.; Guo, C.; Shi, J.; Wang, X.; Li, H. PV-RCNN: Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection. arXiv 2021, arXiv:2102.00463. [Google Scholar]
- Zhou, Y.; Tuzel, O. Voxelnet: End-to-end learning for point cloud based 3d object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4490–4499. [Google Scholar]
- Yang, Z.; Sun, Y.; Liu, S.; Shen, X.; Jia, J. Std: Sparse-to-dense 3d object detector for point cloud. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Long Beach, CA, USA, 15–20 June 2019; pp. 1951–1960. [Google Scholar]
- Qi, C.R.; Su, H.; Mo, K.; Guibas, L.J. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honululu, HI, USA, 21–26 July 2017; pp. 652–660. [Google Scholar]
- Qi, C.R.; Liu, W.; Wu, C.; Su, H.; Guibas, L.J. Frustum pointnets for 3d object detection from rgb-d data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 918–927. [Google Scholar]
- Han, J.; Liao, Y.; Zhang, J.; Wang, S.; Li, S. Target Fusion Detection of LiDAR and Camera Based on the Improved YOLO Algorithm. Mathematics 2018, 6, 213. [Google Scholar] [CrossRef] [Green Version]
- Liang, M.; Yang, B.; Wang, S.; Urtasun, R. Deep continuous fusion for multi-sensor 3d object detection. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 641–656. [Google Scholar]
- García, F.; García, J.; Ponz, A.; de la Escalera, A.; Armingol, J.M. Context aided pedestrian detection for danger estimation based on laser scanner and computer vision. Expert Syst. Appl. 2014, 41, 6646–6661. [Google Scholar] [CrossRef] [Green Version]
- Garcia, F.; Martin, D.; de la Escalera, A.; Armingol, J.M. Sensor fusion methodology for vehicle detection. IEEE Intell. Transp. Syst. Mag. 2017, 9, 123–133. [Google Scholar] [CrossRef]
- Rövid, A.; Remeli, V. Towards raw sensor fusion in 3D object detection. In Proceedings of the 2019 IEEE 17th World Symposium on Applied Machine Intelligence and Informatics (SAMI), Herlany, Slovakia, 24–26 January 2019; pp. 293–298. [Google Scholar]
- Liang, M.; Yang, B.; Chen, Y.; Hu, R.; Urtasun, R. Multi-task multi-sensor fusion for 3d object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 7345–7353. [Google Scholar]
- Xu, D.; Anguelov, D.; Jain, A. Pointfusion: Deep sensor fusion for 3d bounding box estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 244–253. [Google Scholar]
- Shin, K.; Kwon, Y.P.; Tomizuka, M. Roarnet: A robust 3d object detection based on region approximation refinement. In Proceedings of the 2019 IEEE Intelligent Vehicles Symposium (IV), Paris, France, 9–12 June 2019; pp. 2510–2515. [Google Scholar]
- Gong, Z.; Lin, H.; Zhang, D.; Luo, Z.; Zelek, J.; Chen, Y.; Nurunnabi, A.; Wang, C.; Li, J. A Frustum-based probabilistic framework for 3D object detection by fusion of LiDAR and camera data. ISPRS J. Photogramm. Remote Sens. 2020, 159, 90–100. [Google Scholar] [CrossRef]
- Dou, J.; Xue, J.; Fang, J. SEG-VoxelNet for 3D vehicle detection from RGB and LiDAR data. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 4362–4368. [Google Scholar]
- Fernández, C.; Izquierdo, R.; Llorca, D.F.; Sotelo, M.A. Road curb and lanes detection for autonomous driving on urban scenarios. In Proceedings of the 17th International IEEE Conference on Intelligent Transportation Systems (ITSC), Qingdao, China, 8–11 October 2014; pp. 1964–1969. [Google Scholar]
- Vitas, D.; Tomic, M.; Burul, M. Traffic Light Detection in Autonomous Driving Systems. IEEE Consum. Electron. Mag. 2020, 9, 90–96. [Google Scholar] [CrossRef]
- Levinson, J.; Askeland, J.; Dolson, J.; Thrun, S. Traffic light mapping, localization, and state detection for autonomous vehicles. In Proceedings of the 2011 IEEE International Conference on Robotics and Automation, Shanghai, China, 9–13 May 2011; pp. 5784–5791. [Google Scholar]
- Mu, G.; Xinyu, Z.; Deyi, L.; Tianlei, Z.; Lifeng, A. Traffic light detection and recognition for autonomous vehicles. J. China Univ. Posts Telecommun. 2015, 22, 50–56. Available online: https://www.sciencedirect.com/science/article/pii/S1005888515606240 (accessed on 9 November 2021). [CrossRef]
- Redmon, J. Darknet: Open Source Neural Networks in C. 2013. Available online: https://pjreddie.com/darknet/ (accessed on 7 November 2021).
- Darknet. Available online: https://github.com/pjreddie/darknet (accessed on 9 November 2021).
- Geiger, A.; Lenz, P.; Urtasun, R. Are we ready for autonomous driving? The kitti vision benchmark suite. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; pp. 3354–3361. [Google Scholar]
- Geiger, A.; Lenz, P.; Stiller, C.; Urtasun, R. The KITTI Vision Benchmark Suite. 2015. Available online: Http://Www.Cvlibs.Net/Datasets/Kitti (accessed on 9 November 2021).
- Scanning Rangefinder Distance Data Output/UTM-30LX Product Details|HOKUYO AUTOMATIC CO., LTD. Available online: https://www.hokuyo-aut.jp/search/single.php?serial=169 (accessed on 9 November 2021).
- Fang, Z.; Zhao, S.; Wen, S.; Zhang, Y. A Real-Time 3D Perception and Reconstruction System Based on a 2D Laser Scanner. J. Sens. 2018, 2018. [Google Scholar] [CrossRef] [Green Version]
- Choi, D.; Bok, Y.; Kim, J.; Shim, I.; Kweon, I. Structure-From-Motion in 3D Space Using 2D Lidars. Sensors 2017, 17, 242. [Google Scholar] [CrossRef] [Green Version]
Paper | Modality | Limitation |
---|---|---|
Multi-task multi-sensor fusion for 3D object detection [40] | RGB + 3D point cloud | Expensive 3D LiDAR Not real-time |
Frustum pointnets for 3D Object Detection from rgb-d data [34] | RGB-D | 0.12 s per frame Not real-time |
Pointfusion: deep sensor fusion for 3D bounding box estimation [41] | RGB + 3D point cloud | 1.3 s per frame Not real-time |
RoarNet: a robust 3D object detection based on regiOn approximation refinement [42] | RGB + 3D point cloud | Expensive 3D LiDAR Not real-time |
A frustum-based probabilistic framework for 3D object detection by fusion of LiDAR and camera data [43] | RGB + 3D point cloud | Only for detecting static object |
SEG-VoxelNet for 3D vehicle detection from RGB and LiDAR data [44] | RGB + 3D point cloud | Only detects vehicles Not real-time |
MVX-Net: multimodal voxelnet for 3D object detection | RGB + 3D point cloud | Not real-time |
3D-cvf: generating joint camera and lidar features using cross-view spatial feature fusion for 3D object detection | RGB + 3D point cloud | NVIDIA GTX 1080Ti, inference time 75 ms per frame (13.33 FPS) |
PI-RCNN: an efficient multi-sensor 3D object detector with point-based attentive cont-conv fusion module | RGB + 3D point cloud | Not real-time |
Image guidance-based 3D vehicle detection in traffic scene | RGB + 3D point cloud | Only vehicles, 4FPS |
Epnet: enhancing point features with image semantics for 3D object detection. | RGB + 3D point cloud | Not real-time |
Benchmark | Easy | Moderate | Hard |
---|---|---|---|
Car | 56% | 36.23% | 29.55% |
Pedestrian | 29.98% | 22.84% | 22.21% |
Cyclist | 9.09% | 9.09% | 9.09% |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Khatab, E.; Onsy, A.; Abouelfarag, A. Evaluation of 3D Vulnerable Objects’ Detection Using a Multi-Sensors System for Autonomous Vehicles. Sensors 2022, 22, 1663. https://doi.org/10.3390/s22041663
Khatab E, Onsy A, Abouelfarag A. Evaluation of 3D Vulnerable Objects’ Detection Using a Multi-Sensors System for Autonomous Vehicles. Sensors. 2022; 22(4):1663. https://doi.org/10.3390/s22041663
Chicago/Turabian StyleKhatab, Esraa, Ahmed Onsy, and Ahmed Abouelfarag. 2022. "Evaluation of 3D Vulnerable Objects’ Detection Using a Multi-Sensors System for Autonomous Vehicles" Sensors 22, no. 4: 1663. https://doi.org/10.3390/s22041663
APA StyleKhatab, E., Onsy, A., & Abouelfarag, A. (2022). Evaluation of 3D Vulnerable Objects’ Detection Using a Multi-Sensors System for Autonomous Vehicles. Sensors, 22(4), 1663. https://doi.org/10.3390/s22041663