Advanced Point Cloud Techniques for Improved 3D Object Detection: A Study on DBSCAN, Attention, and Downsampling
Abstract
:1. Introduction
- The DBSCAN (Density-Based Spatial Clustering of Applications with Noise) clustering algorithm is applied to point cloud processing. By utilizing its density-based clustering approach, DBSCAN effectively detects and removes outliers and noise within the point clouds, thereby improving data quality and enhancing the accuracy of subsequent analyses.
- In the pillar feature network layer, a point-wise self-attention mechanism [15] and a spatial attention mechanism are combined to further reduce noise within the point cloud pillars, emphasize crucial feature information, and improve the feature extraction capability of the point cloud.
- In the downsampling module, CSPNet, which facilitates gradient flow splitting, is used to replace the conventional convolutional blocks in the original module [16]. This modification enables gradient flow to propagate through different network paths, effectively reducing computational complexity and enhancing the network’s detection performance.
2. Related Work
2.1. PointPillars Network Analysis
2.2. Point Cloud Processing Based on DBSCAN Clustering Algorithm
- Initial Selection: randomly select an unvisited point and mark it as visited.
- Find Neighborhood Points: identify all points within a specified radius (eps) around the selected point; these points are termed neighborhood points.
- Determine Core Points: If the number of neighboring points is greater than or equal to minPts, the point is designated as a core point, and a new cluster is formed that includes all the neighboring points. Conversely, if the number of neighboring points is fewer than minPts, the point is classified as a noise point.
- Expand Cluster: For each neighborhood point of the core point, repeat the steps of finding neighborhood points and determining core points, and add the new neighborhood points to the current cluster. If a neighborhood point is also a core point, add its neighborhood points to the current cluster according to the density reachability principle.
- Repeat the Process: continue with any remaining unvisited points, repeating the above steps until all points have been processed.
2.3. Integration of Attention Mechanisms
2.4. Improvements in Downsampling
3. Experimental Results and Analysis
3.1. Overview and Description of the Experimental Dataset
3.2. Experimental Parameterization Design and Environment Construction
3.3. Analysis of Experimental Results
3.3.1. Performance Analysis of Accuracy Means
3.3.2. Evaluation of Real-Time and Detection Performance of the Model
3.3.3. Ablation Experiments
3.3.4. Visualization of Experimental Results
4. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Bai, Z.; Wu, G.; Barth, M.J.; Liu, Y.; Sisbot, E.A.; Oguchi, K. PillarGrid: Deep Learning-based Cooperative Perception for 3D Object Detection from Onboard-Roadside LiDAR. In Proceedings of the 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC), Macau, China, 8–12 October 2022; pp. 1743–1749. [Google Scholar]
- Chen, X.; Ma, H.; Wan, J.; Li, B.; Xia, T. Multi-view 3D object detection network for autonomous driving. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1907–1915. [Google Scholar]
- Chen, D.J.; Yu, W.J.; Gao, Y.B. 3D Object Detection of LiDAR Based on Improved PointPillars. Laser Optoelectron. Prog. 2023, 60, 447–453. [Google Scholar]
- Ku, J.; Mozifian, M.; Lee, J.; Harakeh, A.; Waslander, S.L. Joint 3D proposal generation and object detection from view aggregation. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; pp. 1–8. [Google Scholar]
- Geiger, A.; Lenz, P.; Urtasun, R. Are we ready for autonomous driving? The KITTI vision benchmark suite. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; pp. 3354–3361. [Google Scholar]
- Li, J.N.; Wu, Z.; Xu, T.F. Research Progress of 3D Object Detection Technology Based on Point Cloud Data. Acta Opt. Sin. 2023, 43, 296–312. [Google Scholar]
- Li, X.L.; Zhou, Y.E.; Bi, T.F.; Yu, Q.; Wang, Z.; Huang, J.; Xu, L. A Review on the Development of Key Technologies for Lightweight Sensing Lidar. Chin. J. Lasers 2022, 49, 263–277. [Google Scholar]
- Qi, C.R.; Yi, L.; Su, H.; Guibas, L.J. PointNet++: Deep hierarchical feature learning on point sets in a metric space. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NeurIPS 2017), Long Beach, CA, USA, 4–9 December 2017; pp. 5099–5108. [Google Scholar]
- Zhou, Y.; Tuzel, O. VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4490–4499. [Google Scholar]
- Yan, Y.; Mao, Y.; Li, B. SECOND: Sparsely Embedded Convolutional Detection. Sensors 2018, 18, 3337. [Google Scholar] [CrossRef] [PubMed]
- Yin, T.; Zhou, X.; Krähenbühl, P. CenterPoint: Center-based 3D Object Detection and Tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 19–25 June 2021; pp. 11784–11793. [Google Scholar]
- Sheng, H.L.; Cai, S.J.; Zhao, N.; Deng, B.; Huang, J.; Hua, X.S.; Zhao, M.J.; Lee, G.H. Rethinking IoU-Based Optimization for Single-Stage 3D Object Detection. In Proceedings of the European Conference on Computer Vision (ECCV), Tel Aviv, Israel, 23–27 October 2022; Springer: Berlin/Heidelberg, Germany, 2022; pp. 544–561. [Google Scholar]
- Yang, Q.; Kong, D.; Chen, J.; Li, X.; Shen, Y. An Improved PointPillars Method Based on Density Clustering and Dual Attention Mechanism. Laser Optoelectron. Prog. 2024, 61, 2412003. [Google Scholar]
- Xu, H.; Dong, X.; Wu, W.; Yu, B.; Zhu, H. A Two-Stage Pillar Feature-Encoding Network for Pillar-Based 3D Object Detection. World Electr. Veh. J. 2023, 14, 146. [Google Scholar] [CrossRef]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2018), Salt Lake City, UT, USA, 18–22 June 2018; pp. 7132–7141. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Huang, C.; Zhang, Z.; Liu, C.; Zhuang, Y.; Li, Y. CSPNet: A New Backbone that can Enhance Learning Capability of CNN. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2020), Seattle, WA, USA, 13–19 June 2020; pp. 390–400. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands, 11–14 October 2016; pp. 21–37. [Google Scholar]
- Wang, Z.; Liu, L.; Yu, X.; Zhang, C.; Zhao, W. 3D Bounding Box Estimation Using Deep Learning and Geometry. In Proceedings of the International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 1–10. [Google Scholar]
- Ku, J.; Saldana, A.; Watterson, J.; Mertz, C.; Khandelwal, S.; Maturana, D. Joint 3D proposal generation and object detection from a single RGB-D image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018; pp. 1085–1094. [Google Scholar]
- Wang, Y.Y.; Wang, Y.N.; Liu, J.X.; Ren, J. Research on Application of Port Logistics Big Data Based on Hadoop. J. YanShan Univ. 2023, 47, 216–220. [Google Scholar]
- Elfwing, S.; Kabra, R.; Kawaguchi, K.; Doya, K. Sigmoid-weighted Linear Unit for Neural Network Activation Functions. In Proceedings of the IEEE Conference on Neural Information Processing Systems (NeurIPS 2018), Montreal, QC, Canada, 2–8 December 2018; pp. 5674–5683. [Google Scholar]
- Hu, J.; An, Y.P.; Xu, W.C.; Xiong, Z.; Liu, H. 3D Object Detection Based on Deep Semantic and Positional Information Fusion of Laser Point Clouds. Chin. J. Lasers 2023, 50, 200–210. [Google Scholar]
- Qiu, S.; Wu, Y.; Anwar, S.; Li, C. Investigating Attention Mechanism in 3D Point Cloud Object Detection. In Proceedings of the 2021 International Conference on 3D Vision (3DV), London, UK, 1–3 December 2021; pp. 403–412. [Google Scholar]
- Zhai, Z.; Wang, Q.; Pan, Z.; Gao, Z.; Hu, W. Muti-Frame Point Cloud Feature Fusion Based on Attention Mechanisms for 3D Object Detection. Sensors 2022, 22, 7473. [Google Scholar] [CrossRef] [PubMed]
- Li, X.; Liang, B.; Huang, J.; Peng, Y.; Yan, Y.; Li, J.; Shang, W.; Wei, W. Pillar-Based 3D Object Detection from Point Cloud with Multiattention Mechanism. Wirel. Commun. Mob. Comput. 2023, 2023, 5603123. [Google Scholar] [CrossRef]
- Wang, L.; Song, Z.; Zhang, X.; Wang, C.; Zhang, G.; Zhu, L.; Li, J.; Liu, H. SAT-GCN: Self-Attention Graph Convolutional Network-Based 3D Object Detection for Autonomous Driving. Knowl. Based Syst. 2023, 259, 110080. [Google Scholar] [CrossRef]
- Wang, Z.; Fu, H.; Wang, L.; Xiao, L.; Dai, B. SCNet: Subdivision Coding Network for Object Detection Based on 3D Point Cloud. IEEE Access 2019, 7, 120449–120462. [Google Scholar] [CrossRef]
- Cao, P.; Chen, H.; Zhang, Y.; Wang, G. Multi-View Frustum PointNet for Object Detection in Autonomous Driving. In Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 22–25 September 2019; pp. 3896–3899. [Google Scholar]
- Wang, S.; Lu, K.; Xue, J.; Zhao, Y. DA-Net: Density-Aware 3D Object Detection Network for Point Clouds. IEEE Trans. Multimed. 2023, 1–14. [Google Scholar] [CrossRef]
- Li, C.; Gao, F.; Han, X.; Zhang, B. A New Density-Based Clustering Method Considering Spatial Distribution of LiDAR Point Cloud for Object Detection of Autonomous Driving. Electronics 2021, 10, 2005. [Google Scholar] [CrossRef]
- Shi, S.; Guo, C.; Jiang, L.; Wang, Z.; Shi, J.; Wang, X.; Li, H. PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10529–10538. [Google Scholar]
- Wang, Y.; Jiang, Z.; Li, Y.; Hwang, J.N.; Xing, G.; Liu, H. RODNet: A Real-Time Radar Object Detection Network Cross-Supervised by Camera-Radar Fused Object 3D Localization. IEEE J. Sel. Top. Signal Process. 2021, 15, 954–967. [Google Scholar] [CrossRef]
- Zheng, K.; Zheng, Y.; Zhang, Y.; Li, B.; Wang, Z.; Li, L. TANet: Robust 3D object detection via dual attention network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019; pp. 8805–8814. [Google Scholar]
- Zhang, W.; Xu, L.; Zhang, X.; Liu, W.; Liao, R.; Li, Z. PRGBNet: Point cloud representation with graph-based neural network for 3D object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 19–25 June 2022; pp. 3471–3480. [Google Scholar]
- Qi, C.R.; Su, H.; Mo, K.; Yi, L. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 77–85. [Google Scholar]
Model | Car | Pedestrian | Cyclist | ||||||
---|---|---|---|---|---|---|---|---|---|
Easy | Mod. | Hard | Easy | Mod. | Hard | Easy | Mod. | Hard | |
MV3D | 86.02 | 76.90 | 68.49 | N/A | N/A | N/A | N/A | N/A | N/A |
SECOND | 88.07 | 79.37 | 77.95 | 55.10 | 46.27 | 44.76 | 73.67 | 56.04 | 48.78 |
VoxelNet | 89.35 | 79.26 | 77.39 | 46.13 | 40.74 | 38.11 | 66.70 | 54.76 | 50.55 |
TANet | 91.58 | 86.54 | 81.19 | 60.58 | 51.38 | 47.54 | 79.16 | 63.77 | 56.21 |
AVODFPN | 88.53 | 83.79 | 77.90 | 58.75 | 51.50 | 47.54 | 68.09 | 57.48 | 50.77 |
PRGBNet | 91.39 | 85.73 | 80.68 | 38.07 | 29.32 | 26.94 | 73.09 | 57.59 | 51.78 |
HDNET | 89.14 | 86.57 | 78.32 | N/A | N/A | N/A | N/A | N/A | N/A |
PointPillars | 88.35 | 86.10 | 79.83 | 58.66 | 50.23 | 47.19 | 79.14 | 62.25 | 56.00 |
Ours | 90.09 | 87.74 | 83.72 | 62.94 | 55.07 | 53.94 | 78.70 | 67.78 | 62.34 |
Model | Car | Pedestrian | Cyclist | ||||||
---|---|---|---|---|---|---|---|---|---|
Easy | Mod. | Hard | Easy | Mod. | Hard | Easy | Mod. | Hard | |
MV3D | 71.09 | 62.35 | 55.12 | N/A | N/A | N/A | N/A | N/A | N/A |
SECOND | 83.13 | 73.66 | 66.20 | 51.07 | 42.56 | 37.29 | 70.51 | 53.85 | 46.90 |
VoxelNet | 77.47 | 65.11 | 57.73 | 39.48 | 33.69 | 31.50 | 61.22 | 48.36 | 44.37 |
TANet | 84.39 | 75.94 | 68.82 | 53.72 | 44.34 | 40.49 | 75.70 | 59.44 | 52.53 |
AVODFPN | 81.94 | 71.88 | 66.38 | 50.58 | 42.81 | 40.88 | 64.00 | 52.18 | 46.61 |
PRGBNet | 83.99 | 76.04 | 71.17 | 44.63 | 37.37 | 34.92 | 75.24 | 61.70 | 55.32 |
PointPillars | 79.05 | 74.99 | 68.30 | 52.08 | 43.53 | 41.49 | 75.78 | 59.07 | 52.92 |
Ours | 84.34 | 77.90 | 74.15 | 53.08 | 49.22 | 46.20 | 81.18 | 62.10 | 56.13 |
Model | Car | Pedestrian | Cyclist | ||||||
---|---|---|---|---|---|---|---|---|---|
Easy | Mod. | Hard | Easy | Mod. | Hard | Easy | Mod. | Hard | |
PP | 88.35 | 86.10 | 79.83 | 58.66 | 50.23 | 47.19 | 79.14 | 62.25 | 56.00 |
PP + DBSCAN | 88.29 | 85.63 | 83.26 | 58.11 | 51.23 | 49.99 | 79.00 | 61.16 | 58.99 |
PP + SC | 89.87 | 87.25 | 84.72 | 58.62 | 53.26 | 48.68 | 79.62 | 64.21 | 58.01 |
PP + CSPNet | 89.66 | 86.55 | 83.96 | 60.01 | 52.64 | 46.62 | 78.39 | 66.86 | 59.21 |
Both | 90.09 | 87.74 | 83.72 | 62.94 | 55.07 | 53.94 | 78.70 | 67.78 | 62.34 |
Model | Car | Pedestrian | Cyclist | ||||||
---|---|---|---|---|---|---|---|---|---|
Easy | Mod. | Hard | Easy | Mod. | Hard | Easy | Mod. | Hard | |
PP | 79.05 | 74.99 | 68.30 | 52.08 | 43.53 | 41.49 | 75.78 | 59.07 | 52.92 |
PP + DBSCAN | 78.06 | 75.62 | 73.11 | 52.39 | 45.88 | 44.18 | 74.66 | 59.21 | 56.21 |
PP + SC | 82.47 | 76.24 | 71.12 | 53.00 | 46.62 | 44.61 | 78.88 | 61.02 | 53.99 |
PP + CSPNet | 81.04 | 75.22 | 70.65 | 53.66 | 46.98 | 44.21 | 79.24 | 59.02 | 52.22 |
Both | 84.34 | 77.90 | 74.15 | 53.08 | 49.22 | 46.20 | 81.18 | 62.10 | 56.13 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Published by MDPI on behalf of the World Electric Vehicle Association. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, W.; Dong, X.; Cheng, J.; Wang, S. Advanced Point Cloud Techniques for Improved 3D Object Detection: A Study on DBSCAN, Attention, and Downsampling. World Electr. Veh. J. 2024, 15, 527. https://doi.org/10.3390/wevj15110527
Zhang W, Dong X, Cheng J, Wang S. Advanced Point Cloud Techniques for Improved 3D Object Detection: A Study on DBSCAN, Attention, and Downsampling. World Electric Vehicle Journal. 2024; 15(11):527. https://doi.org/10.3390/wevj15110527
Chicago/Turabian StyleZhang, Wenqiang, Xiang Dong, Jingjing Cheng, and Shuo Wang. 2024. "Advanced Point Cloud Techniques for Improved 3D Object Detection: A Study on DBSCAN, Attention, and Downsampling" World Electric Vehicle Journal 15, no. 11: 527. https://doi.org/10.3390/wevj15110527
APA StyleZhang, W., Dong, X., Cheng, J., & Wang, S. (2024). Advanced Point Cloud Techniques for Improved 3D Object Detection: A Study on DBSCAN, Attention, and Downsampling. World Electric Vehicle Journal, 15(11), 527. https://doi.org/10.3390/wevj15110527