A New Deep Model for Detecting Multiple Moving Targets in Real Traffic Scenarios: Machine Vision-Based Vehicles
Abstract
:1. Introduction
2. Materials and Methods
2.1. Four-Scale Detection
2.2. Introduction of CBAM
2.3. Soft-NMS
3. Experiments and Results Analysis
3.1. Evaluation Indicators
3.2. Experiment Based on KITTI Dataset
3.3. Experiment Based on BDD100K Dataset
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Pan, Q.; Zhang, H. Key algorithms of video target detection and recognition in intelligent transportation systems. Int. J. Pattern Recognit. Artif. Intell. 2019, 34, 1–17. [Google Scholar] [CrossRef]
- Gilroy, S. Overcoming occlusion in the automotive environment—A review. IEEE Trans. Intell. Transp. Syst. 2019, 22, 23–35. [Google Scholar] [CrossRef]
- Bai, Z.; Nayak, S.P.; Zhao, X.; Wu, G.; Barth, M.J.; Qi, X.; Liu, Y.; Oguchi, K. Small object detection in traffic scenes based on attention feature fusion. Sensors 2021, 21, 3031. [Google Scholar] [CrossRef] [PubMed]
- Wei, J.; He, J.; Zhou, Y.; Chen, K.; Tang, Z.; Xiong, Z. Enhanced object detection with deep convolutional neural networks for advanced driving assistance. IEEE Trans. Intell. Transp. Syst. 2020, 21, 1572–1583. [Google Scholar] [CrossRef] [Green Version]
- Azzedine, B.; Zhijun, H. Object detection using deep learning methods in traffic scenarios. ACM Comput. Surv. 2021, 54, 1–35. [Google Scholar]
- Sharma, V.; Mir, R.N. A comprehensive and systematic look up into deep learning based object detection techniques: A review. Comput. Sci. Rev. 2020, 38, 100301. [Google Scholar] [CrossRef]
- Cao, Y.; Fengqin, Y. Pedestrian detection based on candidate area localization with HOG-CLBP feature combination. Adv. Laser Optoelectron. 2021, 58, 165–172. [Google Scholar]
- Sun, Y.; Wang, B. A nighttime vehicle detection method based on improved models of deformable components. Comput. Eng. 2019, 45, 202–206. [Google Scholar]
- Law, H.; Teng, Y.; Russakovsky, O.; Deng, J. CornerNet-Lite: Efficient keypoint based object detection. In Proceedings of the 31st British Machine Vision Conference 2020(BMVC), Manchester, UK, 7–11 September 2020. [Google Scholar]
- Xu, X.; Ma, M.; Thompson, S.G.; Li, Z. Intelligent co-detection of cyclists and motorcyclists based on an improved deep learning method. Meas. Sci. Technol. 2021, 32, 025402. [Google Scholar] [CrossRef]
- Yanqiu, X.; Kun, Z.; Guangzhen, C. Joint detection of pedestrians and cyclists based on disparity area prediction. Automot. Eng. 2021, 43, 77–85. [Google Scholar]
- Nguyen, H. Improving faster R-CNN framework for fast vehicle detection. Math. Probl. Eng. 2019, 2019, 3808064. [Google Scholar] [CrossRef]
- Cai, Z.; Vasconcelos, N. Cascade R-CNN: Delving into high quality object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake, UT, USA, 18–22 June 2018; pp. 6154–6162. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.; Berg, C. SSD: Single shot multibox detector. In European Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2016; pp. 21–37. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar]
- Redmon, J.; Farhadi, A. Yolov3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. YOLOv4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Han, G.; Zhou, W.; Sun, N.; Liu, J.; Li, X. Feature fusion and adversary occlusion networks for object detection. IEEE Access 2019, 7, 124854–124865. [Google Scholar] [CrossRef]
- Zhong, L.; Li, J.; Zhou, F.; Bao, X.; Xing, W.; Han, Z.; Luo, J. Integration between cascade region-based convolutional neural network and bi-directional feature pyramid network for live object tracking and detection. Traitement Signal 2021, 38, 1253–1257. [Google Scholar] [CrossRef]
- Ju, M.; Luo, H.; Wang, Z.; Hui, B.; Chang, Z. The application of improved YOLO V3 in multi-scale target detection. Appl. Sci. 2019, 9, 3775. [Google Scholar] [CrossRef] [Green Version]
- Cai, Y.; Luan, T.; Gao, H.; Wang, H.; Chen, L.; Li, Y.; Sotelo, M.A.; Li, Z. YOLOv4-5D: An effective and efficient object detector for autonomous driving. IEEE Trans. Instrum. Meas. 2021, 70, 1–13. [Google Scholar] [CrossRef]
- Guo, M.; Xue, D.; Li, P.; Xu, H. Vehicle pedestrian detection method based on spatial pyramid pooling and attention mechanism. Information 2020, 11, 583. [Google Scholar] [CrossRef]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11534–11542. [Google Scholar]
- Ji, Z.; Kong, Q.; Wang, H.; Pang, Y. Small and dense commodity object detection with multi-scale receptive field attention. In Proceedings of the 27th ACM International Conference on Multimedia, Nice, France, 21–25 October 2019; pp. 1349–1357. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 7132–7141. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. CBAM: Convolutional block attention module. In Proceedings of the The European Conference on Computer Vision, Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Chen, L.; Zhang, H.; Xiao, J.; Nie, L.; Shao, J.; Liu, W.; Chua, T.S. SCA-CNN: Spatial and channel-wise attention in convolutional networks for image captioning. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 6298–6306. [Google Scholar]
- Bodla, N.; Singh, B.; Chellappa, R.; Davis, L.S. Soft-NMS—Improving object detection with one line of code. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 5562–5570. [Google Scholar]
- Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU Loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; pp. 12993–13000. [Google Scholar]
- Kuchár, P.; Pirník, R.; Tichý, T.; Rástočný, K.; Skuba, M.; Tettamanti, T. Noninvasive passenger detection comparison using thermal imager and IP cameras. Sustainability 2021, 13, 12928. [Google Scholar] [CrossRef]
Models | Improvements | [email protected] (%) | [email protected] (%) | Model Size (MB) | ||
---|---|---|---|---|---|---|
Car | Pedestrian | Cyclist | ||||
A | YOLOv4 | 87.52 | 68.21 | 78.42 | 78.05 | 256.2 |
B | A + Add scale detection layer | 88.31 | 71.06 | 80.45 | 79.94 | 258.7 |
C | B + DIoU-based Soft-NMS | 88.53 | 71.31 | 80.54 | 80.13 | 258.7 |
D | B + CBAM | 89.15 | 72.68 | 81.02 | 80.95 | 269.3 |
E | D + DIoU-based Soft-NMS | 89.52 | 73.03 | 81.15 | 81.23 | 269.3 |
Algorithms | [email protected] (%) | mAP (%) | FPS (Frames/s) | ||
---|---|---|---|---|---|
Car | Pedestrian | Cyclist | |||
Faster R-CNN | 83.07 | 62.78 | 60.83 | 68.89 | 14.21 |
Cascade R-CNN | 88.15 | 75.24 | 74.50 | 79.30 | 8.20 |
SSD | 75.33 | 50.06 | 49.67 | 58.35 | 45.13 |
YOLOv3 | 80.28 | 69.01 | 75.06 | 74.78 | 40.93 |
YOLOv4 | 87.52 | 68.21 | 78.42 | 78.05 | 51.68 |
Improved YOLOv4 | 89.52 | 73.03 | 81.15 | 81.23 | 47.32 |
Algorithms | [email protected] (%) | mAP (%) | FPS (Frames/s) | ||
---|---|---|---|---|---|
Car | Pedestrian | Cyclist | |||
Faster R-CNN | 60.02 | 48.83 | 46.17 | 51.67 | 13.10 |
Cascade R-CNN | 65.77 | 50.41 | 47.36 | 54.51 | 7.40 |
SSD | 50.35 | 39.26 | 38.76 | 42.79 | 44.52 |
YOLOv3 | 62.72 | 47.60 | 48.32 | 52.88 | 40.28 |
YOLOv4 | 72.26 | 50.86 | 54.78 | 59.30 | 51.45 |
Improved YOLOv4 | 73.92 | 54.26 | 56.53 | 61.57 | 46.83 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xu, X.; Xiong, H.; Zhan, L.; Królczyk, G.; Stanislawski, R.; Gardoni, P.; Li, Z. A New Deep Model for Detecting Multiple Moving Targets in Real Traffic Scenarios: Machine Vision-Based Vehicles. Sensors 2022, 22, 3742. https://doi.org/10.3390/s22103742
Xu X, Xiong H, Zhan L, Królczyk G, Stanislawski R, Gardoni P, Li Z. A New Deep Model for Detecting Multiple Moving Targets in Real Traffic Scenarios: Machine Vision-Based Vehicles. Sensors. 2022; 22(10):3742. https://doi.org/10.3390/s22103742
Chicago/Turabian StyleXu, Xiaowei, Hao Xiong, Liu Zhan, Grzegorz Królczyk, Rafal Stanislawski, Paolo Gardoni, and Zhixiong Li. 2022. "A New Deep Model for Detecting Multiple Moving Targets in Real Traffic Scenarios: Machine Vision-Based Vehicles" Sensors 22, no. 10: 3742. https://doi.org/10.3390/s22103742
APA StyleXu, X., Xiong, H., Zhan, L., Królczyk, G., Stanislawski, R., Gardoni, P., & Li, Z. (2022). A New Deep Model for Detecting Multiple Moving Targets in Real Traffic Scenarios: Machine Vision-Based Vehicles. Sensors, 22(10), 3742. https://doi.org/10.3390/s22103742