An Improved YOLO Model for UAV Fuzzy Small Target Image Detection
Abstract
:1. Introduction
2. Object Detection Algorithm
2.1. Overview of Algorithm Series
2.2. YOLOv5s
3. Improvement of YOLOv5s Algorithm
3.1. SPD-Conv Module
3.2. Attention Mechanism
3.3. Transposed Convolution
3.4. Improved Loss Function
4. Experiment and Discussion
4.1. Experimental Environment
4.2. Model Parameters
4.3. Dataset
4.4. Evaluation Metrics
4.5. Experiment
4.6. Ablation Experiment
- Detection results of Original YOLOv5s are shown in Figure 16:
- Detection results with SPD-Conv module are shown in Figure 17:
- Detection results with CA attention mechanism are shown in Figure 18:
- Detection results with transposed convolution are shown in Figure 19:
- Detection results with all modules added are shown in Figure 20:
4.7. Comparison of Algorithm Improvements
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Liu, Y.; Yang, F.; Hu, P. Small-object detection in UAV-captured images via multi-branch parallel feature pyramid networks. IEEE Access 2020, 8, 145740–145750. [Google Scholar] [CrossRef]
- Yu, W.; Yang, T.; Chen, C. Towards Resolving the Challenge of Long-tail Distribution in UAV Images for Object Detection. In Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 3–8 January 2021; IEEE Press: Waikoloa, HI, USA, 2021; pp. 3257–3266. [Google Scholar]
- Zhang, X.; Izquierdo, E.; Chandramouli, K. Dense and small object detection in uav vision based on cascade network. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Seoul, Republic of Korea, 27–28 October 2019; pp. 118–126. [Google Scholar]
- Law, H.; Deng, J. Cornernet: Detecting objects as paired keypoints. In Proceedings of the European Conference on Computer vision (ECCV), Munich, Germany, 8 September 2018; pp. 734–750. [Google Scholar]
- Xie, C.; Wu, J.; Xu, H. Small object detection algorithm based on improved YOLO5 in UAV image. Comput. Eng. Appl. 2023, 1–11. Available online: http://kns.cnki.net/kcms/detail/11.2127.TP.20230214.1523.050.html (accessed on 10 March 2023).
- Zhang, Q.; Wu, Z.; Zhou, L.; Liu, X. Research on vehicle and pedestrian target detection method based on improved YOLOv5. Chin. Test 2023, 1–8. Available online: http://kns.cnki.net/kcms/detail/51.1714.TB.20230228.0916.002.html (accessed on 10 March 2023).
- Li, Y.; Wang, S.; Chen, W.; Tian, Z.; Hou, L. Improved YOLOv5 target detection algorithm based on Ghost module. Mod. Electron. Tech. 2023, 46, 29–34. [Google Scholar] [CrossRef]
- Li, X.; Zhen, Z.; Liu, B.; Liang, Y.; Huang, Y. Object Detection Based on Improved YOLOv5s for Quadrotor UAV Auto-Landing. Comput. Meas. Control. 2023, 1–10. [Google Scholar]
- Tian, X.; Jia, Y.; Luo, X.; Yin, J. Small Target Recognition and Tracking Based on UAV Platform. Sensors 2022, 22, 6579. [Google Scholar] [CrossRef] [PubMed]
- Cheng, Q.; Wang, H.; Zhu, B.; Shi, Y.; Xie, B. A Real-Time UAV Target Detection Algorithm Based on Edge Computing. Drones 2023, 7, 95. [Google Scholar] [CrossRef]
- Li, B.; Xiao, C.; Wang, L.; Wang, Y.; Lin, Z.; Li, M.; An, W.; Guo, Y. Dense Nested Attention Network for Infrared Small Target Detection. IEEE Trans. Image Process. 2023, 32, 1745–1758. [Google Scholar] [CrossRef] [PubMed]
- Ibrokhimov, B.; Kang, J.Y. Two-Stage Deep Learning Method for Breast Cancer Detection Using High-Resolution Mammogram Images. Appl. Sci. 2022, 12, 4616. [Google Scholar] [CrossRef]
- Martin, Š.; Gašper, S.; Božidar, P. Cephalometric Landmark Detection in Lateral Skull X-ray Images by Using Improved Spatial Configuration-Net. Appl. Sci. 2022, 12, 4644. [Google Scholar] [CrossRef]
- Li, C.; Zhen, T.; Li, Z. Image Classification of Pests with Residual Neural Network Based on Transfer Learning. Appl. Sci. 2022, 12, 4356. [Google Scholar] [CrossRef]
- Li, C. Small target detection algorithm based on YOLOv5. Chang. Inf. Commun. 2021, 34, 30–33. [Google Scholar]
- Tian, F.; Jia, H.-P.; Liu, F. Small Target Detection in Oilfield Operation Field Based on Improved YOLOv5. Comput. Syst. Appl. 2022, 31, 159–168. [Google Scholar] [CrossRef]
- Braun, M.; Krebs, S.; Flohr, F.; Gavrila, D.M. EuroCity Persons:A Novel Benchmark for Person Detection in Traffic Scenes. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 41, 1844–1861. [Google Scholar] [CrossRef] [PubMed]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. In IEEE Transactions on Pattern Analysis and Machine Intelligence; IEEE Press: Salt Lake City, UT, USA, 2017; Volume 39, pp. 1137–1149. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 6517–6525. [Google Scholar]
- Li, Q.; Deng, Z.; Luo, X.; Gu, X.; Wang, S. SSD Object Detection Algorithm with Attention and Cross-Scale Fusion. J. Front. Comput. Sci. 2022, 16, 2575–2586. [Google Scholar]
- Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path Aggregation Network for Instance Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 8759–8768. [Google Scholar]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; Volume 106, pp. 936–944. [Google Scholar]
- Sunkara, R.; Luo, T. No more strided convolutions or pooling: A new CNN building block for low-resolution images and small objects. In Machine Learning and Knowledge Discovery in Databases, Proceedings of the European Conference, ECML PKDD 2022, Grenoble, France, 19–23 September 2022; Part III; Springer Nature Switzerland: Cham, Switzerland, 2022; pp. 443–459. [Google Scholar]
- Hou, Q.; Zhou, D.; Feng, J. Coordinate Attention for Effcient Mobile Network Design. Natl. Univ. Singapore 2021, arXiv:2103.02907v1. [Google Scholar]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 3–8. [Google Scholar]
- Dai, J.; Zhao, X.; Li, L.; Liu, W.; Chu, X. Improved Yolov5-based for Infrared Dim-small Target Detection under Complex Background. Infrared Technol. 2022, 44, 504–512. Available online: http://kns.cnki.net/kcms/detail/53.1053.TN.20220415.1612.002.html (accessed on 15 March 2023).
- Song, Z.; Zhang, Y.; Liu, Y.; Yang, K.; Sun, M. MSFYOLO: Feature fusion-based detection for small objects. IEEE Lat. Am. Trans. 2022, 20, 823–830. [Google Scholar] [CrossRef]
- Qu, J.; Su, C.; Zhang, Z.; Razi, A. Dilated convolution and feature fusion SSD network for small object detection in remote sensing images. IEEE Access 2020, 8, 82832–82843. [Google Scholar] [CrossRef]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal Loss for Dense Object Detection. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 2980–2988. [Google Scholar] [CrossRef] [PubMed]
- Zhu, X.; Lyu, S.; Wang, X.; Zhao, Q. TPH-YOLOv5: Improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 2778–2788. [Google Scholar]
Software and Hardware | Version or Model |
---|---|
Operating system | Windows11 |
CPU | i7-13700K |
GPU | NVIDIA GeForce RTX 3080Ti |
CUDA | 12.0 |
Pytorch version | 11.8 |
Python version | 3.9 |
Software | PyCharm2021 |
Parameter Name | Parameter Settings |
---|---|
Weights | Yolov5s.pt |
Img-size | 640 × 640 |
Epochs | 300 |
Batch-size | 16 |
Max-det | 1000 |
Parameter Name | Parameter Settings | Parameter Explanation |
---|---|---|
Lr0 | 0.01 | Initial learning rate |
Lrf | 0.1 | Cyclic learning rate |
Momentum | 0.937 | Learning rate momentum |
Weight_decay | 0.0005 | Weight decay factor |
Algorithm | SPD-Conv | CA | T-Conv | mAP/% | P/% | R/% |
---|---|---|---|---|---|---|
Yolov5s | 65.21 | 67.24 | 69.76 | |||
SPD-Conv | √ | 75.34 | 69.42 | 74.54 | ||
CA | √ | 74.09 | 68.77 | 74.58 | ||
T-Conv | √ | 74.10 | 68.79 | 72.70 | ||
SPD-Conv + CA | √ | √ | 78.53 | 71.52 | 75.23 | |
SPD-Conv + T-Conv | √ | √ | 77.12 | 71.36 | 73.62 | |
CA + T-Conv | √ | √ | 76.28 | 72.36 | 73.71 | |
Ours | √ | √ | √ | 80.17 | 73.45 | 76.97 |
Algorithm | Layers | Parameters/M | FLOPs/G | Latency/ms |
---|---|---|---|---|
Original Yolov5s | 270 | 7.02 | 15.9 | 20.25 |
SPD-Conv | 277 | 8.56 | 33.3 | 23.25 |
CA | 280 | 7.05 | 16 | 21.23 |
T-Conv | 283 | 8.57 | 23.6 | 22.58 |
SPD-Conv + CA | 317 | 8.6 | 33.4 | 23.89 |
SPD-Conv + T-Conv | 317 | 9.06 | 36.2 | 24.1 |
CA + T-Conv | 317 | 9.36 | 38.3 | 22.98 |
Ours | 317 | 9.91 | 40.1 | 24.35 |
Algorithm | mAP/% | P/% | R/% | Latency/ms | Parameters/M |
---|---|---|---|---|---|
Yolov5s | 65.21 | 67.24 | 69.76 | 20.25 | 7.02 |
Yolov4 | 67.32 | 68.36 | 71.45 | 22.72 | 8.39 |
SSD | 81.23 | 74.21 | 74.49 | 25.16 | 8.63 |
Faster-RCNN | 79.85 | 77.49 | 76.49 | 30.25 | 13.71 |
RetinaNet [30] | 67.36 | 69.45 | 69.32 | 30.12 | 6.45 |
TPH-YOLOv5 [31] | 79.06 | 72.56 | 75.12 | 21.16 | 8.36 |
Fast-YOLOv4 [10] | 70.25 | 71.36 | 69.78 | 19.58 | 8.78 |
Improved YOLOv4 [9] | 68.36 | 68.51 | 67.12 | 17.15 | 6.32 |
ours | 80.17 | 73.45 | 76.97 | 24.35 | 9.91 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chang, Y.; Li, D.; Gao, Y.; Su, Y.; Jia, X. An Improved YOLO Model for UAV Fuzzy Small Target Image Detection. Appl. Sci. 2023, 13, 5409. https://doi.org/10.3390/app13095409
Chang Y, Li D, Gao Y, Su Y, Jia X. An Improved YOLO Model for UAV Fuzzy Small Target Image Detection. Applied Sciences. 2023; 13(9):5409. https://doi.org/10.3390/app13095409
Chicago/Turabian StyleChang, Yanlong, Dong Li, Yunlong Gao, Yun Su, and Xiaoqiang Jia. 2023. "An Improved YOLO Model for UAV Fuzzy Small Target Image Detection" Applied Sciences 13, no. 9: 5409. https://doi.org/10.3390/app13095409
APA StyleChang, Y., Li, D., Gao, Y., Su, Y., & Jia, X. (2023). An Improved YOLO Model for UAV Fuzzy Small Target Image Detection. Applied Sciences, 13(9), 5409. https://doi.org/10.3390/app13095409