Recognition of Maize Tassels Based on Improved YOLOv8 and Unmanned Aerial Vehicles RGB Images
Abstract
:1. Introduction
- 1
- Create a new maize tassel UAV image dataset in JPG format and label the data in VOC format, naming it the maize tassels dataset (MTD).
- 2
- Assess the performance, reliability, and efficiency of the improved YOLOv8 model in recognizing small-target maize tassels in complex backgrounds.
- 3
2. Materials and Methods
2.1. The Study Area
2.2. Data Collection and Dataset Construction
2.3. Data Augmentation
2.4. SPPF and SPPELAN
2.5. Attentional Mechanisms
2.5.1. CBAM Attentional Module
2.5.2. Global Attentional Mechanism
2.6. Target Detection Network Structure of Improved YOLOv8
- (1)
- A Microscale Detection Head (MDH) was integrated into the YOLOv8 architecture at the head section, which was derived through a 4-fold downsampling process, allowing it to generate larger prediction maps (160 × 160) compared to other detection heads. This addition enables the extraction of higher-resolution shallow feature maps, making it possible to better capture the finer details of smaller maize tassels.
- (2)
- We replaced the SPPF module with the SPPELAN module in the feature extraction part of the backbone network, which can combine low-level detail features and high-level semantic features to obtain richer target feature information.
- (3)
- By incorporating the concept of residual networks, an additional connection was introduced within the backbone network (as indicated by the gray dotted line in Figure 9), and the new network connection enabled the shallow feature information to be transferred to the deeper network, which strengthens the backpropagation of the network gradient, avoids the loss of feature information of the small-sized target, and reduces the occurrence of the phenomenon of the network gradient attenuation.
- (4)
- To highlight the feature information of the maize tassels more, the attention module was added to the neck part of the feature fusion. The new attention module consists of the GAM and CBAM together, and its structure is shown in Figure 10. The new attention module uses the global channel attention part of the GAM and the spatial attention module of the CBAM. Firstly, the passband weights are updated using the method of 3D rearranging in GAM combined with multi-layer perceptron (MLP) and then multiplied with the input feature maps to obtain the new feature maps with the channel weights, which avoids spatial dimensionality reduction due to average pooling and global pooling in CBAM and thus information loss. In the spatial attention part, the SAM module of the CBAM is still used, and the feature maps generated by the channel attention of GAM are used as inputs, after which the sigmoid activation function generates the new spatial attention weights through the splicing of two pooling operations and a convolution operation. In contrast to the spatial attention part of the GAM, the information of channel attention is effectively learned while reducing the computation of the module. Finally, the acquired spatial weights are multiplied with the input channel attention-weighted feature map to obtain the dual attention feature map. Integrating the GAM–CBAM attention module in the neck part of YOLOv8 can effectively mitigate the loss of information in the space so that the model focuses on the maize tassels region to obtain the feature information, which increases the robustness of the model and can improve the recognition ability of the model.
2.7. Accuracy Evaluation
3. Results
3.1. Model Training and Validation
3.2. Ablation Experiment
3.3. Comparative Analysis of Various Deep Learning Networks
4. Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Bantchina, B.B.; Qaswar, M.; Arslan, S.; Ulusoy, Y.; Gündoğdu, K.S.; Tekin, Y.; Mouazen, A.M. Corn yield prediction in site-specific management zones using proximal soil sensing, remote sensing, and machine learning approach. Comput. Electron. Agric. 2024, 225, 109329. [Google Scholar] [CrossRef]
- Chen, J.; Fu, Y.H.; Guo, Y.; Xu, Y.; Zhang, X.; Hao, F. An improved deep learning approach for detection of maize tassels using UAV-based RGB images. Int. J. Appl. Earth Obs. Geoinf. 2024, 130, 103922. [Google Scholar] [CrossRef]
- Alzadjali, A.; Alali, M.H.; Sivakumar, A.N.V.; Deogun, J.S.; Scott, S.; Schnable, J.C.; Shi, Y. Maize Tassel Detection From UAV Imagery Using Deep Learning. Front. Robot. AI 2021, 8, 600410. [Google Scholar] [CrossRef] [PubMed]
- Wang, Y.; Feng, L.; Sun, W.; Wang, L.; Yang, G.; Chen, B. A lightweight CNN-Transformer network for pixel-based crop mapping using time-series Sentinel-2 imagery. Comput. Electron. Agric. 2024, 226, 109370. [Google Scholar] [CrossRef]
- Ye, Z.; Guo, Q.; Wei, J.; Zhang, J.; Zhang, H.; Bian, L.; Guo, S.; Zheng, X.; Cao, S. Recognition of terminal buds of densely-planted Chinese fir seedlings using improved YOLOv5 by integrating attention mechanism. Front. Plant Sci. 2022, 13, 991929. [Google Scholar] [CrossRef]
- Sun, M.; Gong, A.; Zhao, X.; Liu, N.; Si, L.; Zhao, S. Reconstruction of a Monthly 1 km NDVI Time Series Product in China Using Random Forest Methodology. Remote Sens. 2023, 15, 3353. [Google Scholar] [CrossRef]
- Longchamps, L.; Philpot, W. Full-Season Crop Phenology Monitoring Using Two-Dimensional Normalized Difference Pairs. Remote Sens. 2023, 15, 5565. [Google Scholar] [CrossRef]
- Niu, Q.; Li, X.; Huang, J.; Huang, H.; Huang, X.; Su, W.; Yuan, W. A 30-m annual maize phenology dataset from 1985 to 2020 in China. Earth Syst. Sci. Data 2022, 14, 2851–2864. [Google Scholar] [CrossRef]
- Yi, H.; Liu, B.; Zhao, B.; Liu, E. Small Object Detection Algorithm Based on Improved YOLOv8 for Remote Sensing. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 1734–1747. [Google Scholar] [CrossRef]
- Lee, D.-H.; Park, J.-H. Development of a UAS-Based Multi-Sensor Deep Learning Model for Predicting Napa Cabbage Fresh Weight and Determining Optimal Harvest Time. Remote Sens. 2024, 16, 3455. [Google Scholar] [CrossRef]
- Zhao, X.; Zhang, W.; Xia, Y.; Zhang, H.; Zheng, C.; Ma, J.; Zhang, Z. G-YOLO: A Lightweight Infrared Aerial Remote Sensing Target Detection Model for UAVs Based on YOLOv8. Drones 2024, 8, 495. [Google Scholar] [CrossRef]
- Zhang, H.; Sun, W.; Sun, C.; He, R.; Zhang, Y. HSP-YOLOv8: UAV Aerial Photography Small Target Detection Algorithm. Drones 2024, 8, 453. [Google Scholar] [CrossRef]
- Wang, G.; Chen, Y.; An, P.; Hong, H.; Hu, J.; Huang, T. UAV-YOLOv8: A SmallObject-Detection model based on improved YOLOv8 for UAV aerial photography scenarios. Sensors 2023, 23, 7190. [Google Scholar] [CrossRef] [PubMed]
- Lu, C.; Nnadozie, E.C.; Camenzind, M.; Hu, Y.; Yu, K. Maize plant detection using UAV-based RGB imaging and YOLOv5. Front. Plant Sci. 2024, 14, 1274813. [Google Scholar] [CrossRef] [PubMed]
- Zhao, L.; Um, D.; Nowka, K.; Landivar-Scott, J.L.; Landivar, J.; Bhandari, M. Cotton yield prediction utilizing unmanned aerial vehicles (UAV) and Bayesian neural networks. Comput. Electron. Agric. 2024, 226, 109415. [Google Scholar] [CrossRef]
- Sun, H.; Shen, Q.; Ke, H.; Duan, Z.; Tang, X. Power Transmission Lines Foreign Object Intrusion Detection Method for Drone Aerial Images Based on Improved YOLOv8 Network. Drones 2024, 8, 346. [Google Scholar] [CrossRef]
- Ferreira, D.; Basiri, M. Dynamic Target Tracking and Following with UAVs Using Multi-Target Information: Leveraging YOLOv8 and MOT Algorithms. Drones 2024, 8, 488. [Google Scholar] [CrossRef]
- Huang, M.; Mi, W.; Wang, Y. EDGS-YOLOv8: An Improved YOLOv8 Lightweight UAV Detection Model. Drones 2024, 8, 337. [Google Scholar] [CrossRef]
- Karim, M.J.; Nahiduzzaman, M.; Ahsan, M.; Haider, J. Development of an early detection and automatic targeting system for cotton weeds using an improved lightweight YOLOv8 architecture on an edge device. Knowl. Based Syst. 2024, 300, 112204. [Google Scholar] [CrossRef]
- Liu, Z.; Abeyrathna, R.M.R.D.; Sampurno, R.M.; Nakaguchi, V.M.; Ahamed, T. Faster-YOLO-AP: A lightweight apple detection algorithm based on improved YOLOv8 with a new efficient PDWConv in orchard. Comput. Electron. Agric. 2024, 223, 109118. [Google Scholar] [CrossRef]
- Ren, S.; He, K.; Girshick, R.B.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
- Lv, W.; Xu, S.; Zhao, Y.; Wang, G.; Wei, J.; Cui, C.; Du, Y.; Dang, Q.; Liu, Y. DETRs Beat YOLOs on Real-time Object Detection. arXiv 2023, arXiv:2304.08069. [Google Scholar]
- Zhang, Z.; Ao, D.; Zhou, L.; Yuan, X.; Luo, M. Laboratory Behavior Detection Method Based on Improved Yolov5 Model. In Proceedings of the 2021 International Conference on Cyber-Physical Social Intelligence (ICCSI), Beijing, China, 18–20 December 2021; pp. 1–6. [Google Scholar]
- Wang, C.-Y.; Yeh, I.-H.; Liao, H. YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information. arXiv 2024, arXiv:2402.13616. [Google Scholar]
- Wang, A.; Chen, H.; Liu, L.; Chen, K.; Lin, Z.; Han, J.; Ding, G. YOLOv10: Real-Time End-to-End Object Detection. arXiv 2024, arXiv:2405.14458. [Google Scholar]
- Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Zhu, X.; Lyu, S.; Wang, X.; Zhao, Q. TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-captured Scenarios. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Virtual, 11–17 October 2021; pp. 2778–2788. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.-S. CBAM: Convolutional Block Attention Module. arXiv 2018, arXiv:1807.06521. [Google Scholar]
- Liu, Y.; Shao, Z.; Hoffmann, N. Global Attention Mechanism: Retain Information to Enhance Channel-Spatial Interactions. arXiv 2021, arXiv:2112.05561. [Google Scholar]
- Zhu, L.; Geng, X.; Li, Z.; Liu, C. Improving YOLOv5 with Attention Mechanism for Detecting Boulders from Planetary Images. Remote Sens. 2021, 13, 3776. [Google Scholar] [CrossRef]
- Selvaraju, R.R.; Das, A.; Vedantam, R.; Cogswell, M.; Parikh, D.; Batra, D. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. Int. J. Comput. Vis. 2016, 128, 336–359. [Google Scholar] [CrossRef]
- Draelos, R.L.; Carin, L. Use HiResCAM instead of Grad-CAM for faithful explanations of convolutional neural networks. arXiv 2020, arXiv:2011.08891. [Google Scholar]
- Luan, T.; Zhou, S.; Liu, L.; Pan, W. Tiny-Object Detection Based on Optimized YOLO-CSQ for Accurate Drone Detection in Wildfire Scenarios. Drones 2024, 8, 454. [Google Scholar] [CrossRef]
- Su, H.; Wang, X.; Han, T.; Wang, Z.; Zhao, Z.; Zhang, P. Research on a U-Net Bridge Crack Identification and Feature-Calculation Methods Based on a CBAM Attention Mechanism. Buildings 2022, 12, 1561. [Google Scholar] [CrossRef]
Name | Parameters and Versions |
---|---|
Central Processing Unit (CPU) | Intel Core i7-14700K @ 3.40 GHz |
Random Access Memory (RAM) | 32 GB |
Hard Disk Drive (SSD) | SHPP41-1000GM (1 TB) |
Graphic Card (GPU) | NVIDIA 4070TiSuper (16 GB) |
Operating System (OS) | Microsoft Windows 11 Professional |
Programming Environment (ENVS) | Pytorch1.12.0 + Python3.8.10 |
Method | Model Number | MDH | New Connections | Attention Mechanism | SPPELAN | Precision (%) | Recall (%) | ||
---|---|---|---|---|---|---|---|---|---|
GAM | CBAM | GAM + CBAM | |||||||
YOLOv8 | - | - | - | - | - | - | - | 86.6 | 85.2 |
Improved model | A1 | √ | - | - | - | - | - | 89.1 | 87.8 |
A2 | √ | √ | - | - | - | - | 89.9 | 88.1 | |
A3 | √ | √ | √ | - | - | - | 90.6 | 89.1 | |
A4 | √ | √ | - | √ | - | - | 89.5 | 88.5 | |
A5 | √ | √ | - | - | √ | - | 91.1 | 90.3 | |
A6 | √ | √ | - | - | √ | √ | 93.6 | 92.5 |
Model | Precision (%) | Recall (%) | mAP50 (%) | F1-Score (%) | FPS |
---|---|---|---|---|---|
Faster R-CNN | 80.2 | 75.8 | 84.2 | 77.9 | 2.28 |
RT-DETR | 90.3 | 87.8 | 89.4 | 89.1 | 62.5 |
YOLOv5 | 85.4 | 86.1 | 85.1 | 85.7 | 83.3 |
YOLOv9 | 91.2 | 85.1 | 91.4 | 88.1 | 33.5 |
YOLOv10 | 91.0 | 88.2 | 93.9 | 89.5 | 66.7 |
Improved YOLOv8 | 93.6 | 92.5 | 96.8 | 93.1 | 58.8 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wei, J.; Wang, R.; Wei, S.; Wang, X.; Xu, S. Recognition of Maize Tassels Based on Improved YOLOv8 and Unmanned Aerial Vehicles RGB Images. Drones 2024, 8, 691. https://doi.org/10.3390/drones8110691
Wei J, Wang R, Wei S, Wang X, Xu S. Recognition of Maize Tassels Based on Improved YOLOv8 and Unmanned Aerial Vehicles RGB Images. Drones. 2024; 8(11):691. https://doi.org/10.3390/drones8110691
Chicago/Turabian StyleWei, Jiahao, Ruirui Wang, Shi Wei, Xiaoyan Wang, and Shicheng Xu. 2024. "Recognition of Maize Tassels Based on Improved YOLOv8 and Unmanned Aerial Vehicles RGB Images" Drones 8, no. 11: 691. https://doi.org/10.3390/drones8110691
APA StyleWei, J., Wang, R., Wei, S., Wang, X., & Xu, S. (2024). Recognition of Maize Tassels Based on Improved YOLOv8 and Unmanned Aerial Vehicles RGB Images. Drones, 8(11), 691. https://doi.org/10.3390/drones8110691