An Improved Rotating Box Detection Model for Litchi Detection in Natural Dense Orchards
Abstract
:1. Introduction
- 1
- Background interference: Since litchi often grow at an angle or in clusters, if horizontal bounding boxes are used to select litchi, a large amount of background will appear in the frame and adjacent frames will overlap, affecting the litchi detection accuracy.
- 2
- Litchis growing in a natural environment exist in diverse and intricate backgrounds and are often obscured by branches or leaves.
- 3
- The multi-scale recognition accuracy is low. With a large viewing angle, the size of litchi fruits varies greatly, and the litchi detection accuracy is low.
2. Related Work
3. Materials and Methods
3.1. Experimental Area and Data Acquisition
3.2. Data Annotation
3.3. Dense Litchi Detection Network Architecture
3.3.1. Network Structure of YOLOv8
- Backbone: The backbone is mainly composed of a CBS module, a C2F module, and an SPPF module. The CBS module consists of a convolution, batch normalization, and normalization SiLU functions. The main function of BN is to maintain the same distribution in each ANN layer to avoid gradient disappearance in network training; the CBS module compresses and expands the feature information by changing the number of feature channels, thereby improving and balancing the calculation speed and accuracy of the ANN; the C2f module is a network component used to extract deep feature information. The latter can be embedded at any position or replace any convolutional layer to enhance the backbone’s performance.
- Neck: The convolution structure in the PAN-FPN upsampling stage in YOLOv5 is removed, and the downsampling operation is performed first, followed by the upsampling operation. Replacing the C3 module with the C2f module reduced the weight and makes the model more adaptable to targets of different sizes and shapes.
- Head: The current mainstream decoupled-head structure is adopted to effectively reduce the parameter number and computational complexity and enhance the model’s generalization ability and robustness. The design of using anchor-base to predict the position and size of the anchor box, as used previously in the YOLO series, has been abandoned, and instead an anchor-free detection method is used to directly predict the target’s center point, width, and height. Reducing the number of anchor boxes further improves the model’s detection speed and accuracy.
- Loss function: The model uses CIOU Loss as the error loss function to further improve the regression accuracy of the bounding box by minimizing the DFL. At the same time, the model adopts the Task Aligned Assigner sample allocation strategy, using the high-order combination of the classification score and the IOU as an indicator to guide the selection of positive and negative samples. This successfully aligns the high classification score and high IOU, effectively improving the model’s detection accuracy.
3.3.2. The Transformer Module
3.3.3. Increased ECA-Net Mechanisms
3.3.4. Head Prediction Branch Improvements
4. Results and Discussion
4.1. Training Environment and Equipment Description
4.2. Evaluation Metrics
4.3. Ablation Test
4.4. Performance Comparison of Different Models
4.5. Class Activation Graph Analysis
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
YOLOV8n | You Look Only Once Version eight -n scales |
Otsu | maximum inter-class variance method |
LDA | Linear discriminant analysis |
C2f | Cross Stage Partial Network Bottleneck with Two Convolutions |
C3 | Cross Stage Partial Network Bottleneck with Three Convolutions |
ECA | Spatial Pyramid Pooling - Fast |
SVM | Support vector machines |
SPPF | Spatial Pyramid Pooling - Fast |
Faster R-CNN | faster region-based convolutional network |
SSD | Single Shot Multibox Detector |
CBAM | Convolutional Block Attention Module |
HHBs | Horizontal bounding boxes |
OBBs | Oriented bounding boxes |
ECA | Efficient Channel Attention Network |
MLP | multilayer perceptron |
FPN | Feature pyramid networks |
CBS | Convolutional Bottleneck with SiLU |
NMS | non-maximum suppression |
NLP | Neuro-Linguistic Programming |
MHA | Multi-head attention |
BN | Batch normalization |
ChatGPT | Chatbot program Chat Generative Pre-trained Transformer |
SiLU | Sigmoid Linear Unit |
IOU | Intersection over Union |
CIOU | Complete intersection over Union |
mAP | Mean average precision |
DFL | Distribution Focal Loss |
References
- Chen, X.; Wang, W.; Huang, C.; Wang, Y.; Fu, H.; Li, J. Study of the Group Vibrational Detachment Characteristics of Litchi (Litchi chinensis Sonn) Clusters. Agriculture 2023, 13, 1065. [Google Scholar] [CrossRef]
- Xiong, J.; He, Z.; Lin, R.; Liu, Z.; Bu, R.; Yang, Z.; Peng, H.; Zou, X. Visual positioning technology of picking robots for dynamic litchi clusters with disturbance. Comput. Electron. Agric. 2018, 151, 226–237. [Google Scholar] [CrossRef]
- Lei, X.; Yuan, Q.; Xyu, T.; Qi, Y.; Zeng, J.; Huang, K.; Sun, Y.; Herbst, A.; Lyu, X. Technologies and Equipment of Mechanized Blossom Thinning in Orchards: A Review. Agronomy 2023, 13, 2753. [Google Scholar] [CrossRef]
- Xiong, Z.; Wang, L.; Zhao, Y.; Lan, Y. Precision Detection of Dense Litchi Fruit in UAV Images Based on Improved YOLOv5 Model. Remote. Sens. 2023, 15, 4017. [Google Scholar] [CrossRef]
- Xiong, J.; Lin, R.; Liu, Z.; He, Z.; Tang, L.; Yang, Z.; Zou, X. The recognition of litchi clusters and the calculation of picking point in a nocturnal natural environment. Biosyst. Eng. 2018, 166, 44–57. [Google Scholar] [CrossRef]
- Xiong, J.; Lin, R.; Bu, R.; Liu, Z.; Yang, Z.; Yu, L. A Micro-Damage Detection Method of Litchi Fruit Using Hyperspectral Imaging Technology. Sensors 2018, 18, 700. [Google Scholar] [CrossRef]
- Wu, J.; Zhang, S.; Zou, T.; Dong, L.; Peng, Z.; Wang, H. A Dense Litchi Target Recognition Algorithm for Large Scenes. Math. Probl. Eng. 2022, 2022, 4648105. [Google Scholar] [CrossRef]
- Jiao, Z.; Huang, K.; Jia, G.; Lei, H.; Cai, Y.; Zhong, Z. An effective litchi detection method based on edge devices in a complex scene. Biosyst. Eng. 2022, 222, 15–28. [Google Scholar] [CrossRef]
- Li, J.; Tang, Y.; Zou, X.; Lin, G.; Wang, H. Detection of Fruit-Bearing Branches and Localization of Litchi Clusters for Vision-Based Harvesting Robots. IEEE Access 2020, 8, 117746–117758. [Google Scholar] [CrossRef]
- Liang, C.; Xiong, J.; Zheng, Z.; Zhong, Z.; Li, Z.; Chen, S.; Yang, Z. A visual detection method for nighttime litchi fruits and fruiting stems. Comput. Electron. Agric. 2020, 169, 105192. [Google Scholar] [CrossRef]
- He, Z.L.; Xiong, J.T.; Lin, R.; Zou, X.; Tang, L.Y.; Yang, Z.G.; Liu, Z.; Song, G. A method of green litchi recognition in natural environment based on improved LDA classifier. Comput. Electron. Agric. 2017, 140, 159–167. [Google Scholar] [CrossRef]
- Guo, Q.; Chen, Y.; Tang, Y.; Zhuang, J.; He, Y.; Hou, C.; Chu, X.; Zhong, Z.; Luo, S. Lychee Fruit Detection Based on Monocular Machine Vision in Orchard Environment. Sensors 2019, 19, 4091. [Google Scholar] [CrossRef] [PubMed]
- Yu, L.; Xiong, J.; Fang, X.; Yang, Z.; Chen, Y.; Lin, X.; Chen, S. A litchi fruit recognition method in a natural environment using RGB-D images. Biosyst. Eng. 2021, 204, 50–63. [Google Scholar] [CrossRef]
- Ortiz, C.; Torregrosa, A.; Castro-García, S. Citrus Fruit Movement Assessment Related to Fruit Damage during Harvesting with an Experimental Low-Frequency–High-Amplitude Device. Agronomy 2022, 12, 1337. [Google Scholar] [CrossRef]
- Mark, E.; De Kleine, M.K. A Semi-Automated Harvesting Prototype for Shaking Fruit Tree Limbs. Trans. ASABE 2015, 58, 1461–1470. [Google Scholar] [CrossRef]
- Torregrosa, A.; Albert, F.; Aleixos, N.; Ortiz, C.; Blasco, J. Analysis of the detachment of citrus fruits by vibration using artificial vision. Biosyst. Eng. 2014, 119, 1–12. [Google Scholar] [CrossRef]
- Bu, L.; Hu, G.; Chen, C.; Sugirbay, A.; Chen, J. Experimental and simulation analysis of optimum picking patterns for robotic apple harvesting. Sci. Hortic. 2019, 261, 108937. [Google Scholar] [CrossRef]
- Li, T.; Sun, M.; He, Q.; Zhang, G.; Shi, G.; Ding, X.; Lin, S. Tomato recognition and location algorithm based on improved YOLOv5. Comput. Electron. Agric. 2023, 208, 107759. [Google Scholar] [CrossRef]
- Han, C.; Wu, W.; Luo, X.; Li, J. Visual Navigation and Obstacle Avoidance Control for Agricultural Robots via LiDAR and Camera. Remote. Sens. 2023, 15, 5402. [Google Scholar] [CrossRef]
- Wang, H.; Dong, L.; Zhou, H.; Luo, L.; Lin, G.; Wu, J.; Tang, Y. YOLOv3-Litchi Detection Method of Densely Distributed Litchi in Large Vision Scenes. Math. Probl. Eng. 2021, 2021, 8883015. [Google Scholar] [CrossRef]
- Xie, J.; Peng, J.; Wang, J.; Chen, B.; Jing, T.; Sun, D.; Gao, P.; Wang, W.; Lu, J.; Yetan, R.; et al. Litchi Detection in a Complex Natural Environment Using the YOLOv5-Litchi Model. Agronomy 2022, 12, 3054. [Google Scholar] [CrossRef]
- Wang, L.; Zhao, Y.; Xiong, Z.; Wang, S.; Li, Y.; Lan, Y. Fast and precise detection of litchi fruits for yield estimation based on the improved YOLOv5 model. Front. Plant Sci. 2022, 13, 965425. [Google Scholar] [CrossRef] [PubMed]
- Qi, X.; Dong, J.; Lan, Y.; Zhu, H. Method for Identifying Litchi Picking Position Based on YOLOv5 and PSPNet. Remote. Sens. 2022, 14, 2004. [Google Scholar] [CrossRef]
- Wang, C.; Zou, X.; Tang, Y.; Luo, L.; Feng, W. Localisation of litchi in an unstructured environment using binocular stereo vision. Biosyst. Eng. 2016, 145, 39–51. [Google Scholar] [CrossRef]
- Peng, H.; Xue, C.; Shao, Y.; Chen, K.; Liu, H.; Xiong, J.; Chen, H.; Gao, Z.; Yang, Z. Litchi detection in the field using an improved YOLOv3 model. Int. J. Agric. Biol. Eng. 2022, 15, 211–220. [Google Scholar] [CrossRef]
- Wang, H.; Lin, Y.; Xu, X.; Chen, Z.; Wu, Z.; Tang, Y. A Study on Long-Close Distance Coordination Control Strategy for Litchi Picking. Agronomy 2022, 12, 1520. [Google Scholar] [CrossRef]
- Xie, J.; Zhang, X.; Liu, Z.; Liao, F.; Wang, W.; Li, J. Detection of Litchi Leaf Diseases and Insect Pests Based on Improved FCOS. Agronomy 2023, 13, 1314. [Google Scholar] [CrossRef]
- Wen, L.; Cheng, Y.; Fang, Y.; Li, X. A comprehensive survey of oriented object detection in remote sensing images. Expert Syst. Appl. 2023, 224, 119960. [Google Scholar] [CrossRef]
- Yang, W.; Wu, J.; Zhang, J.; Gao, K.; Du, R.; Wu, Z.; Firkat, E.; Li, D. Deformable convolution and coordinate attention for fast cattle detection. Comput. Electron. Agric. 2023, 211, 108006. [Google Scholar] [CrossRef]
- Yang, H.; Shi, Y.; Wang, X. Detection Method of Fry Feeding Status Based on YOLO Lightweight Network by Shallow Underwater Images. Electronics 2022, 11, 3856. [Google Scholar] [CrossRef]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 14–19 June 2020; pp. 11531–11539. [Google Scholar] [CrossRef]
- Mekhalfi, M.L.; Nicolo, C.; Bazi, Y.; Rahhal, M.M.A.; Alsharif, N.A.; Maghayreh, E.A. Contrasting YOLOv5, Transformer, and EfficientDet Detectors for Crop Circle Detection in Desert. IEEE Geosci. Remote. Sens. Lett. 2021, 19, 1–5. [Google Scholar] [CrossRef]
- Roy, A.M.; Bhaduri, J. DenseSPH-YOLOv5: An automated damage detection model based on DenseNet and Swin-Transformer prediction head-enabled YOLOv5 with attention mechanism. Adv. Eng. Inform. 2023, 56, 102007. [Google Scholar] [CrossRef]
- Guo, Z.; Wang, C.; Yang, G.; Huang, Z.; Li, G. MSFT-YOLO: Improved YOLOv5 Based on Transformer for Detecting Defects of Steel Surface. Sensors 2022, 22, 3467. [Google Scholar] [CrossRef] [PubMed]
- Tang, Z.; Lu, J.; Chen, Z.; Qi, F.; Zhang, L. Improved Pest-YOLO: Real-time pest detection based on efficient channel attention mechanism and transformer encoder. Ecol. Inform. 2023, 78, 102340. [Google Scholar] [CrossRef]
- Xia, X.; Chai, X.; Li, Z.; Zhang, N.; Sun, T. MTYOLOX: Multi-transformers-enabled YOLO for tree-level apple inflorescences detection and density mapping. Comput. Electron. Agric. 2023, 209, 107803. [Google Scholar] [CrossRef]
- Li, Y.; Miao, N.; Ma, L.; Shuang, F.; Huang, X. Transformer for object detection: Review and benchmark. Eng. Appl. Artif. Intell. 2023, 126, 107021. [Google Scholar] [CrossRef]
- Zhu, D.; Wang, D. Transformers and their application to medical image processing: A review. J. Radiat. Res. Appl. Sci. 2023, 16, 100680. [Google Scholar] [CrossRef]
- Howard, A.; Sandler, M.; Chu, G.; Chen, L.C.; Chen, B.; Tan, M.; Wang, W.; Zhu, Y.; Pang, R.; Vasudevan, V.; et al. Searching for MobileNetV3. arXiv 2019, arXiv:1905.02244. [Google Scholar]
- Zhu, X.; Lyu, S.; Wang, X.; Zhao, Q. TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-captured Scenarios. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Montreal, BC, Canada, 11–17 October 2021; pp. 2778–2788. [Google Scholar] [CrossRef]
- Li, X.; Hu, X.; Yang, J. Spatial Group-wise Enhance: Improving Semantic Feature Learning in Convolutional Networks. arXiv 2019, arXiv:1905.09646. [Google Scholar]
- Ru, C.; Zhang, S.; Qu, C.; Zhang, Z. The High-Precision Detection Method for Insulators’ Self-Explosion Defect Based on the Unmanned Aerial Vehicle with Improved Lightweight ECA-YOLOX-Tiny Model. Appl. Sci. 2022, 12, 9314. [Google Scholar] [CrossRef]
- Gao, C.; Tang, T.; Wu, W.; Zhang, F.; Luo, Y.; Wu, W.; Yao, B.; Li, J. Hyperspectral Prediction Model of Nitrogen Content in Citrus Leaves Based on the CEEMDAN–SR Algorithm. Remote. Sens. 2023, 15, 5013. [Google Scholar] [CrossRef]
- Su, Y.; Liu, Q.; Xie, W.; Hu, P. YOLO-LOGO: A transformer-based YOLO segmentation model for breast mass detection and segmentation in digital mammograms. Comput. Methods Programs Biomed. 2022, 221, 106903. [Google Scholar] [CrossRef] [PubMed]
- Peng, H.; Zhong, J.; Liu, H.; Li, J.; Yao, M.; Zhang, X. ResDense-focal-DeepLabV3+ enabled litchi branch semantic segmentation for robotic harvesting. Comput. Electron. Agric. 2023, 206, 107691. [Google Scholar] [CrossRef]
- Wang, M.; Yang, B.; Wang, X.; Yang, C.; Xu, J.; Mu, B.; Xiong, K.; Li, Y. YOLO-T: Multitarget Intelligent Recognition Method for X-ray Images Based on the YOLO and Transformer Models. Appl. Sci. 2022, 12, 11848. [Google Scholar] [CrossRef]
- Han, K.; Wang, Y.; Tian, Q.; Guo, J.; Xu, C.; Xu, C. GhostNet: More Features from Cheap Operations. arXiv 2019, arXiv:1911.11907. [Google Scholar]
Configuration | Parameter |
---|---|
Image Resolution | 4032 pixels × 3024 pixels (W × H) |
Training framework | Python programming language, Pytorch framework |
Pretrained model | ImageNet model |
Operating system | Ubuntu18.04 version |
Accelerated environment | CUDA11 and CUDNN 7 |
Development environment | Vscode |
Computer configuration used in training and testing | Intel I7-87700K Processor, Huawei, China 32 GB RDIMM, 512 G Solid State Drive, 2 TB Mechanical Hard Drive, Graphics Card RTX3080Ti |
Classes | C2fTRS | ECA | Change Head | Parameters | Precision | Recall | mAP |
---|---|---|---|---|---|---|---|
M0 | - | - | - | 3380646M | 75.7% | 65.1% | 74.0% |
M1 | ✓ | - | - | 3316390M | 81.4% | 63.6% | 74.7% |
M2 | - | ✓ | - | 3380652M | 80.0% | 64.2% | 74.6% |
M3 | - | - | ✓ | 3802728M | 81.7% | 67.4% | 77.4% |
M4 | ✓ | ✓ | - | - | 82.9% | 64.4% | 76.1% |
M5 | ✓ | - | ✓ | - | 66.0% | 78.2% | |
M6 | - | ✓ | ✓ | - | 84.1% | 67.6% | 78.5% |
M7 | ✓ | ✓ | ✓ | - | 84.6% |
Detector | Backbone | Precision | Recall | mAP |
---|---|---|---|---|
YOLOv8 | mobilenetv3-small | 76.9% | 53.8% | 67.5% |
YOLOv8 | mobilenetv3-large | 80.2% | 63.2% | 73.8% |
YOLOv8 | shufflenetv2 | 78.8% | 60.4% | 71.4% |
YOLOv8 | GhostNet | 80.8% | 60.9% | 73.3% |
YOLOv8 | proposed model |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, B.; Lu, H.; Wei, X.; Guan, S.; Zhang, Z.; Zhou, X.; Luo, Y. An Improved Rotating Box Detection Model for Litchi Detection in Natural Dense Orchards. Agronomy 2024, 14, 95. https://doi.org/10.3390/agronomy14010095
Li B, Lu H, Wei X, Guan S, Zhang Z, Zhou X, Luo Y. An Improved Rotating Box Detection Model for Litchi Detection in Natural Dense Orchards. Agronomy. 2024; 14(1):95. https://doi.org/10.3390/agronomy14010095
Chicago/Turabian StyleLi, Bin, Huazhong Lu, Xinyu Wei, Shixuan Guan, Zhenyu Zhang, Xingxing Zhou, and Yizhi Luo. 2024. "An Improved Rotating Box Detection Model for Litchi Detection in Natural Dense Orchards" Agronomy 14, no. 1: 95. https://doi.org/10.3390/agronomy14010095
APA StyleLi, B., Lu, H., Wei, X., Guan, S., Zhang, Z., Zhou, X., & Luo, Y. (2024). An Improved Rotating Box Detection Model for Litchi Detection in Natural Dense Orchards. Agronomy, 14(1), 95. https://doi.org/10.3390/agronomy14010095