Instance Segmentation and 3D Pose Estimation of Tea Bud Leaves for Autonomous Harvesting Robots
Abstract
:1. Introduction
2. Materials and Methods
2.1. Data Acquisition and Processing
2.2. Instance Segmentation Model for Tea Bud Leaves
2.2.1. YOLOv8 Segmentation Model
- (i)
- The E-GELAN module is constructed and integrated into the Backbone for feature extraction, excelling at capturing the detailed morphological features and contextual information of tea bud leaves.
- (ii)
- DCNv2 and the Dynamic Head are employed to enhance the Neck and YOLO Head, improving the differential representation of global and local features.
- (iii)
- The Wise-IoUv3 loss function is used to train the model, dynamically adjusting the weights based on the varying shapes and scales of the targets, thereby enhancing the model’s adaptability to the unstructured tea garden environment.
2.2.2. E-GELAN Module
2.2.3. DCNv2 and Dynamic Head
2.2.4. Wise-IoUv3 Loss Function
2.3. Dynamic Weight-Based Adaptive-Pose Estimation Method for Tea Bud Leaves
2.3.1. Tea-Bud-Leaves Local Point-Cloud Acquisition Based on ORBSLAM3
2.3.2. Point Cloud Pre-Processing
2.3.3. Dynamic Weight-Based Adaptive Pose Estimation for Tea Bud Leaves
2.4. Evaluation Metrics
2.4.1. Instance-Segmentation Evaluation Metrics
2.4.2. Pose-Estimation Evaluation Metrics
3. Results and Discussion
3.1. Tea-Bud-Leaves Instance-Segmentation-Model Performance Evaluation
3.1.1. Ablation Experiments
3.1.2. Loss-Function Comparison Experiment
3.1.3. Visualization of Instance-Segmentation Results
3.1.4. Comparison with Advanced Segmentation Models
3.2. Performance Evaluation of Tea-Bud-Leaves Pose Estimation
3.2.1. Angle-Error Evaluation
3.2.2. Distance-Error Evaluation
3.2.3. Comparison with Other Pose-Estimation Methods
3.2.4. Visualization of Pose-Estimation Results
3.3. Limitations and Future Work
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Yu, X.L.; He, Y. Optimization of tea-leaf saponins water extraction and relationships between their contents and tea (Camellia sinensis) tree varieties. Food Sci. Nutr. 2018, 6, 1734–1740. [Google Scholar] [CrossRef] [PubMed]
- Dong, Q.; Murakami, T.; Nakashima, Y. Recalculating the agricultural labor force in China. China Econ. J. 2018, 11, 151–169. [Google Scholar] [CrossRef]
- Zhu, Y.; Wu, C.; Tong, J.; Chen, J.; He, L.; Wang, R.; Jia, J. Deviation tolerance performance evaluation and experiment of picking end effector for famous tea. Agriculture 2021, 11, 128. [Google Scholar] [CrossRef]
- Zhang, S.; Yang, H.; Yang, C.; Yuan, W.; Li, X.; Wang, X.; Zhang, Y.; Cai, X.; Sheng, Y.; Deng, X.; et al. Edge device detection of tea leaves with one bud and two leaves based on shuffleNetv2-YOLOv5-lite-E. Agronomy 2023, 13, 577. [Google Scholar] [CrossRef]
- Lin, Y.K.; Chen, S.F.; Kuo, Y.F.; Liu, T.L.; Lee, X.Y. Developing a guiding and growth status monitoring system for riding-type tea plucking machine using fully convolutional networks. Comput. Electron. Agric. 2021, 191, 106540. [Google Scholar] [CrossRef]
- Zhao, C.-T.; Wang, R.-F.; Tu, Y.-H.; Pang, X.-X.; Su, W.-H. Automatic Lettuce Weed Detection and Classification Based on Optimized Convolutional Neural Networks for Robotic Weed Control. Agronomy 2024, 14, 2838. [Google Scholar] [CrossRef]
- Hua, X.; Li, H.; Zeng, J.; Han, C.; Chen, T.; Tang, L.; Luo, Y. A review of target recognition technology for fruit picking robots: From digital image processing to deep learning. Appl. Sci. 2023, 13, 4160. [Google Scholar] [CrossRef]
- Wu, X.; Tang, X.; Zhang, F.; Gu, J. Tea buds image identification based on lab color model and K-means clustering. J. Chin. Agric. Mech. 2015, 36, 161–164+179. [Google Scholar] [CrossRef]
- Zhang, L.; Zhang, H.; Chen, Y.; Dai, S.; Li, X.; Kenji, L.; Liu, Z.; LI, M. Real-time monitoring of optimum timing for harvesting fresh tea leaves based on machine vision. Int. J. Agric. Biol. Eng. 2019, 12, 6–9. [Google Scholar] [CrossRef]
- Karunasena, G.; Priyankara, H. Tea bud leaf identification by using machine learning and image processing techniques. Int. J. Sci. Eng. Res. 2020, 11, 624–628. [Google Scholar] [CrossRef]
- Zhang, L.; Zou, L.; Wu, C.; Jia, J.; Chen, J. Method of famous tea sprout identification and segmentation based on improved watershed algorithm. Comput. Electron. Agric. 2021, 184, 106108. [Google Scholar] [CrossRef]
- Wang, Z.; Wang, R.; Wang, M.; Lai, T.; Zhang, M. Self-supervised transformer-based pre-training method with General Plant Infection dataset. In Proceedings of the Chinese Conference on Pattern Recognition and Computer Vision (PRCV), Urumqi, China, 18–20 October 2024; pp. 189–202. [Google Scholar]
- Wang, R.-F.; Su, W.-H. The Application of Deep Learning in the Whole Potato Production Chain: A Comprehensive Review. Agriculture 2024, 14, 1225. [Google Scholar] [CrossRef]
- Li, J.; Li, J.; Zhao, X.; Su, X.; Wu, W. Lightweight detection networks for tea bud on complex agricultural environment via improved YOLO v4. Comput. Electron. Agric. 2023, 211, 107955. [Google Scholar] [CrossRef]
- Chen, T.; Li, H.; Chen, J.; Zeng, Z.; Han, C.; Wu, W. Detection network for multi-size and multi-target tea bud leaves in the field of view via improved YOLOv7. Comput. Electron. Agric. 2024, 218, 108700. [Google Scholar] [CrossRef]
- Xie, S.; Sun, H. Tea-YOLOv8s: A tea bud detection model based on deep learning and computer vision. Sensors 2023, 23, 6576. [Google Scholar] [CrossRef]
- Xu, W.; Zhao, L.; Li, J.; Shang, S.; Ding, X.; Wang, T. Detection and classification of tea buds based on deep learning. Comput. Electron. Agric. 2022, 192, 106547. [Google Scholar] [CrossRef]
- Chen, Y.-T.; Chen, S.-F. Localizing plucking points of tea leaves using deep convolutional neural networks. Comput. Electron. Agric. 2020, 171, 105298. [Google Scholar] [CrossRef]
- Li, Y.; He, L.; Jia, J.; Chen, J.; Lyu, J.; Wu, C. High-efficiency tea shoot detection method via a compressed deep learning model. Int. J. Agric. Biol. Eng. 2022, 15, 159–166. [Google Scholar] [CrossRef]
- Lu, J.; Yang, Z.; Sun, Q.; Gao, Z.; Ma, W. A machine vision-based method for tea buds segmentation and picking point location used on a cloud platform. Agronomy 2023, 13, 1537. [Google Scholar] [CrossRef]
- Zhang, F.; Sun, H.; Xie, S.; Dong, C.; Li, Y.; Xu, Y.; Zhang, Z.; Chen, F. A tea bud segmentation, detection and picking point localization based on the MDY7-3PTB model. Front. Plant Sci. 2023, 14, 1199473. [Google Scholar] [CrossRef] [PubMed]
- Chen, T.; Li, H.; Lv, J.; Chen, J.; Wu, W. Segmentation Network for Multi-Shape Tea Bud Leaves Based on Attention and Path Feature Aggregation. Agriculture 2024, 14, 1388. [Google Scholar] [CrossRef]
- Li, H.; Zhu, Q.; Huang, M.; Guo, Y.; Qin, J. Pose estimation of sweet pepper through symmetry axis detection. Sensors 2018, 18, 3083. [Google Scholar] [CrossRef] [PubMed]
- Lehnert, C.; Sa, I.; McCool, C.; Upcroft, B.; Tristan, P. Sweet pepper pose detection and grasping for automated crop harvesting. In Proceedings of the 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden, 16–21 May 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 2428–2434. [Google Scholar] [CrossRef]
- Tao, Y.; Zhou, J. Automatic apple recognition based on the fusion of color and 3D feature for robotic fruit picking. Comput. Electron. Agric. 2017, 142, 388–396. [Google Scholar] [CrossRef]
- Li, T.; Feng, Q.; Qiu, Q.; Xie, F.; Zhao, C. Occluded apple fruit detection and localization with a frustum-based point-cloud-processing approach for robotic harvesting. Remote Sens. 2022, 14, 482. [Google Scholar] [CrossRef]
- Lin, G.; Tang, Y.; Zou, X.; Xiong, J.; Li, J. Guava detection and pose estimation using a low-cost RGB-D sensor in the field. Sensors 2019, 19, 428. [Google Scholar] [CrossRef]
- Luo, L.; Yin, W.; Ning, Z.; Wang, J.; Wei, H.; Chen, W.; Lu, Q. In-field pose estimation of grape clusters with combined point cloud segmentation and geometric analysis. Comput. Electron. Agric. 2022, 200, 107197. [Google Scholar] [CrossRef]
- Zhu, L.; Lai, Y.; Zhang, S.; Wu, R.; Deng, W.; Guo, X. Improved U-Net Pitaya Image Segmentation and Pose Estimation Method for Picking Robot. In Transactions of the Chinese Society for Agricultural Machinery; Nong Ye Ji Xie Xue Bao Bian Ji Bu: Beijing, China,, 2023; pp. 1–16. Available online: http://kns.cnki.net/kcms/detail/11.1964.S.20230920.1558.002.html (accessed on 7 December 2024).
- Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 7464–7475. [Google Scholar] [CrossRef]
- Wang, C.Y.; Yeh, I.H.; Mark; Liao, H.Y. Yolov9: Learning what you want to learn using programmable gradient information. In European Conference on Computer Vision; Springer: Cham, Switzerland, 2025; pp. 1–21. [Google Scholar]
- Zhu, X.; Hu, H.; Lin, S.; Dai, J. Deformable convnets v2: More deformable, better results. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 9308–9316. [Google Scholar]
- Dai, X.; Chen, Y.; Xiao, B.; Chen, D.; Liu, M.; Lu, Y.; Zhang, L. Dynamic head: Unifying object detection heads with attentions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 7373–7382. [Google Scholar]
- Tong, Z.; Chen, Y.; Xu, Z.; Yu, R. Wise-IoU: Bounding box regression loss with dynamic focusing mechanism. arXiv 2023, arXiv:2301.10051. [Google Scholar]
- Yao, M.; Huo, Y.; Ran, Y.; Tian, Q.; Wang, R.; Wang, H. Neural Radiance Field-based Visual Rendering: A Comprehensive Review. arXiv 2024, arXiv:2404.00714. [Google Scholar] [CrossRef]
- Campos, C.; Elvira, R.; Rodríguez, J.J.; Jose, M.M.M.; Juan, D.T. ORB-SLAM3: An accurate open-source library for visual, visual–inertial, and multimap slam. IEEE Trans. Robot. 2021, 37, 1874–1890. [Google Scholar] [CrossRef]
- Kennedy, J.; Eberhart, R. Particle swarm optimization. In Proceedings of the ICNN'95-International Conference on Neural Networks, Perth, WA, Australia, 27 November–1 December 1995; Volume 4, pp. 1942–1948. [Google Scholar]
- Rezatofighi, H.; Tsoi, N.; Gwak, J.; Sadeghian, A.; Reid, I.; Savarese, S. Generalized intersection over union: A metric and a loss for bounding box regression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 658–666. [Google Scholar]
- Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 12993–13000. [Google Scholar]
- Zhang, Y.; Ren, W.; Zhang, Z.; Jia, Z.; Wang, L.; Tan, T. Focal and efficient IOU loss for accurate bounding box regression. Neurocomputing 2022, 506, 146–157. [Google Scholar] [CrossRef]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
- Cai, Z.; Vasconcelos, N. Cascade r-cnn: Delving into high quality object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 6154–6162. [Google Scholar]
- Bolya, D.; Zhou, C.; Xiao, F.; Lee, Y.J. Yolact: Real-time instance segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 9157–9166. [Google Scholar]
- Bolya, D.; Zhou, C.; Xiao, F.; Lee, Y.J. YOLACT++: Better Real-time Instance Segmentation. IEEE Trans Pattern Anal. Mach. Intell. 2019, 1912, 06218. [Google Scholar] [CrossRef]
Datasets | Image Samples | Label Category | ||
---|---|---|---|---|
tea_Y | tea_I | tea_V | ||
train | 2294 | 5519 | 3037 | 2965 |
val | 286 | 719 | 353 | 355 |
test | 288 | 674 | 372 | 396 |
Total | 2868 | 6912 | 3762 | 3716 |
YOLOv8s-seg | E-GELAN | DCNv2 | Dynamic Head | Wise-IoUv3 | AP@50/Box (%) | AP@50/Mask (%) | mAP@50/Box(%) | mAP@50/Mask(%) | ||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
tea_Y | tea_I | tea_V | tea_Y | tea_I | tea_V | |||||||
√ | 92.7 | 88.3 | 85.4 | 92.7 | 88.4 | 84.4 | 88.8 | 88.5 | ||||
√ | √ | 93.1 | 90.2 | 90.2 | 92.7 | 89.6 | 89.7 | 91.2 | 90.7 | |||
√ | √ | √ | 93.4 | 91.8 | 88.6 | 93.2 | 91.8 | 88.3 | 91.3 | 91.1 | ||
√ | √ | √ | √ | 94.1 | 91.9 | 89.3 | 94.0 | 91.4 | 88.4 | 91.8 | 91.3 | |
√ | √ | √ | √ | √ | 94.4 | 91.3 | 90.4 | 94.3 | 91.0 | 90.2 | 92.0 | 91.9 |
Model A | Wise-IoUv3 | GIoU | DIoU | EIoU | AP@50/Box (%) | AP@50/Mask (%) | mAP@50/Box(%) | mAP@50/Mask(%) | ||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
tea_Y | tea_I | tea_V | tea_Y | tea_I | tea_V | |||||||
√ | √ | 92.7 | 88.3 | 85.4 | 92.7 | 88.4 | 84.4 | 88.8 | 88.5 | |||
√ | √ | 93.7 | 92.4 | 89.2 | 93.7 | 92.0 | 89.0 | 91.8 | 91.6 | |||
√ | √ | 93.8 | 92.3 | 88.1 | 93.5 | 92.0 | 87.8 | 91.4 | 91.1 | |||
√ | √ | 92.6 | 89.5 | 87.8 | 92.6 | 89.6 | 87.8 | 90.0 | 90.0 |
Model | mAP@50 (%) | mAP@50-90 (%) | ||
---|---|---|---|---|
Box | Mask | Box | Mask | |
This Paper | 92.0 | 91.9 | 86.0 | 72.4 |
Mask R-CNN | 75.7 | 73.9 | 57.4 | 49.7 |
Cascade Mask R-CNN | 80.6 | 78.8 | 64.5 | 52.3 |
YOLACT | 86.0 | 84.4 | 70.0 | 53.9 |
YOLACT++ | 88.1 | 85.3 | 72.6 | 57.7 |
Method | Maximum Error (°) | Average Error (°) | Median Error (°) | Median Absolute Deviation (°) |
---|---|---|---|---|
Dynamic Weight-Based Adaptive Estimation Method | 7.76 | 3.41 | 3.69 | 1.42 |
Least Squares Method | 20.97 | 10.58 | 10.06 | 2.90 |
Method | Maximum Error (mm) | Average Error (mm) | Median Error (mm) | Median Absolute Deviation (mm) |
---|---|---|---|---|
Dynamic Weight-Based Adaptive Estimation Method | 8.60 | 2.83 | 2.57 | 0.81 |
Least Squares Method | 19.75 | 7.15 | 6.69 | 1.99 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, H.; Chen, T.; Chen, Y.; Han, C.; Lv, J.; Zhou, Z.; Wu, W. Instance Segmentation and 3D Pose Estimation of Tea Bud Leaves for Autonomous Harvesting Robots. Agriculture 2025, 15, 198. https://doi.org/10.3390/agriculture15020198
Li H, Chen T, Chen Y, Han C, Lv J, Zhou Z, Wu W. Instance Segmentation and 3D Pose Estimation of Tea Bud Leaves for Autonomous Harvesting Robots. Agriculture. 2025; 15(2):198. https://doi.org/10.3390/agriculture15020198
Chicago/Turabian StyleLi, Haoxin, Tianci Chen, Yingmei Chen, Chongyang Han, Jinhong Lv, Zhiheng Zhou, and Weibin Wu. 2025. "Instance Segmentation and 3D Pose Estimation of Tea Bud Leaves for Autonomous Harvesting Robots" Agriculture 15, no. 2: 198. https://doi.org/10.3390/agriculture15020198
APA StyleLi, H., Chen, T., Chen, Y., Han, C., Lv, J., Zhou, Z., & Wu, W. (2025). Instance Segmentation and 3D Pose Estimation of Tea Bud Leaves for Autonomous Harvesting Robots. Agriculture, 15(2), 198. https://doi.org/10.3390/agriculture15020198