Deep Learning-Based Image Recognition of Agricultural Pests
Abstract
:1. Introduction
- We propose a new pest-identification model that aims to achieve good performance, even with a small number of samples, unbalanced categories, and large sample image sizes;
- We devise a new sliding window cropping method that aims to increase the perceptual field to learn sample features more carefully and comprehensively, which may be missed due to large image sizes;
- We perfectly integrate the attention mechanism with the FPN layer in the model to make the model more focused on sample features that are more useful for the task at hand;
- We augment the data for small numbers of sample categories as well as for unbalanced samples to prevent their adverse effects.
2. Related Work
2.1. Existing Object-Detection Frameworks
2.2. The Cbam Attention Mechanism
2.3. The Architecture of ResNet and ResNeXt
3. Method
3.1. Data Preprocessing
3.1.1. Sliding Window Cropping Method
3.1.2. Data Enhancement
3.2. Feature Extraction
3.3. Attention-Based FPN
3.4. Cascaded Structure
4. Experiment
4.1. Dataset Processing and Partitioning
4.2. Evaluation Indicators
4.3. Experimental Results
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Taigman, Y.; Yang, M.; Ranzato, M.A.; Wolf, L. Deepface: Closing the gap to human-level performance in face verification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 1701–1708. [Google Scholar]
- Huang, W.; Qiao, Y.; Tang, X. Robust scene text detection with convolution neural network induced mser trees. In Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; Springer: Cham, Switzerland, 2014; pp. 497–511. [Google Scholar]
- Zhang, Z.; Zhang, C.; Shen, W.; Yao, C.; Liu, W.; Bai, X. Multi-oriented text detection with fully convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 4159–4167. [Google Scholar]
- Ouyang, W.; Wang, X. Joint deep learning for pedestrian detection. In Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia, 1–8 October 2013; pp. 2056–2063. [Google Scholar]
- Zhang, L.; Lin, L.; Liang, X.; He, K. Is faster R-CNN doing well for pedestrian detection. In Proceedings of the European Conference on Computer Vision, Las Vegas, NV, USA, 27–30 June 2016; Springer: Cham, Switzerland, 2016; pp. 443–457. [Google Scholar]
- Hoi, S.C.; Wu, X.; Liu, H.; Wu, Y.; Wang, H.; Xue, H.; Wu, Q. Logo-net: Large-scale deep logo detection and brand recognition with deep region-based convolutional networks. arXiv 2015, arXiv:1511.02462. [Google Scholar]
- Kleban, J.; Xie, X.; Ma, W.Y. Spatial pyramid mining for logo detection in natural scenes. In Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, Hannover, Germany, 26 April–23 June 2008; IEEE: Piscataway, NJ, USA, 2008; pp. 1077–1080. [Google Scholar]
- Kang, K.; Li, H.; Yan, J.; Zeng, X.; Yang, B.; Xiao, T.; Zhang, C.; Wang, Z.; Wang, R.; Wang, X.; et al. T-cnn: Tubelets with convolutional neural networks for object detection from videos. IEEE Trans. Circuits Syst. Video Technol. 2017, 28, 2896–2907. [Google Scholar] [CrossRef] [Green Version]
- Kang, K.; Ouyang, W.; Li, H.; Wang, X. Object detection from video tubelets with convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 817–825. [Google Scholar]
- Chen, X.; Xiang, S.; Liu, C.L.; Pan, C.H. Vehicle detection in satellite images by parallel deep convolutional neural networks. In Proceedings of the 2013 2nd IAPR Asian Conference on Pattern Recognition, Washington, DC, USA, 5–8 November 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 181–185. [Google Scholar]
- Fan, Q.; Brown, L.; Smith, J. A closer look at Faster R-CNN for vehicle detection. In Proceedings of the 2016 IEEE Intelligent Vehicles Symposium (IV), Gothenburg, Sweden, 19-22 June 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 124–129. [Google Scholar]
- Litjens, G.; Kooi, T.; Bejnordi, B.E.; Setio, A.A.A.; Ciompi, F.; Ghafoorian, M.; Van Der Laak, J.A.; Van Ginneken, B.; Sánchez, C.I. A survey on deep learning in medical image analysis. Med. Image Anal. 2017, 42, 60–88. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ahmad, I.; Yang, Y.; Yue, Y.; Ye, C.; Hassan, M.; Cheng, X.; Wu, Y.; Zhang, Y. Deep Learning Based Detector YOLOv5 for Identifying Insect Pests. Appl. Sci. 2022, 12, 10167. [Google Scholar] [CrossRef]
- Boedeker, W.; Watts, M.; Clausing, P.; Marquez, E. The global distribution of acute unintentional pesticide poisoning: Estimations based on a systematic reviez. BMC Public Health 2020, 20, 1875. [Google Scholar] [CrossRef]
- Hu, Z.; Xu, L.; Cao, L.; Liu, S.; Luo, Z.; Wang, J.; Li, X.; Wang, L. Application of non-orthogonal multiple access in wireless sensor networks for smart agriculture. IEEE Access 2019, 7, 87582–87592. [Google Scholar] [CrossRef]
- Yang, G.; Bao, Y.; Liu, Z. Localization and identification of pests in tea plantations based on image saliency analysis and convolutional neural network. Trans. Chin. Soc. Agric. Eng. 2017, 33, 156–162. [Google Scholar]
- Xie, C.; Li, R.; Dong, W.; Song, L.; Zhang, J.; Chen, H.; Chen, T. Image recognition of farmland pests based on sparse coding pyramid model. Trans. Chin. Soc. Agric. Eng. 2016, 32, 144–150. [Google Scholar]
- Sun, P.; Chen, G.; Chao, L. Image recognition of soybean pests based on attentional convolutional neural network. China J. Agric. Mech. 2020, 41, 171–176. [Google Scholar]
- Cheng, X.; Zhang, Y.; Chen, Y.; Wu, Y.; Yue, Y. Pest identification via deep residual learning in complex background. Comput. Electron. Agric. 2017, 141, 351–356. [Google Scholar] [CrossRef]
- Fuentes, A.; Yoon, S.; Kim, S.C.; Park, D.S. A robust deep-learning-based detector for real-time tomato plant diseases and pests recognition. Sensors 2017, 17, 2022. [Google Scholar] [CrossRef] [Green Version]
- Jiao, L.; Dong, S.; Zhang, S.; Xie, C.; Wang, H. AF-RCNN: An anchor-free convolutional neural network for multi-categories agricultural pest detection. Comput. Electron. Agric. 2020, 174, 105522. [Google Scholar] [CrossRef]
- Sabanci, K.; Aslan, M.F.; Ropelewska, E.; Unlersen, M.F.; Durdu, A. A Novel Convolutional-Recurrent Hybrid Network for Sunn Pest–Damaged Wheat Grain Detection. Food Anal. Methods 2022, 15, 1748–1760. [Google Scholar] [CrossRef]
- Gambhir, J.; Patel, N.; Patil, S.; Takale, P.; Chougule, A.; Prabhakar, C.S.; Managanvi, K.; Raghavan, A.S.; Sohane, R.K. Deep Learning for Real-Time Diagnosis of Pest and Diseases on Crops; Intelligent Data Engineering and Analytics; Springer: Singapore, 2022; pp. 189–197. [Google Scholar]
- Li, D.; Wang, R.; Xie, C.; Liu, L.; Zhang, J.; Li, R.; Wang, F.; Zhou, M.; Liu, W. A recognition method for rice plant diseases and pests video detection based on deep convolutional neural network. Sensors 2020, 20, 578. [Google Scholar] [CrossRef]
- Bengio, Y.; Courville, A.; Vincent, P. Representation learning: A review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 1798–1828. [Google Scholar] [CrossRef] [Green Version]
- Kavukcuoglu, K.; Sermanet, P.; Boureau, Y.L.; Gregor, K.; Mathieu, M.; Cun, Y. Learning convolutional feature hierarchies for visual recognition. Adv. Neural Inf. Process. Syst. 2010, 23, 1090–1098. [Google Scholar]
- Oquab, M.; Bottou, L.; Laptev, I.; Sivic, J. Learning and transferring mid-level image representations using convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Zurich, Switzerland, 6–12 September 2014; pp. 1717–1724. [Google Scholar]
- Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning representations by back-propagating errors. Nature 1986, 323, 533–536. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef] [Green Version]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Zurich, Switzerland, 6–12 September 2014; pp. 580–587. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Gothenburg, Sweden, 19–22 June 2016; pp. 779–788. [Google Scholar]
- Girshick, R. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–12 December 2015; pp. 1440–1448. [Google Scholar]
- Cai, Z.; Vasconcelos, N. Cascade r-cnn: Delving into high quality object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 6154–6162. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. Supplementary material for ‘ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; IEEE: Seattle, WA, USA, 2020; pp. 13–19. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Gothenburg, Sweden, 19–22 June 2016; pp. 770–778. [Google Scholar]
- Xie, S.; Girshick, R.; Dollár, P.; Tu, Z.; He, K. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1492–1500. [Google Scholar]
- Chen, X.; Sun, Y.; Zhang, Q.; Liu, F. Two-stage grasp strategy combining CNN-based classification and adaptive detection on a flexible hand. Appl. Soft Comput. 2020, 97, 106729. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Van Etten, A. You only look twice: Rapid multi-scale object detection in satellite imagery. arXiv 2018, arXiv:1805.09512. [Google Scholar]
- Neubeck, A.; Van Gool, L. Efficient non-maximum suppression. In Proceedings of the 18th International Conference on Pattern Recognition (ICPR’06), Washington, DC, USA, 20–24 August 2006; IEEE: Piscataway, NJ, USA, 2006; Volume 3, pp. 850–855. [Google Scholar]
- Ge, Z.; Liu, S.; Wang, F.; Li, Z.; Sun, J. Yolox: Exceeding yolo series in 2021. arXiv 2021, arXiv:2107.08430. [Google Scholar]
Model | Backbone | Men(GB) | Box-AP |
---|---|---|---|
Cascade Rcnn | R-50-FPN | 4.2 | 40.1 |
Cascade Rcnn | R-101-FPN | 6.2 | 42.3 |
Cascade Rcnn | X-101-32×4d-FPN | 7.6 | 43.7 |
YOLOX | YOLOX-s | 7.6 | 40.5 |
YOLOX | YOLOX-l | 19.9 | 49.4 |
YOLOX | YOLOX-x | 28.1 | 50.9 |
Faster Rcnn | R-50-FPN | 4.0 | 37.4 |
Faster Rcnn | R-101-FPN | 6.0 | 39.4 |
Faster Rcnn | X-101-32×4d-FPN | 7.2 | 41.2 |
Model | Backbone | Optimizer | Learning Rate | Momentum | Image-Size | Batch-Size | Epochs |
---|---|---|---|---|---|---|---|
Cascade Rcnn | R-50-FPN | SGD | 0.005 | 0.9 | 1280 × 1280 | 64 | 100 |
Cascade Rcnn | R-101-FPN | SGD | 0.005 | 0.9 | 1280 × 1280 | 64 | 100 |
Cascade Rcnn | X-101-32×4d-FPN | SGD | 0.005 | 0.9 | 1280 × 1280 | 64 | 100 |
YOLOX | YOLOX-s | SGD | 0.005 | 0.9 | 1280 × 1280 | 64 | 100 |
YOLOX | YOLOX-l | SGD | 0.005 | 0.9 | 1280 × 1280 | 64 | 100 |
YOLOX | YOLOX-x | SGD | 0.005 | 0.9 | 1280 × 1280 | 64 | 100 |
Faster Rcnn | R-50-FPN | SGD | 0.005 | 0.9 | 1280 × 1280 | 64 | 100 |
Faster Rcnn | R-101-FPN | SGD | 0.005 | 0.9 | 1280 × 1280 | 64 | 100 |
Faster Rcnn | X-101-32×4d-FPN | SGD | 0.005 | 0.9 | 1280 × 1280 | 64 | 100 |
Ours | X-101-32×4d-FPN | SGD | 0.005 | 0.9 | 1280 × 1280 | 64 | 100 |
Method | Backbone | mAP_0.5 | mAP_0.5:0.95 | Precision | F1-Score |
---|---|---|---|---|---|
Cascade Rcnn | R-101-FPN | 70.62 | 42.35 | 45.36 | 67.64 |
Cascade Rcnn | X-101-32×4d-FPN | 72.35 | 45.12 | 68.54 | 71.21 |
YOLOX | YOLOX-s | 66.39 | 40.16 | 64.37 | 63.87 |
YOLOX | YOLOX-l | 79.82 | 61.82 | 75.42 | 77.15 |
Faster Rcnn | R-101-FPN | 65.45 | 49.67 | 51.56 | 63.24 |
Faster Rcnn | X-101-32×4d-FPN | 68.16 | 51.84 | 57.63 | 66.83 |
Ours | X-101-32×4d-FPN | 84.16 | 65.23 | 67.79 | 82.34 |
Method | Backbone | Sliding Window Cutting | Add Attention to the FPN | mAP_0.5 | mAP_0.5:0.95 | Precision | F1-Score |
---|---|---|---|---|---|---|---|
Ours | X-101-32×4d-FPN | 72.35 | 45.12 | 68.54 | 71.21 | ||
Ours | X-101-32×4d-FPN | 75.64 | 47.17 | 69.03 | 76.36 | ||
Ours | X-101-32×4d-FPN | 76.38 | 46.58 | 67.52 | 77.49 | ||
Ours | X-101-32×4d-FPN | 84.16 | 65.23 | 67.79 | 82.34 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xu, W.; Sun, L.; Zhen, C.; Liu, B.; Yang, Z.; Yang, W. Deep Learning-Based Image Recognition of Agricultural Pests. Appl. Sci. 2022, 12, 12896. https://doi.org/10.3390/app122412896
Xu W, Sun L, Zhen C, Liu B, Yang Z, Yang W. Deep Learning-Based Image Recognition of Agricultural Pests. Applied Sciences. 2022; 12(24):12896. https://doi.org/10.3390/app122412896
Chicago/Turabian StyleXu, Weixiao, Lin Sun, Cheng Zhen, Bo Liu, Zhengyi Yang, and Wenke Yang. 2022. "Deep Learning-Based Image Recognition of Agricultural Pests" Applied Sciences 12, no. 24: 12896. https://doi.org/10.3390/app122412896
APA StyleXu, W., Sun, L., Zhen, C., Liu, B., Yang, Z., & Yang, W. (2022). Deep Learning-Based Image Recognition of Agricultural Pests. Applied Sciences, 12(24), 12896. https://doi.org/10.3390/app122412896