1. Introduction
Citrus fruits, as a crucial cash crop globally, have a direct effect on food supply, agricultural economy [
1], and international trade in terms of yield and quality. However, citrus is susceptible to various diseases [
2] caused by fungi and pathogens [
3]. Diseases are prone to occur in the foliage, which not only leads to premature senescence and dropping of citrus leaves but also affects development and quality [
4]. There are many types of citrus diseases [
5], and once infected, they spread rapidly from tree to tree, resulting in a large area of infection. This not only directly threatens the sustainable development of the citrus industry but also increases the cost and the economic burden on fruit growers. In the early stages of leaf infection, if the type of disease and the degree of infection can be effectively identified and assessed, it can help fruit growers to implement effective preventive measures and solutions and increase citrus production and economic returns [
6].
Citrus leaf disease classification, as an important area of plant pathology, has received a lot of attention and research [
7]. Alongside the evolution and progress of technology [
8], different approaches have been applied in this field to enhance the accuracy and efficiency of disease identification. Traditional methods for diagnosing citrus leaf diseases primarily rely on manual experience and visual inspection, which are easy to implement but suffer from high subjectivity and low accuracy and efficiency. As well as requiring a high degree of specialization and extensive experience for technicians, it is not suitable for the prevention and detection of large-area orchard diseases. Using biological detection methods can achieve accurate identification of diseases [
9]. However, this requires professional testing equipment [
10], operation in the laboratory [
11] by professional testing personnel, and complex processes. Since it is time-consuming and high-cost, it is generally suitable for accurate quantitative analysis of disease application. Therefore, it does not meet the requirements of large orchard disease prevention and control. Consequently, the development of a fast, accurate, and efficient method for diagnosing citrus leaf diseases is particularly crucial and has very important practical significance [
12]. With the emergence of artificial intelligence, machine learning algorithms have experienced significant advancements [
13]. These advancements primarily include techniques based on image processing and machine vision [
14]. These methodologies typically rely on manually crafted features and classifiers [
15], such as support vector machines (SVM) [
16], K-nearest neighbors (KNN) [
17], decision trees, and others [
18], to facilitate disease recognition and classification. However, these approaches often encounter limitations when confronted with intricate variations in images and diverse manifestations of diseases. Moreover, manual design features require a lot of expertise and experience, and different feature extraction methods may be needed for different diseases. This will increase the complexity and difficulty of application. Compared to hyperspectral images, RGB images occupy a central position in the field of computer graphics and image processing due to their rich colors, strong intuition, wide versatility, low cost, and easy processing. They can accurately restore or create rich visual effects and meet various complex image processing needs.
Recently, deep learning (DL) technology [
19] has achieved remarkable results in image classification and target detection [
20]. This is due to its powerful feature extraction and pattern recognition capabilities. In particular, the convolutional neural network (CNN) [
21] can automatically extract image features that are more expressive and robust. By automatically learning the hierarchical feature representations in the image, it can better understand the image structure and improve the classification accuracy. Mohanty et al. [
22] used the transfer learning technique on the PlantVillage dataset for 26 types of diseases recognitions. They successfully achieved an overall classification accuracy of 99.35% by using Alexnet and Googlenet. In contrast, Wagle and Remachandran [
23] used a variety of networks, including Alexnet, VGG16, Googlenet, MobilenetV2, and Squeezenet, which were trained by transfer learning, in their tomato leaf diseases recognition study. Their experimental results showed that the VGG16 network attained the highest recognition accuracy of 99.17%. Sladojevic et al. [
24] used fine-tuned Caffenet networks to recognize different crop leaf diseases and they achieved an average accuracy of 96.3%. Agarwal et al. [
25] used a CNN model containing eight hidden layers and trained it on the PlantVillage dataset, finally achieving an accuracy of 98.4%. Rangarajan et al. [
26] used a transfer learning technique to recognize six types of diseases as well as healthy images of tomatoes with the help of pre-trained Alexnet and VGG16 models and achieved 97.29% and 97.49% accuracies, respectively. Xing et al. [
27] improved parameter utilization by proposing a weakly densenet-16 network and applying a cross-channel feature fusion method and achieved 93.33% accuracy on a citrus pest and disease dataset. Kundu et al. [
28] proposed a DL framework based on real datasets labeled by plant pathologists. This framework utilized the K-means algorithm and maizenet model to achieve automatic detection, disease prediction, and loss estimation of maize diseases with an accuracy of 98.5%. The framework was integrated into a web application to provide a convenient tool for plant pathologists. Gangwar et al. [
29] developed a transformer model that performed well in classifying tomato leaf images (with normal and complex backgrounds) in 13 classes with 93.51% accuracy. It requires less storage space and shorter training time and can be equipped with IoT, drones, and other devices for real-time monitoring. Dhaka et al. [
30] comprehensively evaluated the application of IoT and DL models for plant disease monitoring and classification. The advantages and disadvantages of multiple architectures were analyzed, and shortcomings in existing research were identified. Their performance on publicly available datasets was compared and contrasted to provide a reference for selecting the optimal model. In conclusion, deep learning, with automatic feature extraction, high-level semantic understanding, strong generalization power, concise processing flow, and good scalability, has shown remarkable proficiency in the domain of image categorization. As one of the current mainstream methods, DL provides strong theoretical support and practical guidance for citrus leaf disease classification [
31].
From traditional manual methods to instrumental detection, to the more intelligent methods of today, the prevention and management of crop diseases have always been a pressing issue in agriculture [
32]. Disease recognition technology has undergone several stages of development [
33]. Despite the growing global importance of the citrus industry in recent years, the accurate identification of citrus diseases still faces many challenges. Current research focuses on the identification, prevention and control of common diseases, while there is a lack of systematic research data and effective identification methods for emerging or regionally specific disease species. Currently, the diagnosis of citrus diseases relies on symptomatic observation and laboratory testing, but these methods are often lagging behind. The research gap in early disease diagnosis technology not only limits the early detection and rapid response of diseases, but also affects the sustainable development of the citrus industry.
In view of this, this study aims to fill this gap by developing an efficient and accurate intelligent identification system for citrus diseases through the comprehensive use of advanced technologies such as image processing, and deep learning. Specific research objectives include: (1) to construct a citrus disease dataset containing a wide range of disease species and diverse samples to provide a solid data foundation for subsequent research; (2) to explore and optimize the feature extraction and representation methods applicable to citrus disease images to improve the accuracy and efficiency of disease identification; (3) to design and implement one or more efficient classification models that can accurately identify citrus diseases in complex and changing field environments; and (4) to validate the effectiveness and practicability of the proposed methods, and to provide scientific basis and technical support for the precise prevention and control of citrus diseases.
This article is structured as follows.
Section 2 introduces the dataset, data preprocessing, related technical methodologies, the proposed CNN model in detail, and experimental design.
Section 3 explains the experimental results, and
Section 4 summarizes the research results and discusses their significance and future research direction. Finally, the conclusion and perspectives are presented in
Section 5.
2. Materials and Methodologies
2.1. Data Acquisition
At present, there are few available datasets of citrus pests and diseases, and they usually cover limited types of diseases, misclassification of diseases and uneven image quality. To overcome these challenges, we screened and integrated datasets from multiple sources to construct the dataset. The citrus leaf disease dataset we used comes from multiple sources, including (1) open-source datasets, such as PlantVillage, the Citrus Pest Identification Challenge dataset, the Kaggle citrus leaf dataset, images from Baidu’s webpage, and images from published papers and datasets (citrus samples from orchards in the state of Tamaulipas, Mexico). (2) Self-collected dataset: different citrus orchards in Guilin (25°16′30″ N 110°17′46″ E), Guiping, (23°20′08″ N 110°19′49″ E), Guangxi Zhuang Autonomous Region, China, including one thousand three hundred photos with a size of 6000 × 4000 pixels. These self-collected pictures were shot with a Canon camera (Tokyo, Japan), model EOS R50, with 24 megapixels, autozoom, no flash, and saved in JPG format. Details of the sources of the dataset are shown in
Table 1. To ensure the diversity and practical application value of the data, we consulted citrus pests and diseases experts, fruit growers, and relevant information to fully understand the specific conditions of the diseased leaves. Ultimately, we integrated images from different sources and constructed a comprehensive and representative diseases dataset used for research.
In terms of quantity, the dataset encompassed a total of more than three thousand citrus leaf images. This can help the model to demonstrate better generalization performance in practical applications. In terms of disease types, the dataset covered a wide range of diseases commonly found in citrus production, such as zinc deficiency, citrus huanglongbing (HLB), greasy spot disease, citrus leaf miner, and citrus canker. In addition, in order to build a complete classification system, the dataset contained images of healthy leaves as benchmark categories. Each category possessed a sufficient number of image samples, and these samples showed good diversity in terms of disease severity, leaf morphology, and background environment, simulating complex recognition scenarios in the real world.
2.2. Data Preprocessing
In image classification tasks, CNNs require sufficient data to effectively learn sample features. An insufficient amount of data may cause the model to fall into a local optimum, triggering overfitting and degradation of recognition accuracy. To solve these problems, the amount and diversity of data should be increased to enhance the generalization ability of the network. Therefore, data augmentation on the original dataset is necessary. It can expand the dataset and increase the diversity and number of samples, thereby meeting the data requirements of CNNs and improving the model’s recognition accuracy. Before constructing the DL model, the original dataset was thoroughly preprocessed to enhance the quality, diversity and applicability of the data. The following is a detailed description of the series of preprocessing steps we took for the citrus leaf disease dataset.
Data enhancement: To increase the generalization ability and robustness of the model, we performed data enhancement operations, including Random cropping, zoom, translation, rotate, adjust image light and shade and contrast [
34], etc. Data enhancement helped the model to better adapt to various scenarios that may occur in real applications by simulating different shooting conditions and perspective changes [
35].
Size adjustment: Due to inconsistent sizes and resolutions of the original images, inputting them directly into the model may affect the stability and efficiency of training. Therefore, we resized all images to a uniform size, such as 256 × 256 or 512 × 512 pixels so that the model can process the input data more efficiently.
Normalization: Normalization is a data preprocessing technique commonly used in DL. It converts an original image from the pixel interval of 0 to 255 to the interval of 0 to 1 by a linear transformation. This operation does not alter the image data but can decrease the neural network computation load, thus accelerating model training. In this study, we also normalized the images in the dataset to enhance the convergence speed and recognition accuracy.
Histogram equalization: To improve the contrast and clarity of the image, we utilized histogram equalization technique. This operation redistributed the pixel intensities of the image, making the detailed parts of the image more prominent, which helped the model to better recognize the disease features on the leaves.
Median filter and Gaussian filter: To remove noise and detail interference from the image, we have used a median filter and a Gaussian filter, respectively. The median filter can effectively remove pepper noise, while the Gaussian filter smoothed the image and reduced the effect of high-frequency noise. These filtering operations helped to improve the signal-to-noise ratio of the image and make disease features more clearly recognizable.
Through these preprocessing steps, we successfully enhanced the quality and diversity of the citrus leaf disease dataset, laying a solid foundation for the subsequent disease classification task. After data enhancement, the dataset presented comprised a total of 19,791 images, encompassing five types of leaf diseases as well as healthy leaves. The images were randomly split into training and validation sets at a 7:2:1 ratio, with 14,777 images allocated to the training set, 3546 images to the validation set, and 1468 images to the test set.
Figure 1 shows some examples of preprocessing.
Table 2 shows the number of pictures for each type of disease in the citrus leaves dataset.
2.3. Technical Methodologies
2.3.1. Transfer Learning
In disease classification tasks, it is difficult to collect a large amount of labeled data. Transfer learning can utilize pre-trained models to solve this problem and achieve better performance even with a small sample dataset [
36]. The principle of transfer learning is to utilize already-learned knowledge or models and apply them to new tasks or domains to accelerate the learning process and improve performance [
37]. The core of transfer learning lies in finding similarities between the source and target domains and then utilizing these similarities to transfer knowledge. This transfer can be performed based on different levels, such as on model parameters, features, or samples. Since the pre-trained model has already learned a large number of image features, it requires significantly less training data and training time when transferring to the disease classification task. This makes transfer learning an efficient learning method. Using transfer learning can save considerable computational resources and time. Meanwhile, the model is able to learn a more generalized feature representation, thus demonstrating superior performance and stronger generalization ability on new tasks. In this study, to expedite the training process and enhance its performance, we leveraged pre-trained weights originating from extensive datasets, enriched with robust feature extraction capabilities. Once these weights were loaded, we appended a fully connected layer to the existing network architecture, aiming to enhance its adaptability for leaf disease classification. During the training phase, we employed backpropagation while permitting the fine-tuning of all layers, including the convolutional layers. This approach allowed the entire pre-trained model’s parameters to undergo continual updates, optimizing its performance for the specific task of leaf disease classification.
Figure 2 shows the representation of deep CNN feature extraction using transfer learning.
2.3.2. Convolutional Neural Networks
CNNs were originally proposed by Yann LeCun in 1998 through the LeNet-5 model [
38] and successfully applied to handwritten digits recognition. Nowadays, CNN has become a widely used DL algorithm [
39], which usually consists of a multiple network structure such as input, convolutional, pooling, fully connected, and output layers. In order to enhance the expressive power of neural networks, we introduced nonlinear processing units, i.e., activation functions, to obtain more nonlinear feature information. Common activation functions include Sigmoid, Tanh, and ReLU. CNNs showed significant advantages in leaf disease classification. The convolutional layer automatically extracted key features, such as edges and texture [
40]. As the number of network layers increases, the convolutional layer captured more advanced features, thus improving classification accuracy. The pooling layer can reduce model complexity and computation while enhancing generalization and highlighting features. The connected layer integrated features to form a global representation and mapped to specific disease categories for flexible classification. Overall, the CNN extracted feature through the convolutional layer, simplified them with the pooling layer, and integrated and classified them with the fully connected layer, which together provided an efficient, accurate, and flexible solution for the leaf disease classification task.
2.3.3. Alexnet
Alexnet [
41] is a far-reaching DL network architecture, especially in the field of computer vision. It employs an eight-layer network architecture consisting of five convolutional layers and 3 fully connected layers. In the convolutional layers, Alexnet performs feature extraction from the input image by means of convolutional kernels of different sizes. In addition, Alexnet incorporates the dropout technique to prevent overfitting and trains in parallel on two GPUs to speed up computation. Although Alexnet has achieved remarkable results on image classification tasks, it has some drawbacks. First, due to the more complex network structure, Alexnet has a larger number of parameters, which requires larger computational resources and storage space. Second, Alexnet may encounter computational efficiency problems when dealing with high-resolution images. Furthermore, Alexnet may not be as efficient as subsequent more lightweight network structures for some specific tasks.
2.3.4. VGG
VGG [
42] is a deep CNN developed by the Computer Vision Group at the University of Oxford and researchers at DeepMind. It mainly explores the relationship between the depth of a CNN and its performance. By repeatedly stacking 3 × 3 small convolutional kernels and 2 × 2 maximal pooling layers, VGG successfully builds a deep CNN with 16–19 layers. The philosophy design of VGG lies in its simplicity and consistency, which is entirely composed of 3 × 3 convolutional kernels and 2 × 2 maximal pooling layers, without using any special layers. This design allows the network to capture finer-grained image features and perform well on different image datasets. The key features of the VGG network include its depth, the use of only 3 × 3 convolutional kernels, and the use of ReLU as the activation function.
2.3.5. Resnet
Resnet was proposed in 2015 from Microsoft Labs [
43]. Resnet increases the depth of the network directly to 152 layers in the classification competition of ImageNet, far exceeding the 19 layers of VGG, which won the championship in the previous year. The emergence of Resnet is of great historical significance for deep neural networks. Resnet excels in CNN image tasks, leveraging shortcuts to address the challenge of model degradation in deep networks. Resnet introduces the structure of residual connection, which enables the deep network to play a better role. In addition, Resnet introduces the batch norm layer, which enables the network to train deeper structures.
2.4. The Network Structure (MMFN) We Proposed
In the deep learning based task for citrus leaf disease classification, we innovatively incorporated the ideas of integrated learning and transfer learning, named the Multi-Models Fusion Network (MMFN). First, we used pre-trained models from transfer learning to initialize deep neural networks, accelerating training and improving model performance. Subsequently, we trained multiple DL models based on transfer learning and fused their predictions to improve classification accuracy and generalization. This strategy effectively utilized existing resources and significantly improved the accuracy and efficiency of citrus leaf disease classification. Specifically, we selected three classical DL models, Alexnet, VGG, and Resnet, which has excellent performance in the field of image recognition. We performed a detailed model fusion and introduced transfer learning. This process involved not only an in-depth understanding and tuning of each model but also careful design of the fusion approach. Consequently, we wanted to ensure the final model can effectively cope with a variety of complex leaf disease scenarios while maintaining the advantages of the respective models. Since Alexnet, VGG, and Resnet have different network structures and depths, they are able to extract features from images at different levels and details. Alexnet is more adept at capturing low-level features such as texture, and VGG helps to capture more detailed features by using a stack of multiple 3 × 3 small convolutional kernels, which improves the network’s ability to represent nonlinearities. Resnet, on the other hand, introduces a residual module to avoid overfitting and solves the gradient vanishing problem. It performs well in deep networks and is better at capturing high-level features such as shape and structure. Therefore, the complementary nature of these models is utilized to extract the most representative features from each model. By using this fusion strategy, we expected to synthesize the strengths of each model to improve the recognition accuracy of citrus leaf diseases. Through this integrated application of DL techniques, we provided a new solution to the citrus leaf disease classification problem, as shown in
Figure 3.
For the purpose of verifying the effectiveness of MMFN, we conducted a series of experiments by using numerous citrus leaf disease image data. The image data covered both healthy leaves and various diseased leaves, thus ensuring the broad applicability and accuracy of the model. In addition, we optimized and adapted the model to meet the needs of disease recognition in different environments and conditions. The effectiveness of DL in citrus leaf disease classification was well established. During the model training process, we used the Adam optimizer with appropriate learning rate and momentum values to ensure that the model could converge stably and efficiently. A cross-entropy loss function was employed to gauge the disparity between the model predictions and the actual labels, guiding the model’s optimization direction.
To enhance the accuracy and generalization ability of the leaf disease classification task, we first performed a series of preprocessing operations on the images. These preprocessing steps included random cropping, scaling, panning, rotating, and adjusting the brightness and contrast of the images. These steps aimed to enrich the training set with data expansion and data enhancement so that the model can cope with many different image transformations. Subsequently, we normalized the augmented images to ensure that the data distribution of the images was consistent and resized them so that they could be uniformly input. On the input side of the model, we input one batch of image data at a time. To expedite model training and enhance performance, we initialized the model with pre-trained weights, which were trained on other large datasets and contained rich feature extraction capabilities. After loading the weights, we added a fully connected layer to the original network to better adapt it to the leaf disease classification task. During the training process, we used backpropagation without freezing the parameters of the convolutional layers, i.e., all layers were fine-tuned and all parameters of the pre-trained model were continuously updated to adapt to the leaf disease classification task. Throughout the model’s learning process, we particularly focused on the features extraction stage. In order to efficiently convert high-dimensional convolutional features into low-dimensional representations, we cleverly accessed a well-designed linear mapping layer after the last convolutional layer. The task of this layer was to compress the complex convolutional features to 128 dimensions, aiming to map all the features learned by various models to a common low-dimensional space. This not only preserved the core information in the features but also effectively eliminated redundant data. Subsequently, we stitched these refined 128-dimensional features to form a fully connected layer with a dimension of 128 × 3. This fully connected layer was optimized by a cross-entropy loss function and used L2 regularization to avoid overfitting, and ultimately output the predicted probability of this batch of images corresponding to each category. In this way, we successfully realized the data downscaling and redundancy removal, while retaining the key feature information. This not only improved the computational efficiency of the model, but also enhanced its performance on the leaf disease classification task.
Figure 4 shows the detailed feature extraction and model fusion of the suggested framework.
2.5. Experimental Setting and Evaluation Indicators
The experimental platform consisted of a server running the Windows 10 64-bit operating system, equipped with an Intel (R) Core (TM) i7-10700KF processor clocked at 3.80 GHz, 16 GB of memory, and an NVIDIA GeForce RTX 3070 GPU. The software development environment utilized was Python 3.7, and the experiment was conducted using the pytorch1.12.1+cuda113 deep learning framework. The integrated development environment (IDE) employed for compilation was PyCharm 2020.3.3 ×64.
This experiment employed the adaptive Adam optimizer, which dynamically adjusted the learning rate of each parameter by computing the first-order moment estimate and the second-order moment estimate. The batch size was set to 32, the initial learning rate was 0.001, and 100 epochs were trained for each experiment. Additionally, we adopted an exponential learning rate decay strategy, enabling the learning rate to decrease in a stepwise manner, with a decay factor of 0.9.
The cross-entropy loss function is used to assess the alignment between the model’s predicted probabilities and the true labels. A larger difference between the predicted and true probability distributions results in a higher cross-entropy value, and vice versa. The cross-entropy loss function is depicted in Equation (1), Where
represents the true vector distribution and
represents the predicted vector distribution.
To assess the performance of a trained DL model in image classification tasks, quantitative metrics need to be taken. The commonly used evaluation metrics are accuracy, precision, recall, and F1 value [
44]. Accuracy is the most commonly used and intuitive evaluation metric to measure a model’s overall correctness. It is defined as the ratio of correctly predicted samples to the total number of samples. However, accuracy alone may not fully evaluate a model’s performance, especially in cases of class imbalance. Precision is an evaluation metric for the prediction results that indicates the proportion of all samples predicted to be positive which are actually positive. A higher precision rate signifies greater reliability in the model’s prediction of positive categories. Recall is an evaluation metric for real samples, indicating the proportion of actual positive samples correctly identified by the model. A higher recall means the model successfully identifies more true positive samples. The F1 value (F1-Score) is the harmonic mean of precision and recall, combining both metrics to provide a balanced evaluation of the model’s performance. The formulas for calculating accuracy, precision, recall, and F1 value are shown in (2), (3), (4), and (5).
TP represents the number of correctly predicted positive samples, TN represents the number of correctly predicted negative samples, FP represents the number of incorrectly predicted positive samples, and FN represents the number of incorrectly predict-ed negative samples.
4. Discussion
After numerous training iterations, we obtained a model with stable performance. The results showed that MMFN achieved remarkable results in the task of citrus leaf disease classification. Specifically, (1) in the classification of healthy and unhealthy leaves, the model achieved an average accuracy of 99.72% on the validation set, with precision, recall, and F1 scores all reaching 99%. (2) During the classification of multiple diseases, the model attained an average accuracy of 98.68% on the validation set, with precision, recall, and F1 scores also performing well. At the same time, our ablation experimental results also showed that the MMFN model achieved satisfactory performance on both binary and multi-class classification tasks. The ablation study also further confirmed the importance of each network in the model.
Rehman et al. [
45] used deep learning, combined with image enhancement, feature fusion and WOA optimization algorithms, to successfully classify six diseases of citrus, with an accuracy rate as high as 95.7%. Khattak et al. [
46] constructed a CNN model by using an integrated approach to differentiate between healthy and fruits/leaves suffering from five common citrus diseases (e.g., black spot, ulcer, etc.). The model extracted complementary features performed well on several evaluation metrics with a test accuracy of 94.55%. Elaraby et al. [
47] evaluated a new method on a library of citrus disease images and a combined dataset for recognizing and classifying six citrus diseases. Two CNNs, Alexnet and VGG19, were utilized to build and test the method, achieving 94% accuracy at the best total system performance. Lin et al. [
48] improved the identification of citrus pests and diseases by using deep convolutional neural networks (DCNN). With image preprocessing and DCNN techniques, the recognition accuracy was improved by about 12% compared to traditional methods. Compared to existing studies, our study has made significant progress in citrus pest and disease identification, especially in classification accuracy and disease species coverage. However, we also note that the current public dataset suffers from problems such as scarcity, limited pest species coverage, and classification errors, different ability in disease manifestations in different regions and different seasons, which limit the further improvement of model performance to some extent. To overcome these challenges, we will work on constructing larger and higher quality citrus pest datasets in the future, explore more advanced data processing and enhancement techniques, solve the problem of hardware demand for a large number of computing resources, expand the computing power, and realize the processing of large amounts of data or processing of large areas of orchard disease data. In addition, we need to conduct an in-depth critical analysis of the model’s limitations. For example, unbalanced data sets, diversity of data collection environments, mixed complex disease identification, real-time response, etc. At the same time, detecting diseases on the inner leaves of trees is indeed a more significant challenge than detecting diseases on the outer leaves. We plan to enhance the ability of the model to analyze different levels of the tree canopy by introducing advanced imaging techniques and sensor data to improve the detection of hidden diseases. Although the MMFN model performs well in most cases, it may still be deficient in the identification of certain rare or complex diseases. This may be due to the fact that the model failed to fully learn the unique characteristics of these diseases during the training process. Therefore, we plan to further improve the recognition ability and robustness of the model by introducing more disease samples, optimizing the model structure, or adopting an integrated learning approach in future studies. In order to make our model more user-friendly for field workers and farmers, we are also considering developing a user-friendly interface in the future to further simplify the interaction with the model. Finally, we hope that this study will inspire more researchers to pay attention and invest in citrus pest and disease recognition techniques. We believe that through continuous efforts and innovations, we can provide more accurate and efficient pest control solutions for agricultural production, thus guaranteeing the healthy growth of crops and food security.
5. Conclusions
Aiming at the problems of traditional citrus leaf disease detection, such as the limited number of categories, slow operation speed, and low recognition accuracy, a method for detecting citrus leaf diseases based on a model fusion strategy incorporating transfer learning was proposed. In this paper, we discussed in depth the application and effectiveness of DL in citrus leaf disease classification. By analyzing in detailed the performance of DL models such as CNNs in dealing with complex image classification problems, we demonstrate the great potential of DL in the field of agricultural disease diagnosis. The primary contributions of this paper can be summarized as follows: first, we constructed an efficient DL model, MMFN, which can accurately identify whether a leaf is diseased or not as well as classify a variety of diseases in citrus leaves. Through extensive experimental validation, MMFN demonstrated significant advantages in both recognition accuracy and efficiency.
(1) The model can accurately distinguish between leaf disease and absence disease; the average accuracy on the validation set was 99.72%; and the precision, recall and F1 score were 99%.
(2) During the classification of multiple diseases, the average accuracy for each type of disease on the validation set reached 98.68%. The classification accuracies for citrus Canker and greasy spot diseases were above 99%. The overall precision, recall, and F1 scores were also excellent.
In summary, we faced challenges such as limited disease types in the dataset, similarity in disease performance leading to misclassification, and complexity of the data collection environment. At the same time, we found that DL showed great potential and application value in citrus leaf disease classification. As technology continues to advance and data becomes more abundant, the field of agricultural science and technology will inevitably be impacted by emerging AI-driven technology trends and systems [
49]. By combining citrus disease classification research with AI, as well as multidisciplinary cross-integration, advances in this field will significantly improve citrus yield and quality, reduce pesticide use, and promote sustainable agricultural development with significant economic benefits. We have grounds to anticipate that DL will assume an increasingly crucial role in the future diagnosis of agricultural diseases and provide more efficient and accurate disease identification methods for agricultural production. This will not only contribute to improving the yield and quality of citrus and other crops but also lay a solid foundation for the advancement of smart agriculture.