Classification of Citrus Leaf Diseases Using Hyperspectral Reflectance and Fluorescence Imaging and Machine Learning Techniques

Min, Hyun Jung; Qin, Jianwei; Yadav, Pappu Kumar; Frederick, Quentin; Burks, Thomas; Dewdney, Megan; Baek, Insuck; Kim, Moon

doi:10.3390/horticulturae10111124

Open AccessArticle

Classification of Citrus Leaf Diseases Using Hyperspectral Reflectance and Fluorescence Imaging and Machine Learning Techniques

by

Hyun Jung Min

¹

,

Jianwei Qin

^1,*

,

Pappu Kumar Yadav

²

,

Quentin Frederick

³

,

Thomas Burks

³,

Megan Dewdney

⁴

,

Insuck Baek

¹

and

Moon Kim

¹

USDA/ARS Environmental Microbial and Food Safety Laboratory, Beltsville Agricultural Research Center, Beltsville, MD 20705, USA

²

Department of Agricultural and Biosystems Engineering, South Dakota State University, Brookings, SD 57007, USA

³

Department of Agricultural and Biological Engineering, University of Florida, Gainesville, FL 32611, USA

⁴

Citrus Research and Education Center, Department of Plant Pathology, University of Florida, Lake Alfred, FL 33850, USA

^*

Author to whom correspondence should be addressed.

Horticulturae 2024, 10(11), 1124; https://doi.org/10.3390/horticulturae10111124

Submission received: 29 August 2024 / Revised: 8 October 2024 / Accepted: 11 October 2024 / Published: 22 October 2024

(This article belongs to the Special Issue Application of Non-destructive Detection Techniques in Horticultural Plants)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Citrus diseases are significant threats to citrus groves, causing financial losses through reduced fruit size, blemishes, premature fruit drop, and tree death. The detection of citrus diseases via leaf inspection can improve grove management and mitigation efforts. This study explores the potential of a portable reflectance and fluorescence hyperspectral imaging (HSI) system for detecting and classifying a control group and citrus leaf diseases, including canker, Huanglongbing (HLB), greasy spot, melanose, scab, and zinc deficiency. The HSI system was used to simultaneously collect reflectance and fluorescence images from the front and back sides of the leaves. Nine machine learning classifiers were trained using full spectra and spectral bands selected through principal component analysis (PCA) from the HSI with pixel-based and leaf-based spectra. A support vector machine (SVM) classifier achieved the highest overall classification accuracy of 90.7% when employing the full spectra of combined reflectance and fluorescence data and pixel-based analysis from the back side of the leaves, whereas a discriminant analysis classifier yielded the best accuracy of 94.5% with the full spectra of combined reflectance and fluorescence data and leaf-based analysis. Among the diseases, control, scab, and melanose were classified most accurately, each with over 90% accuracy. Therefore, the integration of the reflectance and fluorescence HSI with advanced machine learning techniques demonstrated the capability to accurately detect and classify these citrus leaf diseases with high precision.

Keywords:

hyperspectral imaging; reflectance; fluorescence; machine learning; citrus; leaf; disease; classification

1. Introduction

Citrus trees in Florida are currently facing wilting due to multiple factors such as a range of plant diseases, hurricanes, winter freeze, etc. In 2022–2023, citrus production in Florida totalled 18.1 million boxes, which marked a decrease of 60% from the 2021–2022 season’s 45.3 million boxes [1]. Citrus diseases, affecting the health of both leaves and fruits, continue to pose a threat to the productivity and economic stability of citrus growers across the United States due to their symptomatic effects. Huanglongbing (HLB), or citrus greening disease, being lethal to citrus trees and lacking a cure, poses a greater concern to Florida than to other states, and is transmitted by the Asian citrus psyllid [2]. Citrus canker was detected in Florida a few decades ago and has re-emerged several times, yielding destruction of trees [3]. It stands out as a prevalent threat, caused by a bacterial pathogen, inducing necrotic lesions on leaves, stems, and fruit [4]. Infected leaves exhibit irregular shaping, stunting, or puckering. Severe cases of infection can lead to adverse outcomes such as leaf loss, premature fruit shedding, twig deterioration, overall tree debilitation, and blemishes on the fruit. Greasy spot, triggered by Mycosphaerella citri, and citrus melanose, induced by Diaporthe citri, also contribute to the array of challenges faced by citrus trees [5]. Scab, initiated by the fungus Elsinoe fawcettii, poses a notable threat to lemon varieties, Rangpur lime, and rough lemon rootstocks, resulting in irregular outgrowths resembling scabs or warts on fruit, leaves, and twigs, often accompanied by leaf distortion [6]. In Florida, zinc deficiency manifests initially as small yellow blotches between green veins on leaves, progressing to overall yellowing, except for green venous areas, with worsening deficiency [7]. Not only do citrus diseases impact production, but adverse weather conditions such as hurricanes, winter freezes, light snow, sleet, and freezing rain, and shortage of labor also contribute to its reduction. Consequently, safeguarding the profitability and marketability of citrus hinges on effective control and identification of infected fruits. It is crucial to detect disease infections with reliability and accuracy, as these can be prevented by human intervention, unlike weather conditions which are uncontrollable. However, the general detection methods rely on manual inspection, which is prone to human error, or conventional methods such as electron microscopy or polymerase chain reaction, which are time consuming, expensive, and need laboratory skills. When considering a long-term final tactic for disease detection, automated systems capable of accurately and tirelessly classifying citrus diseases at faster speeds are necessary.

Most studies on leaf disease detection methods focus on visual or color digital imaging by utilizing image classification and recognition techniques. HSI systems have proven to be reliable, accurate, and non-destructive methods, allowing for in-depth spectral analysis with high spectral resolution. Moreover, fluorescence imaging systems have been widely employed to evaluate the physiological characteristics of plant leaves under environmental stress or disease conditions. When the specific wavelength excites the chlorophyll molecules, chlorophyll emits lights at the different ranges of wavelengths. These unique spectral signatures characterizing disease presence across a broad spectrum allow for the performance of disease detection and classification tasks due to the HSI’s ability [8,9]. Previous research has explored multispectral detection to assess the effects of citrus diseases on grapefruits, achieving a classification accuracy of 95.7% [10]. The research demonstrated the potential of two-band multispectral imaging, though it acknowledged the risk of missing small-sized canker lesions. Thomas et al. also explored a combined reflectance and transmission hyperspectral imaging system for investigating plant pathogens [11]. Their findings suggested that reflectance-based measurements facilitate early detection, while transmission measurements provide supplementary insights into understanding and quantifying the complex spatio-temporal dynamics of plant–pathogen interactions. Bauriegel and Herppich utilized hyperspectral and chlorophyll fluorescence imaging systems for early detection of Fusarium head blight in wheat [12]. They measured the Fusarium using a hyperspectral imaging system in a laboratory environment, and using a chlorophyll fluorescence imaging system in the field. The existing hyperspectral and fluorescence systems require complex setup and operate individually, resulting in longer data collection times and the need to manually align data, leading to lower accuracy.

In addition, accurately distinguishing between healthy and infected leaves or between leaf diseases with similar visible symptoms remains a challenging task. In recent years, machine learning methods, including support vector machine (SVM), k-nearest neighbor (KNN), artificial neural networks (ANN), and convolutional neural networks (CNN), have expedited and advanced automated systems, ensuring promising outcomes in disease detection tasks [13,14]. In the agricultural field, HSI combined with machine learning has been demonstrated as a highly accurate method for the detection and classification of diseases on various leaves such as citrus [15,16], grapevines [17], mangroves [18], squash [19], etc. Xie et al. utilized HSI to detect diseases on tomato leaves and applied extreme learning machine (ELM) models with the full spectrum, as well as a successive projections algorithm (SPA)-ELM for selected wavelengths [20]. The ELM model achieved a higher accuracy (100%) compared to the SPA-ELM model (97.1%) in the testing data. The SPA-ELM method features a simplified model, shorter calculation time, and the potential for developing multispectral-based detection instruments. Wu et al. employed reflectance HSI with ELM, SVM, and KNN for the detection of gray mold on strawberry leaves [21]. Three machine learning classifiers achieved high accuracies of over 90% in the test set. Liu et al. also demonstrated the potential of reflectance HSI with machine learning models for assessing apple mosaic disease-related leaf chlorophyll content on apple leaves [22]. The highest accuracy (98.89%) from validation data sets was achieved using a random forest model, based on several wavelengths, average leaf chlorophyll content, and their combinations. Chen et al. investigated a fluorescence imaging system with machine learning models, a convolutional neural network for transfer learning employing ResNet50 for the identification of leaves infected by cucumber downy mildew [23]. The chlorophyll fluorescence parameters resulting from plant stress due to disease provide valuable information surrounding the physiological characteristics of plants. The parameters were utilized as inputs for deep learning models, with the enhanced ResNet model achieving an accuracy of 94.76% and facilitating early disease detection. Weng et al. performed a chlorophyll fluorescence measurement on leaves using three LEDs at 650 nm for citrus HLB [24]. The least squares SVM achieved an accuracy of 95.0% for Navel oranges and 96% for Satsuma.

While there have been numerous studies investigating solely reflectance or fluorescence based hyperspectral measurement, there has been limited research on the potential use of simultaneous reflectance and fluorescence imaging in a single system for identifying leaf diseases. Employing simultaneous hyperspectral reflectance and fluorescence imaging could offer advantages in distinguishing diseases (e.g., high classification accuracy). This study explored the potential of a portable hyperspectral reflectance and fluorescence imaging system combined with machine learning to distinguish various diseases on citrus leaves. The front and back sides of leaves infected with six different diseases, as well as a healthy control set, were tested using the hyperspectral imaging system. The hyperspectral leaf images were investigated with machine learning classifiers.

2. Materials and Methods

2.1. Portable Hyperspectral Imaging System

A portable hyperspectral imaging system was developed for this study. The system consists of two LED sources: a visible and near-infrared (VNIR) broadband light for reflectance, and an ultraviolet-A (UV-A) excitation light for fluorescence (Metaphase Technologies, Bristol, PA, USA) (Figure 1). The wavelengths of the VNIR LEDs are set at 428, 650, 810, 850, 890, 910, and 940 nm, whereas the wavelength of the UV-A is set at 365 nm. The LED intensities can be adjusted using two digital dimming controllers, each managing three channels. The first controller handles wavelengths of 365, 428, and 650 nm, while the second controller manages wavelengths of 810, 850, 890, 910, and 940 nm. The line beams are generated by the rod focal lens, focusing on the leaf sample on the sample holder, and are overlapped by tilting 6° from the vertical position. The sample holder was designed with dimensions of 254 × 197 × 15 mm³ and printed by a 3D printer (F370, Stratasys, Eden Prairie, MN, USA) with black thermoplastic. A customized reflectance standard panel with a size of 254 × 32 × 15 mm³ (Labsphere, North Sutton, NH, USA) was side mounted to the sample holder for flat-field correction of the reflectance images. A miniature line-scan hyperspectral camera (Nano-Hyperspec VNIR, Headwall Photonics, Bolton, MA, USA), together with a wide-angle lens with 5 mm focal length (Edmund Optics, Barrington, NJ, USA) and a long-pass (>400 nm) gelatin filter (Kodak, Rochester, NY, USA), consisting of an imaging spectrograph and a CMOS focal plane array detector (12-bit and 1936 × 1216 pixels) was used to acquire reflectance and fluorescence signals. The long pass filter was used to eliminate second-order effects from the UV-A excitation. To block the ambient light, an aluminum-framed enclosure with dimensions of 56 × 36 × 56 cm³ with black aluminum composite boards was used to enclose the LED lights, the camera, the sample holder, the reflectance panel, and the moving stage, whereas power supplies and controllers were mounted outside. The components (i.e., two lights, a camera, and a stage) were connected to the laptop through a powered four-port USB hub. The entire system was mounted on a compact 45 × 60 cm² optical breadboard, which was suitable for on-site and field experiments.

2.2. System Software and Operation

The system software developed by LabVIEW (v2022, National Instruments, Austin, TX, USA) on a Windows 11 laptop computer is shown in Figure 2. To operate the system, software development kits (SDKs) from hardware manufacturers with LabVIEW were applied to implement parameterization and data transfer functions. The SDKs included User Datagram Protocol (UDP) for LED light control, Universal Serial Bus (USB) for camera control, serial communication for stage movement control, and the LabVIEW Vision Development Module (VDM) for image and spectrum display. During a measurement, the VNIR line light was turned on for 10 s to stabilize LED output. The translation stage then moved the sample holder leftward, while the hyperspectral camera collected line-scan reflectance signals from the standard panel and passed samples. When the sample holder completed its pass, reflectance image acquisition ended, and the VNIR light was turned off. Subsequently, the UV-A line light was turned on for 10 s to stabilize. The camera then acquired line-scan fluorescence signals as the stage returned to its starting position, turning off the UV-A light upon completion, concluding a full imaging cycle.

The spatial resolution along the translation direction for a predetermined scan distance was dependent on the stage moving speed and the number of total scans. For instance, when the number of total scans was 250 lines over a distance of 250 mm, with the stage moving at 3.3 mm/s, it took about 76 s, resulting in an approximate spatial resolution of 1 mm/pixel. The stage moving speed was adjusted to synchronize continuous line-scan image acquisition with the translation stage movement, based on the exposure time of the camera. Empirically, a reciprocal relationship between moving speed (V in mm/s) and exposure time (T in s) was determined (V = 0.99/T), leading to speeds of 3.3 mm/s for an exposure time of 0.3 s and 1.65 mm/s for an exposure time of 0.6 s. In addition to the continuous moving mode, the hyperspectral system was able to operate in an incremental step-by-step line scanning mode. The software displayed reflectance and fluorescence images along with an original spectrum and spatial profile, updating them line by line to show real-time scan progress while the image was being acquired. After completion, each of the images for reflectance and fluorescence were saved into separate data files in standard band interleaved by line (BIL) format.

2.3. Sample Preparation and Measurement

Mature leaves from Valencia orange trees, exhibiting symptoms of canker, HLB, greasy spot, melanose, scab, and zinc deficiency—each with only one condition and excluding any with multiple infections—were sampled at the Citrus Research & Education Center (CREC) of the University of Florida in Lake Alfred, FL. The leaf samples were handpicked and placed in zipper bags in a cooler, which were then transferred to the laboratory within 4 h. These leaves were refrigerated and imaged within 48 h of collection. The diseases were identified based on visual inspection of the symptoms on each leaf by a plant pathologist. Depending on leaf size, 4 to 16 leaves from the same class were imaged simultaneously. Both the front and back sides were scanned. The total numbers of leaves in the control, canker, HLB, greasy spot, melanose, scab, and zinc deficiency classes were 107, 105, 101, 114, 117, 73, and 125, respectively. The citrus leaf samples were placed on the sample holder. The sample holder and the reflectance panel were moved by a linear motorized stage (FUYU Technology, Chengdu, China) within the scope of the hyperspectral camera for line-scan image acquisition. The lens-to-sample distance was fixed at 285 mm, where the spatial resolution along the camera scanning line direction was 0.33 mm/pixel. Each camera frame captured a region of interest (ROI) of 810 × 348 pixels (spatial × spectral) to address a 270 mm long instantaneous field of view and a spectral region from 395 to 1005 nm.

2.4. Hyperspectral Image Processing and Machine Learning

Hyperspectral reflectance and fluorescence images of each leaf were acquired from the standard panel using 810 spatial pixels and 250-line scans over a 250 mm distance. This resulted in the creation of two 810 × 250 × 348 (X × Y × λ) raw hypercubes. A flowchart summarizing data processing and machine learning procedures is shown in Figure 3. Initial data smoothing was achieved by averaging groups of three neighboring pixels in both spectral dimension (λ) and spatial dimension (X) along the camera’s scanning line, producing two 270 × 250 × 116 reduced hypercubes with a spatial resolution of approximately 1 mm/pixel in both the X and Y dimensions and a spectral interval of 5.3 nm. VNIR reflectance values from the standard reference panel (i.e., 100%) were used for flat-field correction to convert the intensity values in the averaged image to relative reflectance values (0–100%). A mask image was generated from a single-band reflectance image at 815 nm where leaves exhibited high reflectance, using a single threshold value computed with Otsu’s method. This mask image was then utilized to remove the background from both reflectance and fluorescence images acquired from the same leaf samples. The hyperspectral image processing procedures were executed using in-house programs developed in MATLAB (R2024a, MathWorks, Natick, MA, USA). Following this, all leaf pixels in the masked reflectance image at 815 nm were grouped into 5 × 5-pixel windows to reduce spectral variation and the amount of data required for machine learning. For each window, the mean (M) and standard deviation (STD) of the leaf pixel intensities were calculated. Any window where more than 10% of the 25 pixels fell outside the range of M ± 3STD was excluded from further analysis. The remaining pixel windows had their spectra averaged in the spatial domain while maintaining full spectral resolution. These mean spectra were then used for machine learning classification. This segmentation method produced the reflectance, the fluorescence, and the combined reflectance and fluorescence, which were used to develop algorithms for leaf diseases classification. The spectral pixel-based data were labeled according to each type of leaf disease and input into the Classification Learner app in MATLAB (R2024, MathWorks, Natick, MA, USA) to assess the effectiveness of each spectral measurement in disease detection. Furthermore, the averages of the spectral pixel-based data for individual leaves, called leaf-based, were labeled as each type of leaf disease to compare the pixel-based results. To reduce data dimensionality and enhance computational efficiency, principal component analysis (PCA), consisting of the first ten PCA components, was applied. Both the full spectra and the scores from the PCA were used for machine learning classification, and their accuracies were compared. Optimizable machine learning classifiers from nine general categories (including naive Bayes, efficient linear, trees, kernel, ensemble, KNN, discriminant, neural network, and SVMs) were tested to evaluate classification performance. For example, the SVMs utilized hyperplanes to divides the classes with a large margin and used kernel to handle complex dataset. The larger margin with few support vectors decreased the classification performance, resulting in underfitting. To streamline the evaluation of misclassification costs and model training, equal penalties were assigned to all leaf disease misclassifications, and default hyperparameters from the MATLAB Classification Learner app were used. For each class, a 70/30 split was used to create the training and test datasets. For the training dataset, five-fold cross-validation was employed to minimize overfitting and assess the models’ generalization abilities. Each spectral dataset was randomly split into five equal-sized folds, with models trained on four folds and validated on the other. This method ensured reliable evaluation of classification performance by averaging the errors across the five folds.

2.5. Performance Assessment

The confusion matrix plot was utilized to evaluate the performance of the currently selected classifier across each class. Accuracy, precision, recall, and F1 score for each class were applied as parameters that indicated how well the model was functioning in distinguishing between classes. The precision indicated the accuracy of positive predictions, showing how many of the predicted positive cases were positive. Recall measured the ability of the classifier to correctly identify positive cases, showing how many of the actual positives were identified. F1 score is a harmonic mean of precision and recall, providing a single metric that balances both measures, particularly useful in uneven class distribution. It also provides class-wise performance metric rather than the overall performance of all the classes. Accuracy, precision, recall, and F1 score can be represented by the formulas for these metrics in terms of false positives (FP), true positives (TP), false negatives (FN), and true negatives (TN).

A c c u r a c y = \frac{T P + T N}{T P + F P + T N + F N}

(1)

P r e c i s i o n = \frac{T P}{T P + F P}

(2)

R e c a l l = \frac{T P}{T P + F N}

(3)

F 1 S c o r e = \frac{2 T P}{2 T P + F P + F N}

(4)

3. Results

3.1. Hyperspectral Images and Spectra of Citrus Leaf Diseases

Three types of hyperspectral images extracted from pseudo RGB, reflectance, and fluorescence for the front and back side of leaves are illustrated, respectively, in Figure 4. The control leaf is labeled as “Control”, and diseased leaves as “Canker”, “HLB”, “Greasy Spot”, “Melanose”, “Scab”, and “Zinc”, respectively. The leaves appear more consistent in the reflectance images than in the fluorescence images.

The image processing procedure is described using the back side of a leaf with HLB disease, including a pseudo RGB image, a reflectance image at 815 nm, a fluorescence image at 691 nm, a mask image using the reflectance at 815 nm, and an average window image (Figure 5). It displays the reflectance and fluorescence spectra corresponding to these images, as well as the combined reflectance and fluorescence spectra. The fluorescence spectra were normalized using their highest intensity to facilitate a comparison with the reflectance spectra on a similar scale. The midrib and veins appear clearly in the fluorescence image. A mask image was generated to isolate the leaf from the background using the reflectance image. Subsequently, an average-window image was produced by evaluating the pixel intensity variations within all 5 × 5-pixel windows in the masked 815 nm leaf image. The reflectance and fluorescence spectra data for each pixel were derived from the average window image.

The mean reflectance and fluorescence spectra of each disease in Figure 6 were extracted for the front and back sides from the same selected images in Figure 4. For both the front and back sides, the control and zinc deficiency conditions displayed the highest reflectance intensities in the mean plot across the spectrum, while scab exhibited the lowest reflectance intensities, as shown in Figure 6a,c. On the other hand, scab showed the highest fluorescence intensities for both front and back sides, whereas melanose exhibited the lowest fluorescence intensities on the front side and canker on the back side (Figure 6b,d). The leaf samples under all seven conditions exhibited similar reflectance and fluorescence trends.

3.2. Classifications of Citrus Leaf Diseases

The pixel-based classification results in Figure 7 are presented using nine machine learning classifiers of the front or back side of leaves in both the full spectra and PCA datasets. Categorization was performed into different combinations of leaf side and the range of the spectrum. Each data point in the sub-figure represents the classification accuracy based on spectral types, including reflectance, fluorescence, and combined reflectance and fluorescence, as obtained from the results of machine learning classifiers. This visualization helps to identify the general trend and determine the most effective combination. For the full spectra analysis, the reflectance and fluorescence data for the back side of the leaf yielded the highest classification accuracy at 90.7%, utilizing SVM (Figure 7c). Individually, fluorescence achieved an accuracy of 84.2%, while reflectance reached 82.9% for the same classifier. For the back side of leaf dataset, the lowest classification accuracies were found to be 33.3%, 59.3%, and 55.3% for reflectance, fluorescence, and combination of both, respectively, when using the naïve Bayes classification algorithm. On the front side of the leaf, the reflectance and fluorescence data achieved a classification accuracy of 88.0%, and individually, fluorescence reached 81.1%, while reflectance achieved 79.3% using SVM (Figure 7a). Neural networks and discriminant analysis also achieved high accuracy, exceeding 76.1%, for all spectral types for both the front and back sides of the leaf. The lowest classification accuracies were 33.5% for reflectance, 47.4% for fluorescence, and 51.2% for the combined reflectance and fluorescence when using naïve Bayes. For PCA, the classification accuracy for the back side of the leaf was 72.8% using a neural network, which was comparable to the 72.6% accuracy achieved using SVM. The lowest accuracies were 49% using decision trees for reflectance, 59.5% using decision trees for the combined reflectance and fluorescence, and 63.3% using naïve Bayes for fluorescence.

The leaf-based classification results in Figure 8 were compared using nine machine learning classifiers from the front and back of leaves in both the full spectra and PCA datasets. For the full spectra analysis, the results from the averaged data showed similarity, with high accuracy observed in SVM, neural network, and discriminant among the classifiers. Notably, the combined reflectance and fluorescence achieved the highest accuracy of 94.5% with discriminant analysis, which was higher than the highest accuracy of 90.7% using the SVM among the pixel-based results.

The confusion matrices of the pixel-based analysis in Figure 9 are illustrated for the combined reflectance and fluorescence of the back of the leaves based on the SVM classifier with the highest accuracy of 90.7%. Each row represents the true disease classes, while each column indicates the predicted disease classes categorized by the models using an SVM classifier with combined reflectance and fluorescence of the full spectral data. The diagonal in blue of the confusion matrix shows the correctly classified percentage, while the other boxes in red represent the misclassified percentages. The control showed 100% accuracy, whereas the other diseases exhibited some degree of misclassification. The leaf symptoms of HLB and zinc deficiency are known to be similar, resulting in a high misclassification rate: 8.7% for HLB to zinc deficiency and 5.9% for zinc deficiency to HLB. Similarly, canker exhibited a high misclassification rate as melanose (7.3%), while melanose was misclassified as canker (2.5%) and greasy spot (2.9%), the highest among the diseases. The highest misclassification from the confusion matrix was 11.9%, with greasy spot being misclassified as melanose. The receiver operating characteristic (ROC) curve depicts the performance of the SVM model, which achieved the highest classification accuracy among the models tested. According to the ROC analysis, the area under the curve (AUC) reached 1 in the control, indicating the highest separability, followed by scab (0.9943), canker (0.9904), zinc (0.9879), greasy spot (0.9872), melanose (0.9831), and HLB (0.9828). In the test dataset, the control was also classified with 100% accuracy, indicating optimal model performance for this class.

Furthermore, the confusion matrices of the leaf-based analysis are shown in Figure 10 for the combined reflectance and fluorescence of the back of the leaves based on the discriminant analysis classifier with the highest accuracy of 94.5%. There were 12 leaves misclassified out of a test dataset of 220. The control, melanose, and scab leaves showed 100% classification accuracy, while the lowest classification accuracy was for HLB, at 86.7%. The highest percentage of misclassification errors was 9.7%, with canker being misclassified as melanose. In the ROC curves, the AUC for control and scab was 1, which indicates excellent performance in the discriminant model. Compared to the pixel-based model, the leaf-based model achieved higher classification accuracy.

Table 1 presents the results of the performance from the back side of the leaves based on the combined reflectance and fluorescence with the SVM classifier. The control achieved 100% for each parameter. Among the diseases, scab had the highest precision at 97.6%, recall at 93.0%, and F1 score at 95.2%. Zinc deficiency exhibited the lowest, with scores of 75.9% and 82.1% for precision and F1 score, respectively, and greasy spot showed the lowest score (of 83.9%) for recall.

3.3. Prediction of Citrus Leaf Diseases

The back sides of the leaves under seven conditions, as shown in Figure 4, alongside the predicted leaf images generated by the pixel-based classification model are displayed in Figure 11. Each disease was designated with a specific color for prediction using the model.

The prediction results are presented for both the front and back sides of the 220 leaf samples in the test dataset using SVM, integrating combined reflectance and fluorescence data, along with the full spectrum (Figure 12). The pixels corresponding to normal or abnormal areas in each individual leaf display their respective colors, which represent the disease. The classification accuracy for the front of the leaves was 88.0%, whereas the accuracy for the back of the leaves was 90.7%. The back side of the control displayed the highest level of accuracy in pixel classification, with only two pixels misclassified as scab. Similarly, the front side of the control demonstrated robust accuracy, accurately identifying the control in most cases, with just six pixels misclassified as canker, HLB, and greasy spot. In the case of canker, both the front and back sides exhibited the highest misclassification rates out of all of the diseases, with 108 and 105 pixels, respectively, mistakenly identified as melanose. In most cases, mistaken pixels were scattered across the leaf, and certain leaves exhibited a predominance of these erroneous pixels within a single leaf. For example, the highest accuracy was observed for the control group at 100%, while the lowest accuracy was observed for greasy spot at 83.9%, based on Figure 9. Overall, the control leaves exhibited a healthy green color on both the front and back sides. However, some leaves affected by greasy spot, such as the second leaf from the last on the front side and the third leaf on the back side, displayed incorrect coloration.

4. Discussion

In this research, the portable hyperspectral imaging system provided the reflectance and fluorescence spectral data for the leaf samples to analyze the citrus disease infections. The absorption in the reflectance spectra occurs at 675 nm and the emission peaks in the fluorescence spectra arise at 691 nm and 744 nm for the front side and 738 nm for the back side, where chlorophyll A absorbs most [25]. These findings align with those reported by Chappelle et al. (1992), who showed the chlorophyll content in the reflectance [26]. The variation between the emission peaks in fluorescence at 744 nm and 738 nm might be because of the different chlorophyll distribution caused by the different amounts of sunlight reaching the front and back side of the leaves. Furthermore, 686 nm is related to photosystem II, and the 700–750 nm range is correlated with both photosystems II and I (PSII and PSI), which are integral pigment–protein complexes crucial in the primary stages of photosynthesis [27]. The infected leaves, except those afflicted with scab, showed reduced intensities around 750 nm due to the disturbance of PSI. Between 400 and 650 nm, there was a slightly higher peak around 550 nm in the infected leaves as compared to the control due to the onset of disease.

The machine learning classifiers were applied to analyze the hyperpectral data and classify the infected citrus leaves. The PCA analysis shows the different patterns compared to the full spectra analysis. The fluorescence data for the front and back side of the leaf demonstrated higher classification accuracy than the combined reflectance and fluorescence data using most cases of the classifiers except naïve Bayes and efficient linear and discriminant analysis. The classification accuracies using the reflectance data were lower than those using the combined reflectance and fluorescence data. Overall, the gap between the lowest and highest accuracies in PCA analysis was not as large as in the full spectra. The PCA analysis also exhibited a similar pattern, manifesting as a low gap between the accuracies from the classifiers among the pixel-based results. Leaf-based classification accuracies provide higher accuracy, and may be beneficial for faster computation and improved generalization, despite the potential bias and loss of detail.

The combined reflectance and fluorescence showed superiority over individual reflectance and fluorescence, regardless of pixel-based and leaf-based analysis. The full spectra for both pixel-based and leaf-based analyses provided higher accuracies than PCA. This suggests that more bands provide more information and insights about the leaf conditions not visible in solely the reflectance or fluorescence images. Furthermore, the best accuracies of 90.7% and 94.5% for pixel-based and leaf-based analysis on the back side of the leaves were slightly higher than those of 88.0% and 90.9% on the front side, respectively. Depending on the type of disease, the impact of the infection on either the front or back side of the leaves varies. Specifically, the back sides of leaves have high humidity and moisture. Therefore, they are more vulnerable to specific bacteria, viruses, or pathogens that favor those conditions, and are possibly exposed to a higher severity of conditions than the front side. Most of the diseases, including greasy spot, are more prevalent in humid and wet conditions. The results imply that the back side of a leaf encounters higher disease severity than the front side, resulting in higher accuracy in classification.

The classification of healthy and infected plant leaves using machine learning and deep learning methods has gained attention, reflecting the growth of AI in this field. Our findings showed that the machine learning models provided accuracies between 90% and 95% for the classification of citrus leaf diseases. Previous work by Singh et al. used deep learning based models to classify bean rust disease and angular leaf spot disease, showing 91.74% accuracy [28]. Paul et al. utilized convolutional neural network (CNN) models and transfer learning based-models VGG-16 and VGG-19 to classify tomato leaf diseases, reporting a highest accuracy of 95% [29]. Nikith et al. compared CNN models with KNN and SVM models for leaf diseases such as bacterial blight, brown spot, powdery mildew, etc. [30]. The CNN model showed an accuracy of 96%, whereas the KNN and the SVM models achieved 64% and 76%, respectively. Overall, the CNN method often provides a higher degree of accuracy than machine learning models, such as SVM. However, both machine learning and deep learning methods are powerful tools for leaf disease classification. Deep learning methods are suitable for complicated high-dimensional data, whereas machine learning methods and the application of PCA are straightforward and have greater interpretability. Therefore, machine learning models are not too complicated, but remain effective in interpreting leaf disease classification.

5. Conclusions

This study investigated hyperspectral imaging techniques to detect citrus leaf diseases. Two types of spectra (reflectance and fluorescence), and the combination of reflectance and fluorescence, extracted from hyperspectral leaf images, produced datasets for training and testing machine learning classifiers for leaf disease classification. The highest classification accuracies of 90.7% and 94.5% were achieved using SVM for the pixel-based analysis and discriminant for the leaf-based analysis, respectively, to differentiate citrus leaf diseases using full spectra in the visible and near-infrared regions. Reduced spectral datasets produced by principal component analysis generally yielded lower classification accuracies than the full datasets. The combination of the reflectance and fluorescence provided an intuitive means compared to their individual performance and was identified as the most effective combination. Interestingly, the back side leaves showed overall higher accuracy than the front side leaves. This technique shows promise for use in the combined reflectance and fluorescence hyperspectral imaging system for classification of leaf diseases. Future research will further validate the method using more leaf diseases, alternative feature extraction and selection methods, hyperparameter optimization for classification models, and larger datasets.

Author Contributions

Conceptualization, J.Q., M.K. and T.B.; methodology, J.Q. and T.B.; software, H.J.M., J.Q., P.K.Y. and Q.F.; validation, H.J.M., J.Q. and I.B.; formal analysis, H.J.M., Q.F. and J.Q.; investigation, P.K.Y., Q.F., and M.D.; resources, T.B. and M.D.; data curation, Q.F., P.K.Y. and T.B.; writing—original draft preparation, H.J.M.; writing—review and editing, H.J.M., J.Q., P.K.Y., Q.F., T.B. and I.B.; visualization, H.J.M., J.Q. and I.B.; supervision, T.B. and M.K.; project administration, T.B. and M.K.; funding acquisition, M.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Simpson, W. Florida Citrus Statistics 2022–2023. 2024. Available online: https://www.nass.usda.gov/Statistics_by_State/Florida/Publications/Citrus/Citrus_Statistics/2022-23/FCS2023.pdf (accessed on 8 October 2024).
Graham, J.; Gottwald, T.; Setamou, M. Status of huanglongbing (HLB) outbreaks in Florida, California and Texas. Trop. Plant Pathol. 2020, 45, 265–278. [Google Scholar] [CrossRef]
Gottwald, T.R.; Hughes, G.; Graham, J.H.; Sun, X.; Riley, T. The Citrus Canker Epidemic in Florida: The Scientific Basis of Regulatory Eradication Policy for an Invasive Species. Phytopathology 2001, 91, 30–34. [Google Scholar] [CrossRef] [PubMed]
Naqvi, S.A.H.; Wang, J.; Malik, M.T.; Umar, U.-U.-D.; Hasnain, A.; Sohail, M.A.; Shakeel, M.T.; Nauman, M.; Hassan, M.Z.; Fatima, M. Citrus canker—Distribution, taxonomy, epidemiology, disease cycle, pathogen biology, detection, and management: A critical review and future research agenda. Agronomy 2022, 12, 1075. [Google Scholar] [CrossRef]
Aguilera-Cogley, V.; Sedano, E.; Vicent, A. Inoculum and disease dynamics of citrus greasy spot caused by Zasmidium citri-griseum in sweet orange in Panama. Plant Pathol. 2023, 72, 696–707. [Google Scholar] [CrossRef]
Pham, N.Q.; Duong, T.A.; Wingfield, B.D.; Barnes, I.; Duran, A.; Wingfield, M.J. Characterisation of the mating-type loci in species of Elsinoe causing scab diseases. Fungal Biol. 2023, 127, 1484–1490. [Google Scholar] [CrossRef]
Kwakye, S.; Kadyampakeni, D.M. Micronutrients Improve Growth and Development of HLB-Affected Citrus Trees in Florida. Plants 2022, 12, 73. [Google Scholar] [CrossRef]
Gao, Z.; Khot, L.R.; Naidu, R.A.; Zhang, Q. Early detection of grapevine leafroll disease in a red-berried wine grape cultivar using hyperspectral imaging. Comput. Electron. Agric. 2020, 179, 105807. [Google Scholar] [CrossRef]
Zhao, X.; Zhang, J.; Huang, Y.; Tian, Y.; Yuan, L. Detection and discrimination of disease and insect stress of tea plants using hyperspectral imaging combined with wavelet analysis. Comput. Electron. Agric. 2022, 193, 106717. [Google Scholar] [CrossRef]
Qin, J.; Burks, T.F.; Zhao, X.; Niphadkar, N.; Ritenour, M.A. Multispectral detection of citrus canker using hyperspectral band selection. Trans. ASABE 2011, 54, 2331–2341. [Google Scholar] [CrossRef]
Thomas, S.; Wahabzada, M.; Kuska, M.T.; Rascher, U.; Mahlein, A.K. Observation of plant-pathogen interaction by simultaneous hyperspectral imaging reflection and transmission measurements. Funct Plant Biol. 2016, 44, 23–34. [Google Scholar] [CrossRef]
Bauriegel, E.; Herppich, W.B. Hyperspectral and Chlorophyll Fluorescence Imaging for Early Detection of Plant Diseases, with Special Reference to Fusarium spec. Infections on Wheat. Agriculture 2014, 4, 32–57. [Google Scholar] [CrossRef]
Bhujade, V.G.; Sambhe, V. Role of digital, hyper spectral, and SAR images in detection of plant disease with deep learning network. Multimed. Tools Appl. 2022, 81, 33645–33670. [Google Scholar] [CrossRef]
Lowe, A.; Harrison, N.; French, A.P. Hyperspectral image analysis techniques for the detection and classification of the early onset of plant disease and stress. Plant Methods 2017, 13, 80. [Google Scholar] [CrossRef]
Yadav, P.K.; Burks, T.; Qin, J.; Kim, M.; Frederick, Q.; Dewdney, M.M.; Ritenour, M.A. Automated classification of citrus disease on fruits and leaves using convolutional neural network generated features from hyperspectral images and machine learning classifiers. J. Appl. Remote Sens. 2024, 18, 14512. [Google Scholar] [CrossRef]
Frederick, Q.; Burks, T.; Yadav, P.K.; Qin, J.; Kim, M.; Dewdney, M. Classifying adaxial and abaxial sides of diseased citrus leaves with selected hyperspectral bands and YOLOv8. Smart Agric. Technol. 2024, 99–117, in press. [Google Scholar] [CrossRef]
Nguyen, C.; Sagan, V.; Maimaitiyiming, M.; Maimaitijiang, M.; Bhadra, S.; Kwasniewski, M.T. Early Detection of Plant Viral Disease Using Hyperspectral Imaging and Deep Learning. Sensors 2021, 21, 742. [Google Scholar] [CrossRef] [PubMed]
Jiang, X.; Zhen, J.; Miao, J.; Zhao, D.; Wang, J.; Jia, S. Assessing mangrove leaf traits under different pest and disease severity with hyperspectral imaging spectroscopy. Ecol. Indic. 2021, 129, 107901. [Google Scholar] [CrossRef]
Abdulridha, J.; Ampatzidis, Y.; Roberts, P.; Kakarla, S.C. Detecting powdery mildew disease in squash at different stages using UAV-based hyperspectral imaging and artificial intelligence. Biosyst. Eng. 2020, 197, 135–148. [Google Scholar] [CrossRef]
Xie, C.; Shao, Y.; Li, X.; He, Y. Detection of early blight and late blight diseases on tomato leaves using hyperspectral imaging. Sci. Rep. 2015, 5, 16564. [Google Scholar] [CrossRef]
Wu, G.; Fang, Y.; Jiang, Q.; Cui, M.; Li, N.; Ou, Y.; Diao, Z.; Zhang, B. Early identification of strawberry leaves disease utilizing hyperspectral imaging combing with spectral features, multiple vegetation indices and textural features. Comput. Electron. Agric. 2023, 204, 107553. [Google Scholar] [CrossRef]
Liu, Y.; Zhang, Y.; Jiang, D.; Zhang, Z.; Chang, Q. Quantitative assessment of apple mosaic disease severity based on hyperspectral images and chlorophyll content. Remote Sens. 2023, 15, 2202. [Google Scholar] [CrossRef]
Chen, X.; Shi, D.; Zhang, H.; Antonio Sánchez Pérez, J.; Yang, X.; Li, M. Early diagnosis of greenhouse cucumber downy mildew in seedling stage using chlorophyll fluorescence imaging technology. Biosyst. Eng. 2024, 242, 107–122. [Google Scholar] [CrossRef]
Weng, H.; Liu, Y.; Captoline, I.; Li, X.; Ye, D.; Wu, R. Citrus Huanglongbing detection based on polyphasic chlorophyll a fluorescence coupled with machine learning and model transfer in two citrus cultivars. Comput. Electron. Agric. 2021, 187, 106289. [Google Scholar] [CrossRef]
Xue, L.; Yang, L. Deriving leaf chlorophyll content of green-leafy vegetables from hyperspectral reflectance. ISPRS J. Photogramm. Remote Sens. 2009, 64, 97–106. [Google Scholar] [CrossRef]
Chappelle, E.W.; Kim, M.S.; McMurtrey, J.E. Ratio analysis of reflectance spectra (RARS): An algorithm for the remote estimation of the concentrations of chlorophyll A, chlorophyll B, and carotenoids in soybean leaves. Remote Sens. Environ. 1992, 39, 239–247. [Google Scholar] [CrossRef]
Saleem, M.; Atta, B.M.; Ali, Z.; Bilal, M. Laser-induced fluorescence spectroscopy for early disease detection in grapefruit plants. Photochem. Photobiol. Sci. 2020, 19, 713–721. [Google Scholar] [CrossRef] [PubMed]
Singh, V.; Chug, A.; Singh, A.P. Classification of Beans Leaf Diseases using Fine Tuned CNN Model. Procedia Comput. Sci. 2023, 218, 348–356. [Google Scholar] [CrossRef]
Paul, S.G.; Biswas, A.A.; Saha, A.; Zulfiker, M.S.; Ritu, N.A.; Zahan, I.; Rahman, M.; Islam, M.A. A real-time application-based convolutional neural network approach for tomato leaf disease classification. Array 2023, 19, 100313. [Google Scholar] [CrossRef]
Nikith, B.V.; Keerthan, N.K.S.; Praneeth, M.S.; Amrita, D.T. Leaf Disease Detection and Classification. Procedia Comput. Sci. 2023, 218, 291–300. [Google Scholar] [CrossRef]

Figure 1. A portable hyperspectral imaging system for citrus leaf disease detection (aluminum composite panels omitted).

Figure 2. In-house developed software for system control and hyperspectral image acquisition.

Figure 3. Flowchart of spectral and image processing and machine learning classification.

Figure 4. Representative pseudo RGB, 815 nm reflectance, and 691 nm fluorescence images from the front (a) and back (b) side of the leaves (control and six diseased conditions).

Figure 5. Images and the spectra of each pixel extracted from a pair of hyperspectral reflectance and fluorescence images from the back side of a leaf sample with HLB disease.

Figure 6. Mean spectra of VNIR reflectance and fluorescence: (a,b) represent the front side, while (c,d) represent the back side.

Figure 7. Pixel-based leaf disease classification accuracies by nine machine learning classifiers using (a) full spectra and (b) first ten components of PCA for the front side, and (c) full spectra and (d) first ten components of PCA for the back side.

Figure 8. Leaf-based disease classification accuracies by nine machine learning classifiers using (a) full spectra and (b) first ten components of PCA for the front side, and (c) full spectra and (d) first ten components of PCA for the back side.

Figure 9. Confusion matrix (a) and ROC curve (b) for pixel-based leaf disease classification using SVM with full spectral data of the combined reflectance and fluorescence. Each empty cell in the confusion matrix represents 0%.

Figure 10. Confusion matrix (a) and ROC curve (b) for leaf-based leaf disease classification using discriminant analysis with full spectral data of the combined reflectance and fluorescence. Each empty cell in the confusion matrix represents 0%.

Figure 11. Pseudo RGB (1st row) and leaf disease classification images (2nd row) of representative samples with control and six diseased conditions using an SVM model with the best pixel-based classification accuracy.

Figure 12. Citrus leaf disease classification images for the front and back sides of all 220 samples in the test datasets.

Table 1. The results of precision, recall, and F1 score from the best model (an SVM classifier) in relation to the combined reflectance and fluorescence of the back of the leaves.

Class	Precision (%)	Recall (%)	F1-Score (%)
Control	100	100	100
Canker	86.9	86.7	86.8
HLB	93.6	86.2	89.8
Greasy Spot	96.6	83.9	89.8
Melanose	77.0	92.0	83.8
Scab	97.6	93.0	95.2
Zinc	75.9	89.5	82.1

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Min, H.J.; Qin, J.; Yadav, P.K.; Frederick, Q.; Burks, T.; Dewdney, M.; Baek, I.; Kim, M. Classification of Citrus Leaf Diseases Using Hyperspectral Reflectance and Fluorescence Imaging and Machine Learning Techniques. Horticulturae 2024, 10, 1124. https://doi.org/10.3390/horticulturae10111124

AMA Style

Min HJ, Qin J, Yadav PK, Frederick Q, Burks T, Dewdney M, Baek I, Kim M. Classification of Citrus Leaf Diseases Using Hyperspectral Reflectance and Fluorescence Imaging and Machine Learning Techniques. Horticulturae. 2024; 10(11):1124. https://doi.org/10.3390/horticulturae10111124

Chicago/Turabian Style

Min, Hyun Jung, Jianwei Qin, Pappu Kumar Yadav, Quentin Frederick, Thomas Burks, Megan Dewdney, Insuck Baek, and Moon Kim. 2024. "Classification of Citrus Leaf Diseases Using Hyperspectral Reflectance and Fluorescence Imaging and Machine Learning Techniques" Horticulturae 10, no. 11: 1124. https://doi.org/10.3390/horticulturae10111124

APA Style

Min, H. J., Qin, J., Yadav, P. K., Frederick, Q., Burks, T., Dewdney, M., Baek, I., & Kim, M. (2024). Classification of Citrus Leaf Diseases Using Hyperspectral Reflectance and Fluorescence Imaging and Machine Learning Techniques. Horticulturae, 10(11), 1124. https://doi.org/10.3390/horticulturae10111124

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Classification of Citrus Leaf Diseases Using Hyperspectral Reflectance and Fluorescence Imaging and Machine Learning Techniques

Abstract

1. Introduction

2. Materials and Methods

2.1. Portable Hyperspectral Imaging System

2.2. System Software and Operation

2.3. Sample Preparation and Measurement

2.4. Hyperspectral Image Processing and Machine Learning

2.5. Performance Assessment

3. Results

3.1. Hyperspectral Images and Spectra of Citrus Leaf Diseases

3.2. Classifications of Citrus Leaf Diseases

3.3. Prediction of Citrus Leaf Diseases

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI