1. Introduction
Green plums are widely distributed in hills and sloping forests all over China. Green plums are grown not only in China but also in Vietnam, Thailand, the Philippines, Indonesia, and other countries surrounding China. The green plum is sweet in taste and flat with a thin skin, thick and juicy flesh, small nucleus, crisp texture, high acidity, high fruit acid and vitamin C content, and high nutritional value [
1]. The green plum contains a variety of natural acids such as citric acid, which is indispensable for human metabolism [
2]. It is a rare alkaline fruit that also contains threonine and other amino acids and flavonoids, which are extremely conducive to the normal progress of the body’s protein composition and metabolic function and have obvious preventive and curative effects on ubiquitous cardiovascular, urinary, and digestive diseases. Green plums are a popular kind of dual-purpose medicinal and food resource with multiple health care functions [
3].
However, green plums are susceptible to collisions, pests, and diseases during growth, picking, storage, and transportation. Surface defects (such as rot, scars, cracks, rain spots) of green plums affect the quality of green plums and their products and reduce their economic value. The classification of the four main defects of green plums (rot, scars, cracks, rain spots) is conducive to the manufacturing of different green plum products, such as preserves, extracts, and wine [
4]. This classification process also aids efficiency and reduces waste. In China, the sorting of green plums is mostly carried out through manual operation, which has disadvantages such as low efficiency, low accuracy, and high cost. Therefore, in order to increase the economic added value of green plums and their products, it is of great significance to sort green plums according to multiple types of defects and realize an automatic detection and classification system.
As daily life improves, customers pay more attention to fruit quality. Nondestructive detection and processing techniques have been developed rapidly for fruit inspection [
5,
6]. Many studies focused on measurement of fruit quality and defects using multi-spectral or hyperspectral imaging technologies [
7,
8]. Kleynen et al. [
9] developed a visible/near-infrared band multispectral vision system, using related pattern matching algorithms and defect segmentation based on Bayes’ theorem. According to the actual intact rate of 90%, fewer than 2% of defective apples were classified as intact apples. In the study of Blasco et al. [
10], based on the multispectral data and morphological characteristics, 11 kinds of defects (chilling injury,
Penicillium digitatum, scales, medfly, anthracnose, stem-end injury, scarring, thrips scarring, sooty mold, phytotoxicity, oleocellosis) were identified and classified on the surfaces of citrus fruit. More than 2000 kinds of citrus fruits were identified, and the correct rate reached 86%. Huang et al. [
11] used a new non-contact multi-channel spectroscopy system for non-destructive testing of internal defects in apples and established a classification model based on partial least squares discriminant analysis (PLSDA). The overall accuracy of the three detection directions of the stem end towards the light source, the calyx end towards the light source, and the stem calyx axis perpendicular to the light source reached 91.5%, 89.2%, and 93.1%, respectively. Based on visible/near-infrared (Vis-NIR) hyperspectral imaging technology, Zhang et al. [
12] produced a multi-spectral image classification algorithm. Using Nanfeng citrus fruits as the test material, four types of anthracnose, scarring, decay, and thrips scarring were detected. The defects were classified, and the classification accuracy rate was 96.63%. The high price and specific working band of hyperspectral equipment limit its use in the fruit sorting production line.
In recent years, with the development of machine vision, domestic and foreign scholars have widely applied nondestructive machine vision testing to the identification and classification of agricultural products [
13]. Based on texture uniformity measurement technology, Hassan and Nashat [
14] achieved a 100% recognition rate of healthy olives. The recognition rate of olives with small defects reached 99%, and the recognition rate of olives with large defects reached 98%. Capizzi et al. [
15] used a radial basis probabilistic neural network (RBPNN) to classify the color and texture of citrus surface defects. By calculating the gray level co-occurrence matrix, the texture and gray level features of the defect area were extracted. The five categories of morphological defects, slight color defects, black mold, and good fruits were classified, with a total error rate of 2.75%. Based on computer vision, Yogesh et al. [
16] extracted the characteristics of fruit hardness, size, contour, and texture and used a support vector machine (SVM) classifier to classify defects of apples, pears, and pomegranates with accuracy rates of 98.5%, 96.6%, and 92.5%, respectively. Sujatha et al. [
17] used the histogram of oriented gradients (HOG) feature extraction method and the bagged decision tree (BDT) classification method to classify apples into healthy and defective categories, with an accuracy rate of 96%. Bhargava and Barisal [
18] compared the classification quality of four different classifiers, k-nearest neighbor (k-NN), support vector machine (SVM), sparse representative classifier (SRC), and artificial neural network (ANN), on four fruits: apples, bananas, oranges, and avocados. The system performance was verified by k-fold cross-validation technology. When k = 10, the maximum accuracy of fruit detection was 80.00% (k-NN), 85.51% (SRC), 91.03% (ANN), and 98.48% (SVM).
The traditional machine learning method needs to extract features manually, which can easily cause related features to be incomplete, thereby reducing the accuracy of recognition. In recent years, with the development of computer science, the combination of machine vision and deep learning technology has gradually been applied to the field of fruit quality sorting. Fernando Villacres and Auat Cheein [
19] developed, tested, and evaluated a method based on deep learning using a portable artificial vision system to improve cherry harvest with an accuracy rate of 85%. Wang and Chen [
20] applied an improved eight-layer convolutional neural network (CNN) to the classification of 18 kinds of fruits including Anjou pears, blackberries, black grapes, blueberries, Bosch pears, cantaloupes, and watermelons with an accuracy rate of 95.67%. Wan et al. [
21] applied the improved Faster R-CNN to the recognition of apples, mangos, and oranges, and the average recognition rate reached about 91%. da Costa et al. [
22] applied ResNet50 to the recognition of tomato external defects, and the recognition rate reached 94.6%. Zhou et al. [
23] analyzed the potential of deep learning as an advanced data mining tool in food sensory and consumption research. Their survey results show that deep learning is superior to manual feature extraction and traditional machine learning algorithms, and deep learning can be used as a promising tool for food quality and safety inspection [
24].
The CNN automatically extracts features from the image sample dataset through the convolutional layer, compresses the input feature map in the pooling layer to simplify the network calculation complexity, and extracts the main features, and finally all the connected feature output values are sent to the classifier for feature classification in the fully connected layer. This method overcomes the limitations of manual feature extraction. In this study, the improved VGG convolutional neural network was applied to the machine vision system to realize detection and classification of multi-type defects on the surfaces of green plums, improve the accuracy and speed of green plum defect detection, and provide technology support for the automation of green plum processing.
The contributions of this paper are (a) multi-defect classification for green plums; (b) the use of a convolutional neural network in the multiple defect classification and recognition of green plums; and (c) the application of a stochastic weight averaging (SWA) optimizer and w-softmax loss function in the green plum defect detection network based on a deep learning network, to enable the network to accurately learn green plum features, avoid premature convergence of the network, and effectively improve the classification performance of the network.
3. Results
All the codes of the green plum defect detection network were written in Python, using the deep learning framework PyTorch to define the network calculation graph, and optimized based on the transfer learning strategy to accelerate the training and learning speed. The software, hardware, and compilation environment configuration of this experiment is shown in
Table 2:
The green plum defect detection network training parameter settings were as follows: Batchsize was set to 32, the number of SWA-averaged periods was 10, the initial learning rate was 0.1, the weight decay rate was 1e-4, and the SWA learning rate was 0.05. The green plum training set data were imported to train the network until the minimum loss was obtained. The loss value remained stable at 20 epochs, and the training of the green plum defect detection network model was completed.
The test set of green plum data was imported into the trained green plum defect detection network model, and the test results were obtained. From
Table 3, it can be seen that the average precision value of the green plum defect detection network for the surface defect recognition and classification of green plums reached 93.8%, and the test time of a single green plum was 84.69 ms. The recognition rate of the green plum defect detection network was rot > normal > rain spot > scar > crack. The decay feature was the most obvious, with the highest recognition rate of 99.25%, followed by non-defective green plums with 95.65%, the rain spot recognition rate of 93%, and the poor recognition of scars at 84.29% and cracks at 78.13%.
The confusion matrix in
Figure 5 obtained by the green plum defect detection network shows that out of 280 scar defect green plum images, 236 were correctly identified, 30 were mistaken for rot, 9 were mistaken for cracks, and 5 were mistaken for rain spots; among the rot defective green plum images, 794 were correctly identified, and 6 were misidentified as scars; 440 of the 460 normal green plum images were recognized correctly, 4 were misidentified as scars, and 16 were misidentified as rain spots; 125 of 160 crack defects were correctly identified, 8 were mistakenly identified as scars, 17 were mistakenly identified as rot, and 10 were mistakenly identified as rain spots; 744 out of 800 rain-like defects were correctly identified, 19 were mistaken for scars, 3 were mistaken for decay, and 33 were mistaken for normal.
It can be seen from the confusion matrix in
Figure 5 that rain spots were misjudged as scar defects, and the proportion of misclassifications as normal was relatively high, accounting for 2.4% and 4.1%, respectively. Screenshots of the identification results and analysis are shown in
Figure 6 below. The red boxes represent misrecognized images, and the rain spot defect in frame no. 1 in
Figure 6a was mistakenly identified as a scar. The main reason is that there are fruit stems in this picture, and the network mistook the fruit stems as scars. The features of scars are more obvious than rain spots, so it was misjudged. The rain spot defect in frame no. 2 was identified as a normal green plum. The rain spot on the picture is relatively off-center, and the small number and shallow depth of the rain spots led to them not being recognized.
It can be seen from
Figure 5 of the confusion matrix that scars and decay are easy to confuse. Scar defects misjudged as decay accounted for 10.7%. In
Figure 6b, the red box is the confusion of scars and decay. The shape of scars and decay is irregular, and the colors are similar but common, though the scar is darker. In
Figure 6b, the rot in box 1 was misjudged as a scar, and the color of the rot is darker, which is prone to misjudgment; the scar in box 2 was misjudged as rot, which may interfere with the color of the green plum itself; and the rot in box 3 was misjudged as a scar. The green plum image was in a state of coexistence between scars and rot. The outer circle of the defect was scarred, but the inner circle of the defect was in a rotten state, causing a misjudgment; the crack in box 4 was misjudged as a scar, and the crack in the picture was not deep enough, failing to show the characteristics of cracks, causing misjudgment.
4. Discussion
The VGG network has good classification performance, and the green plum defect detection network was improved on the basis of the VGG network. The green plum data were imported into the source VGG network for testing. The moment was set to 0.9, the learning rate was set to 1e-4, and the batch size was set to 32. The VGG network was then compared with the green plum defect detection network in terms of loss curve and test results.
The green plum defect detection network used the SWA optimizer, while the VGG network used the SGD optimizer. As shown in
Figure 7 below, the VGG network training needed to be iterated for 140 epochs to reach the minimum loss, while the loss curve of the green plum defect detection network after 100 epochs of training iterations converged, reached the minimum loss, and obtained the fitting state. In the training process, the loss curve of the VGG network converged slowly and required more training time. The green plum defect detection network converged faster and reached a stable state. This shows that the SWA optimizer can achieve a good convergence effect, and the convergence speed is fast and stable. The green plum defect detection network performed better than the VGG network in the identification and classification of green plum defects. The loss value decreased faster during the training process, and the convergence requirement was reached faster.
From the test results shown in
Table 4, it can be seen that the green plum defect detection network can identify the main features of each defect and distinguish them according to the main features. Not all CNN networks have a high recognition rate after training. Incorrectly identifying the main features produces misjudgments. The network structure and network parameters of the CNN change the recognition ability of main features. The green plum defect detection network was optimized on the basis of the VGG network. It can be seen from
Table 4 that the average precision value of the green plum defect detection network for the identification and classification of green plum surface defects reached 93.8%, while the VGG network only reached 84%. The green plum defect detection network improved the recognition accuracy of green plum rot defects, rain spot defects, scar defects, crack defects, and normal green plums. The identification of crack defects increased from 55.63% of the VGG network to 78.13%, the highest increase. The green plum defect detection network achieved good results in the recognition of decay defects, reaching 99.25% accuracy, and the recognition rate of normal green plums reached 95.65%. The results show that using the w-softmax loss function can help CNNs learn more distinguishing features, increase the gap between categories, reduce the gap between the same categories, improve the discrimination rate of features, and thus improve the classification accuracy. The test time of each picture of the green plum defect detection network was 84.69 ms, and the test time of each picture of the VGG network was 86.56 ms because the number of neurons in the fully connected layer in the green plum defect detection network is 1024, and reducing the number of parameters of the fully connected layer can shorten the test time and increase the test speed.
Jahanbakhshi used a self-designed CNN network [
36] (referred to as the lime network in this article) to classify limes into two categories, normal and defective, with an accuracy of 100%. The appearance of green plums is similar to green lemons. Python was used in this study to reproduce the CNN network with 18 layer designed by Jahanbakhshi, and green plum images were imported into the network for training to identify multiple defects of green plums. It can be seen from
Table 4 that the average recognition rate of green plum defects by the green lemon network was 77.2%, but the recognition rate of rot defects reached 99%, and the green lemon network had a higher recognition rate of single defects. The green plum defect detection network is deeper than the green lemon network and can recognize more features. Therefore, the recognition rate of more complex scars and cracks is higher, so as to achieve a higher average recognition rate.
Resnet once won the first place in the classification task of the ImageNet competition [
37]. It is a relatively new and efficient network that can solve the problem of gradient explosion and gradient disappearance caused by the deepening of the network. This article chooses the resnet18 network with similar layers to the green plum defect detection network for comparison. It can be seen from
Table 4 that the average recognition rate of green plum defects on the resnet18 network is 90.18%, and the recognition rate of rot defects is 94.1%. The selection of the number of neurons in the classification layer of the green plum defect detection network is better, which shows a better advantage in the detection of green plum defect.
In order to evaluate the accuracy of the network model’s classification of green plum defects, the accuracy and recall criteria were used to evaluate the accuracy of the classification. The accuracy expresses the number of samples (TP) that the network model correctly predicts as positive (TP and FP). Taking the rot defect as an example, the green plums that were correctly identified as rot defects account for the proportion of all identified rot defects; the recall rate expresses the number of samples (TP) that the network model correctly predicted to be positive (the ratio of TP and FN), the ratio of correctly identified rot defects to actual rot defects. Since the number of samples between each defect is different, F1-Measure was used to establish a balance between precision and recall. The larger the value of F1-Measure, the better the classification of the model. The three parameters of precision, recall, and F1-Measure were used to judge the classification of the network model, as shown in
Table 5 below.
It can be seen from the table above that the network model index evaluation result is that the green plum defect detection network is the best, followed by the resnet-18 network, the VGG network, and the worse is the lime network. The resnet-18 network is slightly better than VGG network in the detection of scar, crack, and rot. The lime network has obvious advantages in rot detection, but it is still difficult to identify multiple defects. The three index values of the green plum defect detection network in precision, recall, and F1-Measure are higher than the values of other networks. The green plum defect detection network has a value of 0.97 for the F1-Measure of the rot defect identification. The defect detection network model has the best effect on the identification of green plum rot defects, followed by the F1-Measure index value of normal green plums, which is 0.94, and the F1-Measure value of rain spot defects is 0.94.
The green plum defect detection network used a 3 × 3 small convolution kernel, which can identify richer features and increase the feature recognition rate. By using BN, the parameters were normalized to reduce the occurrence of excessive parameter changes due to different data distributions. The w-softmax loss function can help the network increase the gap between each category, reduce the gap between the same category, and improve the discrimination rate of features, thereby improving the classification accuracy. The SWA optimizer can achieve a good convergence effect, and the convergence speed is fast and stable. The results show that the green plum defect detection network can learn more green plum defect features, and the performance of green plum defect classification is better than other models.
5. Conclusions
In this study, based on the different defects on the green plum surface and multiple defects being difficult to correctly identify and classify, we introduced a convolution network to realize the multi-defect classification of green plums, and applied the SWA optimizer and w-softmax loss function to the green plum defect detection network. The network achieved an average recognition rate of 93.8% for the detection of green plum defects among which the recognition rate of green plum rot defects reached 99.25%, the recognition rate of normal green plums reached 95.65%, and the detection time of a single green plum image was 84.69 ms. The loss value of the green plum defect detection network and the VGG network during training was compared, and it was concluded that the loss value of the green plum defect detection network during training decreased faster, the loss value obtained was lower, and the model converged faster. The recognition rate of the green plum defect detection network was compared with the source VGG network and the lime network. The green plum defect detection network greatly improved the recognition of each defect. Finally, the performance evaluation of the three models verified the green plum defect detection network’s superiority compared with other network methods.
However, this study did not identify the green plum stems, which led to the stems of normal green plums being mistaken for defects. In the next experiment, the identification of green plum stems will be increased to further improve the classification accuracy of green plum defects. In this study, the recognition rate of scar and crack was low, which may be due to the small sample size of scar and crack data. In the next experiment, we will increase the sample size to further optimize the plum defect detection model.
In this study, a low-cost vision module composed of a camera and LEDs combined with a deep learning CNN network was used to detect and classify green plums on a static and single surface but failed to identify the entire green plum. In the next experiment, a comprehensive collection device for green plums can be designed to realize the online detection of green plum defects and provide technical support for automatic identification and sorting of green plum products for automated production.