Defect Detection in Food Using Multispectral and High-Definition Imaging Combined with a Newly Developed Deep Learning Model

Deng, Dongping; Liu, Zhijiang; Lv, Pin; Sheng, Min; Zhang, Huihua; Yang, Ruilong; Shi, Tiezhu

doi:10.3390/pr11123295

Open AccessArticle

Defect Detection in Food Using Multispectral and High-Definition Imaging Combined with a Newly Developed Deep Learning Model

by

Dongping Deng

^1,2

,

Zhijiang Liu

³,

Pin Lv

^1,2

,

Min Sheng

³,

Huihua Zhang

³,

Ruilong Yang

⁴ and

Tiezhu Shi

^1,2,*

¹

State Key Laboratory of Subtropical Building and Urban Science & Guangdong–Hong Kong-Macau Joint Laboratory for Smart Cities, Shenzhen University, Shenzhen 518060, China

²

School of Architecture & Urban Planning, Shenzhen University, Shenzhen 518060, China

³

Wuhan Maritime Communication Research Institute (WMCRI), Wuhan 430000, China

⁴

Beijing LUSTER LightTech Group Co., Ltd., Beijing 100089, China

^*

Author to whom correspondence should be addressed.

Processes 2023, 11(12), 3295; https://doi.org/10.3390/pr11123295

Submission received: 7 October 2023 / Revised: 15 November 2023 / Accepted: 17 November 2023 / Published: 25 November 2023

Download

Browse Figures

Versions Notes

Abstract

:

The automatic detection of defects (cortical fibers) in pickled mustard tubers (Chinese Zhacai) remains a challenge. Moreover, few papers have discussed detection based on the segmentation of the physical characteristics of this food. In this study, we designate cortical fibers in pickled mustard as the target class, while considering the background and the edible portion of pickled mustard as other classes. We attempt to realize an automatic defect-detection system to accurately and rapidly detect cortical fibers in pickled mustard based on multiple images combined with a UNet4+ segmentation model. A multispectral sensor (MS) covering nine wavebands with a resolution of 870 × 750 pixels and an imaging speed over two frames per second and a high-definition (HD), 4096 × 3000 pixel resolution imaging system were applied to obtain MS and HD images of 200 pickled mustard tuber samples. An improved imaging fusion method was applied to fuse the MS with HD images. After image fusion and other preprocessing methods, each image contained a target; 150 images were randomly selected as the training data and 50 images as the test data. Furthermore, a segmentation model called UNet4+ was developed to detect the cortical fibers in the pickled mustard tubers. Finally, the UNet4+ model was tested on three types of datasets (MS, HD, and fusion images), and the detection results were compared based on Recall, Precision, and Dice values. Our study indicates that the model can successfully detect cortical fibers within about a 30 ± 3 ms timeframe for each type of image. Among the three types of images, the fusion images achieved the highest mean average Dice value of 73.91% for the cortical fibers. At the same time, we compared the UNet4+ model with the UNet++ and UNet3+ models using the same fusion data; the results show that our model achieved better prediction performance for the Dice values, i.e., 9.72% and 27.41% higher than those of the UNet++ and UNet3+ models, respectively.

Keywords:

multispectral imaging; high-definition imaging; defect detection; imaging fusion; UNet4+ model

Graphical Abstract

1. Introduction

Food quality and safety, as foundational pillars of public health, societal stability, and development, are of paramount significance in our society [1]. Food defects are one of the most important reasons for reduced food quality. Therefore, the importance of reliable defect-detection techniques is increasing due to the growing demand for improved food quality and safety [2]. In recent years, the detection of food defects was predominantly performed through manual processes, with workers positioned alongside conveyor systems, visually inspecting defects in processed food items, such as spoilage, injuries, diseases, bruises, etc. [3]. However, using this approach, efficiency rapidly declines after several hours of continuous inspection. It is necessary to develop more effective, accurate, and rapid food defect-detection methods to replace manual inspection systems.

Many effective technologies have been adopted for food defect-detection purposes [4], including electronic noses [5], X-ray [6], ultrasound [7], thermal imaging [8,9], fluorescence spectroscopy [10], terahertz imaging [11], and spectroscopy [12,13]. Each of these methods has its merits and restrictions. For instance, despite the capability of X-rays to scan high-resolution images, it is challenging to detect low-density objects, such as plastics and wood fibers. Terahertz imaging, employed for food detection, is highly accurate; however, the high cost of Terahertz technology equipment and protracted data acquisition times limit the viability of this technology in factory line operation applications [14].

Spectral imaging, including hyperspectral (HS) and multispectral (MS) imaging techniques, can be used to acquire visual details of foods, making it possible to determine size, shape, texture, color, and other invisible information. HS imaging technology, by integrating spectral and imaging features, provides heterogeneous information [15] which can effectively capture food-quality characteristics. However, the development of real-time detection using HS imaging has run into a bottleneck due to its inefficiency in terms of acquiring and processing hundreds of continuous and narrowband HS images [16]. MS imaging is an alternative which overcomes this problem, as it not only exhibits high efficiency in processing narrow-band images of discrete spectral ranges, but also enables the acquisition of images at certain multi-wavelengths across the electromagnetic spectrum [17]. Considering its advantages, the MS imaging technique is recognized as a superior method of meeting the speed demands for food image processing [18].

Generally, food defect detection requires automatic image recognition and machine vision methods after image acquisition. In recent years, with the advancement of computer-related technologies, machine learning (ML) has shown great advantages and potential in the field of machine vision [19]. However, ML requires the extraction of a lot of features from images to optimize its parameters in order to produce good results. In this regard, HS and MS images can provide sufficient information. Therefore, spectral imaging technology combined with ML has been widely used to detect defects in food, including meat [20], seeds [21], and vegetables [22]. As a subfield of ML, deep learning (DL) stands out as a powerful approach in various research fields, including natural language processing [23] and medical imaging diagnoses [24]. DL was noted as one of the 10 breakthrough technologies in the MIT Technology Review [25]. Undoubtedly, the application of DL will be an inevitable trend in agriculture applications in the future [26]. For instance, the authors of [27] exploited various popular DL models for the classification of sunflower diseases. They then performed a comparative analysis of their classification results. The findings demonstrated the efficacy of DL models in terms of accurately identifying categories of sunflower diseases. The authors of [28] evaluated and analyzed various classification models and detection techniques for citrus fruit diseases. The authors noted that ML and DL models are widely applied in the detection of citrus fruit diseases. The authors of [29] introduced an automatic detection model for various types of potato defects based on multispectral data and the YOLOv3-tiny model. A comparative analysis was conducted with other deep learning models, demonstrating the effectiveness of the proposed approach in terms of accurately detecting different types of defects in potatoes. In this study, we introduce the DL technique in a food engineering experiment, i.e., defect detection using MS and HD images. Compared to traditional or manual methods, the application of DL improves the detection accuracy and decreases cost.

Fuling mustard tuber, renowned for its fresh, fragrant, crisp, and tender qualities, has gained worldwide acclaim and is exported to over 50 countries and regions, including Russia, Japan, Singapore, and South Africa. This food is one of the most famous pickled vegetables (alongside Chinese Zhacai, European pickled cucumber, and German sauerkraut) [30]. According to relevant data (https://www.huaon.com/channel/trend/841302.html (accessed on 14 November 2023)), the packaged pickled mustard tuber market in China has shown an upward trend, increasing from 37.8 billion RMB in 2013 to 82.9 billion RMB in 2021. The sales volume of pickled mustard tuber in China has been on the rise in recent years, increasing from 186,000 tons in 2013 to 334,000 tons in 2021. In the period from January to August 2022, China exported 15,101 tons of pickled mustard tuber, with an export value of 128.751 million RMB. Fuling pickled mustard tuber had the highest market share, accounting for 31% of the Chinese pickled mustard tuber market. However, the cortical fibers contained in mustard greatly affect its taste. Moreover, it is difficult and inefficient to remove cortical fibers manually, because the cortical fiber is very small and shapeless. Therefore, the question of how to remove cortical fibers has troubled people for a long time. A fast and simple approach is urgently needed to detect and remove cortical fibers in mustard. To address this issue, this paper employed the latest DL method, which was based on improved convolutional neural networks, to realize the real-time detection of cortical fibers. The main contributions of this paper are as follows:

(1): We achieved cortical fiber detection based on the segmentation of the physical features of food using deep learning. Most past studies have focused on food classification, calorie estimation, and quality detection. However, few papers have discussed the segmentation of the physical characteristics of food. Especially for complex and indistinct defects in foods, traditional methods have been found to be completely ineffective. Therefore, to contribute to this field, this dissertation took mustard as an example to realize the semantic segmentation of the physical features of food through deep learning.
(2): An improved fusion method with guided filtering was used to fuse MS images and HD images. The Sigmoid function was introduced to normalize weights for the generation of suitable fusion images. The method was shown to be capable of integrating features from multiple source images, making less conspicuous defects in food images appear more distinct, thereby aiding in the identification of defects. The detailed structure of the proposed method will be discussed in Section 2.2.
(3): A novel image segmentation model based on the semantic segmentation model of UNet++ and UNet3+ for the extraction of cortical fibers, named UNet4+, is proposed. The model employs a multiscale semantic connection and dense convolutional layers, enabling the extraction of fine-grained, intricate, and deep-level characteristics of the target object. This results in superior performance for the detection of complex objects compared to conventional models. This approach can therefore facilitate more effective detection of objects similar to cortical fibers in pickled mustard tubers. The detailed structure of the proposed technology will be discussed in Section 2.3.
(4): We compared the performance of the proposed model based on MS, HD, and Fusion images. Detailed results and discussion can be found in Section 3.2.
(5): We compared the recognition results of our model with those of relevant segmentation models (UNet++, UNet3+) based on the data in this paper. The detailed results and discussion can be found in Section 3.5.

2. Materials and Methods

2.1. Image Acquisition

In this study, 50 MS raw images and 50 HD raw images (with each image including four pickled mustard targets) were acquired in a mustard processing plant. Figure 1a shows the mosaic multispectral (MS) imaging camera, i.e., a high-speed mosaic multispectral imaging camera produced by Championoptics (Changchun, China). Each MS image was made up of nine spectral bands (with band 1 to band 9 being 620 nm, 638 nm, 657 nm, 683 nm, 711 nm, 730 nm, 755 nm, 779 nm, and 816 nm, respectively). The resolution of each image is 750 pixels in width and 870 pixels in height. The advantages of this mosaic-type MS camera are the direct and fast acquisition of digital images of high quality. The imaging speed is over two frames per second, which can sufficiently satisfy the needs for industrial use. The HD images were collected by a high-definition (HD) camera (iRAYPLE, Hangzhou, China, Figure 1b) with a resolution of 4096 (width) × 3000 (height).

The MS camera and HD camera were put in the stand above a conveyor belt at a height of approximately 0.5 m so that we could obtain data synchronously. During data acquisition, the pickled mustards were cleaned and sliced into pieces in the range of 2–3 mm; these were then placed on a conveyor belt, i.e., two slices, perpendicular to the forward direction of the conveyor belt. Then, the conveyor belt moved the pickled mustard slices to the camera at an appropriate speed. After imaging with the camera, each view contained four slices. Figure 2 shows a schematic diagram of data acquisition. Figure 3 show sample images of pickled mustard tuber obtained with the HD camera and MS camera. Once the model had detected cortical fiber, a high-pressure water jet on a robotic arm autonomously separated those fibers, mimicking the manual process.

2.2. Image Fusion

Image fusion is a technology in computer vision that is commonly employed to aggregate valuable features from multiple images to generate a new image with multiple features derived from the amalgamation of those features. In the field of remote sensing, images exhibit varying resolutions. Multispectral images typically have lower resolution, while panchromatic images possess higher resolution but are composed of a single spectral band. Therefore, images with high resolution and multispectral images need to be fused to obtain fusion images that combine both high resolution and multispectral characteristics to assist in the subsequent processing of remote sensing images. In this study, the MS images possessed obvious spectral characteristics of cortical fibers, while HD images had high resolution. Therefore, to obtain images with multispectral features and high resolution, we attempted to use a fusion method to fuse MS and HD images.

Image fusion with guided filtering [31] is a rapid and powerful method that can extract substantial relevant data from source images to create a new fusion image with enhanced informational content. The method involves two-stage image decomposition, which breaks down the images into a base level and a detail level that encompasses coarse-grained and captures the fine-grained details, respectively. A guided filter [32] is adopted to smooth the weights, allowing for the effective fusion of the base and detail layers with spatial consistency. The weights of the base layers and detail layers are described as follows:

W_{i}^{b a s e} = σ (w_{b a s e})

(1)

W_{i}^{d e t a i l} = σ (w_{d e t a i l})

(2)

where

w_{b a s e}, w_{d e t a i l}

represent the weight of base layers and detail layers, respectively, smoothed by the guild filter, according to the method presented in [31].

w_{i}^{b a s e}

and

w_{i}^{d e t a i l}

are the final weight maps of the base and detail layers for the ith source image, respectively, and σ is the sigmoid function [33]. Usually, the sigmoid function is employed in the activation layer of the neural network; its output range is 0–1. The output value will be normalized after the sigmoid function is activated. In this way, the sigmoid function is used to normalize the weight values such that they sum to one at each pixel k. The sigmoid function is described as follows:

σ = 1 / 1 + e^{- w}

(3)

Finally, the base and detail layers will multiply the corresponding weight maps and fuse the base and detail layers from different source images through weighted averaging.

\bar{b a s e} = \sum_{i = 1}^{N} W_{i}^{b a s e} b a s e_{i}

(4)

\bar{d e t a i l} = \sum_{i = 1}^{N} W_{i}^{d e t a i l} d e t a i l_{i}

(5)

Then, fused image Fusion is obtained by combining the fused base layer

\bar{b a s e}

and the fused detail layer

\bar{d e t a i l}

.

F u s i o n = \bar{b a s e} + \bar{d e t a i l}

(6)

2.3. Defect Detection Based on UNet4+

The model (we named it UNet4+) was inspired by existing medical image segmentation models, i.e., UNet++ [34] and UNet3+ [35]. UNet++ is a new segmentation architecture using nested and dense skip connections, while UNet3+ may be used to establish tighter connections through all scales, combining low-level details with high-level semantics from feature maps of different scales. Our model structure is as follows:

Encoder: Visual Geometry Group Network-16 (VGG-16) serves as the backbone of the entire network, namely, X_j_,0 (j ∈ [0, 4]). Layers X₀ and X₁ are the model of two convolutional layers, while the rest is a model of three convolutional layers. Furthermore, the convolved data are upsampled and provided to the decoding layer from the X₁ to X₄ layer.
Decoder: There are several decoding layers in the network, which can obtain extensive information from different scales. As shown in Figure 4, each model will fuse adjacent data and upsample data from the lower left model. Every two encoding models plus a decoding model can be considered as a small UNet network [36]. In addition, skip connections are used in the network when exceeding two decoding models to connect coarse-grained and fine-grained information, which can help the network model learn more useful knowledge.
Skip connection: In this network, to capture more effective information, we drew inspiration from UNet3+ and added multiscale skip connections to the network. To make the training more efficient, skip connections were adopted in the network, which associated high-level information with low-level semantic information (like color, border, texture, etc.) in the whole process of the network encoding and decoding. Figure 5 shows the process of multi-scale skip connection and how to construct the feature maps of X_0,3 and X_1,2. Like with UNet, the feature map from same-scale decoder layer X_0,2 and the upsampled result from higher-scaler layer X_1,2 were instantly accepted in decoder layer X_0,3, which delivered low-level information and high-level semantic information, respectively. Moreover, a series of multiple-scale skip connections passed the higher-level semantic information from encoder layer X_3,1 and decoder-layer X_2,1 by using bilinear interpolation, selecting different scale factors based on different expansion scales. Then, a 3 × 3 convolution operation was followed to update the number of channels and reduce the quantity of unnecessary information.
Deep supervision [37,38]: Similar to UNet++, deep supervision that concurrently minimizes detection error and improves the directness and transparency of the hidden layer learning process was used in this model, which consisted of a 1 × 1 convolution. Finally, the result was produced by making use of a method of deep supervision which added the information from decoder layers X_0,1, X_0,2, X_0,3, and X_0,4.
The differences between our model and other models: In comparison to UNet3+, the proposed model simplifies lateral connections and introduces vertical multiscale connections. This design choice aims to enable the model to capture more complex fine-grained features across different scales. Unlike UNet3+, we employed a multi-layer dense convolutional network, allowing the model to extract features on various scales. The features obtained on different scales are then fused through the dense convolutional network, enhancing the model’s ability to achieve superior results when handling complex targets.

2.4. Design of Experiments

First, to reduce background and computational complexity, yolov5 [39] was utilized to detect a single mustard target in all 50 raw images. We obtained 200 images, where each image included one mustard target which was resized to 256 × 256 pixels. Second, bands 8, 4, and 3, which were chosen after the experiment, were extracted from the MS image and restored to a normal format, false-color image after it had been linearly stretched by 2%. Then, we fused HD and MS data with a fusion method based on guided filtering. Finally, we had three kinds of data: HD data, MS data, and Fusion data for the detection of mustard cortical fibers.

Subsequently, during the training of the network for 200 epochs, default initialization was applied to initialize the weights. The Adam optimizer was utilized with default momentum for weight optimization, employing a minimum batch size of 16 and a weight decay of 0.0001. The α-balanced focal loss function [40] was introduced into our model; its expression is described as follows:

L = \frac{1}{N} \sum_{i} - [α y_{i} \times {(1 - p_{i})}^{γ} \times l o g p_{i} + (1 - α) \times (1 - y_{i}) \times p_{i}^{γ} \times l o g (1 - p_{i})]

(7)

where α is obtained from

α = p / (p + n)

, p represents the number of positive samples, and n represents the number of negative samples. γ is a parameter that adjusts a sample unbalanced between positive and negative samples, γ ∈ [0, 1], where the value of γ was set as 0.9 after experiments. N is the number of pixels in each image, y_i is the label of sample i, the positive class is 1, the negative class is 0, and p_i is the probability that sample i is predicted to be a positive class.

Furthermore, 200 images were manually labeled with the ROI tool of ENVI (Exelis Visual Information Solutions, Boulder, CO, USA), and 75% of each type of dataset was randomly chosen to create the training dataset (150 images), while the rest (50 images) were used as a testing dataset. In addition, to prevent overfitting of the model and increase the amount of training data, we used three data augmentation methods to expand the training dataset: (a) clockwise rotation of all original images by 90 degrees; (b) vertical mirroring of each original image; and (c) addition of salt and pepper noise to each original image. Finally, the number of raw images was increased fourfold, resulting in training dataset containing 600 images (150 raw images and 450 augmented images) of each type of training dataset (HD images, MS images, Fusion images). Last but not least, the network was trained on a terminal with 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30 GHz (Santa Clara, CA, USA.), NVIDIA GTX 3090 24 GB (Santa Clara, CA, USA.). The running environment used included Python 3.8, CUDA 11.3.0, and PyTorch 1.10.0. The related code and UNet4+ model are obtainable on GitHub (https://github.com/monch999/pickled-mustard-tuber/tree/master (accessed on 14 November 2023)).

2.5. Assessment System

The result of detecting mustard cortical fiber using the UNet4+ model was assessed by Recall (R), Prediction (P), and Dice [41]. Dice is a function for calculating the similarity between two sets; it is commonly utilized to determine the similarity between two samples. This coefficient is widely employed in the semantic segmentation of medical images to help evaluate the reliability of the models. The purpose of our experiment was the detection of cortical fibers of pickled mustard to allow a machine to remove it accurately and quickly. Therefore, Dice could be used in our experiment to evaluate the performance of our model. The R, P, and Dice were calculated according to equations below:

P = T P / (T P + F P)

(8)

R = T P / (T P + F N)

(9)

D i c e = 2 \times R \times P / (R + P)

(10)

where TP, FP, and FN represent the number of true mustard cortical fiber pixels (True Positives), the number of false mustard cortical fiber pixels (False Positives), and the number of missed mustard cortical fiber (False Negatives), respectively.

3. Results and Discussion

3.1. Spectral Attributes of Cortical Fiber and Meat Tissues

The mean reflectance spectra curve of meat and cortical fiber over the spectral range from 600 to 800 nm is displayed in Figure 6. With regard to the 657 nm (band 3) wavelength corresponding to the color of the meat, there was good divisibility between mustard meat and critical fiber, which indicates that the aforementioned wavelength is able to generate valuable spectral differentiation information, contributing to the detection of mustard critical fibers. Cortical fiber has a small peak at around 683 nm (band 4), making it a useful band for the identification of critical fibers. In other bands, there were no significant differences in spectral trends. Therefore, a multispectral band including band 3 and band 4 could be selected as the feature band. After testing, we selected band 3, band 4, and band 8 (false-color image shown in Figure 7) as the feature bands for the subsequent image fusion and network training.

As illustrated in Figure 7, there was a significant spectral difference between meat and cortical fibers within the 650–700 nm range. This phenomenon may have been related to the high water, protein, and mineral content in meat compared to cortical fibers; in contrast, cortical fibers contain a higher amount of cellulose. As a result, meat has a stronger absorption in the 650–700 nm range, leading to noticeable distinctions between meat and cortical fibers in band 3 (657 nm) and band 4 (683 nm).

3.2. Performance of Image Fusion

The fusion technology based on guided filtering was designed to efficiently fuse spectral information from the MS image with spatial information from the HD image, and thus, to achieve better resolution while preserving spectral information. Figure 8a shows a false-color image, displaying the original wavebands 8, 4, and 3 as RGB. From the images, it is evident that the resolution is a bit low and there is a noticeable frosted texture; since cortical fibers appear white, distinct spectral differences can be observed between meat and cortical fibers. Compared with Figure 8a, Figure 8b is clearer, smoother, and has better boundaries, while the differences between cortical fiber and meat are less obvious compared to those in the MS images. Figure 8c shows the image after the fusion of MS and HD images. The resolution has been greatly improved compared to the original MS image and is virtually the same as the HD image. In comparison to the HD images, the spectral information of the image has been strengthened, making the difference between cortical fibers and meat more pronounced.

3.3. Performance of UNet4+

The performance of the UNet4+ model was assessed regarding the detection of mustard cortical fiber. The UNet4+ dense network was composed of several different scale UNet networks, allowing for the simultaneous acquisition of multi-scale information during runtime. In addition, we incorporated multiscale connections in the network, which facilitated the integration of coarse- and fine-grained information to make it perform better. The model could identify cortical fibers from each type of the 50 test images, while the network could detect them on some small targets as well. Figure 9 presents the result of three types of images and shows that our model can effectively detect cortical fibers.

However, in this study, some of the meat of pickled mustard tubers was in the early stages of cortical fiber growth. These elements have similar spectral characteristics to cortical fibers, suggesting that they might misidentified by the model as cortical fiber. Meanwhile, all training samples were labeled by hand. Since there is diversity in the shape of the cortical fibers of pickled mustard, the boundary between cortical fiber and meat is not always clear. Therefore, annotated data may introduce errors that affect the detection results. In summary, the model had a good segmentation effect on each type of data, making it a reliable segmentation model for the detection of cortical fibers in pickled mustard tubers.

3.4. Comparison of Three Types of Images Based on UNet4+

Three kinds of images were trained based on UNet4+ to obtain the corresponding mustard cortical fiber segmentation models; the results of UNet4+, tested using the testing dataset are displayed in Table 1 and Figure 10. The R, P, and Dice of the fusion images achieved 82.87%, 68.13%, and 73.91%, which was 2.25%, 6.21%, and 5.31% higher than with the MS images and 1.23%, 10.15%, 7.1% higher than with the HD images, respectively. Compared to HD images, the R, P, and Dice of the MS images achieved 80.62%, 61.92%, and 68.60%, which was −1.02%, 3.94%, and 1.79% higher/lower than the HD images. The reason that the R of MS images was lower than that of HD images may be that the information from images with high resolutions was more conducive to the broad identification of cortical fibers in some images, while the spectral information was beneficial for correctly identifying cortical fibers. Consequently, the fused images integrating high-resolution and spectral information could achieve better results for the segmentation of cortical fibers.

The left and right image in Figure 11 display the Dice comparison of each test image from among the Fusion images, i.e., containing MS images and Fusion images, and the HD images, respectively. The red bar indicates the extent to which the Dice values for the fused images increased compared to other types, while the green denotes a decrease compared to other types. In a word, more red and less green groups were observed, which signified that the overall detection result of the Fusion images was superior to that of both the individual MS and HD images. From a single image, the detection result of part Fusion images was worse than that of the MS and HD images. It was possible that the fused images contained incorrect information from the MS and HD images resulting from the fusion of spectral and spatial information. Particularly for cortical fibers that were half-grown, the enhanced spectral information was sometimes mistaken for cortical fibers, such as in 37_3.png in Figure 12, for which after enhancing spectral information, the central part of the pickled mustard was mistakenly identified as cortical fibers.

3.5. Comparision of the UNet4+ Model with UNet++ and UNet3+

The fusion images were trained on other models including UNet++ and UNet3+ to obtain the corresponding mustard defection detection models. The results of each model, based on testing with the test data, are shown in Table 2. The parameter details of each model are shown in Table 3. The R of the UNet4+ achieved 82.87%, which was 3.78% and 7.55% higher than those of the UNet++ model and UNet3+ model, respectively. The P of the UNet4+ achieved a 68.13%, which was 11.79%, and 30.97%, higher than those of the UNet++ model and UNet3+ model, respectively. The Dice of the UNet4+ achieved a 73.91% success rate, which was 9.72% and 27.41% higher than those of the UNet++ model and UNet3+ model, respectively. This demonstrated that the model proposed in this paper had a significant advantage in the detection of complex defects such as cortical fibers in pickled mustard tuber.

In terms of model size, the model used in this paper consumes more memory (13.2 MB) compared to the UNet3+ model (7.83 MB) or the UNet++ model (11.7 MB); however, our model had an advantage in defect detection accuracy. Regarding detection time, with the computing resources provided by the same GPU, in ascending order of time usage, the three models were as follows: UNet++ (17 ms), UNet3+ (22 ms), and UNet4+ (31 ms).

In terms of processing speed, the model proposed in this paper did not hold an advantage. The reason might have been that the bilinear upsampling operation in the multiscale connections took more time. Therefore, for the UNet++ model that does not employ this operation, the detection speed tended to be relatively faster. However, our model achieved much better detection results than the other two models.

Figure 13 illustrates some details regarding the training process of each model. Figure 13a shows the convergence curves of loss values for each model during training, indicating an overall decreasing trend in loss values that stabilized after certain epochs. Figure 13b displays the accuracy curves (expressed in Dice values) during the training process. It can be seen that the accuracy of each network increased during the training process and tended to stabilize after reaching a certain epoch, with minimal differences in Dice values among the models. Figure 13c demonstrates the Dice curves during the model validation process, highlighting the rapid attainment of high accuracy by all three models. Meanwhile, our model maintained a consistently high level of accuracy. Finally, Figure 13d depicts the results of testing of each model after training completion. The modest fluctuations observed in our model across diverse and complex test data indicate its capacity to effectively learn the intricate features of the target. In comparison to other models, our model exhibited enhanced resistance to interference.

As shown in Figure 14, owing to the usage of a dense network and multiscale skip connections, our model performed better in terms of handling small defects. The dense connection network made it possible to simultaneously process images on different scales, capturing their features and then fusing these features by using a method of deep supervision. Similarly, the UNet++ model also used a dense network which can roughly detect defects; however, it was relatively weaker in handling fine details compared to the model proposed in this paper. Multiscale skip connections, on the other hand, conveyed fine-grained semantic information to other layers, which was beneficial for detecting the details of pickled mustard defects. The UNet3+ model, because of its lack of multi-level convolution layers, exhibited relatively weaker learning capabilities and detection performance compared to the other two models.

Of course, besides detection accuracy, the detection time for mustard products is meaningful for image processing and cortical fiber removal system applications. In our experiment, each image contained exactly one target. By using this data acquisition method, the proposed image detection system achieved a speed of 135 ms (105 ± 3 ms for image preprocessing and 30 ± 3 ms for image detection) for the detection of a single Fusion image, meaning that it can detect approximately seven images including seven targets per second with a GTX 3090 24 GB GPU. The system is therefore sufficient to meet the needs of the study factory.

3.6. Discussion

In our proposed model, we first utilized image fusion based on guided filtering to merge HD and MS images, consequently obtaining a fused image. This image possessed both the high resolution of the HD image and the spectral characteristics of the MS image. This facilitated the extraction of features related to cortical fibers in pickled mustard. Additionally, we employed the proposed model to train and detect three different types of images. The results indicated that our model was effective at detecting cortical fibers in pickled mustard. Through quantitative comparisons for each type of data, it was evident that the fused image exhibited a superior Dice value, significantly improving the detection performance. Therefore, for the subject of this study, image fusion was deemed necessary.

Next, we use fused data to compare our model with existing deep learning models (UNet3+ and UNet4+). While the results indicate that existing models can generally detect the target of this study, our model demonstrated several advantages:

Innovation: Our model combined multiscale connections with dense convolutional networks. Multiscale connections can connect features of different fine-grained sizes across layers. The dense convolutional network further amalgamates the characteristic information from multiscale connections. This elevates the model’s complexity, enabling the acquisition of a broader array of features. Consequently, compared to traditional models, our model exhibited improved detection capabilities for small, irregular objects.
Quantitative Comparison: Our model achieved higher accuracy than other models in quantitative comparisons, with a modest increase in the number of parameters. Our model had a strong anti-interference ability in the detection of complex targets, making it suitable for the effective detection of complex targets.
Production Efficiency: In terms of production, our model can yield better results within a specified time frame, leading to higher efficiency compared to other models.
Architectural Design: Our model utilized multi-scale connections and multiple layers of densely connected convolutional networks, enabling the extraction of finer and more diverse features from the target. Consequently, it was effective at recognizing complex targets like cortical fibers in pickled mustard.

However, our model also had areas for improvement. First, while it effectively identified the cortical fibers in pickled mustard, there was still room for improvement in terms of detection accuracy. Additionally, our model is more complex than other models; this may be attributed to the utilization of tight convolutional networks and multiscale connections. The complexity of the model resulted in longer inference times compared to the other two models.

4. Conclusions

In this study, a mustard image system was applied to obtain Multispectral (MS) images and High-Definition (HD) images of mustard. Then, the method of image fusion with guided filtering was utilized to combine MS images with HD images. Based on the features of cortical fibers in mustard, the UNet4+ model, with a dense convolution block and multiscale skip connections, extracted abundant semantic information from the mustard images. Subsequently, the detection results of three types of test data based on UNet4+ were compared. This research evaluated each type of image with Precision (P), Recall (R), Dice, and assessed the required time consumption. The results revealed that the Fusion images achieved the highest Dice (73.91%) and maintained a detection speed (30 ± 3 ms) for each image. In addition, the UNet4+ model was compared with two other segmentation models (UNet++ and UNet3+) based on the same fused images. The experimental results showed that our model acquired the highest accuracy with its weights employing 13.2 MB storage. We can conclude that the mustard image system, along with the image fusion with guided filtering and the UNet4+ model, can effectively detect mustard cortical fiber. Furthermore, we applied a deep learning-based approach for the semantic segmentation of the physical features of food. Our model utilized multi-scale connections and densely connected convolutional layers to capture and fuse deep features of the samples, thus achieving effective segmentation of small objects. As such, it could be conveniently applied to similar scenarios. However, there was still room for improvement in terms of accuracy. The reasons for this were diverse, e.g., errors in manually labeled samples and spectral similarities between cortical fibers and meat.

In future research, we intend to augment the existing dataset with additional training data to train a more accurate model. Additionally, we will focus on developing more efficient deep learning models to enhance the model’s inference speed and reduce the number of parameters. This will help to enable the liberation of labor and improve economic efficiency in industrial production. Our model can also be used in other scenarios, such as medical image recognition, pixel-level recognition of food defects, and for the classification of Remote Sensing Images.

Author Contributions

D.D.: Data curation, Visualization, Investigation, Validation, Writing—original draft. Z.L.: Data curation, Software, Investigation, Validation. P.L.: Software, Writing—review and editing, Investigation, Methodology, Validation. M.S.: Data curation, Software, Investigation, Validation. H.Z.: Data curation, Software, Investigation, Validation. R.Y.: Data curation, Software, Investigation, Validation. T.S.: Conceptualization, Methodology, Investigation, Writing—review and editing, Supervision, Project administration. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Shenzhen Science and Technology Program (Shenzhen Key Laboratory of Digital Twin Technologies for Cities) (Grant No: ZDSYS20210623101800001), Guangdong Science and Technology Strategic Innovation Fund (the Guangdong–Hong Kong Macau Joint Laboratory Program) (No. 2020B1212030009).

Data Availability Statement

The datasets generated during and/or analyzed during the current study are available in the [GitHub] repository, [https://github.com/monch999/pickled-mustard-tuber/tree/master] (accessed on 14 November 2023).

Acknowledgments

The authors acknowledge the support of the Shenzhen Science and Technology Program.

Conflicts of Interest

Zhijiang Liu, Min Sheng, Huihua Zhang were employed by the company Wuhan Maritime Communication Research Institute, Ruilong Yang was employed by the company Beijing LUSTER LightTech Group Co., Ltd., the remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Zhu, L.; Spachos, P.; Pensini, E.; Plataniotis, K.N. Deep learning and machine vision for food processing: A survey. Curr. Res. Food Sci. 2021, 4, 233–249. [Google Scholar] [CrossRef] [PubMed]
Zhou, L.; Zhang, C.; Liu, F.; Qiu, Z.; He, Y. Application of Deep Learning in Food: A Review. Compr. Rev. Food Sci. Food Saf. 2019, 18, 1793–1811. [Google Scholar] [CrossRef] [PubMed]
ElMasry, G.; Wang, N.; Vigneault, C.; Qiao, J.; ElSayed, A. Early detection of apple bruises on different background colors using hyperspectral imaging. LWT-Food Sci. Technol. 2008, 41, 337–345. [Google Scholar] [CrossRef]
Pandiselvam, R.; Mayookha, V.P.; Kothakota, A.; Ramesh, S.V.; Thirumdas, R.; Juvvi, P. Biospeckle laser technique—A novel non-destructive approach for food quality and safety detection. Trends Food Sci. Technol. 2020, 97, 1–13. [Google Scholar] [CrossRef]
Mohammadian, N.; Ziaiifar, A.M.; Mirzaee-Ghaleh, E.; Kashaninejad, M.; Karami, H. Nondestructive Technique for Identifying Adulteration and Additives in Lemon Juice Based on Analyzing Volatile Organic Compounds (VOCs). Processes 2023, 11, 1531. [Google Scholar] [CrossRef]
Haff, R.P.; Toyofuku, N. X-ray detection of defects and contaminants in the food industry. Sens. Instrum. Food Qual. Saf. 2008, 2, 262–273. [Google Scholar] [CrossRef]
Chandrapala, J.; Oliver, C.; Kentish, S.; Ashokkumar, M. Ultrasonics in food processing—Food quality assurance and food safety. Trends Food Sci. Technol. 2012, 26, 88–98. [Google Scholar] [CrossRef]
Baranowski, P.; Mazurek, W.; Wozniak, J.; Majewska, U. Detection of early bruises in apples using hyperspectral data and thermal imaging. J. Food Eng. 2012, 110, 345–355. [Google Scholar] [CrossRef]
Vadivambal, R.; Jayas, D.S. Applications of Thermal Imaging in Agriculture and Food Industry—A Review. Food Bioprocess Technol. 2011, 4, 186–199. [Google Scholar] [CrossRef]
Mo, C.; Kim, G.; Kim, M.S.; Lim, J.; Cho, H.; Barnaby, J.Y.; Cho, B.K. Fluorescence hyperspectral imaging technique for foreign substance detection on fresh-cut lettuce. J. Sci. Food Agr. 2017, 97, 3985–3993. [Google Scholar] [CrossRef]
Ok, G.; Shin, H.J.; Lim, M.-C.; Choi, S.-W. Large-scan-area sub-terahertz imaging system for nondestructive food quality inspection. Food Control 2019, 96, 383–389. [Google Scholar] [CrossRef]
Noordam, J.C.; van den Broek, W.H.A.M.; Buydens, L.M.C. Detection and classification of latent defects and diseases on raw French fries with multispectral imaging. J. Sci. Food Agr. 2005, 85, 2249–2259. [Google Scholar] [CrossRef]
Seo, Y.; Kim, G.; Lim, J.; Lee, A.; Kim, B.; Jang, J.; Mo, C.; Kim, M.S. Non-Destructive Detection Pilot Study of Vegetable Organic Residues Using VNIR Hyperspectral Imaging and Deep Learning Techniques. Sensors 2021, 21, 2899. [Google Scholar] [CrossRef] [PubMed]
Sonia, Z.; Lorenzo, C.; Ilaria, C. THz Imaging for Food Inspections: A Technology Review and Future Trends. In Terahertz Technology; Borwen, Y., Ja-Yu, L., Eds.; IntechOpen: Rijeka, Croatia, 2021. [Google Scholar] [CrossRef]
Feng, Y.; Sun, D. Application of Hyperspectral Imaging in Food Safety Inspection and Control: A Review. Crit. Rev. Food Sci. Nutr. 2012, 52, 1039–1058. [Google Scholar] [CrossRef] [PubMed]
He, H.J.; Sun, D. Inspection of harmful microbial contamination occurred in edible salmon flesh using imaging technology. J. Food Eng. 2015, 150, 82–89. [Google Scholar] [CrossRef]
Cui, S.; Ling, P.; Zhu, H.; Keener, H.M. Plant Pest Detection Using an Artificial Nose System: A Review. Sensors 2018, 18, 378. [Google Scholar] [CrossRef]
Su, W.; He, H.; Sun, D. Non-Destructive and rapid evaluation of staple foods quality by using spectroscopic techniques: A review. Crit. Rev. Food Sci. Nutr. 2017, 57, 1039–1051. [Google Scholar] [CrossRef]
Gill, J.; Sandhu, P.S.; Singh, T. A review of automatic fruit classification using soft computing techniques. In Proceedings of the ICSCEE, Johannesburg, South Africa, 15–16 April 2014; pp. 91–98. [Google Scholar]
Fan, K.J.; Su, W. Applications of Fluorescence Spectroscopy, RGB- and MultiSpectral Imaging for Quality Determinations of White Meat: A Review. Biosensors 2022, 12, 76. [Google Scholar] [CrossRef]
Liu, Z.; Wang, L.; Liu, Z.; Wang, X.; Hu, C.; Xing, J. Detection of Cotton Seed Damage Based on Improved YOLOv5. Processes 2023, 11, 2682. [Google Scholar] [CrossRef]
Cen, H.Y.; Lu, R.F.; Ariana, D.P.; Mendoza, F. Hyperspectral Imaging-Based Classification and Wavebands Selection for Internal Defect Detection of Pickling Cucumbers. Food Bioprocess Technol. 2014, 7, 1689–1700. [Google Scholar] [CrossRef]
Otter, D.W.; Medina, J.R.; Kalita, J.K. A Survey of the Usages of Deep Learning for Natural Language Processing. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 604–624. [Google Scholar] [CrossRef] [PubMed]
Ma, C.; Pan, C.G.; Ye, Z.; Ren, H.B.; Huang, H.; Qu, J.X. Gout Staging Diagnosis Method Based on Deep Reinforcement Learning. Processes 2023, 11, 2450. [Google Scholar] [CrossRef]
Hof, R.D. Deep Learning. Available online: https://www.technologyreview.com/technology/deep-learning/ (accessed on 14 November 2023).
Medus, L.D.; Saban, M.; Frances-Villora, J.V.; Bataller-Mompean, M.; Rosado-Munoz, A. Hyperspectral image classification using CNN: Application to industrial food packaging. Food Control 2021, 125, 107962. [Google Scholar] [CrossRef]
Gulzar, Y.; Ünal, Z.; Aktas, H.; Mir, M.S. Harnessing the Power of Transfer Learning in Sunflower Disease Detection: A Comparative Study. Agriculture 2023, 13, 1479. [Google Scholar] [CrossRef]
Dhiman, P.; Kaur, A.; Balasaraswathi, V.R.; Gulzar, Y.; Alwan, A.A.; Hamid, Y. Image Acquisition, Preprocessing and Classification of Citrus Fruit Diseases: A Systematic Literature Review. Sustainability 2023, 15, 9643. [Google Scholar] [CrossRef]
Yang, Y.; Liu, Z.; Huang, M.; Zhu, Q.; Zhao, X. Automatic detection of multi-type defects on potatoes using multispectral imaging combined with a deep learning model. J. Food Eng. 2023, 336, 111213. [Google Scholar] [CrossRef]
ChinaDaily. Fuling ‘Zhacai’: The Tasty Chinese Pickled Vegetable. Available online: http://www.chinadaily.com.cn/a/201903/27/WS5c9b0e29a3104842260b2dcd.html (accessed on 14 November 2023).
Li, S.; Kang, X.; Hu, J. Image Fusion with Guided Filtering. IEEE Trans. Image Process. 2013, 22, 2864–2875. [Google Scholar] [CrossRef]
He, K.; Sun, J.; Tang, X. Guided Image Filtering. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 1397–1409. [Google Scholar] [CrossRef]
Cybenko, G. Approximation by superpositions of a sigmoidal function. Math. Control Signals Syst. 1989, 2, 303–314. [Google Scholar] [CrossRef]
Zhou, Z.; Rahman Siddiquee, M.M.; Tajbakhsh, N.; Liang, J. UNet++: A Nested U-Net Architecture for Medical Image Segmentation. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support; Springer International Publishing: Berlin/Heidelberg, Germany, 2018; pp. 3–11. [Google Scholar] [CrossRef]
Huang, H.; Lin, L.; Tong, R.; Hu, H.; Zhang, Q.; Iwamoto, Y.; Han, X.; Chen, Y.; Wu, J. UNet 3+: A Full-Scale Connected UNet for Medical Image Segmentation. In Proceedings of the ICASSP 2020—2020 IEEE International Conference on Acoustics, Speech and Signal Processing, Barcelona, Spain, 4–8 May 2020; pp. 1055–1059. [Google Scholar] [CrossRef]
Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Munich, Germany, 5–9 October 2015; Volume 9351, pp. 234–241. [Google Scholar] [CrossRef]
Nashat, S.; Abdullah, A.; Aramvith, S.; Abdullah, M.Z. Support vector machine approach to real-time inspection of biscuits on moving conveyor belt. Comput. Electron. Agric. 2011, 75, 147–158. [Google Scholar] [CrossRef]
Lee, C.; Xie, S.; Patrick, G.; Zhang, Z.; Tu, Z. Deeply-Supervised Nets. In Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, San Diego, CA, USA, 9–12 May 2015; pp. 562–570. [Google Scholar]
Glenn, J.; Alex, S.; Jirka, B.; Stan, C.; Liu, C.; Hogan, A.; NanoCode012; Laughing; lorenzomammana; tkianai; et al. Ultralytics/yolov5: v3.0. Available online: https://zenodo.org/records/3983579 (accessed on 14 November 2023).
Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollar, P. Focal Loss for Dense Object Detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2999–3007. [Google Scholar] [CrossRef]
Goutte, C.; Gaussier, E. A Probabilistic Interpretation of Precision, Recall and F-Score, with Implication for Evaluation. In Advances in Information Retrieval; Springer: Berlin/Heidelberg, Germany, 2005; pp. 345–359. [Google Scholar] [CrossRef]

Figure 1. Images of the High-Definition and Multispectral cameras.

Figure 2. Schematic diagram of data acquisition.

Figure 3. Samples of pickled mustard tuber were obtained with HD cameras (left) and MS cameras (right), with cortical fibers in red circles.

Figure 4. UNet4+ model. The backbone is VGG-16, and the outputs for the layers X_0,0–X_4,0 are 16 × 256 × 256, 32 × 128 × 128, 64 × 64 × 64, 128 × 32 × 32, and 256 × 16 × 16.

Figure 5. Structural details of skip connections. A skip connection model includes the operation of bilinear-upsampling and convolution.

Figure 6. The mean spectral reflectance curves were extracted from areas of cortical fiber and meat.

Figure 7. False-color image, the red box is cortical fiber with shapeless and white characteristics.

Figure 8. Comparison of images before and after fusion.

Figure 9. The obtained results using UNet4+ with three types of images: High-Definition (HD), Multispectral (MS), and Fusion.

Figure 10. Evaluation results of three types of images.

Figure 11. The left image shows the Dice curve of each test image from among the Fusion images and Multispectral (MS) images and their relative relationships. The right shows the Fusion images and High-Definition (HD) images. The red bar represents how much the Dice values for Fusion images increased compared to other images, while green indicates that they decreased compared to other images.

Figure 12. Detection results of partial data for three types of images: High-Definition (HD), Multispectral (MS), and Fusion image.

Figure 13. (a) the convergence curve of the loss function during the training process. (b) The Dice values of each model on the training dataset during the training process. (c) The Dice values of each model on the validation dataset during the training process. (d) The Dice values of each model based on a test dataset of 50 images after training had been completed.

Figure 14. Detection results of partial data with three types of models: UNet4+, UNet++, and UNet3+.

Table 1. Evaluation of detection results for using High-Definition (HD), Multispectral (MS), and fusion images.

	R (%)	P (%)	Dice (%)
HD	81.64	57.98	66.81
MS	80.62	61.92	68.60
Fusion	82.87	68.13	73.91

Table 2. Evaluation of detection results with UNet++, UNet3+, and UNet4+.

	R (%)	P (%)	Dice (%)	Size (MB)	Speed (ms)
UNet++	79.09	56.34	64.19	11.70	17
UNet3+	75.32	37.16	46.50	7.83	22
UNet4+	82.87	68.13	73.91	13.20	31

Table 3. Model detail of UNet++, UNet3+, and UNet4+.

	Backbone	Feature Layers	Input Size	Epoch
UNet++	VGG-16	[16, 32, 64, 128, 256]	256 × 256 × 3	200
UNet3+	VGG-16	[16, 32, 64, 128, 256]	256 × 256 × 3	200
UNet4+	VGG-16	[16, 32, 64, 128, 256]	256 × 256 × 3	200

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Deng, D.; Liu, Z.; Lv, P.; Sheng, M.; Zhang, H.; Yang, R.; Shi, T. Defect Detection in Food Using Multispectral and High-Definition Imaging Combined with a Newly Developed Deep Learning Model. Processes 2023, 11, 3295. https://doi.org/10.3390/pr11123295

AMA Style

Deng D, Liu Z, Lv P, Sheng M, Zhang H, Yang R, Shi T. Defect Detection in Food Using Multispectral and High-Definition Imaging Combined with a Newly Developed Deep Learning Model. Processes. 2023; 11(12):3295. https://doi.org/10.3390/pr11123295

Chicago/Turabian Style

Deng, Dongping, Zhijiang Liu, Pin Lv, Min Sheng, Huihua Zhang, Ruilong Yang, and Tiezhu Shi. 2023. "Defect Detection in Food Using Multispectral and High-Definition Imaging Combined with a Newly Developed Deep Learning Model" Processes 11, no. 12: 3295. https://doi.org/10.3390/pr11123295

APA Style

Deng, D., Liu, Z., Lv, P., Sheng, M., Zhang, H., Yang, R., & Shi, T. (2023). Defect Detection in Food Using Multispectral and High-Definition Imaging Combined with a Newly Developed Deep Learning Model. Processes, 11(12), 3295. https://doi.org/10.3390/pr11123295

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Defect Detection in Food Using Multispectral and High-Definition Imaging Combined with a Newly Developed Deep Learning Model

Abstract

1. Introduction

2. Materials and Methods

2.1. Image Acquisition

2.2. Image Fusion

2.3. Defect Detection Based on UNet4+

2.4. Design of Experiments

2.5. Assessment System

3. Results and Discussion

3.1. Spectral Attributes of Cortical Fiber and Meat Tissues

3.2. Performance of Image Fusion

3.3. Performance of UNet4+

3.4. Comparison of Three Types of Images Based on UNet4+

3.5. Comparision of the UNet4+ Model with UNet++ and UNet3+

3.6. Discussion

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI