1. Introduction
Initially, human experts primarily used digital retinal images in ophthalmic clinics to identify DR and related eye diseases. The retinal blood vessels are important diagnostic indicators for DR and the pathologies of systems within the human eye. More and more research indicate that the accurate extraction of retinal blood vessels is helpful in the analysis of other related ophthalmic diseases. The manual extraction of blood vessels requires an expert ophthalmologist’s skills. Although such manual extraction is possible, it is time-consuming and there can be human error when working with large image datasets. Therefore, automated systems for the precise extraction of retinal blood vessels are urgently needed to reduce the workload of expert ophthalmologists. An example of the automatic extraction of blood vessels is illustrated in
Figure 1.
Considering the typical retinal blood vessels and the background region information in the digital retinal images shown in
Figure 1, three challenges make the retinal vessel-extraction task difficult:
Retinal images often require higher contrast and quality, making it difficult to extraction the Region of Interest (ROI).
Most retinal images suffer from imbalanced illumination and noise (salt and pepper), making it difficult to distinguish the blood vessel from the background image.
Retinal blood vessels come in many shapes, sizes, and unexpected forms, making it challenging to identify large and small blood vessels.
In these circumstances, research on vessel extraction from retinal images has attracted a large number of researchers.
Related works and state-of-the-art studies enlighten this paper’s proposed method, which is based on machine learning and deep-learning methods. A selection of studies on retinal vessel extraction from 2015 to 2022 was reviewed to ensure that the summary of extraction techniques is as up-to-date. In choosing the articles, keywords such as blood vessel classification, blood vessel extraction, retinal imaging, DR segmentation, exudate segmentation, and red lesion extraction were selected. I searched for articles from websites such as Web of Science, PubMed, Science Direct, and IEEE Xplore. This review includes 216 papers from Q1 to Q4 journals and international conferences.
Image extraction generally involves classifying ROI using specific image intensities or geometric features to suppress the unwanted background. This way, only specific areas are extracted as retinal blood-vessel regions. Typically, the anatomy of the retina consists of observable characteristics, such as the optic disc, retinal blood vessels, and eye pathologies that appear in retinal images. In addition, the researchers used retinal images from various data sources in their analysis, captured at different times, resolutions, and intensities. Accordingly, determining whether retinal blood-vessel extraction achieves the same accuracy as that of an expert diagnosis is a challenging task that has attracted many researchers.
Recent implementations of deep-learning algorithms for retinal-image extraction were proposed by Soomro et al. [
1]. Imran et al. [
2] proposed the effectiveness of different deep-learning algorithms for automating blood-vessel extraction. Their proposed method falsely extracted the blood vessel on an abnormal image, and micro blood vessels were extracted due to the noise. Chen et al. [
3] proposed deep-learning algorithms for retinal-image analysis. Their proposed method demonstrated significant efficacy in extracting retinal blood vessels. However, it had limitations in extracting thin blood vessels, especially in noisy and low-intensity images. A thorough analysis of conventional supervised, unsupervised, and neural network methodologies and statistics was presented by Khan et al. [
4]. Jia et al. [
5] analyzed deep-learning and traditional machine learning methods for retinal-image extraction. However, the ability of their approach to accurately extract a micro blood vessel was limited. Li et al. [
6] proposed effective deep-learning techniques for extracting blood vessels. Next, the main focus of Badar et al. [
7] was on deep learning techniques for retinal-image analysis. The main features of their proposed method were its improved extraction performance and reduced running time. The extraction of retinal blood vessels was automated using several supervised and unsupervised methods. The unsupervised method is the most prevalent technique for autonomously extracting the retinal blood vessel [
8,
9]. These methods can be extracted into three categories: matching filters [
10,
11,
12], extraction based on blood-vessel tracing [
13,
14,
15], and model-based extraction [
16]. Since they cannot exploit the hand-labeled ground-truth image, unsupervised algorithms show some performance limitations. In contrast, unsupervised algorithms are trained using annotations and can take advantage of real datasets. In two steps, retinal blood-vessel extraction is carried out by supervised models: feature extraction and pixel classification. The methods can further break down features into features that are created manually or automatically. In machine learning, feature extraction from retinal images is done manually, and specific methods, such as the K-Nearest Neighbor (KNN) method [
17] and the Support Vector Machine (SVM) method [
18], are used. The generalization ability is lacking when features are manually selected because manual selection is application-specific and new features cannot be extracted [
19].
For image processing, deep learning techniques, particularly Convolutional Neural Networks (CNNs), have received much attention [
20,
21]. Deep learning techniques use enormous amounts of data to learn features, while automatically minimizing human inference. They are superior at extraction because they can automatically learn multiple-level patterns and are not constrained by a particular system. The following issues are typically present in the proposed deep-learning techniques: (1) the model’s down-sampling factor is too high, causing numerous micro blood vessels to lose their feature information, which can never be recovered; (2) the DR and retinal blood vessels produce results that are low in accuracy; and (3) it is only possible to obtain accurate blood-vessel information from the extracted blood-vessel image with a great deal of noise.
Jiang et al. [
22] suggested a network based on the fully convolutional version of AlexNet. They used Gaussian smoothing to lessen the discontinuity between the optic disc and the replacement region. This method has the advantage of correctly affecting the retinal blood vessel even in cases of DR. In contrast, the application of this method is limited to situations where the retinal blood vessels are connected. Li et al. [
23] built FCM with skip connections to improve retinal blood-vessel extraction. The extraction accuracy was increased by active learning, using fewer manually marked samples. The iterative training method improved the proposed model’s performance. This strategy has produced positive results with different datasets, but small datasets will yield even better results.
In addition to proposing a fully convolutional network, Atli and Gedik [
24] were the first to use up-sampling and downsampling to capture small and large blood vessels, respectively. Their suggested method used the STARE (Structured Analysis of the Retina) dataset for retinal blood-vessel extraction. Zhang and Chung [
25] considered this multi-class extraction task and included an edge-aware technique. Five categories of pixels were created for background and micro blood vessels. As a result, the network was able to concentrate on the blood vessel boundary zones. To make optimization easier, they made use of deep supervision. However, their proposed method had limitations in accurately extracting blood vessels with a width of one to several pixels. Although their proposed method achieved accurate results, the overall sensitivity and specificity were low compared to conventional methods and can be improved by parameter tuning.
Mishra et al. [
26] suggested a straightforward U-net and introduced data-aware deep supervision to improve micro blood-vessel extraction. They determined the average input diameter of the retinal blood vessels and the layer-wise effect, adding further layers as needed. Receptive fields were used to identify layers that notably extracted aspects of retinal blood vessels.
Deformable convolution was suggested by Jin et al. [
27] for classifying blood vessels. The flexible convolution block learned offsets to modify the receptive fields, enabling it to capture blood vessels in various sizes and shapes. In comparison to U-net and the deformed convolution network [
28], the proposed deformable U-net outperformed them on the DRIVE (Digital Retinal Images for Vessel Extraction), STARE, and CHASE (Child Heart and Health Study in England) datasets, as well as on two additional datasets: WIDE [
29] and SYNTHE (Synthesizing Retinal and Neuronal Images) [
30].
In order to extract micro blood vessels, Dharmawan et al. [
31] presented a combined Contrast-Limited Adaptive Histogram Equalization (CLAHE) method with a novel match filter. They built the new matching filter on the multiscale and modified the Dolph–Chebyshev type I function. Their proposed combination improved extraction performance by using a preprocessing step, and the preprocessed image was fed directly to the fine extraction process. Compared to standard CLAHE, their method discovered more micro blood vessels, although it still committed many extraction errors. Dilated convolution was also used to extract blood vessels to increase the receptive fields [
32]. Lopes et al. [
33] also examined the effects of a series of downsampling methods, such as max-pooling, convolution with a 2-by-2 kernel, and convolution with a 3-by-3 kernel. Their method can help address low-intensity images and micro blood-vessel extraction in cases of DR disease. According to Soomro et al. [
34], they achieved superior outcomes when employing convolution as a downsampling operation. They used morphological reconstruction during the post-processing stage to eliminate the small pixels in the extracted images. A black ring surrounds the Field of Vision (FOV) in retinal imaging. Networks should pay closer attention to the FOV because there is no information in the black ring. This method can more effectively identify normal and abnormal images but requires a mathematical morphology method to extract the micro blood vessels in the extracted image.
The region of interest has been identified and feature representations have been strengthened in blood-vessel extraction via the attention mechanism [
35]. Luo et al. [
36], Lian et al. [
37], and Lv et al. [
38] manually built attention masks that were the same size as the retinal images to detect the region of interest. Li et al. [
39], Li et al. [
40], and Fu et al. [
41] provided only a few examples. According to Tang et al. [
42], their created attention modules and networks learned their attention mappings, rather than relying on experts. Although this method cannot segment micro blood vessels, the process may further improve the accuracy of results in the future. According to other works, blood-vessel boundaries or micro blood vessels were extracted independently before being combined to create a complete extraction [
43,
43,
44]. Other researchers proposed segmenting data from coarse to fine by cascading different subnetworks [
45]. The original images and the extracted output from previous sub-models were used as the input for the next sub-model.
When using deep learning to extract retinal blood vessels, the following issues have been reported:
There is a need for training samples with clear labels. Even though there are many retinal images, collecting annotated data is quite challenging because it involves qualified experts, takes a lot of time, and is expensive.
The current retinal image samples are of low quality. As a result, deep learning models cannot develop more robust feature representations. The performance of the proposed methods is hampered by image noise, low intensity, and various features of DR diseases.
There is an issue concerning training sample class imbalance. The performance of networks is harmed by disparity in the amount of positive and negative training examples. Class imbalance affects large and micro blood vessels, DR, and backgrounds. Because there are more non-blood-vessel pixels than blood-vessel pixels, deep learning models frequently categorize pixels in boundaries as non-blood-vessel pixels.
Since the error extraction of micro blood-vessel pixels has less impact on the overall loss, the network performs poorly on micro blood vessels than on large blood vessels.
This research focuses on retinal blood-vessel extraction, with an emphasis on an extraction method that does not require specialized hardware for training the developed methods, decreases computational time, eliminates hyper-parameter adjustment, and minimizes memory. I chose the WKFCM-DBF method because of its many advantages. WKFCM-DBF identifies retinal blood vessels without a ground-truth image. This is a great advantage, as large datasets are typically unavailable for large-scale screening programs. In addition, Dilation-Based Functions and optimal thresholding are based on attributes of micro blood vessels in retinal images.
The remaining portions of the article are divided into the following sections. A detailed description of the data preparation, image normalization, color space selection, noise removal, and Optic Disc (OD) feature extraction is provided in
Section 2.
Section 3 provides a discussion of the proposed WKFCM-DBF method together with discussion of the standard FCM and WKFCM methods. It also covers method-training procedures and test setup. The experimental results, materials, evaluation criterion, robustness of extraction, and comparison of the proposed WKFCM-DBF method with state-of-the-art methods are provided in
Section 4. Finally, the conclusion is drawn in
Section 5.
4. Experimental Results
The following subsections provide a performance analysis by comparing the proposed preprocessing techniques, a novel WKFCM-DBF method, and state-of-the-art algorithms. To validate the effectiveness of the proposed method for extracting blood vessels, I compared my approach with those of previous studies.
4.1. Materials
Many publicly available retinal datasets detail retinal anatomy and blood vessels. This is essential in retinal analysis and blood-vessel extraction for training and testing algorithms on retinal databases. I evaluated the extraction method using three publicly available datasets:
DRIVE: A collection of retinal images from the Netherlands, covering a wide age range of patients [
17].
STARE: A collection of 80 retinal images from the United States [
57].
DiaretDB0 (Standard Diabetic Retinopathy Database Calibration Level 0): 130 retinal images from Kuopio University Hospital [
58].
I used the retinal image of the DRIVE dataset from a DR screening program in the Netherlands. There were 400 diabetic patients in the screening program, ranging in years of age from 25 to 90. Forty retinal images were randomly selected from 400 diabetic patients, of which 7 images showed symptoms of DR and the remaining 33 images did not show any signs of DR. Each image was captured in JPEG format using a Canon CR5 non-mydriatic camera with a field of view of 45 degrees, 8 bits per color channel, and 768 × 584 pixels. There was a total of 40 images in DRIVE’s training and test sets. This database has two manual classifications; I used 12 images as training ground truth and the remaining 28 images to compare the proposed extraction method with manual extraction by an experienced ophthalmologist.
STARE was conceived in 1975 by Michael Goldbaum, M.D., of the University of California, San Diego. The STARE dataset has 80 images corresponding to the ground-truth images used for blood-vessel extraction, 49 of which are normal images, while the remaining 31 images show various types of abnormal disease. The first expert manually marked retinal images in large blood vessels, while the second and third experts marked micro blood vessels. Extraction results are commonly used as a ground-truth image for computing performance. Each image was captured in PPM format using a TOPCON TRV-50 camera with a field of view of 35 degrees, 605 × 700 pixels, and 8 bits per color channel.
Finally, I obtained the retinal image of the DiaretDB0 from Kuopio University Hospital. The screening population consisted of 130 color retinal images, of which 110 showed symptoms of DR and the remaining 20 did not show any DR. Each image was captured in JPEG format with a field of view of 50 degrees, 1500 × 1152 pixels, and 8 bits per color channel. The information from the three selected datasets is summarized in
Table 1.
4.2. Performance Measurement
To describe the performance of the blood-vessel extraction method, more than an accuracy performance is needed to determine the quality of extraction methods. In this section, I applied mathematical metrics that are commonly used to measure the performance of blood-vessel extraction algorithms: sensitivity, specificity, and accuracy. The extracted binary vessel image was compared pixel-to-pixel with its corresponding ground-truth image from the test set to calculate the extraction algorithm with sensitivity, specificity, and accuracy. The retinal vessel that was extracted was converted to a binary image to distinguish the vessels from the retinal background. A ground-truth image, manually marked by an experienced ophthalmologist, was used to evaluate the performance of the vessel-extraction method. Four parameters—TP (True Positive), TN (True Negative), FP (False Positive), and FN (False Negative)—indicated that proposed method correctly and incorrectly extracted the blood vessel image with the ground-truth image marked by expert ophthalmologists.
TP represents vessels correctly extracted by the proposed method.
FN represents retinal vessels extracted as non-vessels by the proposed method.
TN represents non-vessels correctly extracted by the proposed method.
FP represents non-vessels incorrectly extracted as vessels by the proposed method.
In this theory, sensitivity, specificity, and accuracy can be calculated via Equations (19)–(21).
Sensitivity (SEN) represents the correctly extracted blood vessels as blood-vessel pixels by the proposed method. Specificity (SPEC) represents the accurately extracted non-blood-vessel as non-vessel pixels by the proposed method. At the same time, accuracy (ACC) represents the proportion of correctly extracted pixels as blood-vessel or non-blood-vessel pixels.
4.3. Analysis of Proposed Blood-Vessel Extraction
The proposed method was simulated within a MATLAB environment (The Mathworks, Inc.) and executed on a personal computer with an Intel (R) Core (TM) i7-6700K CPU at 4.00 GHz and 8 GB DDR3 RAM. To confirm the proposed method in retinal vessel extraction, I visualized and compared the activations from the different stages presented.
Figure 9 shows a visual comparison of three extracted images from the DRIVE, STARE, and DiaretDB0 datasets by the proposed method. The parameters provided in Equations (19)–(21) were used for the proposed method. Their average was determined by applying the traditional FCM and the proposed WKFACM-DBF method with a non-enhanced and an enhanced image to 167 test images of the DRIVE, STARE, and DiaretDB0 datasets, as shown in
Table 2 and
Table 3, respectively.
As previously mentioned, the proposed method has been developed based on the data from a publicly available dataset. Therefore, even though this is a machine learning implementation, the extraction method is still biased for the different datasets. The proposed method with preprocessing performs better on the STARE dataset than on the DRIVE and DiaretDB0 datasets. The results for the DiaretDB0 dataset indicate that its accuracy was the lowest among all datasets, implying that its ability to correctly classify retinal blood vessels is not as good as that of the other datasets. Nevertheless, the accuracy of the DiaretDB0 dataset is still high (98.09%). Below is an illustration of a detailed analysis of the numerical results presented in
Table 2.
If the proposed WKFACM-DBF method is used for non-preprocessed images or images with noise or artifacts, the value of SEN and SPEC may be decreased.
If the proposed WKFACM-DBF method is applied to preprocessed images or noise-free images, the application of the proposed method results in higher accuracy than that of the non-preprocessing alternative. However, these results show that the proposed method performs best in the studied retinal blood-vessel extraction method, even when the noise level is high. However, as it is more effective on unseen images and the different datasets free of noise, the proposed method performs as well as the training dataset.
Next, the proposed WKFACM-DBF method was compared to the traditional FCM method and tested on the DRIVE, STARE, and DiaretDB0 databases. From
Table 3, the accuracy of the proposed method on the STARE dataset has increased to 26.99% instead of 71.52% by the traditional FCM method, which implies the increased accuracy in retinal blood vessel image extraction.
The following is an illustration of a detailed analysis of the numerical results presented in
Table 3:
The proposed WKFACM-DBF method consistently outperforms standard FCM algorithms on these three measures. Note: the significant improvement in ACC scores over scores with standard FCM indicates that the proposed algorithm can better segment more micro blood vessels. This demonstrates, in particular, that the proposed improvement method highlights blood vessels of various widths.
The kernel functions technique has a good extraction ability and has been improved into the fuzzy function in the WKFACM-DBF method. The optimal number of clusters and fuzzy weighting can improve the clustering accuracy. Therefore, the WKFACM-DBF method performs better than the traditional FCM algorithm in extracting retinal blood vessels with similar structures.
4.4. Extraction Analysis with Private Dataset
The performance of the WKFACM-DBF method on the private dataset was analyzed. The private dataset contains 1800 retinal images (750 × 750 pixels) with a 45-degree field of view. Of all the retinal images, 200 contained DR lesions. The training set (30%) and the test set (70%) were divided. Each image in the training set had only one manual annotation, while the test set had three manual annotations provided by three experts. I used the same evaluation methodology as that of the publicly available datasets and used the ground-truth image annotated by three experts for performance evaluation. I compared the performance of the WKFACM-DBF method using the same parameters.
Table 4 displays the pixel-based evaluation of the WKFACM-DBF method and optimal global thresholding with the non-enhanced and enhanced images for large and micro blood-vessel segmentation.
In the private dataset experiment, the proposed method obtained 95.38%, 95.60%, and 95.42% for SEN, SPEC, and ACC, respectively. The proposed method achieved slightly lower SEN, SPEC, and ACC. However, incorporating the preprocessing approach and the WKFACM-DBF method with an optimal parameter improved the technique’s overall performance.
4.5. Performance Comparison and Analysis
The performance comparison with other algorithms was carried out via a single-dataset test (the DRIVE dataset). The proposed WKFACM-DBF method does not have a training process and does not necessarily have single or cross-dataset tests. However, to fairly compare the performance of the proposed WKFACM-DBF method with other methods, I compared the results from the proposed method with the results from the same dataset test of the other methods. Each method’s best SEN, SPEC, and ACC values are highlighted in bold in
Table 5, denoting the best extraction results. As shown in
Table 5, it is pertinent that, on the DRIVE dataset, the SEN, SPEC, and ACC values of the WKFACM-DBF method are better than those of the state-of-the-art algorithms. It can also be observed in the same table that the ACC value offered is the best value among the traditional clustering methods. The sensitivities of [
22] and my method are the first- and second-best, respectively, while the algorithm of [
24] is the third-best among all extraction methods. As the best, second-best, and third-best deep learning algorithms, respectively, refs. [
23,
24,
40] all have higher specificity values than my method; mine is only slightly lower. However, the average accuracy of my proposed method lies in first place as a result of achieving high accuracy in sensitivity and specificity.
The proposed WKFCM-DBF method outperforms state-of-the-art methods based on pixel-based evaluation metrics, including SEN, SPEC, and ACC (most of the time) on three published datasets, DRIVE, STARE, and DiaretDB0, respectively. A careful preprocessing step of the proposed method’s activation helped prove the hypothesis and increase the proposed method’s efficiency by preserving retinal quality information in the extraction stage. This critical hypothesis was validated based on the SEN, SPEC, and ACC metrics. Moreover, the average time required to process one image on a computer with 4.00 GHz Intel(R) Core (TM) i7–6700K CPU, 8GB (RAM) under the Microsoft Windows 10 32–bit operating system for all datasets was 4 s per image. The similar execution times for all datasets were due to the normalized and resized sizes of the original images before a feed to the extraction method. As a result, the WKFCM-DBF method was computationally efficient and fast compared to many state-of-the-art techniques.
Table 6 shows the comparison results for the DRIVE, STARE, and DiaretDB0 datasets. The extraction accuracy of the STARE databases was exceptionally high, which implied the extraction success of the proposed method. This remarkable achievement proves the validity of a of complete preprocessing exercise to enhance retinal images before transferring the enhanced images to a novel blood-vessel extraction method by a three-combination stage model.