1. Introduction
Lung cancer is a type of tumor with a very high morbidity and mortality in the clinical practice. Lung cancer at an early stage has a higher curable possibility than late-stage tumors, thus early diagnosis and treatment are significantly important to improve the patients’ clinical situation [
1]. Statistically, early lung cancers usually present as a solitary pulmonary nodule (SPN). Clinically, it is crucial to segment the sequence of lung parenchymal images rapidly without compromising accuracy in order to subsequently achieve pulmonary nodule segmentation and diagnose benign and malignant features [
2]. Lung parenchyma segmentation refers to the division of the lung parenchyma in a number of areas of interest with specific properties. In terms of image processing, the use of computer-aided diagnosis of pulmonary diseases allows the identification of suspected pulmonary nodules. Pulmonary nodules appear as spheroidal abnormal tissue located in a complex juxtapleural structure in lungs’ Computed Tomography (CT) images. The surrounding tissues, such as chest wall and blood vessels, may be attached to the pulmonary nodules, thus preventing their detection or segmentation.
Statistically solitary pulmonary nodules are about 6% to 17% of juxtapleural nodules [
3]. Because these nodules are attached to the pleura and their grayness and density are similar to those of the pleura, most of the existing segmentation methods are based on gray-level thresholding. When segmenting, juxtapleural nodules often appear associated with a depression area in the lung parenchyma, termed juxtapleural nodular depression. This type of pulmonary nodule is the most difficult to segment, but the segmentation results have a significant impact on the accuracy of image analysis, auxiliary processing, and other post-processing steps. In addition, most of the existing methods of lung parenchyma segmentation are subject to either under-segmentation or edge leakage when applied to complex structures of lung parenchyma. Consequently, we cannot obtain the required area of interest, while retaining a good edge and the details of the outline of the lungs. In order to be able to remove a complete lung parenchyma containing the juxtapleural nodules and to improve the effectiveness of ancillary diagnosis, the depressed area must be repaired.
The aim of this paper is the automated sequencing of CT images of juxtapleural nodules. To achieve this goal, this paper proposes an automated segmentation sequencing algorithm of CT images. The proposed algorithm is based on the fractal geometry and the improved convex hull algorithm. On completion of juxtapleural nodule segmentation, the proposed algorithm uses the fractal geometry method to detect the concave boundary for the concave area under study and, thereafter, it uses the improved convex hull algorithm to repair it, aiming at producing an accurate segmentation of the lung parenchyma. In addition, the proposed algorithm makes full use of the correlation between the sequence of CT images and the automated threshold segmentation sequence of CT images. To verify the effectiveness of the algorithm, 97 cases were chosen from 5800 sequential CT images for the segmentation test. The evaluation results confirmed that the proposed segmentation method significantly improved the segmentation speed reaching a satisfactory 92.45% Pixel Accuracy (PA) and 95.9% Intersection over Union (IoU) of lung parenchyma segmentation of juxtapleural nodules.
2. Related Work
There are a number of existing models and algorithms in the area of lung parenchyma segmentation and related to the work of this paper. The existing work can be mainly classified according to the methods used, which are the threshold [
4,
5,
6,
7,
8], clustering [
9], region-growing [
10], graph theory-based [
11], and active contour model [
12,
13] methods. Sudha and Jayasheree [
14] proposed a method of partial lung cutting based on the combination of the threshold method and the opening operation in morphology in order to extract pulmonary nodules. However, when there are pulmonary nodules presenting lung adhesion, these nodular areas are excluded by the method of threshold segmentation and the region growing. The reason for this is the similar gray level of the pulmonary nodules and the surrounding lung parenchyma. In addition, the lung boundary appears concave, making difficult the extraction and identification of the tumor, blood vessels, and trachea. Therefore, it is necessary to analyze and repair the concave and depressed lung parenchyma boundary.
Retico et al. [
15] demonstrated that the rolling sphere CT image processing method of pulmonary nodules can effectively reduce false positives when analyzing juxtapleural stretch-type pulmonary nodules. Bian and Yan [
16] combined the methods of threshold and regional growth to segment the lung parenchyma. They used the ball method to repair the extracted lung boundary. Messay et al. [
17] used morphological methods to detect and segment the pulmonary nodules. Their computer-aided diagnostic system was sensitive in 82.66% of cases. In Zhou et al. [
18] method of image segmentation for lung CT images with juxtapleural nodules, the iterative adaptive averaging algorithm and adaptive curvature threshold method was used to reinsert the missing juxtapleural nodules. Gong et al. [
19] proposed a lung parenchyma segmentation algorithm, which was based on gray integral projection and fuzzy C means clustering combined with the rolling sphere method to repair the boundary area. However, the selection of the sphere radius was a critical problem. If the radius was too large, the lung parenchyma would be over-segmented. If the radius was too small, it would be under-segmentated, and the repaired lung boundary would be incomplete. Wei et al. [
20] proposed a method to repair lung parenchyma boundary by combining an improved chain code algorithm and the Bresenham algorithm. Li et al. [
21] presented a lung parenchyma segmentation algorithm combining the regional growth and the morphology methods. They also proposed a two-dimensional convex hull algorithm to repair the profile of the lung parenchyma. Compared to the previous work, this algorithm had a higher accuracy, and the depression of the juxtapleural nodules could be accurately repaired. However, similar to the previous convex hull algorithm, the proposed one was still unable to repair vascular depression.
On the basis of the above review, the rolling-ball method and morphology-based method are the most commonly used methods of boundary restoration, characterized by simple and fast implementation. However, because of the randomness of the shape and size of pulmonary nodules, it is difficult to find morphological templates of appropriate size. Small-size templates cannot include all lung nodules, whereas oversizing can include a too large non-lung area and other border areas. The curvature-based method [
22,
23,
24] is also a commonly used method, which considers the sudden change of the boundary curvature corresponding to the defective boundary region. However, this assumption is not suitable in case of sudden changes in noise and local curvature. The boundary repair of by this method is not satisfactory when these sudden changes are present in the pulmonary boundary.
An observation regarding the existing lung parenchymal segmentation algorithms is that the majority of the images are segmented as single-slice CT images, ignoring the correlation between adjacent images in a sequence. In recent years, a large number of scholars have studied the segmentation methods of sequences of lung parenchymal images.
Geng et al. [
25] used the grayscale threshold iteration method to rapidly and automatically select seed points for the region growth and to extract each lung real image in the sequence of CT images. However, the algorithm is sensitive to the background noise. Liming et al. [
26] proposed a new framework of lung parenchyma segmentation. The optimized thresholding method and boundary tracking algorithm were used to segment the lung parenchyma. The method could effectively eliminate the influence of background noise, but at the same time it could also eliminate some of the lung parenchyma during processing. Yan-hua et al. [
27] proposed a method of 3D connectivity growth. The method used the adaptive thresholds to select seed points, the 3D connectivity markers to identify the lung parenchyma, and the morphological method to remove tracheal noise. In the final output, the lung parenchyma mask was generated, and the lung parenchymal images were removed. The produced segmentation results were satisfactory, but the image processing of the juxtapleural traction characteristics of the lung parenchyma was not effective. Luo et al. [
28] used an improved active contour model algorithm. They needed to manually circle the initial contour and semi-automatically segment the edges of lung real images in the serial CT images. Their segmentation results were satisfactory, but the method was not time-efficient.
3. The Method
This section proposes an algorithm to perform an accurate segmentation of lung parenchyma presenting juxtapleural traction nodules. On the basis of fractal geometry and the improved convex hull repair of pleura-type depressions, this algorithm includes an initial segmentation of CT images with automated threshold iteration, removal of bronchial tissue, and detection of pulmonary border. The lung parenchyma algorithm flow is shown in
Figure 1.
3.1. The Construction of the Initial Contour
Among the existing algorithms of lung parenchyma segmentation, the threshold segmentation algorithm has the advantage of high calculation speed and accuracy. Considering the high correlation between adjacent slices of sequential CT images, we chose the threshold segmentation method to pre-process the original images to obtain the initial contour of sequential plural traction images. The initial outline presented some noise derived from the trachea, bronchus, and other tissues. In order not to miss any important information of the lung parenchyma segmentation, such noise was tolerated. In this section, the open and close operations were used to smooth and fill the edge and interior of the lung parenchyma.
3.1.1. The Automated Threshold Iterative Segmentation of Sequential CT Image
It is difficult to choose a suitable global threshold to obtain the ideal initial contour because of the differences in the gray levels of lung nodules. Taking into account the high correlation between adjacent slices of sequential CT images, the automated threshold iteration method was chosen to construct the initial contour of the lung parenchyma of sequential CT images, the algorithm is shown in Algorithm 1. In
Figure 2, using (a–d) to display a sequence of images, the first line shows the original of sequential CT image and the second line results of the automated threshold iterative segmentation of lung parenchyma.
- 1.
Get the initial contour of the first CT image
Algorithm 1. The automated threshold iteration sequence segmentation algorithm. |
Input: the first image of CT sequential images Output: initial outline of the parenchyma |
1: Traversing CT images to get the maximum Gmax value and the minimum Gmin value of grays in the region 2: According to formula (1), calculate the initial threshold T
3: Segmenting image, firstly if the pixel of number is greater than the threshold T, the set of pixel N (nodule region) is expanded; else the dataset B (background region) is expanded 4: = (B1 + B2+…+ Bn)/n, = (N1 + N2+…+ Nn)/n, 5: According to formula (2), calculate the threshold T again
With 6: Getting global optimal threshold |
- 2.
Because of the high correlation between sequential CT images, an optimal threshold of i image as the initial threshold of i + 1 image is necessary. This can decrease the number of iterations and significantly improve the speed of segmentation.
3.1.2. The Removal of Bronchus and Trachea Noise
On completion of the initial segmentation, a lung parenchyma image could be obtained by the automated threshold iteration method. The corresponding globally optimal threshold
T could be obtained simultaneously. Image
can be converted to a binary image
through formula (3)
On completion of the above series of operations, it is possible to obtain the contours of some interfering objects and determine the noise. In our previous work [
29], we improved the region growth method and the open and close operations to obtain final lung parenchyma images by smoothing and filling the edges and the interior of the pulmonary nodules. The process of trachea and bronchus removal and lung contour refining is shown in
Figure 3.
3.2. Fractal Geometry-Based Pulmonary Boundary Detection
Because of the similarity of the gray values between the juxtapleural nodules and the chest, some nodules overlap with the juxtapleural blood vessels. Therefore, in the extracted lung region, the pulmonary nodule region, attached to the lung wall is often excluded. However, if this part of the region is excluded, the accuracy of the computer-aided diagnosis system will be greatly affected. Thus, it is essentially necessary to repair the area attached to the lung wall.
3.2.1. Adaptive Meshing
Although the existing patching methods are effective, all requires human intervention and do not have good adaptability to different samples. These repair methods only take into account the local properties without considering the other boundary properties. In order to improve the efficiency of the segmentation, we meshed only the smallest circumscribed rectangle (
a ×
b) containing lung parenchyma. Since the two-dimensional contour of the lung varied greatly from top to bottom in the entire CT sequence of images, this section presents an adaptive network for the CT images, as shown in formula (4).
In formula (4), N1 is the number of connected regions in the image, A1 is the area of all lung regions, and Ar is the area of the smallest circumscribed rectangle.
3.2.2. Fractal Geometry-Based Pulmonary Boundary Detection
The lung border that is produced by the segregation shows a curve varying according to a certain rule. The fractal theory can investigate, describe, and analyze complex, random, disordered, incomprehensible, or difficult-to-quantify objects to a deeper level. In particular, the use of the fractal dimension is an effective way to investigate and describe the degree of a fractal to space filling [
30]. To do that, the entire image is abstracted as a set F in two-dimensional space. The analysis of the fractal features of its boundary is equivalent to calculating the fractal dimension of the two-dimensional figure. The box dimension is a simple and automated method, which can work with or without self-similarity [
31]. This method is used to calculate the fractal dimension and detect the points to be repaired and is fast and suitable for the juxtapleural traction nodules of our work in this paper.
According to the Graham method, CT images are searched to obtain the boundary points, and the grids containing the boundary points are stored as a grid set Grid = {|}.
The number of squares N (s) represents each network intersecting with the image, and the length s represents the sides of all the squares [
32]. The curvature of the line defined by
and
is the fractal dimension, and the linear regression equation that was used to estimate the fractal dimension is as formula (5):
The fractal dimension of the statistical histogram for the area block is shown in
Figure 4.
From
Figure 4 it is evident that all the fractal dimensions show a trend of polarization. The fractal dimension is small because of the smooth pulmonary border and the complex boundary of the lung. The boundary of the lung has high randomness and complexity due to the irregularities caused by pulmonary nodules. The fractal value is larger.
3.2.3. Adaptive Threshold-Based Defect Boundary Selection
Because the proportion of defective boundaries caused by adhesive pulmonary nodules was small, the number of small regions containing this part of the border was small as well. In order to select the boundary of defect accurately, an adaptive threshold was proposed as formula (6) for sequencing CT images.
Among them, the
i-th network represents the fractal dimension, which is the average value of the fractal dimension of the grid, representing the variance. The mean square error (MSE) was calculated by using formula (7) [
33]:
On the basis of the results of a number of experiments, the mean square error is small when
n is 1. However, fewer nodules could be identified by setting
n = 1. This is because the extent of the pulmonary border affected by pulmonary nodules was relatively small. Therefore, the final adaptive threshold was determined by formula (8).
For a grid block in a grid, if the fractal dimension of internal boundary is larger than the threshold, the lung boundaries in the block need to be repaired and stored in the grid. If the fractal dimension of the internal boundary is less than the threshold , the lung boundaries within the region block need not be patched.
Through the fractal geometry method, the defect boundary was obtained, and the repair rate and accuracy were improved.
3.3. The Improved Convex Hull Repair Algorithm
On the basis of the review of existing work, this paper proposes a lenticular edge repair algorithm. This algorithm can repair well not only the juxtapleural nodular depression, but also the depression between the two lungs, adjacent to the heart and mediastinum. According to the convex hull principle, the algorithm avoids over-repair and under-repair, which exist in the rolling-ball method or morphological method. The detailed steps of the proposed algorithm are shown in Algorithm 2.
Algorithm 2. The improved convex hull repair algorithm. |
Input: Grid block to be patched Output: Patched parenchyma edge set |
1: The boundaries within the grid block to be modified are stored in the point set Q = {}. 2: Update with convex hull theory. for: to Select , , and from
if concave point else if bump point delete point else preserve point, delete point update Q. 3: for: for: to
if D < T To be repaired between two points removed from the |
The final result is the set of edge points of the lungs in , to be combined with the unpatched grid blocks and representing the repaired lung parenchyma images.
4. Analysis of the Results
4.1. The Dataset
In order to verify the effectiveness and the real-time characteristics of the proposed feature extraction method, we have applied it to the Lung Image Database Consortium image collection (LIDC–IDRI) dataset. The LIDC–IDRI database is the largest open lung nodule database in the world, which contains 1080 cases. For each of the images in the samples, four experienced chest radiologists performed a two-stage diagnosis. In the first stage, each radiologist independently diagnosed and marked the location of the patient. In the subsequent second stage, a radiologist independently reviewed the other three radiologists’ marks and gave his/her own final diagnosis. Such a two-stage process can verify all results as completely as possible without the interference of other radiologists. Therefore, the diagnosis results can be used as a gold standard reference. According to the diagnostic results of corresponding lesions, XML files were marked, and the CT images of 50 cases of sequential juxtapleural nodules were selected. At the same time, some of the image datasets used in the experiment were collected from the positron emission tomography/computed tomography (PET/CT) detection center of a collaborating hospital. In the experiment, the imaging data of 47 patients in the hospital were selected. For each patient there were 299 serial lung CT images, and the sequential images of juxtapleural-drawn pulmonary nodules were selected for the experiment.
4.2. Analysis of the Experimental Results
In the experiment, 50 CT images were selected for training and 47 images were used to test the accuracy and time complexity of the algorithm. In order to verify the accuracy of the segmentation algorithm, we used the proposed method, region-growing, watershed, and rolling-ball methods to segment the CT images of juxtapleural nodules. Considering the interference of multiple factors, the accuracy of segmentation was not judged in the experimental process. Therefore, the results of various segmentation algorithms were compared with the comprehensive lung parenchyma area manually segmented by four experienced radiologists. In the course of the experiment, the manually segmented images in the medical records were the ultimate gold standard.
4.2.1. Qualitative Assessment
The CT sequence contained a large number of images. In this section, the segmentation of the same sequence of juxtapleural nodules by four kinds of algorithms was performed, as shown in
Figure 5. The images were classified in the following groups: (a) original CT image; (b) manual segmentation results obtained by a doctor; (c) improved segmentation algorithm; (d) segmentation results of the improved region growth (RG) algorithm; (e) regional growth and rolling-ball method repair algorithm segmentation results; (f) algorithm segmentation results.
For the CT images with intact lung parenchyma, as shown in the first row and the third row of
Figure 5, the segmentation by all the algorithms could ensure the integrity of the segmentation. However, on the basis of the automated threshold segmentation CT sequence images, seed points could be selected manually, and the threshold could be passed, greatly improving the efficiency of segmentation. For the sequence images of lung parenchyma having juxtapleural nodules, as shown in the second row of
Figure 5, the RG and Watershed algorithm lost the juxtapleural nodules, and the repair results of ball-rolling method lost some parts of the juxtapleural nodules. Promisingly, our method could ensure the integrity of lung parenchyma segmentation (f). For the not connective area of lung parenchyma, the regional growth algorithm lost some parts of the lung parenchyma, while the other algorithms’ segmentation results were more complete.
The sample, could contain multiple nodules in the same CT slice at the same time. We chose the same methods as in
Figure 5 for an experimental comparison, and the results are shown in
Figure 6.
The results showed that the regional growth algorithm still lost some parts of the lung parenchyma for the not connected areas. The comparison between column (e) and column (f) showed that the rolling-ball method provided better repair results for a single large juxtapleural nodule. The method described in this paper could repair multiple juxtapleural nodules at the same time and provided better results.
In order to verify the repair ability of our method for juxtapleural concave areas, we compared the proposed algorithm and the “ball-and-roll” method for the same sequence of slices, as shown
Figure 7. In the ball-rolling method, the radius of repair had an important effect on the result.
For the concave area caused by the juxtapleural nodules, if the radius of the ball-rolling method was too small, it caused under-segmentation. Because of the uncertainty of the size of a juxtapleural nodule, setting the threshold manually was useless and the segmentation results were unsatisfactory. In contrast,
Figure 7 shows the effectiveness of the algorithm proposed in this paper.
4.2.2. Quantitative Analysis
In this paper, the pixel accuracy, cross-ratio, and time complexity were used as the quantitative metric for the evaluation of segmentation results. In the following analysis, we assumed that the actual segmented image was P and the reference segmented image was G.
- 1.
Pixel accuracy
Pixel Accuracy (PA) is expressed as the ratio of correctly labeled pixels to total pixels. Assume that
is the number of pixels with the correct marks, and
is the gold standard number of pixels. The higher the pixel accuracy, the better the image segmentation; the formula is as follows:
According to formula (9), the higher the value of PA, the more overlapped the results and the gold standard. The PA curve for lung parenchymal image segmentation with juxtapleural nodules is shown in
Figure 8.
It is clearly shown that either RG or Watershed had a lower curve than the other methods. Thus, both methods lost some juxtapleural nodules, resulting in a with less pixels. However, the overall trend of the PA of the rolling-ball method was toward higher values than that of other algorithms. In a number of previous experiments, r = 65 was usually used. When comparing with the algorithm described in this paper, the repair result of the juxtapleural indentation was still deficient. The PA curve values obtained with the method described in this paper represent the most accurate segmentation results.
- 2.
Intersection over Union
Intersection over Union (IoU) is a probability value used to compare the similarity and dispersion between sample sets and reflect the degree of coincidence between two segmented images. The higher the cross ratio, the better the image segmentation results. Assuming that the gold standard of the nodules in the image is
G and the segmentation result of an automated method is P, the calculation of IoU is represented by formula (10).
From
Figure 9, it is evident that the IoU curve of the proposed algorithm and that of the rolling method were significantly higher than those of the RG and Watershed methods. Furthermore, the IoU curve of the algorithm described in this paper was slightly higher than that of the ball-rolling method. At the same time, the segmentation results of the algorithm proposed in this paper were the most similar to the gold standard results.
- 3.
Time complexity
The algorithm was also evaluated for time complexity. In this paper, taking
i of the correlation of sequential images into account, we compared the shortest time, the longest time, and the average time that was consumed for segmenting the sequence of CT images. The results are shown in
Figure 10.
As shown in
Figure 10, the proposed method had the shortest processing time. The watershed algorithm had the longest processing time and the region-growing algorithm had a processing time between the previous two, to some extent. The average time for processing leaflets was 0.75 s in
Figure 10a and 0.63 s in
Figure 10b. These average times were significantly shorter than those of the other two methods. Moreover, the average processing time decreased with the increase of the number of sequence images. This feature demonstrated that the automated threshold transfer took the correlation of the adjacent images into consideration and was able to find the threshold to adapt all the images and speed up the sequence image segmentation.
In summary, this article compares the evaluation results of the four methods of segmentation results, as shown in
Table 1.
The experimental results demonstrated that for the segmentation of juxtapleural nodules, both the method of this paper and the rolling-ball method were more efficient than the regional growth and watershed methods. In addition, the segmentation time of the method proposed in this paper was much lower than that of the other three methods.
5. Conclusions
Juxtapleural nodules have a very high probability of being malignant. Sequence image segmentation of the lung parenchyma with juxtapleural nodules is the basis for the subsequent pulmonary nodule segmentation and detection [
36,
37,
38]. On the basis of fractal geometry and the improved convex hull algorithm, an automated segmentation algorithm of sequential CT images is proposed in this paper with the goal of identifying juxtapleural nodules. To verify the validity of the algorithm, 97 cases were selected from 5800 sequential CT images. The evaluation results were compared with the results of manual segmentation by professional radiologists, regional-growth and watershed segmentation, and ball-rolling repair algorithm. Through both a qualitative analysis and a quantitative analysis of the results from the comparisons, we can confirm that the method for lung parenchyma segmentation and repair proposed in this paper produced a more complete set of juxtapleural nodules and was faster than the other automated methods. The results of sequence segmentation of our method are more accurate and, therefore, can be used as a strong reference by physicians for the diagnosis of lung cancer in the clinical practice.
Nonetheless, there are some open issues requiring future work. This work only used the traditional quantitative analysis to evaluate the experimental results and did not actually evaluate them in the Computer Aided Design (CAD) system. In future work, the method of sequential segmentation of images will be applied to a CAD system, and the possibility of applying it to segment other types of nodules will be explored.