Point-Sampling Method Based on 3D U-Net Architecture to Reduce the Influence of False Positive and Solve Boundary Blur Problem in 3D CT Image Segmentation
Round 1
Reviewer 1 Report
This manuscript describes a novel addition to the ADU-Net to reduce false positive segmentations and improve overall organ segmentation accuracy. The manuscript is well organized and well-written. I only have a few small suggestions:
- "Organ lesion is a huge threat to human health with a high mortality rate" -- I think there was a translation error here. The term "Organ lesion" used here odd/awkward. I think perhaps the authors meant "cancer", "malignant lesions", possibly "solid lesions" (organs is generally implied)
- "Even small boundary error is very likely to cause a larger medical accident. " This seems to be hyperbole, and 'accident' is usually more associated with mechanical or operational failure and not an incorrect diagnosis I would suggest something like "Even a small boundary error can lead to misdiagnosis". Also the authors may wish to modify this sentence to emphasize the effect of the false positive behaviours of the prior works (see lines 57-59) and (65-68) in which the sentence could be amended to something like "Even a small boundary error can lead to misdiagnosis (false positive), lead to further work up or unnecessary procedures"
- What about the danger of false negatives (undersegmentation) for these methods? Is this not an issue as well?
- Fig 2. Given the complex detail in this figure, it would be worthwhile to notify the reader that (Line 118): "…Figure 2. Details on the symbols and notation in Figure 2 are discussed in the next sections" Otherwise, Figure 2 is very hard to understand when first encountered.
- Fig 2/3 Why is X1_0 to X0_1 upsampling (green arrow) and all the rest are orange arrows (point sampling). This seems to break the symmetry somewhat -- why couldn't it also be point sampling? The authors should state the reason for this difference.
- Line 227 - where "Y is the ground true" to "Y is the ground truth" also throughout text and Figure 7 onwards.
- Figure 7 - the images really aren't large enough to make any real comparison. Assuming a full-width figure is OK for the journal, I would suggest moving the legend to the bottom of the figure and greatly enlarging the images (making maximum use of the space). Also, it may be worthwhile to overlap the Truth and Output Images or show a Difference image instead of the Truth Image (the truth could be shown in a corner -- it's the same for each row and does not need to be repeated). Having good images helps a good imaging paper!
- Line 273: "We observe that the feature represented…" Given that Figure 7 is so small, we can't observe that. When the figure is enlarged that may be possible for the reader. However, if the discrepancies are still subtle in Figure 7, an additional Figure may be necessary (zoomed in on the border of the segmentation perhaps?) to illustrate the limitation of the baseline UNet.
- Figure 8: "Ground Truth" - Again the figures are too small - suggest changing the figure to have two or three rows of larger images for each example. Similarly for Figures 9-10
- Is the HD measured in mm taking the voxel sizes (FOV, slice spacing) into account? Please indicate the units. If not in mm, please indicate whether there is bias/domain adaptation issues if used on different CT data sets.
Author Response
To start with, we would like to sincerely THANK the reviewers for their time, efforts and valuable comments on the manuscript. We have carefully revised the manuscript in accordance with two reviewers’ comments.
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
This paper modified their attention nested U-Net architecture to improve 3D segmentation of organ in CT image by reducing false positive phenomenon and boundary blur problem. Overall the research was designed well and the results look promising. As they added various features, however, the process time to run the suggested model with additional features should be addressed.
Minor comments:
- p4: five methodologies are introduced. But in p5, only four points were addressed. What are the five methodologies?
- p6: it needs additional calculation. How much computational complexity increased?
- p7: FPC is shown in Figure 6 -> Figure 5? deep supervision of Figure ?? -> Figure 5?
- Deep supervision is mentioned in Fig. 3 and 5, but no detail was found. It was explained in author’s previous work, but is it the same? Please elaborate more.
- p10: formula for Dice (eq9) and f1-Score (eq11) are the same. What’s the purpose to calculate the same thing? Better to exclude one.
- p11: MRI -> CT?
- Table 2. Render 3D U-Net, DS x, FPC x -> why Dice and F1-score values are different?
- Fig. 8, ground true -> ground truth
- Table 3. Attention 3D r2UNet, DS x, FPC x-> why Dice and F1-score values are different?
- Fig. 9, ground true -> ground truth
- To prove the improvement of the suggested part, DS and FPC were compared with other models. However, it seems the new hybrid loss function was not compared with or without it. Can you also show the improvement from the loss function?
- As mentioned before, the increased process time to run the suggested model with additional features should be also addressed.
Author Response
To start with, we would like to sincerely THANK the reviewers for their time, efforts and valuable comments on the manuscript. We have carefully revised the manuscript in accordance with two reviewers’ comments.
Please see the attachment.
Author Response File: Author Response.pdf