Analysis of 2D and 3D Convolution Models for Volumetric Segmentation of the Human Hippocampus
Round 1
Reviewer 1 Report
Magnetic Resonance Imaging provides the spatial resolution necessary for visualization. MRI data is very sophisticated for detecting early changes in the human brain. During aging or neurodegenerative condition, many pathological changes occur, and MRI provides excellent detail to assess the patient.
It's difficult to achieve accurate automated segmentation. The size and density of the tissue make it difficult. Machine learning approaches have been developed to meet this challenge. The present manuscript is well-written and well-planned. The author also listed the limitation of this study, which is understandable one system is not perfect.
Q1- There are anatomical differences between the right and left hippocampus. However, the algorithm was supposed to work similarly as it detected the left hippocampus.
Q2- Deep brain areas are hard to segment compared to cortical segmentation. Did the author try to segment smaller areas like the dentate gyrus?
Author Response
Response to Reviewer 1 Comments
Submission Manuscript ID: BDCC-2311479: ANALYSIS OF 2D AND 3D-CONVOLUTION MODELS FOR VOLUMETRIC SEGMENTATION OF THE HUMAN HIPPOCAMPUS
We thank the reviewer for their constructive comments and are grateful to the editors for the opportunity to respond and revise our manuscript. Please find below our point-by-point responses to the comments and a revised manuscript.
Magnetic Resonance Imaging provides the spatial resolution necessary for visualization. MRI data is very sophisticated for detecting early changes in the human brain. During aging or neurodegenerative condition, many pathological changes occur, and MRI provides excellent detail to assess the patient.
It's difficult to achieve accurate automated segmentation. The size and density of the tissue make it difficult. Machine learning approaches have been developed to meet this challenge. The present manuscript is well-written and well-planned. The author also listed the limitation of this study, which is understandable one system is not perfect.
Point 1: There are anatomical differences between the right and left hippocampus. However, the algorithm was supposed to work similarly as it detected the left hippocampus.
Response 1: We thank the reviewer for this comment.
We don’t quite understand what the reviewer meant by this. For each architecture (U-Net, 2D U-Seg-Net etc), there are indeed two separate models (e.g. one 3D U-Net for the left, one 3D U-Net for the right) for the left and right hippocampus respectively. This is because we acknowledged the anatomical differences between the left and right hippocampi, and understand that one model used to segment both hippocampi probably will not give good results.
Point 2: Deep brain areas are hard to segment compared to cortical segmentation. Did the author try to segment smaller areas like the dentate gyrus?
Response 2: We thank the reviewer for the above comment.
No, we did not try to segment the smaller areas like the dentate gyrus as the dataset did not have any labels for the deep brain regions. Therefore, the machine learning methods have no way of “learning” these. I have tried searching online for open-source medical images of the human brain that contained labels of these sub-regions of the hippocampus, but they are extremely rare and did not have many samples (which is an important requirement for machine learning algorithms to learn).
Reviewer 2 Report
The authors propose the analysis of 2d and 3d-convolution models for volumetric segmentation of the human hippocampus. In this, they compare their model “2D Ensemble U-Seg Net” with 3D UNet in terms of DSC and training time. They show that their model performs well. Overall the paper is well-written but it needs some improvement that would help the reader understand it better.
1) 2D and 3D-Convolution models has been used in various medical image segmentation/classification applications. You need to include some other studies too such as Sclera-Net for sclera segmentation, Cardio-Net for anatomical regions segmentation and DMFL_Net for classification of multiple chest diseases.
2) As you discuss the limitations of the HarP dataset, then why did you choose only this? It will be better for you to test on other datasets to make your model more reliable.
3) Quality of the figures is very low. Text in the figures is blur and not readible. Please recheck and fix it.
4) You are comparing your model with the 3D UNet model only, is 3D UNet outperforming all other models? If not, then it is better to include other models in your study and then compare them with your model.
5) Also include some good and bad cases of segmentation with your model.
Author Response
Response to Reviewer 2 Comments
Submission Manuscript ID: BDCC-2311479: ANALYSIS OF 2D AND 3D-CONVOLUTION MODELS FOR VOLUMETRIC SEGMENTATION OF THE HUMAN HIPPOCAMPUS
We thank the reviewer for their constructive comments and are grateful to the editors for the opportunity to respond and revise our manuscript. Please find below our point-by-point responses to the comments and a revised manuscript.
REVIEWER 2:
The authors propose the analysis of 2d and 3d-convolution models for volumetric segmentation of the human hippocampus. In this, they compare their model “2D Ensemble U-Seg Net” with 3D UNet in terms of DSC and training time. They show that their model performs well. Overall the paper is well-written but it needs some improvement that would help the reader understand it better.
Questions:
1) 2D and 3D-Convolution models has been used in various medical image segmentation/classification applications. You need to include some other studies too such as Sclera-Net for sclera segmentation, Cardio-Net for anatomical regions segmentation and DMFL_Net for classification of multiple chest diseases.
Response: We thank the reviewer for the above comment.
We are aware that 2D and 3D-convolution models have been used for numerous other medical image segmentation/classification purposes. However, we are not sure why we would have to include studies of those methods in our paper, given that this study is specifically targeted at segmentation of the human hippocampus in the brain. Additionally, the nature of the datasets used in the other medical segmentation cases (Sclera-Net, Cardio-Net) are quite different from the case of hippocampus segmentation. They mostly deal with 2D X-rays or 2D images, whereas in the hippocampus segmentation case, we have to analyse the volumetric, 3D MRI scans. A comparison of 2D and 3D-convolution models in the field of all possible medical segmentation cases will require a substantial amount of time, and we believe will be a topic that spans multiple papers/studies as well.
- As you discuss the limitations of the HarP dataset, then why did you choose only this? It will be better for you to test on other datasets to make your model more reliable.
Response: We thank reviewer #2 for the above comment.
The HarP dataset was the only “publicly" available (subject to approval from ADNI/LONI) dataset in which access to it was not too limited. This dataset underwent stringent checks to ensure that the labels are consistent and adherent to clinical protocols for hippocampus segmentation. We did try very hard to find other datasets but as with most medical data, we found it extremely hard to obtain the brain MRI image data readily available online, since most medical institutions do not upload the brain image MRI data for sharing. We did consider to ask the National University Hospital for brain image MRI data and hippocampus labels, but it takes about six months to process the request and to approve the brain image MRI data for research. Further, since the best hippocampus segmentations often come from manual labelling, we would also have to wait for at best another six months before we found a clinician to label the brain image data (for free) for us. In addition, we had limited time for this study. We only had eight months to collect, pre-process and analyse the brain image MRI image data.
- Quality of the figures is very low. Text in the figures is blur and not readible. Please recheck and fix it.
Response: We thank the reviewer for the concern on the quality of the figure and text in figures.
Regarding this, we have checked the manuscript and found all images/figures to be of good quality. We can see the images and text clearly. But, we have still replaced the figures. The figures still look the same. Note that this issue was not raised by the first reviewer. It may be a hardware/resolution issue on the reviewer’s computer.
- You are comparing your model with the 3D UNet model only, is 3D UNet outperforming all other models? If not, then it is better to include other models in your study and then compare them with your model.
Response: We thank the reviewer for this comment.
The models we are comparing were taken from previous studies on deep learning for hippocampus segmentations. We included additional explanations on why these two models were chosen (See lines 232 to 239 in our revised manuscript) instead of including more. We would also like to reiterate that the purpose of this study was not to find or come up with the “best” hippocampus segmentation model (We believe there are thousands of ways of achieving the hippocampus segmentation, and almost every paper out there on hippocampus segmentation is hyper-focused on this, whether they use 2D or 3D convolutions). Instead, this study was written to determine whether there are any empirical differences between 2D and 3D convolutions for hippocampus segmentations and if so, to provide future researchers with some evidence of choosing ensemble 2D convolution models instead of the computationally expensive 3D convolution models.
- Also include some good and bad cases of segmentation with your model.
Response: We thank the reviewer for this comment.
For this, we are afraid we are unable to do so as we no longer have the resources (GPUs) to run the models and extract out cases of good/bad segmentations. However, we already included figures which show the segmentation performance between the EnsembleUSegNet and 3D U-Net, as compared to the ground-truth. We hope this will be sufficient.
Round 2
Reviewer 2 Report
Most of my comments are addressed. I recommend acceptance of this article.