Automated Opportunistic Trabecular Volumetric Bone Mineral Density Extraction Outperforms Manual Measurements for the Prediction of Vertebral Fractures in Routine CT
Round 1
Reviewer 1 Report
I read with interest this manuscript on the automated estimation of vBMD for the prediction of VB fractures using CT. The concept is interesting and the study well conducted. However, certain changes are needed prior to publication. Specific comments:
Abstract OK
Introduction OK
Materials and methods
1. I would like to congratulate the authors for the asynchronous calibration method and the correction for the contrast medium.
2. How big was the volumetric ROI that was placed manually? Did the authors ensure that the basivertebral foramen was outside the ROI?
3. How did you select the variables to be included in multivariable analysis?
4. Non parametric statistical tests need to be used unless you demonstrate data normality....
Results
1. Goodness of fit measures need to be provided for the linear regression analyses.
2. Please provide P-value for the whole multivariable models.
3. Please provide scatter plots showing the correlation of the data.
4. R2 values need to be provided in tables. please specified if you have used standardised or unstandardised B coefficients.
The quality of English is ok
Author Response
I read with interest this manuscript on the automated estimation of vBMD for the prediction of VB fractures using CT. The concept is interesting, and the study well conducted. However, certain changes are needed prior to publication.
Specific comments:
Abstract OK
Introduction OK
Materials and methods
Reviewer comment 2.1: I would like to congratulate the authors for the asynchronous calibration method and the correction for the contrast medium.
Author response 2.1: Thank you very much for this comment.
Author action 2.1: -
Reviewer comment 2.2: How big was the volumetric ROI that was placed manually? Did the authors ensure that the basivertebral foramen was outside the ROI?
Author response 2.2: Thank you for this comment. The ROI was the same size to ensure comparability and had a volume of 4.5 cm3. The basivertebral foramen was outside the ROI (only the vertebral bodies were analyzed; all posterior vertebral structures, including the basivertebral foramina, were excluded prior to the analysis).
Author action 2.2: We added this aspect to the Material and Methods section:
- (Page 5, lines 33-36) “Manual extraction of HUs on the other hand was performed by placing a volumetric ROI of 4.5 cm3 in the anterior trabecular region of vertebrae L1 to L4 (Figure 3) and BMD values were calculated from these as described above.”
Reviewer comment 2.3: How did you select the variables to be included in multivariable analysis?
Author response 2.3: Thank you for this very important comment and thank you for pointing out the missing information on the variable selection. Multivariable regression analysis was adjusted for age and sex, as these are potential confounders regarding vBMD results. This is due to the fact that both age and sex have an impact on BMD in general, which e.g., was previously shown in a study by Havill et al. in 2007 (Havill LM, Mahaney MC, L Binkley T, Specker BL. Effects of genes, sex, age, and activity on BMC, bone size, and areal and volumetric BMD. J Bone Miner Res. 2007 May;22(5):737-46. doi: 10.1359/jbmr.070213. PMID: 17444815.)
Author action 2.3: This aspect was added to the manuscript in the Statistical analysis section:
- (Page 7, lines 39-41) “Adjustment for age and sex was done as both parameters have been proven to potentially influence BMD results [36].”
Reviewer comment 2.4: Nonparametric statistical tests need to be used unless you demonstrate data normality....
Author response 2.4: Thank you very much for pointing this out. We have demonstrated data normality and referred to this in the Statistical analysis section.
Author action 2.4: The following text passage in the manuscript refers to this point:
- (Page 7, lines 19-20) “Shapiro-Wilk test was performed to test for normal distribution of the data.”
Results
Reviewer comment 2.5: Goodness of fit measures need to be provided for the linear regression analyses.
Author response 2.5: Thank you for this important comment. As suggested we provided goodness of fit measures for the linear regression models ( manual: R2 0.27, p<0.001; Durbin-Watson statistic (2.06); automatic: R2 0.25, p<0.001; Durbin-Watson statistic (2.07).
Author action 2.5: Goodness of fit measures were added to the manuscript in both the statistics and results section:
- (Page 7, 32-33) “Goodness of fit measures were applied for linear regression analysis including Durbin-Watson statistics.”
- (Page 9, lines 12-14 and 24-26) “*Goodness of fit measures were applied for both linear regression models (manual: R2=0.27, p<0.001; Durbin-Watson statistic 2.06; automatic: R2=0.25, p<0.001; Durbin-Watson statistic 2.07).”
Reviewer comment 2.6: Please provide P-value for the whole multivariable models.
Author response 2.6: Thank you very much for pointing this out. We have provided p-values in Tables 2 and 3 for all analyses. In addition, we also reported the p-values in the text.
Author action 2.6: We added p-values in the Results section in the text as well:
- (Page 8, lines 22-34) “Further, the incidental VF status at follow-up was significantly associated with automatically extracted vBMD assessment adjusting for age and sex for both single vertebral levels from L1 to L4 (L1: p<0.001, L2: p<0.001, L3: p=0.001, L4: p=0.002) as well as for combinations of consecutive vertebral body heights from L1-L2 to L3-L4 (L1-2: p<0.001, L2-3: p<0.001, L3-4: p<0.001) (Tables 2 and 3). The incidental VF status at follow-up was significantly associated with manual trabecular vBMD assessment adjusting for age and sex at single vertebral levels L1 and L2 (L1: p=0.006, L2: p=0.029) and at consecutive levels L1-L2 and L2-L3 (L1-2: p<0.001, L2-3: p=0.007), respectively, while there was neither a significant association found at single vertebral levels L3 and L4 nor at the combination of the BMD values from L3 to L4 (Tables 2 and 3).”
Reviewer comment 2.7: Please provide scatter plots showing the correlation of the data.
Author response 2.7: Thank you very much for this suggestion. We absolutely agree with you that some visualization of our study results would be helpful. As scatter plots are useful in visualizing the correlation of nominal scaled data, this kind of graphical visualization in our opinion is not appropriate in nominal scaled respectively dichotomous data as with our study (fracture status at follow-up “yes” vs. “no“). Therefore, we decided to provide at least boxplots in order to visualize the manually respectively automatically extracted vBMD values in both study groups.
Author action 2.7: Boxplots showing minimum, first quartile, median, third quartile, and maximum for both automatic and manual derived baseline vBMD values in the patient cohort with respectively without incidental vertebral fractures at follow-up were provided:
- (Page 8, line 7-13): “Figure 4”
Reviewer comment 2.8: R2 values need to be provided in tables. please specified if you have used standardised or unstandardised B coefficients.
Author response 2.8: Thank you for this important comment. We provided R2 values in the manuscript (in accordance with comment 2.5). We have used standardised Beta coefficients, which we specified in the manuscript (Signatures of Table 2 and 3).
Author action 2.8: We addressed these aspects in the manuscript as follows:
- (Page 9, lines 9-11): “Numbers are given as standardised regression coefficients (β) and 95% confidence intervals (95%-CI).”
- (Page 9, lines 20-22): “Numbers are given as standardised regression coefficients (β) and 95% confidence intervals (95%-CI).”
Comments on the Quality of English Language
The quality of English is ok
--------------------------------------------------------------------------------------------------
Reviewer 2 Report
Dear author,
Thank you for the opportunity to review this article.
It is a very interesting topic about using artificial intelligence in fracture prevention among osteoporotic patients.
The introduction should be expanded. Are there any other AI or learning machines that were already proved to bring value to fracture prediction?
Low bone mass is a common problem in the elderly, but during childhood, the risk of fractures is also increased. Recent evidence shows that from early childhood the risk of fractures is increased in low bone mass individuals, quantified by Vitamin D and Calcium levels, as shown here: https://www.mdpi.com/1660-4601/20/4/3300. Did you correlate low BMD with vitamin D and Calcium?
Materials and Methods: a Flow diagram would make it easy to follow.
Lines 87-88: Why did the patients undergo the first MDCT scans? It is hard to guess because you excluded neoplastic patients, trauma patients, and spinal surgeries... Is it a part of a screening? It is suggestive that the second CT was conducted because of the vertebral fracture. Please elaborate on the subject for ethical reasons.
Discussions: Although Discussions are focused on various methods of assessment for osteoporosis/low BMD, in my opinion, this section should start with a short general talk about osteoporosis, methods of assessment, and fracture risk among the elderly due to poor BMD.
Other degenerative conditions are also as a consequence of low BMD, such as adult scoliosis or hyperkyphosis. Diseases of the spine occur from early childhood, recent studies showing that low bone quality could be a factor for development and progression of scolios, as shown here https://www.mdpi.com/2227-9067/9/5/758 . Thus, a link between early low bone mass and risk of fractures in the elderly could be pointed.
Could it be also optimized for metabolic diseases among children? It may have a plethora of applications, such as monitoring the fracture risk among Lobstein patients in relation to medical treatment or to the need to conduct spinal fusion.
Conclusions should be delineated based on the results, they should not be the results. Please rephrase them.
Author Response
Dear author,
Thank you for the opportunity to review this article.
It is a very interesting topic about using artificial intelligence in fracture prevention among osteoporotic patients.
Reviewer comment 3.1: The introduction should be expanded. Are there any other AI or learning machines that were already proved to bring value to fracture prediction?
Author response 3.1: Thank you for your comment. We reported further Machine-Learning and other Algorithm-based studies for osteoporotic fracture prediction in the introduction section. Other published approaches focused more on being risk assessment tools, requiring the input of existing data to calculate a future fracture risk, or generated other parameters than BMD, for example Texture Analysis or Feature Extraction from conventional X-Ray-images. Our approach is the first comparing Algorithm-generated vBMD from CT-images to manual opportunistic vBMD measures for the prediction of future incidental vertebral fractures.
Author action 3.2: The following text passage was added to the manuscript:
- (Page 3, lines 3-12) “Previously published studies investigating machine learning- and CNN- approaches to predict future osteoporotic fractures are mainly risk assessment tools requiring the input of existing clinical examination data, such as aBMD derived from DXA [24-26]. Other approaches focused on machine learning combined with texture analysis of vertebrae, not taking BMD into consideration [27], or performed feature extraction by a deep-learning algorithm from lateral spine radiographs, but using non-CNN generated aBMD from DXA-scans [28].”
Reviewer comment 3.2: Low bone mass is a common problem in the elderly, but during childhood, the risk of fractures is also increased. Recent evidence shows that from early childhood the risk of fractures is increased in low bone mass individuals, quantified by Vitamin D and Calcium levels, as shown here: https://www.mdpi.com/1660-4601/20/4/3300. Did you correlate low BMD with vitamin D and Calcium?
Author response 3.2: Thank you for this very important point. Unfortunately, we could not correlate serum Vitamin D, Calcium levels or other laboratory parameters related to the bone metabolism with our vBMD measures due to the retrospective design of the study. Furthermore, as the indication for CT scanning was not BMD assessment, most patients did not undergo special blood testing for evaluation of bone health. We fully agree with the reviewer that this analysis is of great importance and that this should be investigated in future prospective studies. We have added this to the limitations section.
Author action 3.3: The following changes were made in the manuscript:
- (Page 12, lines 26-33) “Therefore, we could not correlate laboratory parameters such as Vitamin D or serum Calcium levels with the vBMD measures due to lacking patient blood samples, even though low bone mass quantified by Vitamin D and Calcium showed an increased low-energy trauma fracture risk in a pediatric population [55]. Thus, future studies need to be performed, that correlate automated vBMD measures with serum Vitamin D and Calcium levels in an adult study cohort.”
Reviewer comment 3.3: Materials and Methods: a Flow diagram would make it easy to follow.
Author response 3.3: Thank you for this suggestion.
Author action 3.3: We designed a flow diagram, visualizing the workflow of automated vBMD extraction based on a routine MDCT scan, visualizing the steps of automated spine recognition, labelling, segmentation and eroding vertebral segments not necessary for the vBMD extraction:
- (Page 3, lines 19-30) “Figure 1: Flowchart illustrating the study’s workflow with regard to automated spine processing and vBMD extraction. Clinical routine MDCT scans of the thoracolumbar spine were retrospectively identified by using criteria for inclusion as described in the section 2.1 Subjects (1. a). Segmentation and labelling of vertebrae were performed using an CNN-based automated pipeline (https://anduin.bonescreen.de) (2. b, c). Segmentation masks of vertebral bodies were eroded by the cortical bone as well as posterior vertebral elements were removed using affine and deformable transformations (3. d). vBMD measures were extracted using asynchronous calibration and correction for contrast medium, if applicable (4.).”
Reviewer comment 3.4: Lines 87-88: Why did the patients undergo the first MDCT scans? It is hard to guess because you excluded neoplastic patients, trauma patients, and spinal surgeries... Is it a part of a screening? It is suggestive that the second CT was conducted because of the vertebral fracture. Please elaborate on the subject for ethical reasons.
Author response 3.4: The patients underwent the MDCT scans for other indications than to screen for osteoporosis, of which most were tumor staging scans. Inclusion criteria for the fracture patients was a history of two routine CTs (“baseline” and “follow-up”) showing the thoracolumbar spine, of which the newer scan (“follow-up”) showed a new incidental vertebral fracture. Exclusion criteria were a history of vertebral metastasis or hematologic disorder, traumatic spine injury, a history of an osteoporotic fracture in a different region (e.g., hip), or previous spinal surgery.
Author action 3.4: Thank you for pointing out that this point is misleading. We have rephrased this section in order to clarify the indications for CT scans:
- (Page 4, lines 11-16) “Baseline and follow-up scans were routine CTs with other indication than to screen for osteoporosis, of which most were staging CTs in tumor patients. Exclusion criteria were a history of vertebral metastasis or hematologic disorder, traumatic spine injury, a history of an osteoporotic fracture in a different region (e.g., hip), or previous spinal surgery.”
Reviewer comment 3.5: Discussions: Although Discussions are focused on various methods of assessment for osteoporosis/low BMD, in my opinion, this section should start with a short general talk about osteoporosis, methods of assessment, and fracture risk among the elderly due to poor BMD.
Author response 3.5: Thank you for this suggestion. We agree with the reviewer and have added this information accordingly.
Author action 3.5: A general discussion about osteoporosis was added to the discussion section, highlighting prevalence in the elderly, risk of fractures and their consequences and current diagnostic tools:
- (Page 9, lines 29-32, page 10, lines 1-16) “Osteoporosis is a highly prevalent disorder in the elderly characterized by decreased BMD and microarchitectural deterioration of the bone [1,36]. This leads to an increased risk of low-energy trauma fractures, of which vertebral fractures are the most common [3]. Incidental osteoporotic fractures cause a significant decrease in quality of life [4], increased risk of future fractures [6] and bring burden to the socioeconomic system [37]. As an underdiagnosed condition [9,10], diagnostic tools for an early discovery of patients at risk for an incidental osteoporotic fracture are important. The Gold-Standard tool for the diagnosis of osteoporosis is DXA, measuring aBMD [11]. This highlights a major problem of DXA, measuring areal but not volumetric BMD, which is easily contaminated by intra- and extra-osseous soft tissue effects [13]. Avoiding this, qCT is a different, more accurate method for osteoporosis assessment, measuring vBMD from dedicated CT-scans performed to assess BMD, but has the disadvantage of additional radiation and costs [38]. Therefore, we focused on using opportunistically generated vBMD by a CNN-based approach, causing no additional radiation to the patient and minimizing the costs, for future vertebral fracture prediction.”
Reviewer comment 3.6: Other degenerative conditions are also as a consequence of low BMD, such as adult scoliosis or hyperkyphosis. Diseases of the spine occur from early childhood, recent studies showing that low bone quality could be a factor for development and progression of scolios, as shown here https://www.mdpi.com/2227-9067/9/5/758 . Thus, a link between early low bone mass and risk of fractures in the elderly could be pointed.
Author response 3.6: It could be valuable to detect low bone quality in scoliosis patients earlier by using automated opportunistic bone mineral density measurements. An earlier treatment with potentially improving the individual outcome of the patient could be the consequence. Up until now, it has not been investigated yet, how well the automated vBMD measures work in scoliosis and how a potential outcome could be predicted. Future research could potentially further investigate this topic.
Author action 3.6: The literature was mentioned and put into context with our research:
- (Page 12, lines 15-24) “In a future perspective, automated opportunistic vBMD extraction could be added to more routine MDCT scans in groups with increased fracture risk, also pediatric cohorts who receive regular MDCT-scans, to timely initiate medical or surgical treatment if needed. Especially due to the correlation of low bone quality in early childhood and the increased risk for the development and progression of scoliosis [54], early automated opportunistic detection of low vBMD could be helpful to diagnose and therefore treat such conditions earlier. This is suggestive to potentially prevent future fractures and hast to be further investigated.”
Reviewer comment 3.7: Could it be also optimized for metabolic diseases among children? It may have a plethora of applications, such as monitoring the fracture risk among Lobstein patients in relation to medical treatment or to the need to conduct spinal fusion.
Author response 3.7: This is an interesting point. In general, indications of automated opportunistic bone mineral density measures could be widened in the future, for example for children with increased fracture risk and especially children who receive regular routine MDCT-scans, to check on bone mineral density and bone quality without exposure to additional radiation. As our approach aims to measure opportunistic vBMD, it is a prerequisite that the patient receives an MDCT-scan mainly for another indication than to screen for low vBMD.
Author action 3.7: The following adaption was made to the manuscript:
- (Page 12, lines 15-24) “In a future perspective, automated opportunistic vBMD extraction could be added to more routine MDCT scans in groups with increased fracture risk, also pediatric cohorts who receive regular MDCT-scans, to timely initiate medical or surgical treatment if needed. Especially due to the correlation of low bone quality in early childhood and the increased risk for the development and progression of scoliosis [54], early automated opportunistic detection of low vBMD could be helpful to diagnose and therefore treat such conditions earlier. This is suggestive to potentially prevent future fractures and needs to be further investigated.”
Reviewer comment 3.8: Conclusions should be delineated based on the results; they should not be the results. Please rephrase them.
Author response 3.8: Thank you for this comment.
Author action 3.8: We rephrased our conclusions, highlighting the potential benefits for clinical practice, based on our results. These are mainly the potential identification of individuals at high fracture risk, without exposing the patient to additional radiation and minimizing costs for examination:
- (Page 12, lines 41-48) “This study shows a significant association between automated CNN-based measurements of opportunistic trabecular vBMD and incidental VFs at the thoracolumbar spine at 1.5-year follow-up, with potential superiority to manual vBMD measures. Therefore, automated vBMD measurements could be highly advantageous in clinical practice for identifying individuals at a high risk of an incidental vertebral fracture, without additional costs and radiation exposure. “
==========================================================
Round 2
Reviewer 2 Report
The paper is ready to be publish