1. Introduction
The human brain develops and atrophies during growth and aging, resulting in structural changes in the brain [
1,
2]. These changes in the brain structure can be confirmed with the human eye using medical images. Particularly, noninvasive and radiation-free magnetic resonance (MR) imaging (MRI) of the brain is a useful method for confirming structural changes. Thus, human age can be estimated by using brain structures that change with age, and this type of age estimation technique can be applied to two major fields: forensic medicine and early detection of neurodegenerative diseases of the brain.
As for the application to the field of forensic science, both living and deceased subjects can be targeted [
3,
4]. In the case of the living subject, it is expected to be applied to situations where identification is required, such as asylum applications, criminal lawsuits, and youth sports [
3], whereas in the case of the corpse, it is expected to be applied to the identification of the unidentified corpse [
4]. In such applications, since previous studies have attempted to estimate the age with other parts of the body, such as the teeth, clavicles, hands, and knees, the reliability of the age estimation will be improved by using this method together with these techniques. As for the application to the early detection of neurodegenerative diseases of the brain, it is based on the fact that brain morphology changes with aging (atrophy due to linear decrease in gray matter volume) and these neurodegenerative diseases cause brain atrophy that is greater than that of normal aging [
5]. Since patients with neurodegenerative diseases develop brain atrophy caused by systematic loss of nerve cells in the central nervous system, such changes can be confirmed through MRI and computed tomography (CT) images [
6,
7]. Therefore, patients with neurodegenerative diseases are assumed to show a larger discrepancy between their chronological age and estimated age based on their brain structures, and this discrepancy can be used for the early detection of neurodegenerative diseases. Furthermore, considering the aging of the population on a global scale and the fact that “aging” is considered a risk factor for neurodegenerative diseases, this type of technology is expected to be increasingly utilized in the future [
8,
9]. Early detection of these diseases is greatly significant since these diseases affect not only the patients themselves, but also the quality of life of their families [
10,
11].
In related works, regarding the methodology used to estimate age, in order to improve the identification of age-related developmental disorders as significant, an investigation was conducted based on a dental examination that included the determination of the X-ray image of the left hand and dental status and X-ray images of the dentition [
12], however, if the skeletal formation of the hand has been completed, an additional X-ray examination of the clavicles was required. These results indicate that it is difficult to estimate age using the same physical indicators. In addition, another paper has been reported [
13] that suggests a problem with the use of X-rays in age estimation because of the radiation exposure involved in age estimation, and as the international atomic energy agency (IAEA) regulates the use and possible abuse of X-rays, it is desirable to perform this non-invasive method. Moreover, a paper has been reported [
14] that shows the usefulness of MRI-based methods as an alternative to age estimation methods that involve radiation exposure, and a report examining the degree to which imaging time can be reduced in age estimation methods using MR images of the hand and wrist joint. However, the method has the disadvantage that it is not applicable to a wide range of age groups because the number of subjects is small and the age range of the subjects is limited. In summarizing these reports, noninvasive and identical body part-based age estimation is required.
Recently, deep learning technology has continued to develop and has been applied to several medical image analyses [
15,
16,
17,
18,
19,
20] besides classification [
21], including semantic segmentation [
22,
23], and object detection [
24,
25] using convolutional neural networks (CNNs) and regression tasks [
26,
27]. This study attempted to estimate human age from brain MR images using regression analysis. Although medical methods are more useful than nonmedical methods for human age estimation [
28] and age estimation methods using brain MR images via deep learning [
29,
30,
31,
32] have already been reported, studies on young subjects such as minors have not been confirmed. However, no study on young subjects, such as minors, has been reported. Moreover, previous studies reported that sagittal images should be used for age estimation using two-dimensional (2D) brain MR images [
32] and that central regions of the brain are better than peripheral regions [
31] as the target regions. Therefore, this study aimed to examine the accuracy of age estimation by using 2D CNN, a type of deep learning technique, for a wider age range than previous studies, including from infants to elderly individuals, using brain MR images.
2. Materials and Methods
2.1. Subjects
Images from two public databases were used. One database is the Decoded Neurofeedback (DecNef) Project Brain Data Repository (
https://bicr-resource.atr.jp/srpbsopen/ (accessed on 28 December 2022)), which is part of the Japanese Strategic Research Program of Brain Sciences (SRPBS) supported by the Japanese Advanced Research and Development Programs for Medical Innovation (AMED) [
33]. Another one is the “Healthy Brain Network” published by the Child Mind Institute (
http://fcon_1000.projects.nitrc.org/indi/cmi_healthy_brain_network/ (accessed on 28 December 2022)) [
34]. This study obtained T1-weighted images of 1000 subjects (men = 500, women = 500) aged 5–79 years from these databases (
Table 1). The images were sourced from multiple centers. Therefore, the minimum to maximum width and height of the images used in this study ranged from 180–320 mm and 138–320 mm, respectively. When the same number of male and female data was obtained, care was taken to adjust the distribution of the number of persons by age to be as uniform as possible.
2.2. Preprocessing
Since all data acquired in this study were in the Neuroimaging Informatics Technology Initiative (NIfTI) format and this study targets 2D images (sagittal plane), the data were converted to Joint Photographic Experts Group (JPEG) format. The accuracy of age estimation using 2D brain MR images is higher when the central region of the brain is targeted than the peripheral regions. Therefore, in this study, we used a total of 11 images from each subject, consisting of the central value of the total number of images after conversion from NIfTI format to JPEG format and five images before and after the central value.
Since the lower limit of the subject age was set lower than that of previous studies, it cannot be denied that the age estimation in the present study is based on the size of the cranium and brain, not on the structural differences of the brain. Since the proportion of the brain in the MRI image can be changed by setting the field of view (FOV) and other factors, the size of the head in the MRI image tends to be the same to some extent, if we simply consider the size of the head in the image. However, considering the fact that there is a difference in the standardization of the head size, it cannot be eliminated that the age estimation is based only on the size of the cranium and the brain. To exclude the influence of the head size, the data augmentation of brain size rescaling was performed. Since the results of the actual measurement of head standards (
Figure 1 and
Figure 2) and images that included a part of the brain (
Figure 3) are missing due to magnification, scaling factors of data augmentations as 0.8, 0.9, and 1.0 were conducted in this study. The measured angles of the subcallosal line (the line connects the inferior surface of the genu of the corpus callosum to the inferior surface of the splenium of the corpus callosum) showed normality in the variation of the angle of the head (mean = −1.0°). Therefore, in this study, data augmentation on rotation was not conducted. Data augmentation by scaling down or equalizing was conducted.
2.3. Architecture and Training
To construct a regression model for estimating chronological age (CA) from brain MR images, we developed a regression CNN based on ResNet-50 [
35]. Originally, ResNet-50 has a structure in which the softmax layer outputs the input data as values ranging from 0 to 1, and the classification layer shows the final classification result. However, in this study, since the estimated age is the output, the regression model is created by introducing a regression layer instead of these layers, which can output the estimated value before it is converted to a value between 0 and 1 by the softmax layer. Using this model, we created three age estimation models (
Table 2). The CNN conditions used during training were the stochastic gradient descent with momentum optimizer, and the initial learning rate, max Epochs, and mini-batch size were set to 0.0001, 10, and 512, respectively.
A fivefold cross-validation (train:test = 4:1) was used in training the data. In this study, we defined a subset as one of the five divisions of the dataset, and we adjusted the distribution of the number of persons in each age group to be as uniform as possible before creating the subset, so as not to bias the age distribution within each dataset. These pre-adjustments and the five-fold cross validation are performed to avoid overfitting of specific data to the created predictor.
Table 3 shows the software and computer specification used for the training of these CNNs.
2.4. Evaluating the Created Regression Models
The regression models were used to estimate CA from brain MR images in the test data. Since training and evaluation were conducted using fivefold cross-validation in this study, the average performance of the five regression models was evaluated as the performance of this method.
Mean absolute error (MAE), root mean squared error (RMSE), correlation coefficient (R), and coefficient of determination (R
2) were used as evaluation indices. Since linear regression is used in this study, we calculated the R
2 value by squaring R which was obtained from linear regression and used it as an evaluation index. R, MAE, and RMSE are calculated using the following equations:
where y
obs,i is the real age, y
pred,i is the estimated age,
is the average of y
obs, i,
is the average of y
pred, i, and
n is the number of samples in the case.
2.5. Evaluation of the Significant Region in Age Estimation
The results of previous studies showed that central regions of the brain have a better effect on age estimation than peripheral regions of the brain [
31]. A detailed examination of the visualization map of the basis for this report shows that the MAE is smaller in the corpus callosum region. This indicates that the corpus callosum region is a useful region for age estimation.
In this study, based on the results of the previous study, we created an image in which the corpus callosum region is removed (
Figure 4). By determining how the accuracy of the estimation changes before and after the removal of the corpus callosum region, it is possible to examine whether the corpus callosum region is a useful region for age estimation in this method. As a specific analysis procedure, we used the method for examining significant differences as described in
Section 2.6.
2.6. Statistical Analysis
In this method, three adjustments were considered when collecting data and creating a dataset/subset: (1) adjusting the distribution of the number of persons by age to be as uniform as possible between men and women, (2) adjusting the number of data to be as uniform as possible between men and women, and (3) adjusting the distribution of the number of persons by age to be as uniform as possible between each subset. Therefore, the five age estimation models A, B, C, D, and E, which were created using the fivefold cross-validation, are models that each learn the same number of data with similar age distributions, which contributes to showing similar age distribution trends for both Male_Model A–E and Female_Model A–E.
Based on this trend, we conducted a statistical significance study using the absolute error (AE) of each test data. The AE of 200 subjects for ALL_Model and the AE of 100 subjects for Male_Model and Female_Model were used to confirm the normality by the Shapiro–Wilk test, and when the normality was found, the F test and t-test were used to examine the statistical significance. In the case of nonnormality, the Wilcoxon rank sum test was used to examine the significance of the difference. A p value less than 0.05 was considered statistically significant. Specifically, we used this method to examine whether there is a statistically significant difference in the accuracy of age estimation between Male_Model and Female_Model and whether there is a statistically significant difference in the accuracy of age estimation before and after the deletion of the corpus callosum region. In the former case, a total of 25 patterns (cross comparison for five patterns of Male_Model (Male_Model A–E) and five patterns of Female_Model (Female_Model A–E)) were examined for statistical significance using AE. Conversely, in the latter case, since the images before and after the deletion of the corpus callosum are paired (the original images are identical before and after the deletion), we examined the statistical significance of AE using five patterns each of ALL_Model, Male_Model, and Female_Model (15 patterns in total). The test data used for the input are images without data augmentation processing and images from which the corpus callosum region is removed.
4. Discussion
To the best of our knowledge, this is the first study to examine age estimation methods based on brain MR images using deep learning over a wide age range from young people, including infants, to elderly individuals. Therefore, we were concerned that the age estimation would be affected by the difference in the head standard due to the age range of the target population, but we did not observe any decrease in accuracy even after data augmentation by scaling down/equalizing the images. In fact, the accuracy of age estimation improved as the number of data increased with the implementation of data augmentation, and we expect further improvement in accuracy by adding more data, including data in age ranges where data are lacking, and increasing the number of data through further implementation of data augmentation methods. A further increase in the number of data is expected to improve the accuracy.
Compared with previous studies, the results of MAE are inferior to those of the previous studies, whereas the results of correlation coefficients are comparable to those of previous studies. Jiang et al. [
32] reported the results of age estimation using 2D images segmented into gray matter and white matter in the axial, sagittal, and coronal sections. The results show that the best estimation accuracy is obtained for the combination of sagittal and GM, recording 3.57 years, which exceeds the MAE of 5.25 years in the present study. Hepp et al. [
31] also reported an age estimation method for 10,691 subjects, which is 10 times larger than the present study, although using 3D brain MR images. In this report, the MAE was 3.21 years, which is approximately two years less than that of the present study. This may be due to the fact that more features can be obtained from each subject using 3D images rather than using 2D images and that more features can be obtained from each subject per age than in the present study because more subjects were included in the study. The reason for this may be that the number of features that can be obtained from each age group is larger than that of the present study. Conversely, Masaru et al. [
30] reported an age estimation method using 3D brain MR images for 1101 subjects aged 20–80 years. In this report, the correlation coefficient was 0.96, which was equivalent to that of the present study, although the MAE was 3.67 years, which was approximately 1.6 years smaller than that of the present study. Although the differences in MAE may be attributed to the differences in the number of subjects and the image format, an overview of the estimated age against CA suggests that this method already has the performance to show the same trend as the previous studies.
The abovementioned comparison with previous studies suggests that the estimation accuracy of the present method will be further improved with an increase in the number of data and similar analysis using 3D-CNN.
Next, as for the analysis of the significant region, the accuracy of age estimation significantly decreased before and after the deletion of the corpus callosum region in the overall evaluation of this method. This result suggests that the corpus callosum is a useful region for age estimation, and the results show a similar trend to those obtained in previous studies. The corpus callosum connects the left and right cerebral hemispheres, which are separated by the median longitudinal fissure, and therefore plays a role in the transmission of information between the two hemispheres, as it is composed of more than 190 million axons. Cognitive functions are affected by cerebral corpus callosum dysplasia and cerebral corpus callosum disconnection [
36,
37], and considering this, it can be said that the corpus callosum plays a role in brain functions corresponding to higher brain dysfunction and cognitive dysfunction. Thus, taking into account that brain atrophy and cognitive decline progress with aging, it is considered that some morphological changes have occurred in the corpus callosum region due to aging and that the AI focused on these changes to estimate the age of the subjects. However, although we focused only on the corpus callosum region in this study, we cannot deny that other regions may have had a significant impact on age estimation. Therefore, the influence of other regions on age estimation should be examined by using heat mapping techniques, such as Grad-CAM [
38] and occlusion sensitivity [
39].
Finally, regarding the accuracy of age estimation between Male_Model and Female_Model, no statistically significant difference in estimation accuracy was found in 17 of 25 cases in this study, and significant differences were found in eight cases. Therefore, since no statistically significant differences were found overall, the results suggest the possibility of data collection methods that focus on age distribution regardless of sex in the field of research on age estimation by combining deep learning techniques and brain MR images. For example, the results indicate that the brain MR images of a “42-year-old man” and a “42-year-old woman” can be treated as data of a 42-year-old person. Therefore, this result supports the fact that it is possible to collect data more easily without falling into the data shortage caused by the consideration of the sex ratio at the time of data collection, as in the present study.
However, in eight of 25 cases (32%), there was a significant difference in the accuracy of age estimation. There are various theories regarding anatomical differences in brain structure. In particular, morphological differences in the corpus callosum between men and women were raised by Lacoste-Utamsing and Holloway [
40], but their truth or falsehood has been debated due to the recent spread of MRI and CT equipment [
41,
42,
43]. Therefore, it is difficult to explain the eight cases in which significant differences were found between men and women because of the difference in brain structures between sexes, but some structural differences indeed caused the significant differences. Therefore, it is important to confirm how the current results will change when the volume of data is increased and to examine how the results are represented by heat mapping techniques, such as Grad-CAM and occlusion sensitivity, as described above.
There was a limitation in this study. Even though ResNet-50 was used in this study, original network models, rather than pre-trained models, are considered to be important in order to create an age estimation model with higher performance. Considering the fact that previous study [
17] has already confirmed the use of original networks, we believe that it will be necessary to consider the network model to be used in order to create an age estimation model that shows higher performance in the future.