1. Introduction
Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder characterized mainly by social impairments, commonly followed by communication challenges or restricted and repetitive patterns of behavior [
1]. ASD is a substantially heterogeneous disorder in which two diagnosed subjects may have a completely different set of symptoms. Some researchers estimated that approximately one in 44 children aged eight years are in the spectrum [
2]. Despite a possible gender bias regarding diagnosis, ASD seems to be a sex-related disorder, with a male-to-female ratio close to 3–4:1 [
2,
3,
4]. Current research points to ASD as a primarily hereditary disorder. Approximately 80–83% of ASD cases are due to genetic inheritance. Close to 17–20% are due to environmental risk factors, including problems during the gestation period and the parents’ age [
5,
6,
7].
Children and adolescents with an ASD diagnosis have medical expenses up to 6.2 times greater than those with typical development (TD), with general costs from 8.4 to 9.5 times greater than the average [
8]. In addition to medical expenses, intensive behavioral interventions needed for ASD treatment have costs from USD 40,000 to USD 60,000 per child per year [
9]. Moreover, most ASD individuals live in low- or middle-income countries and receive no proper support from health or social care systems, suffering from the high costs of (1) proprietary tools for diagnosis; (2) evidence-based intervention techniques, and (3) training of parents and professionals to conduct the ASD treatment process [
10].
Early diagnosis and proper interventions are critical factors in reversing the impairments generated by ASD in children. Unfortunately, there are no low-cost automated tests to identify the disorder. Instead, the ASD diagnosis is performed through clinical observation, which is challenging to accomplish in young children, especially in the early years of life [
11]. Early treatments may result in improved cognitive, behavioral, and social functioning, allowing, for a subset of people, an evolution that may lead to healthy adult life, as well as significant long-term societal cost reductions [
12]. However, most technological tools proposed to assist the ASD intervention process showed some common limitations [
13].
It is critical to comprehend the severity of each individual with ASD to plan personalized treatments and conduct more effective intervention processes. Nowadays, there are many protocols used to support diagnosis, such as the Autism Diagnostic Observation Schedule (ADOS), Autism Diagnostic Interview—Revised (ADI-R), and Social Communication Questionnaire (SCQ). However, ADOS is currently one of the most used worldwide. ADOS divides ASD classification between autism—the ones with more severe symptoms—autism spectrum—the ones with less severe symptoms—and as non-spectrum—those diagnosed outside of the spectrum [
14].
An ADOS diagnosis consists of standard evaluation on three main domains: communication, social relations, and behavior. Each domain has a set of tasks to be evaluated, with different total scores. The ADOS diagnosis comprises four modules for a specific range of ages and language skills, each with different cut-offs for each of the three classes [
15]. Furthermore, current ASD diagnosis is performed by trained professionals, with the help of tools such as ADOS, which has both sensibility and specificity above
[
16]. It is important to note that the current ADOS version mainly used is the ADOS-2 [
14], but due to our available samples, we used the ADOS in its classic version.
The last decade was marked by research looking for methods to take advantage of the recent evolution of machine learning (ML) to build automated ASD diagnosis processes [
17,
18,
19,
20]. The first works in this field date from mid-2010 [
21]. Since then, there has been an increase in the number of papers and improved outcomes. Many of these works used magnetic resonance imaging (MRI) and ML combined, aiming for a positive or negative ASD diagnosis by classifying subjects between ASD and TD [
21], as in [
18,
19,
20,
22,
23,
24].
One of MRI’s advantages is that it is a non-invasive procedure, being a prevalent method to scan the brain in living human beings [
25]. There are two main uses for brain MRI: (1) the structural scan, which scans brain tissues and assesses their differences; and (2) the functional scan, which tracks the oxygen flow in the brain. This second method is usually called functional MRI (fMRI) and allows the indirect measurement of brain activities in regions of interest (ROIs). From the measured oxygen levels, it is possible to determine which regions are more activated than others [
25,
26].
There are many tasks applied to a subject for an fMRI scan; they range from resting state to very narrow activities, such as watching a video. The resting-state fMRI is usually called rs-fMRI, which is a means to delimitate the activity for scan acquisition. However, the other activities, in general, do not have a specific nomination. The rs-fMRI is easiest to apply and is also easy to compare between multiple studies, as it is easier to reproduce in the same setup than any other activity.
Additionally, other medical images are also combined with ML to diagnose ASD, as is the case of electroencephalograms (EEG), which try to measure brain activity by scanning magnetic signals originating from the brain. There are many different setups, but as in fMRI, many papers using resting-state scans are available, such as [
27,
28,
29,
30]. However, some other setups, such as during the ADOS test [
31] or while watching videos [
32], are also available. However, there are few EEG data with ASD diagnosis that are publicly available.
Meanwhile, on the fMRI side, some universities have worked together and created the Autism Brain Imaging Data Exchange (ABIDE) [
33], an initiative that makes available more than 2000 brain fMRI scans for research purposes. In addition, all fMRI subjects gave consent to use their images. This initiative facilitates autism investigation by providing access to a database that otherwise would not be easily acquired. Moreover, the pre-processed data available on ABIDE I PREPROCESSED also contribute in this sense.
Therefore, we take into account the following true propositions: (1) early diagnosis and interventions lead to better outcomes for autism treatment, as well as long-term cost reduction; (2) ADOS scores allow a rating of the ASD severity; (3) promising results of ML techniques classifying ASD vs. neurotypical through the use of rs-fMRI; and (4) the ADOS scores and ASD rs-fMRI data available at ABIDE. This work aims to investigate the functional differences between autism spectrum and autistic individuals, looking for potential brain regions that may be associated with autism severity. We used ML applied to brain segments from rs-fMRI data to classify individuals from the two groups to identify these regions, selecting the ones with the greatest differences as potential biomarkers that should be more deeply investigated in future works.
The remainder of this paper is structured as follows:
Section 2 presents the methodology employed.
Section 3 and
Section 4 present and discuss our results, while
Section 5 concludes this work.
2. Methodology
This section presents this work’s methodology. It starts by describing the materials used in
Section 2.1, followed by a presentation of the ADOS sub-classes for ASD classification in
Section 2.2 and the region selection process in
Section 2.3. Then, we explain both the ML used to classify the samples in
Section 2.4 and the validation process in
Section 2.5. Finally, we present the final data source in
Section 2.6 and the accuracy, sensitivity, and sensibility cut-off points in
Section 2.7.
2.1. Materials
In this work, we used the rs-fMRI data provided by ABIDE [
33]. The ABIDE I consortium currently offers 1100 rs-fMRI scans from subjects with and without ASD diagnosis. Since our work was not an ASD vs. TD classification, all rs-fMRI data of neurotypical subjects were discarded, leaving 505 preprocessed fMRI scans from subjects with ASD diagnosis. From these ASD data, only 202 had information concerning ADOS scores for communication, social interaction, and repetitive behavior, which are essential data in our classification approach. Thus, the final data comprised 202 ASD subjects.
The original data from fMRI are 3D images over time. Therefore, applying an atlas and a preprocessing pipeline is necessary to transform the 3D images into matrices representing the brain regions (columns) and their respective activities over time (rows). The preprocessing pipeline also removes noises and other undesirable artifacts, which allows better results.
2.1.1. Automated Anatomical Labeling (AAL)
An atlas is a brain mapping that allows us to evaluate brain activity through its regions. We used the AAL atlas [
34] available at ABIDE, as it is the most used atlas in the literature for ASD classification using fMRI and ML [
21], reaching meaningful outcomes in [
18,
20,
35,
36,
37].
In its third version, AAL segments the human brain into 116 ROIs. A detailed explanation of these regions can be seen in [
34].
Table 1 presents the AAL’s labels.
2.1.2. Preprocessing Pipeline
Different machines across multiple sites acquired the fMRI data available at ABIDE. Moreover, some sites used different total time acquisition. Thus, some rs-fMRI scans have more frames than others.
The ABIDE offers 884 preprocessed rs-fMRI scans in four pipeline options:
Connectome Computation System (CCS);
Configurable Pipeline for the Analysis of Connectomes (CPAC);
Data Processing Assistant for rs-fMRI (DPARSF);
Neuroimaging Analysis Kit (NIAK).
These pipelines have different methods and sequences to manage fMRI data, removing noise such as head motion, skull, and magnetic interference. We only used the DPARSF pipeline in this work [
26,
38,
39]. The criteria used for choosing DPARSF were analogous to those employed in the atlas definition process. Except for works where the authors create their preprocessing pipeline, DPARSF is the prevailing pipeline in a number of papers [
21], reaching meaningful outcomes in ASD classification using rs-fMRI and ML [
37,
40,
41,
42].
The DPARSF final product is a matrix , where X is the number of columns, and Y is the number of rows. Each table column represents one ROI, according to the chosen atlas, and each table row represents the elapsed time during the scan. The number of rows (Y) could differ for each fMRI, even using the same atlas. However, the X value must be the same for all fMRI using the same atlas. For example, in a DPARSF matrix, a value () represents the oxygen level of ROI i at time j.
2.2. ADOS Classification
We used the ADOS standard division for ASD diagnosis to investigate any functional differences in the severity of ASD. The ADOS standard division has previously defined cut-off points to classify subjects as autistic, ASD, or non-ASD.
Table 2 shows the maximum scores and the ASD and autism cut-off points for each module (ASD score groups according to the individual’s age) and domain areas. For each ADOS module, the first line indicates the maximum value; the second line shows the ASD cut-off point, and the third line indicates the autism cut-off point, according to the domain area.
We adopted the cut-off points from [
15] to determine into which class a given subject should be classified, based on their scores available on ABIDE. This way, if a subject scored in at least one domain above the “autism cut-off”, they were classified as Class 2 (autism). If the subject did not score above the “autism cut-off” but had at least one domain scoring above the “ASD cut-off”, they were classified as Class 1 (ASD). We classified the remaining subjects as non-ASD, discarding them.
Table 3 and
Table 4 show the ABIDE subjects’ distribution according to the ADOS class; the complete phenotypes of each subject are available on [
33].
2.3. Region Selection
We grouped the ROIs from AAL by macro regions, considering the region name. The result was a set of regions (SoRs) (e.g., precentral left and right as one SoRs, angular left and right as one SoRs). This process resulted in 35 SoRs containing the ROIs grouped by brain region. We also included one SoRs with all the ROIs.
Table 8 presents the resulting SoRs, where the set ID is the SoRs’ identification, and the RoIs IDs match the RoIs used in
Table 1.
This approach aimed: (1) to simplify the SVC classification; and (2) to give a more generic location of the functional differences between ASD classes in a manner that would allow better comparison between existing studies that use different atlases.
2.4. SVC Classifying Algorithm
We used a supervised learning method, support vector machine (SVM), specifically the C-Support Vector Classification (SVC), to check the differences between ASD sub-classes. This method has three steps: training, validation, and test [
43,
44].
Based on an in-depth systematic review and meta-analysis available in [
21], we selected SVM as our ML method. SVM was the most used AI tool for solving ASD classification problems, showing some reliable results when applied in similar situations [
18,
20,
37,
45,
46]. The second most used method was the artificial neural network (ANN) [
21]. Both approaches have similar results in the literature, with SVM slightly better in terms of sensitivity [
21]. As our goal was to find potential regions of a biomarker, and due to the complexity of the problem, we decided to adopt SVM given its more direct comparison, facilitating the interpretability of the results. We used the SVM from the scikit-learn library available at [
47].
SVM creates a multidimensional plane, where each object (in our case, each subject) will be positioned according to the selected features’ value. First, the sample part used for training will determine a curve to split the plane, as shown in
Figure 1, where each area corresponds to one class. Then, the validation sample part will verify the accuracy of the curve, and this process will be repeated until the SVM reaches the best angle given the features, training sample, and validation sample. After this, the test sample is used to measure the SVM generalization.
We hypothesized that higher accuracy would reflect the existence of an interpretative way to differ each class. In other words, SoRs with higher accuracy potentially contain the regions where classes are more distinct regarding the features used. These findings can highlight the areas to consider for further investigations on functional brain activity and ASD severity.
As the main goal was to find regions where there is a functional brain difference in the ASD severity level, and there is a lack of data about SVM setups in previews works on fMRI related to ASD investigations, as observed in [
21], we chose a few educated-guess setups in our experiment. The setup was related to the variables
gamma,
coef0,
kernel,
class_weight,
degree, and
max_iter.
The gamma delimitates how close the final classification should be regarding the training sample, with more significant values given to more rigid solutions and lower values to given more flexible solutions.
The
coef0 is an independent value related to the scale of the sample. Meanwhile, the
kernel is the mathematical equation used to solve the problem, and the ones available from [
47] are
linear,
poly,
rbf,
sigmoid.
The class_weight option considers the size of each class in the training step, adjusting the weight accordingly. For example, regarding training, if Class 1 has three subjects and Class 2 has nine subjects, Class 1 will weigh three while Class 2 will weigh one. This process is meant to avoid the algorithm taking into account only the dominant class from training, which can jeopardize the SVM’s generalization capacity.
The degree will define the curve degree of the equation that splits the SVM classification plane. Finally, max_iter is the total training iterations allowed to be used by the algorithm, stopping the training when the value is reached, regardless of the gain.
Here, we used the following values for each variable:
2.5. Validation Process
We performed a k-fold cross-validation model to validate our process [
48,
49,
50]. We selected k = 10, which is recommended for samples larger than 200 objects. The SVM automatically split the sample into training and test; in this case, we used the standard 70% to training and 30% for test. Therefore, the 9 folds were sent to the SVM and then split into 7/3 for training and test, and then applied in the 10th fold for validation; the process was repeated until all 10 folds were used as the validation sample.
We adopted the following division criteria to avoid bias noise:
Amount of subjects of a specific ADOS subclass in each fold, avoiding any fold having only subjects of the same subclass. For example, a fold without autistic subjects could bias the SVC always to answer ASD due to the lack of autistic subjects on training or validation.
We first divided our sample into two groups, ASD and autistic, one for each ADOS subclass. Then, we ordered them by subject ID, and for each group, we designated one subject at a time for each fold: .
Thus, each fold had a balanced subclass distribution at the end of this process. Given our sample’s limitations, this process aimed to produce the most adaptive learning for our SVC.
2.6. Final Data Source
The resultant data were composed of two files for each subject. The first file contained a matrix where each column represented one of the 116 ROIs from the AAL atlas, and each row represented a picture of the brain over time. The second file was a vector with the subject’s phenotype data, including the ADOS score. Since the first row of each fMRI placed the ROI label, we removed it from the file sent to the SVM.
SVM only accept vectors as its input. Therefore, we converted the resulting matrix from DPARSF into a vector. We considered two conversion options: (1) construct a vector from the matrix where the matrix position is placed on the vector position ; and (2) acquire the maximum, minimum, median, and average values for each ROI from each SoRs and create a vector , where a and b are, respectively, the first and the last ROI ID of a SoRs.
Both conversion options have advantages and drawbacks. The first option has the simplest preprocessing but a more significant need for computer power for the SVC to process all data. On the other hand, the second option has the drawback of a preprocessing pipeline, which will acquire the data from each subject to transform in the four values mentioned above, with loss of information due to transformation. However, due to the size reduction, the SVC requires less computer power to analyze all the data from all subjects. Thus, aiming for better scalability and facilitating human understanding of the results, we chose the second option for this paper.
2.7. Accuracy, Sensitivity, and Specificity Restrictions, and Post-Hoc Tests
We imposed restrictions on the minimum accuracy, sensitivity, and sensibility required to consider a functional difference between the two ASD sub-classes. The cut-off point was 60%, based on values achieved by other ASD vs. non-ASD classification studies [
22,
23,
24,
51,
52,
53]. Thus, we discarded results with accuracy (ACC), specificity (SPC), or sensibility (SNS) less than 60%.
Finally, we applied three post-hoc tests on the features from the SoRs that achieved the cut-off: addition of phenotype data, t-test, and p-value. The addition of phenotype data aimed to investigate the effect of sex, age, and FIQ on SVM accuracy for each SoRs, while t-test and p-value aimed to investigate the separability of the sample used, to investigate how they differed from both groups.
3. Results
This section presents the results of our ASD vs. autism classification experiments. All SoRs can be seen in
Table 8 and each ROI used by these sets can be seen in
Table 1. In this paper, we used specificity (SPC) related to the ASD classification and sensitivity (SNS) associated with the autistic classification.
Our experiments worked with a total of 202 subjects, which comprised 36 with ASD and 166 with autism, according to the ADOS scores.
Table 9 shows the SoRs with the ACC, SNS, and SPC greater than or equal to 60%.
ACC ranged from (SoRs 27) to (SoRs 11). SNS ranged from (SoRs 27) to (SoRs 11). SPC ranged from (SoRs 27) to (SoRs 30). This shows the existence of a non-random separation when considering five brain regions.
The
t-test of each feature allows us to understand the difference between the ASD and autistic groups. The
t-test results are a statistical difference between any two given groups, and positive values mean that the group 1 average is larger than group 2, while negative values mean that the group 2 average is larger than group 1.
Table 10 shows the
t-test result for each feature on each SoRs for which SVM had above threshold results, and the positive values mean that the ASD group average is larger than the autistic group for that feature, while negative values mean that the autistic group average is higher.
Furthermore, reinforcing the
t-test result, the
p-value (scale [0,1]) of each feature from SoRs above the required threshold is plotted in
Table 11. The higher
p-value was 0.96 for the mean on ROI 4 (Frontal Sup. Orb. Left), the third ROI from SoRs 1, with high values indicating a risk of not being able to distinguish the two groups from each other. On the other hand, lower values indicate a high possibility of discerning the two groups using the feature. The lower
p-value was 0.02 from the min on ROI 72 (Putamen Left), the first ROI from SoRs 27. The SoRs 1 has a mean
p-value of 0.45 (0.43 STD), while SoRs 11 has a mean
p-value of 0.32 (0.14 STD); for SoRs 23, 27, and 30, the mean
p-value is 0.53 (0.51 STD), 0.30 (0.24 STD), and 0.30 (0.24 STD), respectively. Therefore, SoRs 11 has the lowest
p-value STD and one of the lowest
p-value means, which indicates a high probability of containing the largest set of features to classify ASD severity. It is worth noting that these values reflect only our sample and should not be used as a diagnostic tool as further research is needed to either confirm or deny our findings.
Moreover, we performed other trials adding phenotype information (age, sex, and full IQ). We used the same features and added the phenotype data in the vector sent to the ML algorithm. We executed the test for the three phenotypes together, one at a time, and all combinations of two phenotypes. We used the same process for the main experiment; the results that reached the threshold defined in
Section 2.7, as well as the ACC gain, using the phenotype for each SoRs are shown in
Table 12. However, as shown by [
21], these features did not show a significant improvement, if any, in the sample.
Finally, we show the mean result for each of the features with high ACC both for ASD and autistic in
Table 13 and
Table 14, respectively.
4. Discussion
This paper assessed brain functional differences between ASD and autism using rs-fMRI and SVM classification (SVC). The measure used to distinguish ASD from autism was the ADOS score and cut-off points, as seen in
Table 2.
Our results highlight some brain regions that potentially can distinguish functional differences between both groups (ASD vs. autism). The main finding in distinguishing the two ASD sub-classes reached up to accuracy (SoRs 11). These results need to be taken with caution due to the limitations mentioned and given its Matthews Correlation Coefficient of 0.31 (scale [−1,1]), which is better than a random selection but still not ideal. However, our results show a promising path to investigate the functional difference between both ASD sub-classes.
The best ACC was reached for SoRs 11, consisting of the cingulate gyrus (cingulum), and both left and right sides of the brain for the anterior, median, and posterior. We can conjecture that brain regions such as the cingulum ( ACC, SNS, SPC) and angular (SoRs 23) ( ACC, SNS, SPC) have the potential to differentiate the severity of ASD subjects taking into consideration the ACC reached on this experiment. These SoRs applied together with methods such as ADOS may in the future allow professionals to classify individuals. The frontal lobe (SoRs 1) ( ACC, SNS, SPC) also should be considered for further investigations as it shows reasonable ACC.
Our results support previous studies [
54,
55,
56] that point to the cingulum region functions differences between ASD vs. TD. Likewise, [
19,
40] detected the thalamus as a key region for classifying ASD vs. TD, and [
57,
58,
59,
60] pointed to the frontal lobe as a region where ASD vs. TD can be differentiated from each other. Angular (SoRs 23) [
61,
62], Heschl (SoRs 30) [
63,
64], and putamen (SoRs 27) [
65,
66] also have consistently been linked to ASD.
Since these brain regions are commonly pointed to as an ASD vs. TD differential, we can also suppose, based on our results, that such regions have the potential to describe areas where functional activity may be a biomarker for ASD severity, supporting previous investigations [
64]. Therefore, we can presume the potential functional difference between subjects from the ASD group and the autism group using these ROIs.
5. Conclusions
Firstly, and most importantly, the field lacks sample data to strengthen the recent outcomes. We believe that all published studies have insufficient samples to ensure definitive conclusions on ML applied to fMRI for ASD diagnoses. For example, the ADOS used hundreds of thousands of subjects to validate its algorithm, while the sum of all subjects from all published papers regarding ML applied to fMRI (discounting the subjects duplicated for multiple studies) is not even close to this value. Therefore, any claim to solve the issue tends to be premature. Nevertheless, it is mandatory to research possible biomarkers while waiting for more available data to validate the findings.
We investigated the functional brain activity difference between ADOS ASD sub-classes (autism and ASD) using fMRI data from subjects previously diagnosed and available at ABIDE. The differences between each ASD sub-class were the ADOS score and cut-off points. We applied these data to train an ML classification algorithm (SVC) to classify the disorder severity, investigating the existence of functional brain differences across regions between both ASD sub-classes.
Our main contribution was the identification of five SoRs that potentially have discriminating patterns for ASD severity. Additionally, the suggested use of SoRs can help to improve investigations by allowing more clarity in interpreting and comparing the results, aiming to enable physicians to look up the same markers found by the ML. In this same aspect, opting to explore approaches using features more easily observed by human analyses, such as the maximum, minimum, mean, and standard deviation from each ROI, is also another contribution. These contributions can improve further research to give tools for physicians to utilize these signals when evaluating a subject, more than simply finding an ML to aid the ASD evaluation.
Our findings are consistent with previous studies on autism and brain development, bringing a promising approach to evaluating ASD subtypes. A computational aid system could improve medical diagnosis by delivering more tools for physicians’ evaluation, reducing analysis ambiguity. Further research, applied to a younger sample, can allow a computational system to assess individuals early, before the most severe symptoms begin. Distinguishing the severity of a subject can help in intervention selection, and earlier diagnosis can help set proper interventions to improve the individual’s quality of life.
Our study limitations lie mainly in the reduced sample size, which may not generalize our outcomes for all populations. However, we can speculate about these functional differences between the ASD subtypes.
Another limitation of the study was the mean age of the subjects (≃ 16 years old), which does not correspond to early diagnosis. Therefore, an additional experiment with younger subjects will be required to improve the results’ reliability.
For further works, an increase in the available subjects, including younger ones, would help to raise the accuracy as it would help to clarify how many of our results can be generalized to all populations. In addition, the research community would benefit from more available fMRI data with the respective phenotype data (such as ADOS score, age at scan, sex, FIQ), allowing more accurate investigations.