Next Article in Journal
Cognitive Reserve in Isolated Rapid Eye-Movement Sleep Behavior Disorder
Previous Article in Journal
Lethal Lust: Suicidal Behavior and Chemsex—A Narrative Review of the Literature
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Parkinson’s Disease Gene Biomarkers Screened by the LASSO and SVM Algorithms

Tongji University School of Medicine, East Hospital, Department of Neurology, Tongji University, Shanghai 200070, China
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Brain Sci. 2023, 13(2), 175; https://doi.org/10.3390/brainsci13020175
Submission received: 14 November 2022 / Revised: 20 December 2022 / Accepted: 12 January 2023 / Published: 20 January 2023
(This article belongs to the Section Molecular and Cellular Neuroscience)

Abstract

:
Parkinson’s disease (PD) is a common progressive neurodegenerative disorder. Various evidence has revealed the possible penetration of peripheral immune cells in the substantia nigra, which may be essential for PD. Our study uses machine learning (ML) to screen for potential PD genetic biomarkers. Gene expression profiles were screened from the Gene Expression Omnibus (GEO). Differential expression genes (DEGs) were selected for the enrichment analysis. A protein–protein interaction (PPI) network was built with the STRING database (Search Tool for the Retrieval of Interacting Genes), and two ML approaches, namely least absolute shrinkage and selection operator (LASSO) and support vector machine recursive feature elimination (SVM-RFE), were employed to identify candidate genes. The external validation dataset further tested the expression degree and diagnostic value of candidate biomarkers. To assess the validity of the diagnosis, we determined the receiver operating characteristic (ROC) curve. A convolution tool was employed to evaluate the composition of immune cells by CIBERSORT, and we performed correlation analyses on the basis of the training dataset. Twenty-seven DEGs were screened in the PD and control samples. Our results from the enrichment analysis showed a close association with inflammatory and immune-associated diseases. Both the LASSO and SVM algorithms screened eight and six characteristic genes. AGTR1, GBE1, TPBG, and HSPA6 are overlapping hub genes strongly related to PD. Our results of the area under the ROC (AUC), including AGTR1 (AUC = 0.933), GBE1 (AUC = 0.967), TPBG (AUC = 0.767), and HSPA6 (AUC = 0.633), suggested that these genes have good diagnostic value, and these genes were significantly associated with the degree of immune cell infiltration. AGTR1, GBE1, TPBG, and HSPA6 were identified as potential biomarkers in the diagnosis of PD and provide a novel viewpoint for further study on PD immune mechanism and therapy.

1. Introduction

Parkinson’s disease (PD) is a common progressive neurodegenerative disorder. Its main pathological features are the loss of dopaminergic neurons in the substantia nigra (SN) and α-synuclein abnormal aggregation [1]. PD is usually diagnosed by the physical examination and assessment of motor symptoms, such as resting tremor, muscle rigidity, and bradykinesia [2].
Currently, the treatment for PD is mainly to relieve symptoms by adding levodopa, dopamine receptor agonists, etc. These treatments can control motor symptoms only in the early stages of PD and do not prevent dopaminergic neuronal damage. As the disease progresses and the response to the drugs decreases, long-term use of these drugs is accompanied by adverse effects, such as dyskinesia and symptom fluctuations [3], which seriously affect the quality of life of patients [4]. Therefore, better and more effective therapeutic strategies are needed, and the fundamental step toward this goal is to search for the underlying genetic and molecular mechanisms behind the pathogenesis of PD.
Increasing research indicates that both innate and adaptive immune responses play a pivotal role in the pathogenesis of PD [5]. The protein α-synuclein, considered the central component to the pathogenesis of PD, was associated with the immune responses triggered by the immune cells [6]. Thus, the immunosuppressants which immunologically restore the brain’s homeostatic environment have been proven to affect the progress of PD [7]. Racette et al. performed a population-based case–control study that included 10,619 participants and found that using immunosuppressants, such as corticosteroids and inosine monophosphate dehydrogenase inhibitors, might decrease the risk of PD [8]. Peter et al. also found that early exposure to anti-tumor necrosis factor could significantly reduce the incidence of PD in patients with inflammatory bowel disease [9]. Similarly, a recent national case–control study from Finland showed that using immunosuppressants helped reduce the risk of Parkinson’s disease in rheumatoid arthritis [10]. These studies suggest the essential role of immunosuppressants in PD, and, on the other hand, they indicate that neuroinflammation is an important pathological feature in the pathogenesis of PD. It was shown that neuroinflammation is regulated by immune cells, such as microglia (macrophages), astrocytes, and peripheral immune cells, as well as cytokines [11], of which microglia play a significant mediating role [12], and the accumulation of α-synuclein is also associated with microglia activation. A considerable infiltration of CD4 and CD8 cells was observed in the postmortem NS cells of patients, and it was shown that in the absence of T cells, α-synuclein could not upregulate the microglial proinflammatory response, and there is no loss of neurons; thus, T-cell infiltration is necessary for neuronal degeneration [13]. The activation of neuroinflammatory responses by these immune cells promotes the onset of neurotoxicity, which, in turn, leads to neuronal death. To sum up, these potential immune cell infiltrations can influence the pathogenesis of PD and are also potential targets for developing PD-modifying therapies [14].
With the rapid development of bioinformatics, compared with time-consuming and expensive traditional experimental research, the bioinformatics analysis can screen a larger number of potentially worthwhile genes more quickly and accurately and provide exploratory predictions at a lower cost to inform subsequent biological experiments and clinical applications [15]. Hub gene, as a gene with a high degree of connectivity in the gene expression network, is considered to play a pivotal role in the progression of the disease [16]. In previous studies, cytoHubba or STRING (Search Tool for the Retrieval of Interacting Genes) software was often used to screen for hub genes [17,18]. However, in this type of selection, whether to select the top 5 or 10 of total differential expression genes (DEGs) as hub genes depends on the researchers’ preferences, which inevitably decreases the accuracy of screening process and reduces the repeatability of the experiment [19,20]. In order to diminish this type of inaccuracy, various machine learning (ML) techniques have been recently added to bioinformatics analysis, which has been proven to give the screening method better accuracy and stability [21,22]. The least absolute shrinkage and selection operator (LASSO) regression, as a normalized linear regression method, can ignore unimportant features and build a sparse and easy-to-interpret model to prevent overfitting. The support vector machine recursive feature elimination (SVM-RFE) technique integrates the support vector machine into the recursive feature elimination strategy and uses the inherent feature selection function of the support vector machine to screen key features in continuous iteration. The combination of LASSO and SVM-RFE techniques has shown satisfactory accuracy and sensitivity in some fields, such as lung and pituitary tumors [23,24]. However, few studies have screened PD-related bioinformation by the combination of LASSO and SVM-RFE ML techniques.
In this study, we creatively took advantage of the combination of these two ML techniques to identify the hub genes for PD and further analyzed these gene-related infiltration patterns of PD immune cells. This combination of these two machine learning techniques could not only screen the genes with significant features but also delete the gene that has the least influence on the pathogenesis of PD. We hope our study can reveal the information regarding neuroimmune-related pathogenesis of PD more accurately and provide some insights into searching for the potential targets of immunotherapy for PD.
In the present work, we first combined PD microarray collections from the Gene Expression Omnibus (GEO) database to identify DEGs and perform enrichment analysis. Then, we combined two ML algorithms, LASSO regression, and SVM-RFE analysis to identify the PD-related hub genes. Next, the convolution tool cell-type identification by estimating relative subsets of RNA transcripts (CIBERSORT) was used to investigate the discrepancies between immune cells in PD pathogenesis and explore the correlation between hub genes and immune cell infiltration. Finally, another PD microarray dataset that met the inclusion criteria was used for external validation. The flow chart of the present study design is shown in Figure 1.

2. Materials and Methods

2.1. Data Processing and Differential Gene Screening

The GEO database (https://www.ncbi.nlm.nih.gov/geo/, accessed on 1 November 2022) is an international public repository of high-throughput microarray and next-generation sequence functional genome datasets created and maintained by The National Center for Biotechnology Information [25]. Almost all research-relevant gene expression assay data can be found in this database. Here, we extracted Parkinson’s-related microarray data from the GEO database with the selection criteria as per below: (1) organism is a Homo sapiens with a gene expression profile type of array expression profile; (2) the samples come from the substantia nigra; and (3) the raw data can be employed for further analysis. We selected four independent datasets: GSE7621, GSE20141, GSE20333, and GSE49036, including 31 normal controls and 47 PD samples from the GPL570 ((HG-U133_Plus_2) Affymetrix Human Genome U133 Plus 2.0 Array) and GPL201 ((HG-Focus) Affymetrix Human HG-Focus Target Array) platforms as test sets, and the GSE20164 microarray was selected to validate the results. The details of the 5 datasets are shown in Table 1.
We converted the gene probes into gene symbols using annotation files, where the average value of several probes corresponding to the same gene was measured. Log2 was used for the normalization. We integrated these microarray data as the training datasets after leveling out the discrepancies between the batches via a surrogate variable analysis (SVA) package [26]. The 2D principal component analysis (PCA) showed the inter-batch differences before and after treatment. We applied the “limma” package [27] in screening the DEGs by the criteria of the adj. p  <  0.05 and |log2 fold change (FC) |> 1, whereas we employed the “pheatmap” and “ggplot2” packages [28] in creating the heatmaps and volcano maps to visualize these DEGs.

2.2. Enrichment Analysis Method

To better understand the biological functions of DEGs, we used the “limma”, “clusterProfiler”, “org. Hs. eg. db”, and “DOSE” packages to perform the enrichment analysis, including Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Disease Ontology (DO). The GO database consists of a set of terms that annotate the properties of genes and gene products. It can be analyzed at three levels: biological processes (BP), cellular components (CC), and molecular functions (MF) [29]. A KEGG analysis to assign DEGs to specific pathways is used as a database resource for understanding the network of advanced functional and interacting relationships in biological systems [30]. A DO analysis can identify multiple diseases associated with these DEGs [31]. We used the “ggplot2” package to visualize these enrichment analyses using adjusted p-values of <0.05 and q-values of <0.05 (Benjamini–Hochberg method) as default cut-off thresholds.

2.3. PPI Network Construction

The DEGs were imported into the online database STRING [32] (https://cn.string-db.org/, accessed on 6 November 2022), the Homo sapiens race was selected, and the interaction score was set to >0.15 to construct the protein–protein interaction (PPI) network. In this network, each node represents a target gene and the lines between the nodes represent related interactions. The main modules in the PPI network were analyzed in Cytoscape3.9.1 software [33] using the plug-in Minimal Common Oncology Data Elements (MCODE) [34], with the filtering parameters set to degree cutoff = 2, node score cutoff = 0.2, k-core = 4, and max. depth = 100.

2.4. Machine Learning Screening and Validation Gene Biomarkers

To diagnose and predict diseases more accurately, researchers have proposed various ML algorithms. Here, we applied two ML algorithms to screen hub genes. LASSO regression [35] uses regularization to improve the prediction accuracy, which is conducted by performing variable selection and complexity adjustment while fitting a generalized linear model. Here, selectively placing variables into the model was conducted to obtain better performance parameters, and then the complexity of the model was controlled by a series of parameters. The degree of the complexity adjustment was controlled by the parameter λ, which controls the severity of the penalty. In this study, the value of λ is determined by cross-validation using the “glmnet” package of R to fit the model, where the response type is set to “binary”, alpha = “1”, and nfold = “10”. An SVM-RFE analysis [36] was used to iteratively construct the model, and then the best features were selected with a sequential backward selection algorithm based on the maximum interval principle of SVM. In the research, the defined training model and cross-validation were used to obtain the value with the minimum error as the feature genes. The SVM classifier was performed with the “e1071”, “kernlab”, and “caret” packages. Additionally, we surveyed LASSO-SVM to screen the hub genes for PD and then used the “Venn” package to obtain two overlapping hub genes as potential biomarkers for PD. Furthermore, we confirmed the differences in the biomarkers’ expression of the candidate genes in the validated dataset GSE20164.

2.5. Diagnostic Value of Gene Biomarkers in PD

We applied receiver operating characteristic (ROC) curve analyses [37] to investigate the regression model, verified with an external validation dataset to determine the potential predictive value of the gene expression differences. The area under the ROC curve (AUC) was extremely close to 1, indicating good specificity and sensitivity of the screened genes, implying greater accuracy as potential biomarkers of disease. The AUC in the study >0.6 showed a relatively satisfying diagnosis efficiency.

2.6. Analysis of Immune Cell Components

The CIBERSORT [38] refers to a computational method quantifying the cell composition of complex tissues on the basis of the corresponding gene expression profiles, which should be able to analyze the RNA mixtures of cell biomarkers and therapeutic targets on a large scale. R’s “CIBER-SORT” package was employed for quantifying the relative ratio of 22 infiltrating immune cells. Samples with p < 0.05 were filtered out, with the zero-value type of immune cells excluded while obtaining the immune cell infiltration matrix. The packages “corrplot” and “vioplot” were employed to draw the bar and violin charts using the data on the immune cell infiltration matrix. This can help demonstrate the correlation and difference in immune cells in the SN of PD patients and normal controls. Finally, the correlation between the screening genes and immune cells was further validated by Pearson correlation analysis to explore how these genes regulate immune cell infiltration to influence the development of PD.

2.7. Statistical Analysis

We implemented all statistical analyses with R software (version 4.2.2). For the continuous variables, we used Student’s t-test, and for the categorical variables, we used the Mann–Whitney U test. p < 0.05 indicated statistical significance.

3. Results

3.1. Recognition of DEGs

The PCA cluster charts showed a random distribution of samples after the removal of the batch differences (Figure 2A,B). Compared with the control sample, we acquired 27 DEGs, including 25 upregulated and 2 downregulated genes, from the SN of PD patients. In addition, we visualized DEGs with heatmaps and volcano plots (Figure 2C,D).

3.2. DEGs Gene Enrichment Analysis

Through the GO enrichment analysis, we found that these genes showed their discrepancies mainly in the enrichment of BP, CC, and MF, such as neurotransmitter transport, dopamine biosynthetic process, synapse organization, presynapse, synaptic vesicle, exocytic vesicle, neuron projection terminus, and transport vesicle (Figure 3A). Furthermore, the KEGG pathways enrichment analyses showed the enrichment of DEGs under cocaine addiction (hsa05030), dopaminergic synapse (hsa04728), amphetamine addiction (hsa05031), alcoholism (hsa05034), synaptic vesicle cycle (hsa04721), serotonergic synapse (hsa04726), and tyrosine metabolism (hsa00350) (Figure 3B). The DEGs enrichment analysis for the disease group and control group indicated that PD possibly causes inflammatory and tumor problems in the nervous system. Autonomic nervous system neoplasm, neuroblastoma, peripheral nervous system neoplasm, Parkinson’s disease, and synucleinopathy were the first five differential genes enriched (Figure 3C). The above findings obtained with GO, KEGG, and DO indicate that there is an appropriate immune response mechanism in PD.

3.3. PPI Network Construction

A 27-DEG PPI network was built with the STRING database to investigate the interaction among robust DEGs. Confidence of >0.15 and the separated nodes were hidden, and all 25 nodes and 96 edges participated in the PPI network (Figure 4A). The PPI data were imported into the Cytoscape software, and we identified two significant modules on the basis of the filtering criteria by MCODE. Subcluster 1 had the high cluster score of 9.111, which included a total of 10 nodes and 41 edges, and Subcluster 2 had a score of 4, with 5 nodes and 8 edges. (Figure 4B,C).

3.4. Application of Machine Learning and Validation of Candidate Gene Biomarkers

To extract PD-related gene biomarkers from DEGs, we used the LASSO model and SVM-RFE. With the help of LASSO regression, eight genes were mined (Figure 5A). Six characteristic genes were degraded for the SVM-RFE algorithm (Figure 5B). At the same time, four overlapping genes were found, namely angiotensin II type 1 receptor (AGTR1), glycogen branching enzyme (GBE1), trophoblast glycoprotein (TPBG), and heat shock 70-kDa protein 6 (HSPA6) (Figure 5C).
Using the validation dataset, we further clarified its validity by verifying the expression level of the characteristic genes. As seen from the boxplots results, in GSE20164, the expression degrees of AGTR1 and GBE1 in the PD group were significantly lower than those in the control group (p < 0.05), while the TPBG expression was lower and the HSPA6 expression was higher without statistical significance (Figure 6).

3.5. Value of Gene Biomarkers in PD

Furthermore, the diagnostic efficacy of the genes was verified using the ROC curve in the validation dataset. Figure 7 indicates the specific AUC and 95% CI of the characteristic diagnostic genes: AGTR1 (AUC = 0.933), GBE1 (AUC = 0.967), TPBG (AUC = 0.767), and HSPA6 (AUC = 0.633). The results of such genes in the validation dataset were relatively satisfactory, which demonstrates powerful predictive capabilities.

3.6. Analysis of Immune Cell Infiltration

Immune infiltration of PD was computed using the CIBERSORT algorithm. The immune cell components of the disease and control groups can be seen in the bar chart (Figure 8A), and the differences between the same immune cells can also be seen in the violin graph (Figure 8B). Compared with the control group, the expression of B cell memory and dendritic cells (DCs) activation in the PD group was lower (p = 0.035 and p = 0.037, respectively) and that of M2 macrophages was higher (p = 0.024). Figure 8C indicates the interactions between the immune cells. Our results showed that B cell memory correlated positively with DCs activation (r = 0.42) and negatively with B cell initiation (r = −0.61). M2 macrophages correlated negatively with macrophages M0 (r = −0.58) but positively with monocytes (r = 0.33). The activated DCs had a significant negative correlation with B cells naive (r = −0.26).

3.7. Correlation Analysis between the Identified Genes and Immune Cell Infiltration

The correlation analysis of 22 types of immune cells could indicate how these identified genes take part in the development of PD by regulating immune cell infiltration. The results indicated that AGTR1 (R = −0.53, p = 0.0017), GBE1 (R = −0.38, p = 0.029), and TPBG (R = −0.46, p = 0.007) linked negatively to monocytes. AGTR1 related negatively to M2 macrophages (R = −0.46, p = 0.0073), and GBE1 related negatively to T cells CD4 memory resting (R = −0.35, p = 0.046), while HSPA6 related negatively to plasma cells (R = −0.45, p = 0.0089) (Figure 9).

4. Discussion

PD is the second most common neurodegenerative disease followed only by Alzheimer’s, the clinical assessment of which is usually tricky. It is often misdiagnosed because of the overlap of symptoms with other conditions [39]. The recent fast growth of bioinformatics has offered an effective solution for discovering and screening possible diagnostic genes. In the present study, 27 DEGs were screened from the PD expression profile extracted from GEO databases by differential analysis. Then, four PD-related hub genes, namely AGTR1, GBE1, TPBG, and HSPA6, were finally identified by LASSO and SVM-RFE algorithms. Later, these four hub genes were validated in an external dataset GSE20164. In addition, the CIBERSORT immune infiltration analysis revealed that these four hub genes were associated with increased infiltration of M2 macrophages, decreased infiltration of B cell memory, and activated DCs during the progression of PD.
Among the gene enrichment analysis of DEGs, the GO/KEGG enrichment analyses showed presynapse, synaptic vesicle, exocytic vesicle, neuron projection terminus, transport vesicle catecholamine binding, serotonergic synapse, and tyrosine metabolism and other immune-related signaling pathways. Tansey et al. demonstrated the causal role of inflammation and immune pathways in PD pathogenesis [40]. The DO enrichment analysis showed more clearly the association of PD with the occurrence of tumor inflammatory diseases, such as autonomic nervous system neoplasm, neuroblastoma, peripheral nervous system neoplasm, and synucleinopathy. This further implied the close relationship of PD to the immune response mechanism. Therefore, the data analyzed herein are of potential significance.
LASSO refers to a ML approach based on regression that can actively select from various potential multicollinear variables. We classified genes and variables by looking up the lambda parameter to find the smallest error [41]. Generally speaking, SVM is often viewed as one of the most salient and mature binary classification algorithms in microarray computing, especially useful for gene expression analysis [42]. We investigated AGTR1, GBE1, TPBG, and HSPA6, four specific genes, using the LASSO and SVM-RFE models and verified them using an external dataset.
The AGTR1 gene mediated by the renin–angiotensin system is crucial to the pathophysiology of cardiovascular diseases [43]. AGTR1 stimulation leads to physiological and pathological reactions, such as vasoconstriction, inflammation, and proliferation [44]. Moreover, studies in Japan have shown that AGTR1 gene variation is related to sporadic PD [45]. Notably, it is worth remarking that renin–angiotensin system inhibitors are a good solution for improving PD patients’ motor functions and reducing L-DOPA-related dyskinesia [46]. Moreover, the activation of AGTR1 in a PD mouse model was found to cause oxidative stress, leading to the loss of midbrain dopaminergic neurons, and its inhibition will prevent this [47]. The latest research shows that the single subtype characterized by the expression of the gene AGTR1 is limited to the ventral layer of the SN in terms of space, and it is highly sensitive to loss in PD, indicating a molecular process associated with degeneration [48]. In summary, such results indicate that AGTR1 may affect the selective susceptibility of dopaminergic neurons and that inhibitors of this pathway may affect neuroprotection.
The GBE1 gene belongs to the glycosyl hydrolase 13 family, whose mutations can cause adult polyglucosan body disease (APBD), which is a fatal adult-onset neurodegenerative disease featuring progressive sensory deficits and upper and lower motor neuron dysfunction [49]. In APBD, GBE is reduced and the glucan chains are too long, wind around each other, and roll up polyglucosan bodies (PBs), provoking neuroinflammation and neurodegeneration [50]. PBs formed in glia and neurons seem to clog, which may explain the neurological presentation of such a disease [51]. The low expression of GBE1 in patients with PD may also be related to neuroinflammation or neuronal tangle in the SN caused by BPs, which needs further study.
TPBG, an alias of Wnt-activated inhibitory factor 1 (WAIF1), refers to a single-pass transmembrane protein [52], which is usually highly expressed in trophoblast cells and tumors. Studies have found that it is also highly expressed in normal adult tissues, such as the brain, and TPBG is considered a PD-related gene [53]. The study has shown that gene ablation of TPBG causes slight degeneration of dopaminergic neurons in the midbrain of older mice. Furthermore, transcriptome analysis of the SN in older TPBG knockout mice confirmed TPBG as a potential candidate gene related to PD [54].
The HSPA6 gene encodes Hsp70B’, one of the stress-induced HSP70 proteins. There are no studies on the HSPA6 gene in animal models of neurodegenerative disorders in view of its sole existence in the human genome [55]. Previous studies found that the transcription activity of the stress-related HSPA6 gene was increased in PD patients in peripheral blood mononuclear cells [56]. Through bioinformatics research, we discovered that the HSPA6 gene is highly expressed in the SN of PD patients, with potential diagnostic value in PD recognition.
However, numerous studies are required to further prove the reliability of the diagnostic value of these genes. Recently, many studies have indicated the accelerating effect of immune cell infiltration into brain tissue on the disease process [57]. Therefore, the calculation of immune infiltration using the CIBERSORT algorithm is of great importance for discovering several immune subtypes closely related to the biological process of PD.
The increased infiltration of M2 macrophages and the decreased infiltration of B cells memory and activated dendritic cells (DCs) may be associated with the development of PD by way of neural injury and inflammation. Macrophages are the most significant innate immune cells in the brain as well as the most important regulators of neurodegenerative disorders. Some studies on multiple sclerosis have shown that activation of M2 macrophages/microglia can promote healing and repair [58]. Unfortunately, information on M2 macrophages/microglia markers is relatively lacking in PD or chronic animal models of PD based on α-synuclein. Our research seems to be in contrast to previous studies, but this also shows that the effect of immune response regarding M2 macrophages in PD is complex and needs to be further researched [59]. Researchers have proved that the number of B cells in PD patients may be reduced, including B cell memory [60]. Moreover, DCs turn out to be antigen-presenting cells in the pathogenesis of neuroinflammation. Studies have shown that immature DCs can activate endothelial cells more than activated DCs [61]. Our study implied the number of activated DCs in PD patients decreased, consistent with previous findings.
As for the correlation analysis, the gene biomarkers AGTR1, GBE1, and TPBG were all significantly related to infiltrating immune cells and monocytes. AGTR1 decreases the expression of M2 macrophages, GBE1 decreases the expression of T cells CD4 memory resting, and HSPA6 decreases the number of plasma cells. It can be noted that the pathophysiological mechanisms of PD include a large number of inflammatory cell changes and immune diseases. In the future, AGTR1, GBE1, TPBG, and HSPA6 could take part in the PD pathophysiological process through the role of such inflammatory and immune cells. In addition, the current studies on some PD immune targets add confidence to our further research, such as Wnt-related signaling [62] and G protein-coupled receptor-GPR109A [63], which may more precisely ameliorate neuroinflammation in diseases such as PD by controlling tissue or organ inflammation, thereby treating or delaying disease progression. The study also avoids the toxic side effects associated with the long-term use of drugs, such as levodopa. Our study also provides new ideas for PD immune-related treatments.
However, our research also has some limitations. Firstly, due to the fact of our limited sample size, a larger sample is needed for validation; secondly, our samples are from public databases and published literature. Further patient data from our research center can be collected for external validation, and multiple regression models can be used to validate the sensitivity and specificity of the selected biomarkers. Next, we will focus on the function of the screened potential genes and design complete in vivo and in vitro experiments for validation in the future.

5. Conclusions

In this work, we ultimately identified four hub genes, namely AGTR1, GBE1, TPBG, and HSPA6, by LASSO and support vector machine algorithms. In addition, we found that increased infiltration of M2 macrophages and decreased infiltration of B cell memory and activated DC may be involved in the progression of PD by way of neurological damage and inflammation, and AGTR1, GBE1 TPBG, and HSPA6 were significantly associated with the degree of immune cell infiltration. Our study may provide some insights into searching for potential immunotherapy targets for delaying or halting the progression of PD. However, more research is warranted to verify the role of these genes involved in the neuroimmune and neuroinflammation-related pathogenesis of PD in the future.

Author Contributions

Conceptualization, Y.B., L.W., J.Y. and D.H.; Methodology, Y.B., L.W. and D.H.; Software, Y.B. and L.W.; Validation, Y.B., L.W. and D.H.; Formal analysis, Y.B. and L.W; Investigation, Y.B. and J.Y.; Resources, Y.B., F.Y. and J.Y.; Data curation, Y.B., F.Y., J.Y. and D.H.; Writing—original draft, Y.B.; Writing—review & editing, Y.B., F.Y. and J.Y.; Visualization, F.Y., J.Y. and D.H.; Supervision, J.Y. and D.H.; Project administration, D.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data related to the research are presented in the article.

Acknowledgments

We acknowledge the GEO database for providing their platforms and contributors for uploading their meaningful datasets.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Bloem, B.R.; Okun, M.S.; Klein, C. Parkinson’s disease. Lancet 2021, 397, 2284–2303. [Google Scholar] [CrossRef] [PubMed]
  2. Sveinbjornsdottir, S. The clinical symptoms of Parkinson’s disease. J. Neurochem. 2016, 139, 318–324. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  3. Ahlskog, J.E.; Muenter, M.D. Frequency of levodopa-related dyskinesias and motor fluctuations as estimated from the cumulative literature. Mov. Disord. 2001, 16, 448–458. [Google Scholar] [CrossRef] [PubMed]
  4. Chapuis, S.; Ouchchane, L.; Metz, O.; Gerbaud, L.; Durif, F. Impact of the motor complications of Parkinson’s disease on the quality of life. Mov. Disord. 2005, 20, 224–230. [Google Scholar] [CrossRef] [PubMed]
  5. Harms, A.S.; Ferreira, S.A.; Romero-Ramos, M. Periphery and brain, innate and adaptive immunity in Parkinson’s disease. Acta Neuropathol. 2021, 141, 527–545. [Google Scholar]
  6. Zhu, B.; Yin, D.; Zhao, H.; Zhang, L. The immunology of Parkinson’s disease. Semin. Immunopathol. 2022, 44, 659–672. [Google Scholar] [CrossRef]
  7. Abdi, I.Y.; Ghanem, S.S.; El-Agnaf, O.M. Immune-related biomarkers for Parkinson’s disease. Neurobiol. Dis. 2022, 170, 105771. [Google Scholar]
  8. Racette, B.A.; Gross, A.; Vouri, S.M.; Camacho-Soto, A.; Willis, A.W.; Searles Nielsen, S. Immunosuppressants and risk of Parkinson disease. Ann. Clin. Transl. Neurol. 2018, 5, 870–875. [Google Scholar] [CrossRef]
  9. Peter, I.; Dubinsky, M.; Bressman, S.; Park, A.; Lu, C.Y.; Chen, N.J.; Wang, A. Anti-Tumor Necrosis Factor Therapy and Incidence of Parkinson Disease Among Patients with Inflammatory Bowel Disease. JAMA Neurol. 2018, 75, 939–946. [Google Scholar] [CrossRef]
  10. Paakinaho, A.; Koponen, M.; Tiihonen, M.; Kauppi, M.; Hartikainen, S.; Tolppanen, A.M. Disease-Modifying Antirheumatic Drugs and Risk of Parkinson Disease Nested Case-Control Study of People with Rheumatoid Arthritis. Neurology 2022, 98, E1273–E1281. [Google Scholar] [CrossRef]
  11. Marogianni, C.; Sokratous, M.; Dardiotis, E.; Hadjigeorgiou, G.M.; Bogdanos, D.; Xiromerisiou, G. Neurodegeneration and Inflammation—An Interesting Interplay in Parkinson’s Disease. Int. J. Mol. Sci. 2020, 21, 8421. [Google Scholar] [CrossRef] [PubMed]
  12. Badanjak, K.; Fixemer, S.; Smajic, S.; Skupin, A.; Grunewald, A. The Contribution of Microglia to Neuroinflammation in Parkinson’s Disease. Int. J. Mol. Sci. 2021, 22, 4676. [Google Scholar] [CrossRef] [PubMed]
  13. Subbarayan, M.S.; Hudson, C.; Moss, L.D.; Nash, K.R.; Bickford, P.C. T cell infiltration and upregulation of MHCII in microglia leads to accelerated neuronal loss in an alpha-synuclein rat model of Parkinson’s disease. J. Neuroinflamm. 2020, 17, 242. [Google Scholar] [CrossRef]
  14. Simon, D.K.; Tanner, C.M.; Brundin, P. Parkinson Disease Epidemiology, Pathology, Genetics, and Pathophysiology. Clin. Geriatr. Med. 2020, 36, 1–12. [Google Scholar] [CrossRef]
  15. Blauwendraat, C.; Nalls, M.A.; Singleton, A.B. The genetic architecture of Parkinson’s disease. Lancet Neurol. 2020, 19, 170–178. [Google Scholar] [CrossRef] [PubMed]
  16. Chen, H.Z.; Yang, J.K.; Wu, W.J. Seven key hub genes identified by gene co-expression network in cutaneous squamous cell carcinoma. BMC Cancer 2021, 21, 852. [Google Scholar] [CrossRef]
  17. Zheng, H.R.; Qian, X.H.; Tian, W.T.; Cao, L. Exploration of the Common Gene Characteristics and Molecular Mechanism of Parkinson’s Disease and Crohn’s Disease from Transcriptome Data. Brain Sci. 2022, 12, 774. [Google Scholar] [CrossRef]
  18. Shen, L.; Zhou, K.G.; Liu, H.; Yang, J.; Huang, S.Q.; Yu, F.; Huang, D.Y. Prediction of Mechanosensitive Genes in Vascular Endothelial Cells Under High Wall Shear Stress. Front. Genet. 2022, 12, 796812. [Google Scholar] [CrossRef]
  19. Fang, K.Y.; Liang, G.N.; Zhuang, Z.Q.; Fang, Y.X.; Dong, Y.Q.; Liang, C.J.; Chen, X.Y.; Guo, X.G. Screening the hub genes and analyzing the mechanisms in discharged COVID-19 patients retesting positive through bioinformatics analysis. J. Clin. Lab. Anal. 2022, 36, e24495. [Google Scholar] [CrossRef]
  20. Wu, L.J.; Tian, X.X.; Du, H.; Liu, X.M.; Wu, H.G. Bioinformatics Analysis of LGR4 in Colon Adenocarcinoma as Potential Diagnostic Biomarker, Therapeutic Target and Promoting Immune Cell Infiltration. Biomolecules 2022, 12, 1081. [Google Scholar] [CrossRef]
  21. Liu, S.H.; Wang, Y.L.; Jiang, S.M.; Wan, X.J.; Yan, J.H.; Liu, C.F. Identifying the hub gene and immune infiltration of Parkinson’s disease using bioinformatical methods. Brain Res. 2022, 1785, 147879. [Google Scholar] [CrossRef] [PubMed]
  22. Moradi, S.; Tapak, L.; Afshar, S. Identification of Novel Noninvasive Diagnostics Biomarkers in the Parkinson’s Diseases and Improving the Disease Classification Using Support Vector Machine. BioMed Res. Int. 2022, 2022, 5009892. [Google Scholar] [CrossRef] [PubMed]
  23. Wang, Y.; Huang, X.L.; Xian, B.; Jiang, H.J.; Zhou, T.; Chen, S.Y.; Wen, F.Y.; Pei, J. Machine learning and bioinformatics-based insights into the potential targets of saponins in Paris polyphylla smith against non-small cell lung cancer. Front. Genet. 2022, 13, 3123. [Google Scholar] [CrossRef] [PubMed]
  24. Huang, R.X.; Chen, D.Y.; Wang, H.M.; Zhang, B.H.; Zhang, Y.; Ren, W. SFRP2 is a Novel Diagnostic Biomarker and Suppresses the Proliferation of Pituitary Adenoma. J. Oncol. 2022, 2022, 4272525. [Google Scholar] [CrossRef] [PubMed]
  25. Barrett, T.; Wilhite, S.E.; Ledoux, P.; Evangelista, C.; Kim, I.F.; Tomashevsky, M.; Marshall, K.A.; Phillippy, K.H.; Sherman, P.M.; Holko, M.; et al. NCBI GEO: Archive for functional genomics data sets-update. Nucleic Acids Res. 2013, 41, D991–D995. [Google Scholar] [CrossRef] [Green Version]
  26. Leek, J.T.; Johnson, W.E.; Parker, H.S.; Jaffe, A.E.; Storey, J.D. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 2012, 28, 882–883. [Google Scholar] [CrossRef] [Green Version]
  27. Ritchie, M.E.; Phipson, B.; Wu, D.; Hu, Y.F.; Law, C.W.; Shi, W.; Smyth, G.K. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015, 43, e47. [Google Scholar] [CrossRef]
  28. Ito, K.; Murphy, D. Application of ggplot2 to Pharmacometric Graphics. CPT: Pharmacomet. Syst. Pharmacol. 2013, 2, e79. [Google Scholar] [CrossRef]
  29. Ashburner, M.; Ball, C.A.; Blake, J.A.; Botstein, D.; Butler, H.; Cherry, J.M.; Davis, A.P.; Dolinski, K.; Dwight, S.S.; Eppig, J.T.; et al. Gene Ontology: Tool for the unification of biology. Nat. Genet. 2000, 25, 25–29. [Google Scholar] [CrossRef] [Green Version]
  30. Kanehisa, M.; Furumichi, M.; Tanabe, M.; Sato, Y.; Morishima, K. KEGG: New perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 2017, 45, D353–D361. [Google Scholar] [CrossRef] [Green Version]
  31. Schriml, L.M.; Mitraka, E.; Munro, J.; Tauber, B.; Schor, M.; Nickle, L.; Felix, V.; Jeng, L.; Bearer, C.; Lichenstein, R.; et al. Human Disease Ontology 2018 update: Classification, content and workflow expansion. Nucleic Acids Res. 2019, 47, D955–D962. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  32. Szklarczyk, D.; Franceschini, A.; Wyder, S.; Forslund, K.; Heller, D.; Huerta-Cepas, J.; Simonovic, M.; Roth, A.; Santos, A.; Tsafou, K.P.; et al. STRING v10: Protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 2015, 43, D447–D452. [Google Scholar] [CrossRef]
  33. Shannon, P.; Markiel, A.; Ozier, O.; Baliga, N.S.; Wang, J.T.; Ramage, D.; Amin, N.; Schwikowski, B.; Ideker, T. Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Res. 2003, 13, 2498–2504. [Google Scholar] [CrossRef] [PubMed]
  34. Bader, G.D.; Hogue, C.W. An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinform. 2003, 4, 2. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  35. Tibshirani, R. Regression shrinkage and selection via the lasso: A retrospective. J. R. Stat. Soc. Ser. B-Stat. Methodol. 2011, 73, 273–282. [Google Scholar] [CrossRef]
  36. Lin, X.H.; Li, C.; Zhang, Y.H.; Su, B.Z.; Fan, M.; Wei, H. Selecting Feature Subsets Based on SVM-RFE and the Overlapping Ratio with Applications in Bioinformatics. Molecules 2018, 23, 52. [Google Scholar] [CrossRef] [Green Version]
  37. Janssens, A.; Martens, F.K. Reflection on modern methods: Revisiting the area under the ROC Curve. Int. J. Epidemiol. 2020, 49, 1397–1403. [Google Scholar] [CrossRef]
  38. Newman, A.M.; Liu, C.L.; Green, M.R.; Gentles, A.J.; Feng, W.G.; Xu, Y.; Hoang, C.D.; Diehn, M.; Alizadeh, A.A. Robust enumeration of cell subsets from tissue expression profiles. Nat. Methods 2015, 12, 453–457. [Google Scholar] [CrossRef] [Green Version]
  39. Hess, C.W.; Okun, M.S. Diagnosing Parkinson Disease. Continuum 2016, 22, 1047–1063. [Google Scholar] [CrossRef]
  40. Tansey, M.G.; Wallings, R.L.; Houser, M.C.; Herrick, M.K.; Keating, C.E.; Joers, V. Inflammation and immune dysfunction in Parkinson disease. Nat. Rev. Immunol. 2022, 22, 657–673. [Google Scholar] [CrossRef]
  41. Huang, H.W. Controlling the false discoveries in LASSO. Biometrics 2017, 73, 1102–1110. [Google Scholar] [CrossRef] [PubMed]
  42. Brown, M.P.S.; Grundy, W.N.; Lin, D.; Cristianini, N.; Sugnet, C.W.; Furey, T.S.; Ares, M.; Haussler, D. Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc. Natl. Acad. Sci. USA 2000, 97, 262–267. [Google Scholar] [CrossRef] [PubMed]
  43. Wagenaar, L.J.; Voors, A.A.; Buikema, H.; van Gilst, W.H. Angiotensin receptors in the cardiovascular system. Can. J. Cardiol. 2002, 18, 1331–1339. [Google Scholar] [PubMed]
  44. Griendling, K.K.; Lassegue, B.; Alexander, R.W. Angiotensin receptors and their therapeutic implications. Annu. Rev. Pharmacol. Toxicol. 1996, 36, 281–306. [Google Scholar] [CrossRef] [PubMed]
  45. Saito, S.; Iida, A.; Sekine, A.; Kawauchi, S.; Higuchi, S.; Ogawa, C.; Nakamura, Y. Catalog of 178 variations in the Japanese population among eight human genes encoding G protein-coupled receptors (GPCRs). J. Hum. Genet. 2003, 48, 461–468. [Google Scholar] [CrossRef]
  46. Reardon, K.A.; Mendelsohn, F.A.O.; Chai, S.Y.; Horne, M.K. The angiotensin converting enzyme (ACE) inhibitor, perindopril, modifies the clinical features of Parkinson’s disease. Aust. N. Z. J. Med. 2000, 30, 48–53. [Google Scholar] [CrossRef]
  47. Labandeira-Garcia, J.L.; Garrido-Gil, P.; Rodriguez-Pallares, J.; Valenzuela, R.; Borrajo, A.; Rodriguez-Perez, A.I. Brain renin-angiotensin system and dopaminergic cell vulnerability. Front. Neuroanat. 2014, 8, 67. [Google Scholar]
  48. Kamath, T.; Abdulraouf, A.; Burris, S.J.; Langlieb, J.; Gazestani, V.; Nadaf, N.M.; Balderrama, K.; Vanderburg, C.; Macosko, E.Z. Single-cell genomic profiling of human dopamine neurons identifies a population that selectively degenerates in Parkinson’s disease. Nat. Neurosci. 2022, 25, 588–595. [Google Scholar] [CrossRef]
  49. Akman, H.O.; Kakhlon, O.; Coku, J.; Peverelli, L.; Rosenmann, H.; Rozenstein-Tsalkovich, L.; Turnbull, J.; Meiner, V.; Chama, L.; Lerer, I.; et al. Deep Intronic GBE1 Mutation in Manifesting Heterozygous Patients with Adult Polyglucosan Body Disease. JAMA Neurol. 2015, 72, 441–445. [Google Scholar] [CrossRef] [Green Version]
  50. Gumusgoz, E.; Guisso, D.R.; Kasiri, S.; Wu, J.; Dear, M.; Verhalen, B.; Nitschke, S.; Mitra, S.; Nitschke, F.; Minassian, B.A. Targeting Gys1 with AAV-SaCas9 Decreases Pathogenic Polyglucosan Bodies and Neuroinflammation in Adult Polyglucosan Body and Lafora Disease Mouse Models. Neurotherapeutics 2021, 18, 1414–1425. [Google Scholar] [CrossRef]
  51. Gumusgoz, E.; Kasiri, S.; Guisso, D.R.; Wu, J.; Dear, M.; Verhalen, B.; Minassian, B.A. AAV-Mediated Artificial miRNA Reduces Pathogenic Polyglucosan Bodies and Neuroinflammation in Adult Polyglucosan Body and Lafora Disease Mouse Models. Neurotherapeutics 2022, 19, 982–993. [Google Scholar] [CrossRef] [PubMed]
  52. Zhao, Y.G.; Malinauskas, T.; Harlos, K.; Jones, E.Y. Structural Insights into the Inhibition of Wnt Signaling by Cancer Antigen 5T4/Wnt-Activated Inhibitory Factor 1. Structure 2014, 22, 612–620. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  53. Bossers, K.; Meerhoff, G.; Balesar, R.; van Dongen, J.W.; Kruse, C.G.; Swaab, D.F.; Verhaagen, J. Analysis of Gene Expression in Parkinson’s Disease: Possible Involvement of Neurotrophic Support and Axon Guidance in Dopaminergic Cell Death. Brain Pathol. 2009, 19, 91–107. [Google Scholar] [CrossRef] [PubMed]
  54. Park, S.; Yoo, J.E.; Yeon, G.B.; Kim, J.H.; Lee, J.S.; Choi, S.K.; Hwang, Y.G.; Park, C.W.; Cho, M.S.; Kim, J.; et al. Trophoblast glycoprotein is a new candidate gene for Parkinson’s disease. npj Park. Dis. 2021, 7, 110. [Google Scholar] [CrossRef]
  55. Noonan, E.J.; Place, R.F.; Giardina, C.; Hightower, L.E. Hsp70B’ regulation and function. Cell Stress Chaperones 2007, 12, 219–229. [Google Scholar] [CrossRef] [Green Version]
  56. Vavilova, J.D.; Boyko, A.A.; Troyanova, N.I.; Ponomareva, N.V.; Fokin, V.F.; Fedotova, E.Y.; Streltsova, M.A.; Kust, S.A.; Grechikhina, M.V.; Shustova, O.A.; et al. Alterations in Proteostasis System Components in Peripheral Blood Mononuclear Cells in Parkinson Disease: Focusing on the HSP70 and p62 Levels. Biomolecules 2022, 12, 493. [Google Scholar] [CrossRef]
  57. Tan, E.K.; Chao, Y.X.; West, A.; Chan, L.L.; Poewe, W.; Jankovic, J. Parkinson disease and the immune system—Associations, mechanisms and therapeutics. Nat. Rev. Neurol. 2020, 16, 303–318. [Google Scholar] [CrossRef]
  58. Moehle, M.S.; West, A.B. M1 and M2 immune activation in Parkinson’s disease: Foe and ally? Neuroscience 2015, 302, 59–73. [Google Scholar] [CrossRef] [Green Version]
  59. El-Deeb, N.K.; El-Tanbouly, D.M.; Khattab, M.A.; El-Yamany, M.F.; Mohamed, A.F. Crosstalk between PI3K/AKT/KLF4 signaling and microglia M1/M2 polarization as a novel mechanistic approach towards flibanserin repositioning in Parkinson’s disease. Int. Immunopharmacol. 2022, 112, 109191. [Google Scholar] [CrossRef]
  60. Stevens, C.H.; Rowe, D.; Morel-Kopp, M.C.; Orr, C.; Russell, T.; Ranola, M.; Ward, C.; Halliday, G.M. Reduced T helper and B lymphocytes in Parkinson’s disease. J. Neuroimmunol. 2012, 252, 95–99. [Google Scholar] [CrossRef] [Green Version]
  61. Arjmandi, A.; Liu, K.; Dorovini-Zis, K. Dendritic Cell Adhesion to Cerebral Endothelium: Role of Endothelial Cell Adhesion Molecules and Their Ligands. J. Neuropathol. Exp. Neurol. 2009, 68, 300–313. [Google Scholar] [CrossRef] [PubMed]
  62. Serafino, A.; Cozzolino, M. The Wnt/beta-catenin signaling: A multifunctional target for neuroprotective and regenerative strategies in Parkinson’s disease. Neural Regen. Res. 2023, 18, 306–308. [Google Scholar] [CrossRef] [PubMed]
  63. Taing, K.; Chen, L.; Weng, H.-R. Emerging roles of GPR109A in regulation of neuroinflammation in neurological diseases and pain. Neural Regen. Res. 2023, 18, 763–768. [Google Scholar] [PubMed]
Figure 1. Flowchart of this study.
Figure 1. Flowchart of this study.
Brainsci 13 00175 g001
Figure 2. PCA and DEGs in substantia nigra between PD and normal controls. (A) pre-correction raw PCA; (B) post-correction combat PCA; and (C) heatmap indicating a significant DEGs. These two colors denote distinct trends; darker color for a more pronounced trend; (D) volcano map exhibiting DEGs. Red and green denote upregulated and downregulated genes, while grey denotes no significant difference. PCA: principal component analysis; DEGs: differentially expressed genes.
Figure 2. PCA and DEGs in substantia nigra between PD and normal controls. (A) pre-correction raw PCA; (B) post-correction combat PCA; and (C) heatmap indicating a significant DEGs. These two colors denote distinct trends; darker color for a more pronounced trend; (D) volcano map exhibiting DEGs. Red and green denote upregulated and downregulated genes, while grey denotes no significant difference. PCA: principal component analysis; DEGs: differentially expressed genes.
Brainsci 13 00175 g002
Figure 3. The results of the enrichment analysis of differential expression genes (DEGs). (A) Gene Ontology (GO) enrichment analysis, where the x-axis refers to the generation, and the y-axis refers to the significantly enriched GO analysis of the modules. (B) Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment using Circos plots. Each column in the outermost circle corresponds to a KEGG pathway. The second circle represents the number of genes contained in each pathway. The redder the color, the more significant the enrichment of DEGs. The third circle represents the number of DEGs enriched. The innermost circle represents the proportion of DEGs in the enriched genes of the pathway. (C) Disease Ontology (DO) enrichment analysis, where the x-axis refers to the gene count, and the y-axis refers to the enriched diseases.
Figure 3. The results of the enrichment analysis of differential expression genes (DEGs). (A) Gene Ontology (GO) enrichment analysis, where the x-axis refers to the generation, and the y-axis refers to the significantly enriched GO analysis of the modules. (B) Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment using Circos plots. Each column in the outermost circle corresponds to a KEGG pathway. The second circle represents the number of genes contained in each pathway. The redder the color, the more significant the enrichment of DEGs. The third circle represents the number of DEGs enriched. The innermost circle represents the proportion of DEGs in the enriched genes of the pathway. (C) Disease Ontology (DO) enrichment analysis, where the x-axis refers to the gene count, and the y-axis refers to the enriched diseases.
Brainsci 13 00175 g003
Figure 4. PPI network construction and 2 subcluster modules extracted by MCODE. (A) The interaction network among the proteins was coded by DEGs (25 nodes and 96 edges). The node refers to a protein, while the edges refer to protein–protein correlation between two nodes. (B) Subcluster module 1 was extracted by MCODE and consisted of 10 nodes and 41 edges; MCODE score = 9.111. (C) Subcluster module 2 consisted of 5 nodes and 8 edges; MCODE score = 4.
Figure 4. PPI network construction and 2 subcluster modules extracted by MCODE. (A) The interaction network among the proteins was coded by DEGs (25 nodes and 96 edges). The node refers to a protein, while the edges refer to protein–protein correlation between two nodes. (B) Subcluster module 1 was extracted by MCODE and consisted of 10 nodes and 41 edges; MCODE score = 9.111. (C) Subcluster module 2 consisted of 5 nodes and 8 edges; MCODE score = 4.
Brainsci 13 00175 g004
Figure 5. LASSO and SVM-RFE jointly screened and verified the special gene biomarkers. (A) eight genes were extracted for PD gene biomarkers with the LASSO algorithm; (B) six genes were extracted for PD gene biomarkers with the SVM-RFE algorithm; (C) Venn diagram indicating the four crossover genes between LASSO and SVM-RFE. LASSO: least absolute shrinkage and selection operator; SVM-RFE: support vector machine recursive feature elimination.
Figure 5. LASSO and SVM-RFE jointly screened and verified the special gene biomarkers. (A) eight genes were extracted for PD gene biomarkers with the LASSO algorithm; (B) six genes were extracted for PD gene biomarkers with the SVM-RFE algorithm; (C) Venn diagram indicating the four crossover genes between LASSO and SVM-RFE. LASSO: least absolute shrinkage and selection operator; SVM-RFE: support vector machine recursive feature elimination.
Brainsci 13 00175 g005
Figure 6. Expression levels of the four genes in the verification dataset GSE20164 for the substantia nigra samples of the control and PD groups. (A) AGTR: p = 0.014; (B) GBE1: p = 0.002; (C) TPBG: p = 0.15; (D) HSPA6: p = 0.28. p < 0.05 denotes statistical significance. AGTR1: angiotensin II type 1 receptor; GBE1: glycogen branching enzyme; TPBG: trophoblast glycoprotein; and HSPA6: heat shock 70-kDa protein 6.
Figure 6. Expression levels of the four genes in the verification dataset GSE20164 for the substantia nigra samples of the control and PD groups. (A) AGTR: p = 0.014; (B) GBE1: p = 0.002; (C) TPBG: p = 0.15; (D) HSPA6: p = 0.28. p < 0.05 denotes statistical significance. AGTR1: angiotensin II type 1 receptor; GBE1: glycogen branching enzyme; TPBG: trophoblast glycoprotein; and HSPA6: heat shock 70-kDa protein 6.
Brainsci 13 00175 g006
Figure 7. The ROC curves of four genes in the validation dataset. (A) AGTR: AUC = 0.933; (B) GBE1: AUC = 0.967; (C) TPBG: AUC = 0.767; and (D) HSPA6: AUC = 0.633. AGTR1: angiotensin II type 1 receptor; GBE1: glycogen branching enzyme; TPBG: trophoblast glycoprotein; and HSPA6: heat shock 70-kDa protein 6.
Figure 7. The ROC curves of four genes in the validation dataset. (A) AGTR: AUC = 0.933; (B) GBE1: AUC = 0.967; (C) TPBG: AUC = 0.767; and (D) HSPA6: AUC = 0.633. AGTR1: angiotensin II type 1 receptor; GBE1: glycogen branching enzyme; TPBG: trophoblast glycoprotein; and HSPA6: heat shock 70-kDa protein 6.
Brainsci 13 00175 g007
Figure 8. Analysis of the infiltrating immune cells. (A) The contrast of 22 types of immune cells’ proportion between the control group and treatment group. The x-axis refers to immune cells, and the y-axis refers to the relative percentage. (B) Discrepancy in the immune cell infiltration. Blue and red legends represent the control group vs. the PD group. The x-axis represents the type of immune cells, and the y-axis represents the fraction. p < 0.05 denotes statistical significance (B cells memory, M2 macrophages, and activated dendritic cells have significant differential infiltration). (C) Correlation in the immune cell infiltration. The x/y-axes represent the immune cell types, the red color refers to a positive correlation, and the blue refers to a negative correlation. Darker color represents a stronger association.
Figure 8. Analysis of the infiltrating immune cells. (A) The contrast of 22 types of immune cells’ proportion between the control group and treatment group. The x-axis refers to immune cells, and the y-axis refers to the relative percentage. (B) Discrepancy in the immune cell infiltration. Blue and red legends represent the control group vs. the PD group. The x-axis represents the type of immune cells, and the y-axis represents the fraction. p < 0.05 denotes statistical significance (B cells memory, M2 macrophages, and activated dendritic cells have significant differential infiltration). (C) Correlation in the immune cell infiltration. The x/y-axes represent the immune cell types, the red color refers to a positive correlation, and the blue refers to a negative correlation. Darker color represents a stronger association.
Brainsci 13 00175 g008
Figure 9. Immune cell infiltration correlations of the four selected genes. (A) lollipop plot of the correlation between AGTR and immune cells; (B,C) scatter plots of the significant correlation between AGTR and immune cells (M2 macrophages: R = −0.46, p = 0.0073; monocytes: R = −0.53, p = 0.0017); (D) lollipop plot of the correlation between GBE1 and immune cells; (E,F) scatter plots of the significant correlation between GBE1 and immune cells (T cells CD4 memory resting: R = −0.35, p = 0.046; monocytes: R = −0.38, p = 0.029); (G) lollipop plot of the correlation between TPBG and immune cells; (H) scatter plot of the significant correlation between TPBG and monocytes (R = −0.46, p = 0.007); (I) lollipop plot of the correlation between HSPA6 and immune cells; (J) scatter plot of the significant correlation between HSPA6 and plasma cells (R = −0.45, p = 0.0089). In the right column of lollipop plots, p-values < 0.05 with a red color indicate statistical significance. AGTR1: angiotensin II type 1 receptor; GBE1: glycogen branching enzyme; TPBG: trophoblast glycoprotein; and HSPA6: heat shock 70-kDa protein 6.
Figure 9. Immune cell infiltration correlations of the four selected genes. (A) lollipop plot of the correlation between AGTR and immune cells; (B,C) scatter plots of the significant correlation between AGTR and immune cells (M2 macrophages: R = −0.46, p = 0.0073; monocytes: R = −0.53, p = 0.0017); (D) lollipop plot of the correlation between GBE1 and immune cells; (E,F) scatter plots of the significant correlation between GBE1 and immune cells (T cells CD4 memory resting: R = −0.35, p = 0.046; monocytes: R = −0.38, p = 0.029); (G) lollipop plot of the correlation between TPBG and immune cells; (H) scatter plot of the significant correlation between TPBG and monocytes (R = −0.46, p = 0.007); (I) lollipop plot of the correlation between HSPA6 and immune cells; (J) scatter plot of the significant correlation between HSPA6 and plasma cells (R = −0.45, p = 0.0089). In the right column of lollipop plots, p-values < 0.05 with a red color indicate statistical significance. AGTR1: angiotensin II type 1 receptor; GBE1: glycogen branching enzyme; TPBG: trophoblast glycoprotein; and HSPA6: heat shock 70-kDa protein 6.
Brainsci 13 00175 g009
Table 1. A summary of the PD datasets used in the analysis and independent validation.
Table 1. A summary of the PD datasets used in the analysis and independent validation.
ContributorAccessionPlatformSamples
(Normal/PD Sample)
CountryLast Update Date
Middleton FAGSE20141GPL5708/10USA25 March 2019
Dijkstra AAGSE49036GPL5708/15The Netherlands25 March 2019
Ffrench-Mullen JMGSE7621GPL5709/16USA25 March 2019
Edna GGSE20333GPL2016/6Israel25 October 2022
Hauser MAGSE20164GPL965/6USA10 August 2018
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Bao, Y.; Wang, L.; Yu, F.; Yang, J.; Huang, D. Parkinson’s Disease Gene Biomarkers Screened by the LASSO and SVM Algorithms. Brain Sci. 2023, 13, 175. https://doi.org/10.3390/brainsci13020175

AMA Style

Bao Y, Wang L, Yu F, Yang J, Huang D. Parkinson’s Disease Gene Biomarkers Screened by the LASSO and SVM Algorithms. Brain Sciences. 2023; 13(2):175. https://doi.org/10.3390/brainsci13020175

Chicago/Turabian Style

Bao, Yiwen, Lufeng Wang, Fei Yu, Jie Yang, and Dongya Huang. 2023. "Parkinson’s Disease Gene Biomarkers Screened by the LASSO and SVM Algorithms" Brain Sciences 13, no. 2: 175. https://doi.org/10.3390/brainsci13020175

APA Style

Bao, Y., Wang, L., Yu, F., Yang, J., & Huang, D. (2023). Parkinson’s Disease Gene Biomarkers Screened by the LASSO and SVM Algorithms. Brain Sciences, 13(2), 175. https://doi.org/10.3390/brainsci13020175

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop