Next Article in Journal
Utilizing Flaxseed as an Antimicrobial Alternative in Chickens: Integrative Review for Salmonella enterica and Eimeria
Previous Article in Journal
The Effect of Geranylgeraniol and Ginger on Satellite Cells Myogenic State in Type 2 Diabetic Rats
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Two-Sample Mendelian Randomization Study Identifies Tissue-Dependent Risk Genes in Autoimmune Diseases

Department of Animal and Avian Sciences, University of Maryland, College Park, MD 20742, USA
*
Author to whom correspondence should be addressed.
Curr. Issues Mol. Biol. 2024, 46(11), 12311-12321; https://doi.org/10.3390/cimb46110731
Submission received: 4 October 2024 / Revised: 27 October 2024 / Accepted: 29 October 2024 / Published: 31 October 2024
(This article belongs to the Section Bioinformatics and Systems Biology)

Abstract

:
Autoimmune diseases are among the most prevalent diseases across the world with genetic and environmental factors that contribute to their etiology. Because the exact causes of autoimmune diseases are largely unknown, a Mendelian randomization (MR) approach is used here to examine the potential causal association between gene expression levels and disease risk across various tissues. Specifically, this study focuses on six autoimmune diseases including Crohn’s disease, ulcerative colitis, rheumatoid arthritis, multiple sclerosis, type 1 diabetes mellitus, and systemic lupus erythematosus. Several of these diseases are currently treatable with immunosuppressants that target specific genes, such as TNF-alpha, IL-23, CD20, and more. In this study, a two-sample MR analysis is performed with multitissue expression quantitative trait loci (eQTLs) and large-scale genome-wide association studies to investigate how gene expression can influence the risk of developing these diseases. Our results show that genes HLA-DQA1/2, HLA-DRB1/6, HLA-DQB2, C4A, CYP21A2, and HLA-DQB1-AS1 have a high causal effect across several diseases and tissues, and almost all of these findings originate from the major histocompatibility complex (MHC) region on Chromosome 6. Our findings support the current knowledge of genes associated with these diseases while also revealing novel genes that can be used for drug therapies in the future. Although several drug therapies currently exist to treat this selection of autoimmune diseases, we provide further insights into the main, common pathways responsible for autoimmune disease pathogenesis and discuss novel genes that lack research focus.

1. Introduction

Autoimmune diseases are caused by abnormal immune activity where the body attacks its own cells [1]. Approximately ten percent of the population has some form of autoimmunity, making it the third most common category of disease [2]. Autoimmune diseases are multifactorial and can be linked to various factors in a person’s genetics and environmental background. Genetic factors play a large role in the regulation of the proper function of the immune system and may lead to dysregulated immune system and disease conditions [3]. Although environmental conditions have also been found to play a big role in the risk of developing these diseases, we focus specifically on the genetic aspects. Autoimmune diseases vary greatly in pathogenesis, and we study six of the most common autoimmune diseases to provide an insight into the common and specific genetic factors, including Crohn’s disease (CD), ulcerative colitis (UC), rheumatoid arthritis (RA), multiple sclerosis (MS), type 1 diabetes mellitus (T1D), and systemic lupus erythematosus (SLE).
CD and UC are two forms of inflammatory bowel disease (IBD) [4]. While CD is generally localized in the area from the end of the small intestine (terminal ileum) to the start of the colon, it can still be found in any part of the gastrointestinal (GI) tract. UC, alternatively, can be seen anywhere in the colon and rectum. These two diseases have similar symptoms such as diarrhea, weight loss, abdominal pain, weight loss, and other related GI symptoms [4]. RA is another form of autoimmune disease that is often found in a person’s joints [5]. RA can attack multiple joints at once and is most commonly found at the joints of the hands, wrists, and knees, and is also known to affect other systems, such as the eyes, lungs, heart, and more. It is often associated with variants in the HLA complex and genes encoding cytokines and is especially prevalent in older women [6]. MS is considered as a continuum of inflammatory and neurodegenerative processes, which fluctuate and evolve over time and vary among individuals [7]. MS does not directly fit into the standard T-cell autoimmune dogma [8], and there is a debate on whether or not inflammation is the initial trigger of the disease or a secondary response [9]. Studies of common autoimmune diseases show that MS has a low rate of co-occurrence with other autoimmune diseases [2,10]. Myelin sheaths are the protective covering of nerve cells and help make the electrical impulses within a nerve cell more efficient, speeding up the signal. Because of this, it can result in cognitive impairment, reduced motor function, and weakness or numbness in limbs. T1D, unlike type 2 diabetes, where there is reduced insulin production or decreased insulin efficiency, is an autoimmune disease characterized by the destruction of the beta cells [11]. This leads to a deficiency in insulin which causes hyperglycemia, increased thirst, weight loss, fatigue, and extreme hunger. SLE is an autoimmune disease that, unlike the other diseases mentioned, causes inflammation in a multitude of tissues and organs, including the skin, joints, kidneys, cardiovascular system, respiratory system, brain, nervous system, and blood [12]. There are a wide range of symptoms including joint pain and swelling, skin rashes, chest pain and discomfort, and neurological symptoms. SLE also has a female-to-male ratio of 9:1 and can be traced back to variants of the HLA complex and cytokines [13].
Through the use of genome-wide association studies (GWAS), many genes have been identified to be correlated with the risk of autoimmune diseases, which need to be further verified by functional studies and clinical trial experiments [14]. GWAS studies have significantly advanced our understanding of autoimmune diseases, particularly highlighting the role of the HLA region and other non-HLA genes like PTPN22 and IL23R [15,16,17]. While these studies have reported shared genetic factors across various autoimmune diseases, challenges remain including the “missing heritability” and functional interpretation of noncoding variants. Emerging approaches, including the integration of GWAS with multiomics data, the development of polygenic risk scores, and Mendelian randomization (MR) studies, aim to enhance the functional understanding and clinical applications of these findings.
Mendelian randomization studies can identify and analyze the causal effects of genetic variants on autoimmune diseases by integration of existing GWAS and functional genomics datasets [18]. MR simulates a randomized control trial, given the nature of randomly inherited alleles in the genome [19]. Single nucleotide polymorphisms (SNPs) are used as instrumental variables (IV) which are then calculated via a ratio of their effect on the exposure/outcome variable. Each SNP is associated with a specific phenotype so that the ratio calculated can represent the causal influence of a phenotype on the outcome. GWAS and MR have been increasingly integrated to investigate the causal relationship between phenotypes and disease outcomes. The addition of using transcriptome data (eQTL as IV) also allows for a better understanding of the biological mechanisms of diseases [20]. Moreover, gene expression is unique to different tissues given their unique functions, so they allow us to identify the association between specific tissues and diseases.
While GWAS and MR analyses have been used to study autoimmune diseases in previous studies [21,22], this study specifically focuses on gene expression levels to find out how gene regulation plays a role in autoimmune disease risk and how this differs across tissues. Specifically, we use eQTL data from the Genotype-Tissue Expression Consortium (GTEx, v8) and MR-Base to perform two-sample MR analysis [23,24]. Through the use of GWAS, MR, and eQTL, we investigate how the expression level of genes may influence the risk of developing autoimmune diseases. Moreover, we discuss any potential shared genetic dispositions and how they affect the pathogenesis of these diseases.

2. Materials and Methods

2.1. Mendelian Randomization (MR)

Mendelian randomization allows us to estimate a causal relationship between exposures and outcomes by evaluating the causal effect of an instrumental variable (IV) on both exposure and outcome [25]. Due to the mechanisms of genetic inheritance, any person’s alleles at any given gene are expected to be randomized. MR serves as a way to mimic randomized controlled trials to detect causal relationships by exploiting this natural process (Figure 1). This analysis framework is dependent on three assumptions: (1) the IV is associated with the exposure; (2) there are no confounding variables that influence the exposure effect on the outcome; and (3) the IV does not affect the outcome directly, but rather through the given exposure. In this study, we use gene expression levels as exposure and autoimmune disease risks as outcome. And genetic variants that influence gene expression levels (eQTLs) are IVs and tested against various autoimmune diseases to determine the causal effect of gene expression on disease risk. Our study focuses on transcriptome data and most genes have only a single eQTL SNP as IV, so the analyses are not available for thoroughly checking the assumptions, including horizontal pleiotropy.

2.2. Two-Sample MR (MR-Base)

To perform MR analysis, we use the ‘TwoSampleMR’ R Package (0.5.7, https://mrcieu.github.io/TwoSampleMR, accessed on 20 June 2024) and ‘MRInstruments’ R package (https://github.com/mrcieu/mrinstruments, accessed on 20 June 2024) from the MR-Base platform [23]. Two-sample MR allows datasets from different populations to be compared, is less prone to data overfitting compared to one-sample MR, and increases the power of our analysis. Through the MR-Base package, we harmonize eQTL data from GTEx (v8) with GWAS summary statistics data from the NHGRI-EBI GWAS Catalog [26]. The GWAS datasets (Table 1) are from these GWAS IDs (CD: ukb-b-8210, T1D: ebi-a-GCST90014023, Lupus: ebi-a-GCST90018917, RA: ebi-a-GCST90038685, MS: ukb-b-17670, UC: ebi-a-GCST90038684). GTEx data are publicly available and downloaded from the MRInstruments R package. Due to data availability, we focus predominantly on individuals of European ancestry.

2.3. Harmonizing and Data Cleaning

Harmonizing is the process of combining the summary-level statistics datasets from both the exposure and outcome studies and is required to determine any causal effect in MR. This is performed by ensuring that the effect allele, along with its corresponding effect size (beta) and standard error (se) values, from the outcome dataset matches the same effect allele from the exposure dataset. One potential problem in two-sample MR is the presence of palindromic SNPs. These occur when the alleles of an SNP correspond to the nucleotides that pair with each other. For example, if the effect allele is sequenced as A and the other allele is sequenced as T, then it becomes very difficult to determine which allele is the effect allele. SNPs can be sequenced on either the forward strand (5′ to 3′) or reverse strand (3′ to 5′), which may result in the same SNP being reported as its complementary base pairs (A may be reported as T if sequenced from the reverse strand) [23]. While the effect allele can easily be determined in nonpalindromic SNPs, palindromic SNPs additionally require that effect allele frequencies are reported and that the minor allele (rarer genotype) frequency is significantly lower. For these reasons, we remove all palindromic SNPs from this analysis. Additionally, we limit our exposure data to only cis-eQTLs and remove trans-eQTL SNPs as cis-eQTLs have stronger effect and are less likely to violate MR assumptions. Cis-eQTLs refer to genetic variants that affect local gene expressions within the same haplotype, while trans-eQTLs refer to variants that are on different chromosomes or located far away from the regulated genes on the same chromosome [27]. The threshold for eQTL inclusion is a p-value threshold of <1.0  ×  10−5, knowing that our analyses are confined to the top eQTL only. This threshold is used to maximize the number of possible genes analyzed across tissues.

2.4. Statistical Tests

A Wald ratio test estimates a causal effect by dividing the beta value (measurement of the influence of an SNP) for the outcome by the beta value for the exposure. A high Wald ratio means that the data point is significantly different from the hypothesized value and that the null hypothesis can be rejected. This is indicative of vertical pleiotropy and that the exposure has an effect on the outcome. Vertical pleiotropy can estimate a high causal effect while horizontal pleiotropy, where the SNP has a higher influence on the outcome than the exposure, would violate the assumption that the SNP does not affect the outcome. Because of this, we focus on the gene variants that show a high Wald ratio and low p-value.
Alternatively, the inverse variance-weighted (IVW) method allows us to analyze gene expression levels with multiple eQTL SNPs. This method finds the Wald ratio for each SNP and weights them based on the inverse of the variance of association between the exposure and outcome [28]. Wald ratios that have higher variance are weighted lower than those with lower variance, thus giving a more precise estimate of the causal effect of the exposure on the outcome.
We use a Wald ratio test for eQTLs with single SNPs as instruments or an inverse variance-weighted (IVW) method for instruments with more than one SNP in MR-Base to test the causal effect between exposure and outcome. To account for multiple testing, for each disease, the p-value threshold was calculated by dividing 0.05 by the number of tests of the respective dataset after filtering out palindromic and trans-eQTL SNPs.

3. Results

To present the results, we summarize in the Supplemental Tables all the significant gene–tissue pairs that passed the multiple-testing corrected significance threshold. Furthermore, we organize the data to visualize the frequency of significant genes among multiple tissues, the average and median of p-values, and effect size (beta) values to show the magnitude of effect size. Because of our large datasets, we use the median p-value to evaluate the most statistically significant genes, so a large p-value does not disproportionately bias the results. We also do this for the various tissues to find out what tissues play a more significant role than others. Afterward, we display the statistically significant genes we find on a Manhattan plot to show their chromosomal locations. For each disease, we highlight genes that had exceptionally low p-values, large beta values, or genes that were notably present across several diseases (Supplemental Tables).

3.1. Crohn’s Disease (CD)

Ten eQTL variants and six genes (Figure 2) were found to be associated with CD risk with p-values below a threshold of p < 5.33×10−7. The threshold was decided with 0.05/n (n = 93,744) using a Bonferroni approach. Gene IL12RB2 expression in esophagus mucosa (p = 3.82 × 10−15, β = −0.004) is found to have the lowest p-value, while the AT16L1 (p = 7.43 × 10−8, Brain Cerebellum; p = 1.08 × 10−7, Testis), NOD2 (p = 1.35 × 10−7, Skin; p = 1.60 × 10−7, Esophagus Mucosa), and CYLD (p = 1.35 × 10−7, Skin; p = 1.77 × 10−7, Esophagus Mucosa) genes are notable for having statistically significant associations in multiple tissues. More importantly, Esophagus Mucosa is the most frequent tissue for CD involving three genes, which is known to be relevant for disease pathogenesis [29].

3.2. Ulcerative Colitis (UC)

A total of 196 eQTL variants of 57 genes (Figure 3) were found to be associated with UC with p-values below a threshold of p < 2.49 × 10−7. The threshold was decided with 0.05/n (n = 200,977). Most of these genes are located in or near the human major histocompatibility complex (MHC) region on Chromosome 6 (Figure 3). Among all significant gene–tissue pairs, HLA-DRB1 (p = 4.72 × 10−20 across 23 tissues), HLA-DQA1 (median p = 5.59 × 10−14 across 26 tissues), HLA-DQB2 (p = 6.15 × 10−14 across 16 tissues), and HLA-DQA2 (p = 2.64 × 10−13 across 26 tissues) showed the most significant pleiotropic associations with UC in more than 15 tissues. The results also show that Nerve Tibial is the most frequent tissue among the significant associations with UC.

3.3. Rheumatoid Arthritis (RA)

A total of 528 variants of 117 genes (Figure 4) were found to be causally associated with RA with p-values below a threshold of p < 2.49 × 10−7. The threshold was decided with 0.05/n (n = 200,977). Similar to UC, the majority of significant genes are located in or near the human MHC region on Chromosome 6 (Figure 4). In all significant results, HLA-DRB6 (median p = 2.61 × 10−67, 29 tissues), HLA-DQA1 (median p = 2.68 × 10−17, 26 tissues), HLA-DRB5 (median p = 3.49 × 10−17, 20 tissues), HLA-DQA2 (median p = 2.3 × 10−16, 25 tissues), SKIV2L (median p = 1.8 × 10−13, 28 tissues), and PSORS1C3 (median p = 1.06 × 10−13, 26 tissues) showed the most significant pleiotropic associations with RA in at least 20 tissues. Among the significant results, the most frequent tissues include Testis, Cells Transformed Fibroblasts, Skin, Lung Artery Tibial, and Adipose Subcutaneous.

3.4. Multiple Sclerosis (MS)

A total of 563 variants of 145 genes (Figure 5) were found to be causally associated with MS with p-values below a threshold of p < 4.59 × 10−7. The threshold was decided with 0.05/n (n = 108,955). In all tissues, HLA-DQB1 (median p = 1.99 × 10−68, 11 tissues), HLA-DRB1 (median p = 7.08 × 10−62, 25 tissues), HLA-DRB6 (median p = 9.54 × 10−41, 21 tissues), CYP21A1P (median 7.84 × 10−29, 14 tissues), HLA-J (median p = 2.82 × 10−18, 30 tissues), HLA-DQA1 (median p = 1.53 × 10−27, 19 tissues), and ATF6B (median p = 3.1 × 10−27, 16 tissues) showed the most significant pleiotropic associations with MS. More importantly, the most frequent tissues are Brain Cerebellum, Esophagus Mucosa, Testis, Esophagus Muscularis, Nerve Tibial, Whole Blood, Colon Transverse, and Adipose Visceral Omentum. Among these, the Brain Cerebellum and Nerve Tibial are related to the central nervous system that is attacked in MS [30].

3.5. Type 1 Diabetes (T1D)

A total of 2940 eQTL variants of 508 genes (Figure 6) were found to be causally associated with T1D with p-values below a threshold of p < 4.59 × 10−7. The threshold was decided with 0.05/n (n = 108,955). The associated genes are distributed on several chromosomes in addition to a major peak in the MHC region. Due to the extremely small p-values from the original GWAS studies, our results are highly significant, and the smallest p-values were rounded to 1.53 × 10−305. In all tissues, HLA-DRB1 (median 1.53 × 10−305, 21 tissues), C4A (median p = 6.91 × 10−263, 31 tissues), HLA-DOB (median p = 8.74 × 10−232, 27 tissues), TAP2 (median p = 1.7 × 10−231, 23 tissues), SKIV2L (median p = 6.51 × 10−148, 28 tissues), PSORS1C3 (median p = 1.8 × 10−119, 29 tissues), MICB (median p = 2.84 × 10−88, 28 tissues), LY6G5B (median p = 5.26 × 10−81, 28 tissues), ATP6V1G2 (median p = 1.43 × 10−68, 21 tissues), HLA-DRB6 (median p = 1.13 × 10−65, 29 tissues), HLA-DQA1 (median p = 6.87 × 10−63, 26 tissues), NOTCH4 (median p = 4.01 × 10−62, 24 tissues), and HLA-DQB2 (median p = 3.2 × 10−58, 22 tissues) showed the most significant pleiotropic associations with T1D. The most frequent significant tissues include Testis, Skin, Adipose Subcutaneous, Thyroid, Whole Blood, Artery Tibial, and Nerve Tibial.

3.6. Systemic Lupus Erythematosus (SLE)

A total of 119 variants of 43 genes (Figure 7) were found to be causally associated with SLE with p-values below a threshold of p < 4.59 × 10−7. The threshold was decided with 0.05/n (n = 108,955). In all tissues, the C4A (median p = 1.2 × 10−15, 28 tissues), CYP21A1P (median 3.3 × 10−15, 12 tissues), and C4B (median p = 2.7 × 10−15, 6 tissues) genes were the top pleiotropic associations with SLE. The most frequent significant tissues are Nerve Tibial, Whole Blood, Pancreas, Adrenal Gland, Artery Aorta, Thyroid, and Muscle Skeletal.

3.7. Shared Risk Genes Across Multiple Autoimmune Diseases and Related Tissues

The MHC region showed the most significant signal for five out of six autoimmune diseases in this study, except for Crohn’s disease. The MHC region is still one of the major association signals for Crohn’s disease. Many genes within the MHC region are related to multiple diseases in multiple tissues with the top ones shown in Table 2. The HLA-DQA1 gene has been previously associated with several autoimmune diseases [31,32], and is related to UC, RA, MS, and T1D in this study across many tissues. The HLA-DQA2 gene has been associated with RA previously [33], and is related to UC, RA, MS, and T1D in many tissues in this study. The CYP21A1P gene has been associated with all five autoimmune diseases with evidence from multiple tissues.

4. Discussion

While many studies have explored the effect of various environmental factors as exposures for autoimmune diseases, this study focuses on the use of gene expression levels in conjunction with Mendelian randomization to identify many genes involved in the pathogenesis of several common autoimmune diseases. Across all tissues and six autoimmune diseases, we find that type 1 diabetes mellitus, rheumatoid arthritis, multiple sclerosis, and ulcerative colitis share several common genetic factors that may lead to increased risk of each other. Almost all of the identified genes were on the sixth chromosome across all diseases. Within this chromosome, HLA-DQA1, HLA-DRB6, HLA-DQA2, HLA-DQB2, HLA-DRB1, and HLA-DQB1-AS1 were notable for having causal associations among several autoimmune diseases. This selection of genes is located in the MHC class II region. The MHC region is one of the largest collections of immune regulation genes. Because of this, genetic factors associated with the development of autoimmunity can be traced back to this region. Specifically, this region is among the most polymorphic within the genome and contains extreme linkage disequilibrium, meaning that alleles in this region are often inherited together, allowing for a diverse range of genetic variations across ethnicities [34]. While this diversity allows for a strong adaptive immune system, this also leads to certain alleles being associated with autoimmunity. While some genes identified in this study, such as SKIV2L, exist outside of the MHC region, they are few and suggest other independent risk factors for autoimmunity. Although MS is found to have a low rate of co-occurrence with other autoimmune diseases [2,3], the present study shows that MS share several common genetic factors that may lead to increased risk of each other.
In our results, we also find a variety of pseudogenes (HLA-DRB6, CYP21A1P, PSORS1C3) that have a significant effect on the risk of autoimmune diseases. Pseudogenes are imperfect copies of functional genes that cannot translate into a protein. They are structurally very similar to their parent genes, but often lack exons and introns, or promoters and stop codons crucial to code for proteins. These pseudogenes are believed to exist for various reasons, such as accumulated mutations that render genes dysfunctional due to a lack of selection pressure [35]. Although pseudogenes are unable to properly translate into proteins, some are still able to transcribe into ncRNA (noncoding RNA). While ncRNA cannot code for a protein, it has been found to have an effect on the regulation of gene expression by influencing epigenetics. This occurs by modifying histones, affecting heterochromatin formations, and DNA methylation [36]. Because this study focuses specifically on eQTLs as IVs for the exposure variable, we can find results related to pseudogenes that may not otherwise be found in other types of studies. Furthermore, this paper reveals that factors beyond mutations in protein-coding genes play a significant role in the risk of autoimmune disease and can widen the scope for early diagnosis based on family history. These pseudogenes may also reveal additional information about the evolutionary history of autoimmune diseases. As of now, pseudogenes also serve another important purpose of recording ancient genes that have lost their purpose. These can be used to discover the rate of gene duplication and are extremely helpful in phylogenetic studies. With our results of pseudogenes that play a significant role in regulating disease risk, future studies can trace some autoimmune diseases to ancient selective pressures and discover the origin of autoimmunity.
Of course, our study presents limitations. The studies chosen as outcomes for our MR analysis were chosen based on the number of SNPs and sample size to maximize the power of the study and find as many significant genes as possible. However, it is evident that some studies lacked significant SNPs that were able to be harmonized with GTEx data, such as the Crohn’s disease results, which led to underwhelming results compared to the other diseases. Additionally, this study only used SNP data from European ancestry, which limits our ability to apply these results to other ethnicities and geographical regions, given that the MHC region is highly polymorphic and specialized. This is especially prominent in different ethnicities. Lastly, the lack of testing for linkage disequilibrium undermines some of the results as alleles may be inherited together and invalidates the assumption that all genes are inherited randomly and independently.

5. Conclusions

In conclusion, the use of two-sample MR and eQTL data allows for the discovery of the causal association between gene expressions and autoimmunity. Additionally, some autoimmune diseases were found to share certain genetic predispositions that may increase the risk of inheriting or developing other immune-related conditions, in particular those located in or near the MHC region.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/cimb46110731/s1, Table S1: Results of MR analysis for all six diseases. Table S2: Results of MR analysis for Lupus. Table S3: Results of MR analysis for MS. Table S4: Results of MR analysis for CD. Table S5: Results of MR analysis for T1D. Table S6: Results of MR analysis for RA. Table S7: Results of MR analysis for UC.

Author Contributions

L.M. and R.C. conceived the study; R.C. and L.M. analyzed and interpreted data; L.M. and R.C. wrote the manuscript. Both authors read and approved the final manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the USDA National Institute of Food and Agriculture (NIFA) Agriculture and Food Research Initiative (AFRI), grant number 2021-67015-33409. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable as the study used publicly accessible data.

Data Availability Statement

GWAS datasets are downloaded from the EU GWAS catalog (https://gwas.mrcieu.ac.uk/, accessed on 20 June 2024) with GWAS IDs (CD: ukb-b-8210, T1D: ebi-a-GCST90014023, Lupus: ebi-a-GCST90018917, RA: ebi-a-GCST90038685, MS: ukb-b-17670, UC: ebi-a-GCST90038684). GTEx data are publicly available (https://gtexportal.org, accessed on 20 June 2024) and downloaded from the MR-Base program.

Acknowledgments

We thank Liu Yang for assistance and reviewing the manuscript. We acknowledge the support from the Bronx High School of Science, for this research project. We also thank the reviewers for their constructive comments that have improved the manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Wang, L.; Wang, F.S.; Gershwin, M.E. Human autoimmune diseases: A comprehensive update. J. Intern. Med. 2015, 278, 369–395. [Google Scholar] [CrossRef] [PubMed]
  2. Conrad, N.; Misra, S.; Verbakel, J.Y.; Verbeke, G.; Molenberghs, G.; Taylor, P.N.; Mason, J.; Sattar, N.; McMurray, J.J.; McInnes, I.B. Incidence, prevalence, and co-occurrence of autoimmune disorders over time and by age, sex, and socioeconomic status: A population-based cohort study of 22 million individuals in the UK. Lancet 2023, 401, 1878–1890. [Google Scholar] [CrossRef]
  3. Baranzini, S.E. The genetics of autoimmune diseases: A networked perspective. Curr. Opin. Immunol. 2009, 21, 596–605. [Google Scholar] [CrossRef]
  4. Sartor, R.B. Mechanisms of disease: Pathogenesis of Crohn’s disease and ulcerative colitis. Nat. Clin. Pract. Gastroenterol. Hepatol. 2006, 3, 390–407. [Google Scholar] [CrossRef]
  5. Aletaha, D.; Smolen, J.S. Diagnosis and management of rheumatoid arthritis: A review. JAMA 2018, 320, 1360–1372. [Google Scholar] [CrossRef] [PubMed]
  6. Kurkó, J.; Besenyei, T.; Laki, J.; Glant, T.T.; Mikecz, K.; Szekanecz, Z. Genetics of rheumatoid arthritis—A comprehensive review. Clin. Rev. Allergy Immunol. 2013, 45, 170–179. [Google Scholar] [CrossRef] [PubMed]
  7. Kuhlmann, T.; Moccia, M.; Coetzee, T.; Cohen, J.A.; Correale, J.; Graves, J.; Marrie, R.A.; Montalban, X.; Yong, V.W.; Thompson, A.J. Multiple sclerosis progression: Time for a new mechanism-driven framework. Lancet Neurol. 2023, 22, 78–88. [Google Scholar] [CrossRef]
  8. Dobson, R.; Giovannoni, G. Multiple sclerosis—A review. Eur. J. Neurol. 2019, 26, 27–40. [Google Scholar] [CrossRef]
  9. Stys, P.K.; Tsutsui, S.; Gafson, A.R.; ‘t Hart, B.A.; Belachew, S.; Geurts, J.J. New views on the complex interplay between degeneration and autoimmunity in multiple sclerosis. Front. Cell. Neurosci. 2024, 18, 1426231. [Google Scholar] [CrossRef]
  10. Baranzini, S.E.; Wang, J.; Gibson, R.A.; Galwey, N.; Naegelin, Y.; Barkhof, F.; Radue, E.-W.; Lindberg, R.L.; Uitdehaag, B.M.; Johnson, M.R. Genome-wide association analysis of susceptibility and clinical phenotype in multiple sclerosis. Hum. Mol. Genet. 2009, 18, 767–778. [Google Scholar] [CrossRef]
  11. DiMeglio, L.A.; Evans-Molina, C.; Oram, R.A. Type 1 diabetes. Lancet 2018, 391, 2449–2462. [Google Scholar] [CrossRef] [PubMed]
  12. Yu, C.; Gershwin, M.E.; Chang, C. Diagnostic criteria for systemic lupus erythematosus: A critical review. J. Autoimmun. 2014, 48, 10–13. [Google Scholar] [CrossRef]
  13. Niu, Z.; Zhang, P.; Tong, Y. Value of HLA-DR genotype in systemic lupus erythematosus and lupus nephritis: A meta-analysis. Int. J. Rheum. Dis. 2015, 18, 17–28. [Google Scholar] [CrossRef] [PubMed]
  14. Harroud, A.; Hafler, D.A. Common genetic factors among autoimmune diseases. Science 2023, 380, 485–490. [Google Scholar] [CrossRef]
  15. Pisetsky, D.S. Pathogenesis of autoimmune disease. Nat. Rev. Nephrol. 2023, 19, 509–524. [Google Scholar] [CrossRef]
  16. de Vries, R.R.; Van Rood, J. HLA and autoimmunity. In Perspectives on Autoimmunity; CRC Press: Boca Raton, FL, USA, 2020; pp. 1–17. [Google Scholar]
  17. Tizaoui, K.; Terrazzino, S.; Cargnin, S.; Lee, K.H.; Gauckler, P.; Li, H.; Shin, J.I.; Kronbichler, A. The role of PTPN22 in the pathogenesis of autoimmune diseases: A comprehensive review. Semin. Arthritis Rheum. 2021, 51, 513–522. [Google Scholar] [CrossRef]
  18. Boehm, F.J.; Zhou, X. Statistical methods for Mendelian randomization in genome-wide association studies: A review. Comput. Struct. Biotechnol. J. 2022, 20, 2338–2351. [Google Scholar] [CrossRef] [PubMed]
  19. Lawlor, D.A. Commentary: Two-sample Mendelian randomization: Opportunities and challenges. Int. J. Epidemiol. 2016, 45, 908–915. [Google Scholar] [CrossRef]
  20. Richardson, T.G.; Hemani, G.; Gaunt, T.R.; Relton, C.L.; Davey Smith, G. A transcriptome-wide Mendelian randomization study to uncover tissue-dependent regulatory mechanisms across the human phenome. Nat. Commun. 2020, 11, 185. [Google Scholar] [CrossRef]
  21. Chen, C.; Wang, P.; Zhang, R.-D.; Fang, Y.; Jiang, L.-Q.; Fang, X.; Zhao, Y.; Wang, D.-G.; Ni, J.; Pan, H.-F. Mendelian randomization as a tool to gain insights into the mosaic causes of autoimmune diseases. Autoimmun. Rev. 2022, 21, 103210. [Google Scholar] [CrossRef]
  22. Xu, Q.; Ni, J.-J.; Han, B.-X.; Yan, S.-S.; Wei, X.-T.; Feng, G.-J.; Zhang, H.; Zhang, L.; Li, B.; Pei, Y.-F. Causal relationship between gut microbiota and autoimmune diseases: A two-sample Mendelian randomization study. Front. Immunol. 2022, 12, 746998. [Google Scholar] [CrossRef] [PubMed]
  23. Hemani, G.; Zheng, J.; Elsworth, B.; Wade, K.H.; Haberland, V.; Baird, D.; Laurin, C.; Burgess, S.; Bowden, J.; Langdon, R. The MR-Base platform supports systematic causal inference across the human phenome. eLife 2018, 7, e34408. [Google Scholar] [CrossRef]
  24. Consortium, G. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 2020, 369, 1318–1330. [Google Scholar] [CrossRef]
  25. Sanderson, E.; Glymour, M.M.; Holmes, M.V.; Kang, H.; Morrison, J.; Munafò, M.R.; Palmer, T.; Schooling, C.M.; Wallace, C.; Zhao, Q. Mendelian randomization. Nat. Rev. Methods Primers 2022, 2, 6. [Google Scholar] [CrossRef] [PubMed]
  26. Sollis, E.; Mosaku, A.; Abid, A.; Buniello, A.; Cerezo, M.; Gil, L.; Groza, T.; Güneş, O.; Hall, P.; Hayhurst, J. The NHGRI-EBI GWAS Catalog: Knowledgebase and deposition resource. Nucleic Acids Res. 2023, 51, D977–D985. [Google Scholar] [CrossRef]
  27. Gilad, Y.; Rifkin, S.A.; Pritchard, J.K. Revealing the architecture of gene regulation: The promise of eQTL studies. Trends Genet. 2008, 24, 408–415. [Google Scholar] [CrossRef]
  28. Burgess, S.; Butterworth, A.; Thompson, S.G. Mendelian randomization analysis with multiple genetic variants using summarized data. Genet. Epidemiol. 2013, 37, 658–665. [Google Scholar] [PubMed]
  29. Decker, G.A.G.; Loftus Jr, E.V.; Pasha, T.M.; Tremaine, W.J.; Sandborn, W.J. Crohn’s disease of the esophagus: Clinical features and outcomes. Inflamm. Bowel Dis. 2001, 7, 113–119. [Google Scholar] [CrossRef]
  30. Ford, H. Clinical presentation and diagnosis of multiple sclerosis. Clin. Med. 2020, 20, 380–383. [Google Scholar] [CrossRef]
  31. Megiorni, F.; Pizzuti, A. HLA-DQA1 and HLA-DQB1 in Celiac disease predisposition: Practical implications of the HLA molecular typing. J. Biomed. Sci. 2012, 19, 88. [Google Scholar] [CrossRef]
  32. Badenhoop, K.; Walfish, P.G.; Rau, H.; Fischer, S.; Nicolay, A.; Bogner, U.; Schleusener, H.; Usadel, K. Susceptibility and resistance alleles of human leukocyte antigen (HLA) DQA1 and HLA DQB1 are shared in endocrine autoimmune disease. J. Clin. Endocrinol. Metab. 1995, 80, 2112–2117. [Google Scholar] [PubMed]
  33. Andreasi, R.B.; Khan, M.; Galuppi, E.; Govoni, M.; Rubini, M. THU0022 Replication analysis of gene-gene interaction between HLA-DQA2 and HLA-DQB2 variants in italian rheumatoid arthritis patients. Ann. Rheum. Dis. 2017, 76, 207. [Google Scholar]
  34. Simmonds, M.; Gough, S. The HLA region and autoimmune disease: Associations and mechanisms of action. Curr. Genom. 2007, 8, 453–465. [Google Scholar] [CrossRef] [PubMed]
  35. Statello, L.; Guo, C.-J.; Chen, L.-L.; Huarte, M. Gene regulation by long non-coding RNAs and its biological functions. Nat. Rev. Mol. Cell Biol. 2021, 22, 96–118. [Google Scholar] [CrossRef] [PubMed]
  36. Kaikkonen, M.U.; Lam, M.T.; Glass, C.K. Non-coding RNAs as regulators of gene expression and epigenetics. Cardiovasc. Res. 2011, 90, 430–440. [Google Scholar] [CrossRef]
Figure 1. Overview of the Mendelian randomization analysis.
Figure 1. Overview of the Mendelian randomization analysis.
Cimb 46 00731 g001
Figure 2. Manhattan plot representing the distribution of eQTL variants across chromosomal location and significance level for Crohn’s disease. Horizontal dashed line represents the p-value threshold for significant SNPs.
Figure 2. Manhattan plot representing the distribution of eQTL variants across chromosomal location and significance level for Crohn’s disease. Horizontal dashed line represents the p-value threshold for significant SNPs.
Cimb 46 00731 g002
Figure 3. Manhattan plot representing the distribution of eQTL variants across chromosomal location and significance levels for ulcerative colitis. Horizontal dashed line represents the p-value threshold for significant SNPs.
Figure 3. Manhattan plot representing the distribution of eQTL variants across chromosomal location and significance levels for ulcerative colitis. Horizontal dashed line represents the p-value threshold for significant SNPs.
Cimb 46 00731 g003
Figure 4. Manhattan plot representing the distribution of eQTL variants across chromosomal location and significance levels for rheumatoid arthritis. Horizontal dashed line represents the p-value threshold for significant SNPs.
Figure 4. Manhattan plot representing the distribution of eQTL variants across chromosomal location and significance levels for rheumatoid arthritis. Horizontal dashed line represents the p-value threshold for significant SNPs.
Cimb 46 00731 g004
Figure 5. Manhattan plot representing the distribution of eQTL variants across chromosomal location and significance levels for multiple sclerosis. Horizontal dashed line represents the p-value threshold for significant SNPs.
Figure 5. Manhattan plot representing the distribution of eQTL variants across chromosomal location and significance levels for multiple sclerosis. Horizontal dashed line represents the p-value threshold for significant SNPs.
Cimb 46 00731 g005
Figure 6. Manhattan plot representing the distribution of eQTL variants across chromosomal location and significance levels for Type 1 diabetes. Horizontal dashed line represents the p-value threshold for significant SNPs.
Figure 6. Manhattan plot representing the distribution of eQTL variants across chromosomal location and significance levels for Type 1 diabetes. Horizontal dashed line represents the p-value threshold for significant SNPs.
Cimb 46 00731 g006
Figure 7. Manhattan plot representing the distribution of eQTL variants across chromosomal location and significance levels for systemic lupus erythematosus. Horizontal dashed line represents the p-value threshold for significant SNPs.
Figure 7. Manhattan plot representing the distribution of eQTL variants across chromosomal location and significance levels for systemic lupus erythematosus. Horizontal dashed line represents the p-value threshold for significant SNPs.
Cimb 46 00731 g007
Table 1. Summary information of GWAS datasets of six diseases.
Table 1. Summary information of GWAS datasets of six diseases.
GWAS Study IDDiseasePopulationSample SizeNumber of SNPs
ukb-b-8210CDEuropean462,9339,851,867
ebi-a-GCST90014023T1DEuropean520,58059,999,551
ebi-a-GCST90018917LupusEuropean482,91124,198,877
ebi-a-GCST90038685RAEuropean484,5989,587,836
ukb-b-17670MSEuropean462,9339,851,867
ebi-a-GCST90038684UCEuropean484,5989,587,836
Table 2. Top genes associated with multiple autoimmune diseases in multiple tissues.
Table 2. Top genes associated with multiple autoimmune diseases in multiple tissues.
GeneDisease
Ulcerative ColitisRheumatoid ArthritisMultiple SclerosisType 1 DiabetesSystemic Lupus Erythematosus
HLA-DQA126 tissues26 tissues19 tissues26 tissues
HLA-DQA226 tissues25 tissues10 tissues27 tissues
HLA-DRB62 tissues29 tissues21 tissues29 tissues
HLA-DRB123 tissues6 tissues25 tissues21 tissues
HLA-DQB216 tissues18 tissues12 tissues22 tissues
C4A1 tissue1 tissue 31 tissues28 tissues
CYP21A1P1 tissue13 tissues14 tissues19 tissues12 tissues
HLA-DQB1-AS111 tissues18 tissues3 tissues18 tissues
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chiu, R.; Ma, L. Two-Sample Mendelian Randomization Study Identifies Tissue-Dependent Risk Genes in Autoimmune Diseases. Curr. Issues Mol. Biol. 2024, 46, 12311-12321. https://doi.org/10.3390/cimb46110731

AMA Style

Chiu R, Ma L. Two-Sample Mendelian Randomization Study Identifies Tissue-Dependent Risk Genes in Autoimmune Diseases. Current Issues in Molecular Biology. 2024; 46(11):12311-12321. https://doi.org/10.3390/cimb46110731

Chicago/Turabian Style

Chiu, Ryan, and Li Ma. 2024. "Two-Sample Mendelian Randomization Study Identifies Tissue-Dependent Risk Genes in Autoimmune Diseases" Current Issues in Molecular Biology 46, no. 11: 12311-12321. https://doi.org/10.3390/cimb46110731

APA Style

Chiu, R., & Ma, L. (2024). Two-Sample Mendelian Randomization Study Identifies Tissue-Dependent Risk Genes in Autoimmune Diseases. Current Issues in Molecular Biology, 46(11), 12311-12321. https://doi.org/10.3390/cimb46110731

Article Metrics

Back to TopTop