Next Article in Journal
Evaluation of Metabolic Profiles of Patients with Anorexia Nervosa at Inpatient Admission, Short- and Long-Term Weight Regain—Descriptive and Pattern Analysis
Previous Article in Journal
Continuous Glucose and Heart Rate Monitoring in Young People with Type 1 Diabetes: An Exploratory Study about Perspectives in Nocturnal Hypoglycemia Detection
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Liability Threshold Model for Predicting the Risk of Cardiovascular Disease in Patients with Type 2 Diabetes: A Multi-Cohort Study of Korean Adults

1
Molecular Neurogenetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
2
Department of Neurology, Harvard Medical School, Boston, MA 02115, USA
3
Medical and Population Genetics Program, the Broad Institute of M.I.T. and Harvard, Cambridge, MA 02142, USA
4
Yonsei Cancer Institute, College of Medicine, Yonsei University, Seoul 03722, Korea
5
Department of Medical Genetics, College of Medicine, Hallym University, Chuncheon, Gangwon-do 24252, Korea
*
Author to whom correspondence should be addressed.
Metabolites 2021, 11(1), 6; https://doi.org/10.3390/metabo11010006
Submission received: 31 October 2020 / Revised: 21 December 2020 / Accepted: 22 December 2020 / Published: 24 December 2020
(This article belongs to the Section Endocrinology and Clinical Metabolic Research)

Abstract

:
Personalized risk prediction for diabetic cardiovascular disease (DCVD) is at the core of precision medicine in type 2 diabetes (T2D). We first identified three marker sets consisting of 15, 47, and 231 tagging single nucleotide polymorphisms (tSNPs) associated with DCVD using a linear mixed model in 2378 T2D patients obtained from four population-based Korean cohorts. Using the genetic variants with even modest effects on phenotypic variance, we observed improved risk stratification accuracy beyond traditional risk factors (AUC, 0.63 to 0.97). With a cutoff point of 0.21, the discrete genetic liability threshold model consisting of 231 SNPs (GLT231) correctly classified 87.7% of 2378 T2D patients as high or low risk of DCVD. For the same set of SNP markers, the GLT and polygenic risk score (PRS) models showed similar predictive performance, and we observed consistency between the GLT and PRS models in that the model based on a larger number of SNP markers showed much-improved predictability. In silico gene expression analysis, additional information was provided on the functional role of the genes identified in this study. In particular, HDAC4, CDKN2B, CELSR2, and MRAS appear to be major hubs in the functional gene network for DCVD. The proposed risk prediction approach based on the liability threshold model may help identify T2D patients at high CVD risk in East Asian populations with further external validations.

Graphical Abstract

1. Introduction

Type 2 diabetes (T2D) keeps steadily increasing in prevalence in developed countries, and thus its complications, such as cardiovascular and renal diseases, constitute the leading cause of disease burden worldwide. The mortality rate in T2D patients with cardiovascular disease (CVD) is two to four times higher than in those with T2D only. In the United States, the majority of elderly patients with T2D die from heart disease (68%) and stroke (15%), even when their glucose levels are well controlled. CVD encompasses a broad spectrum of subphenotypes affecting the heart and blood vessels, including coronary artery disease (CAD), cerebrovascular disease (CVA), and peripheral arterial disease (PAD) [1]. The prevalence of diabetes among Korean adults aged 30 years or more increased from 12.4% in 2011 to 14.4% in 2016, and the highest prevalence estimate was seen in older adults aged 65 and over (i.e., 29.8%) [2]. Likewise, the prevalence of macrovascular complications in T2D patients estimated in 2011, such as CAD (10.3%), CVA (6.7%), and PAD (0.19%), is expected to increase further as the T2D prevalence increases in South Korea [3].
During the last decade, genome-wide association studies (GWAS) based on the “common disease common variant hypothesis” have successfully identified approximately 153 variants mapping to more than 120 T2D loci, including PPARG, KCNJ11, and TCF7L2, in multiethnic populations [4]. Although there is some overlap in susceptibility genes, previous studies have reported differences in genetic factors associated with CVD risk between diabetic and nondiabetic individuals. Multiple genes, such as CDKN2A/2B, HNF1A, PCSK9, CELSR2-PSRC1-SORT1, and PHACTR1, have been suggested to be associated with diabetic cardiovascular disease (DCVD) [5]. However, only one single-nucleotide polymorphism (SNP) of the GLUL gene, rs10911021, passed a threshold of genome-wide statistical significance for coronary heart disease (CHD) in non-Hispanic Caucasian patients with T2D (OR = 1.36, p = 2 × 10−8), and such a significant association was not observed in nondiabetic individuals [6].
To date, numerous statistical methods have been proposed to dissect the genetic architecture of complex traits. In particular, the use of a linear mixed model (LMM) in GWAS improves the statistical power to detect genetic associations by removing redundant SNPs [7]. Polygenic risk scoring (PRS) improves disease risk predictability by estimating the cumulative effect of multiple susceptibility variants [8]. However, a complex model that combines conventional risk factors, such as hypertension, obesity, and smoking, with a polygenic model may further enhance the predictive power for CVD risk in diabetic patients [9]. Another useful method for disease prediction, the liability threshold (LT) model, also called a probit model, gave the highest predictive accuracy compared to both the Risch risk model and the logit model using the same dataset. Here, liability refers to an individual’s innate tendency to develop a disease determined by the combinatory effects of genetic and environmental factors on the disease incidence [10]. While recent meta-analyses of GWAS have discovered many new T2D loci by increasing sample size, large-scale sequencing studies, contrary to expectations, have identified very few rare variants despite having sufficient statistical power [11].
The development of reliable prediction models for complex diseases, such as DCVD, is of the utmost importance in the era of precision medicine. To the best of our knowledge, risk prediction based on a multifactorial liability threshold model (MLT) that combines the effects of multiple genes and conventional nongenetic factors has not been applied to DCVD yet. In this study, we initially constructed genetic LT models with three different sets of DCVD-associated variants using data obtained from four Korean population-based cohort studies. Subsequently, we compared the discriminatory performance of three polygenic LT models for cardiovascular risk stratification in diabetic patients with the corresponding PRS models. In addition, we evaluated the degree of improvement in predictive performance for DCVD risk classification by adding genetic risk information to a phenotype-based risk model.

2. Results

2.1. Nongenetic Risk Factors for DCVD

Of the 21 nongenetic variables tested in this study, age was the most significant risk factor for DCVD (Table 1). Compared to T2D patients under the age of 50, the risk of developing CVD increased significantly in the 50s and 60s (OR = 2.28 and 3.75, p = 0.007 and 5.1×10−6, respectively) (Table S1). The mean serum creatinine level in the DCVD patient group (1.01 mg/dL) was significantly higher than that of the T2D control group (0.91 mg/dL) (OR = 2.62, p = 5.3 × 10−5). The effect of systolic blood pressure (SBP) on DCVD (OR = 1.01, p = 0.032) turned out to be statistically insignificant in multivariate analysis, whether treated as a continuous variable or as a categorical variable. In addition, past alcohol and tobacco consumption (OR = 1.96 and 1.43, p = 0.005 and 0.062, respectively), higher income (OR = 0.69, p = 0.027), total cholesterol (TC), triglycerides (TG), and gamma-glutamyl transpeptidase (GGT) were associated with DCVD risk in univariate analysis. However, only three variables, age (OR = 1.06, p = 6.5 × 10−6), BMI (OR = 1.09, p = 0.005), and blood creatinine level (OR = 2.02, p = 0.028), remained in the multivariate logistic regression (MLR) model after backward stepwise elimination.

2.2. Genetic Risk Factors for DCVD

In the current LMM analysis, after adjusting for age, sex, BMI, and creatinine level, two SNPs, rs4538911 (LOC392180-MCPH1, 8p23.2) and rs9982069 (PPIAL3-SLC6A6P, 21q21.1), showed the most significant associations with DCVD (p = 5.0 × 10−7 and 9.1 × 10−7, respectively) (Table 2). The regional association plots showed additional SNPs that were not in high LD (r2 < 0.8) but yielded suggestive associations with DCVD (p < 0.05) in the vicinity of those SNPs (Figure S1).
Among the 169 genotyped tSNPs that were also previously reported to be associated with CVD and/or DCVD (r2 < 0.8), 15 tSNPs yielded replicated associations with CVD in Korean T2D patients (0.001 < p < 0.05) (Table 2). The detailed LMM analysis results for 32 SNPs (p < 1 × 10−4) and 216 SNPs (p < 1 × 10−3) are provided in Table 2 and Table S2, respectively.

2.3. Gene Function Prediction

After filtering out 14 genes that did not appear in the DAVID database from 200 genes harboring 231 SNPs, we identified 92 significantly enriched GO terms associated with 118 genes (p < 0.05 and FDR < 0.1, data not shown). The most enriched GO term, GO:0007399~nervous system development, was associated with 45 genes, including HDAC4, FGF9, and EPHA5 (p = 8.2 × 10−8, FDR = 1.5 × 10−4). Five genes, EPHB2, EPHA3, EPHA5, EFNA5, and SLIT3, were significantly enriched in axon guidance in the KEGG pathway, essential for neuronal network formation (hsa04360, p = 0.03, data not shown).
Of the 41 genes harboring 47 SNPs, four genes did not appear in DAVID. We identified 10 GO terms significantly enriched in 19 genes, and three of which, HDAC4, NOX4, and NRP1, were shown to play an important role in smooth muscle cell migration (GO:0014911, p = 0.001, FDR = 0.02) (Table 3).
Three other genes, MYL2, FGF9, and MRAS, were shown to be involved in the KEGG pathway hsa04810~regulation of actin cytoskeleton (p = 0.042, Figure S2). In particular, genes such as HDAC4, CDKN2B, CELSR2, and MRAS are major hubs in both functional networks for 31 and 170 genes that harbor 47 (p < 1 × 10−4) and 231 SNP sets (p < 1 × 10−3), respectively (Figure 1 and Figure S3).

2.4. DCVD Risk Prediction

2.4.1. Genetic Risk Prediction

The disease-free mortality of Koreans aged 40 to 69 was higher in men than in women (19.5 vs. 7.3 per 10,000 people). However, the incidence rate (IR) of DCVD in T2D patients was higher in women than in men (15.67 vs. 13.47 per 1000 person-years) (Table S3). A model consisting of 15 previous SNPs that also showed nominally significant associations (p < 0.05) in this Korean study did not achieve sufficient predictability for DCVD (AUC, 53.7%). On the other hand, by adding SNPs that were less significantly associated with DCVD, the genetic liability threshold (GLT) model showed significantly improved predictability than the model using a more stringent p-value threshold for SNP selection (AUCs: 73.2% and 99.2% for GLT47 and GLT231, respectively). As the number of SNPs included in the model increased, the mean difference (MD) in liability to DCVD between cases and controls increased (MDs: 0.006, 0.044, and 0.216 for GLT15, GLT47, and GLT231, respectively). For every 1-point increase in normalized genetic liability on a scale of 0 to 10, the risk of developing DCVD also increased accordingly (ORs: 1.05, 1.54, and 14.13 for GLT15, GLT47, and GLT231, respectively) (Table 4).
When comparing the GLT and PRS model performance to predict DCVD risk in T2D patients, PRS47 performed better than GLT47 (ΔAUC = 11%); however, there was no significant difference between the two methods when predicting genetic risk based on a set of 15 or 231 SNP markers. In particular, we observed consistency between two risk measurements in that the model based on a larger number of SNP markers showed much-improved predictability (AUCs: 99.21% and 99.18% for GLT231 and PRS231, respectively) (Table 4). When we assigned each participant a percentile based on the GLT231 or PRS231 value, all DCVD patients have liability or risk scores above the 90th percentile in the risk distribution (Figure S4).
We further evaluated the predictive ability of discrete models to identify T2D patients at high CVD risk and found that GLT231 outperformed PRS231. Using a cutoff point of 0.21 or greater, the GLT231 model correctly classified 87.7% of 2378 individuals as high or low risk for DCVD with high sensitivity and specificity of 100% and 86.8%, respectively. Since there were no DCVD cases in the low-risk group, we could not estimate the ORs in the discrete GLT and PRS models (Table 4). When we stratified the liability and genetic risk scores into four risk quartiles and compared the highest (Q4) to the lowest quartile (Q1), the OR of each was large, possibly due to the small number of cases in the first quartile (OR = 20.2 and p = 6.3 × 10−9 for GLT47; OR = 30.5 and p = 1.2 × 10−13 for PRS47) (Table S4).

2.4.2. Multifactorial Risk Prediction

We observed a much higher performance of risk stratification for CVD in T2D patients in the genetic models than in the nongenetic model (AUCs: 0.63 for nGLT vs. 0.99 for GLT231), whereas the GLT model, which includes a family history of DCVD, slightly improved the predictive performance (e.g., ΔAUC = 2% for GLT231). By adding four nongenetic risk factors, the predictability of the 47 SNP-based genetic models improved (ΔAUC = 4%), whereas the predictability of the 231 SNP-model slightly decreased (ΔAUC = −2%) (Figure 2). Specifically, the combined effect of the four nongenetic factors was weaker than that of the susceptibility SNPs (ORs: 1.23 for nGLT vs. 1.54 and 14.13 for GLT47 and GLT231, respectively), and these results were similar to those of the PRS models (ORs: 1.21 for nGRS vs. 2.72 and 18.41 for PRS47 and PRS231, respectively) (Table 4). As in the continuous model, the predictability for an individual’s DCVD risk increased in the quartile liability-based model by integrating nongenetic factors and 47 SNP information (ΔAUC = 2%). Contrary to expectations, the predictive performance of the PRS model was higher than that of the multifactorial model (ΔAUC = −11%) (Table S4).
We observed similar predictive performance in each of the four cohorts, although the case-control data from the Health2 Study showed the highest AUC values (Table S5). In 10-fold cross-validation, we also demonstrated consistency in the predictive performance of the models (Table S6). Since net reclassification improvement (NRI) has become a widely used measure to assess the predictive performance of risk models, we estimated the degree of improvement in continuous NRI achieved by adding genetic information to the nongenetic risk model. By adding 47 SNPs to the nGLT model, the enhanced model correctly assigned 12% of DCVD patients to higher predicted risk (event NRI, NRIe) and 32% of the control group to lower risk (non-event NRI, NRIne). The overall NRI, calculated as the sum of NRIe and NRIne, was as large as 0.441, but the continuous NRI of the risk score-based model was greater than that of the liability-based model (NRI = 1.017 for PRS47). Compared to the adding effect of 47 SNP information in DCVD prediction, adding 231 SNP information improved the nongenetic model significantly (NRIs: 1.824 and 1.837 for GLT231 and PRS231, respectively) (Table S7).

3. Discussion

We validated the impact of traditional CVD risk factors, such as age, obesity, elevated blood pressure, cigarette smoking, and alcohol drinking, on the development of DCVD in Korean T2D patients [12]. Interestingly, a significant association between elevated serum creatinine, a clinical marker of renal dysfunction, and DCVD was observed in the case-control study, while the association with hypertension became more significant through the 10-year follow-up (data not shown). These results are consistent with the previous findings that diabetic nephropathy rarely occurred in patients with diabetes duration less than ten years and that diabetic patients with CVD complications were more likely to take antihypertensive drugs than those with T2D alone [13,14]. In this study, serum lipid or GGT levels were lower in the DCVD patient group than in the control group. Previous studies have reported that elevated GGT and CRP levels increased the risks of dyslipidemia, metabolic syndrome, and CVD, yet their prognostic values of CVD events in T2D patients remain controversial [15,16]. As reported in a general population-based cohort study, increased CVD risk due to low education and wealth levels has also been observed in Korean T2D patients [17].
In this LMM-based genetic association study, four SNPs, rs4538911, rs7946015, rs17465734, and rs9982069, did not achieve a genome-wide threshold of p < 5 × 10−8 but exhibited suggestive associations with DCVD risk (5 × 10−7 < p < 1 × 10−5). The SNP, rs10911021, located near the GLUL gene (1q25), which had been associated with CHD in Caucasian T2D patients, revealed no significant association in this East Asian study [6,18]. On the other hand, we found significant associations between DCVD and 15 reported SNPs located in or near CVD candidate genes, including CELSR2-PSRC1, CDKN2B, TFCP2L1, HDAC4, MRAS, SPSB4, KCNN2, MYL2, and ZFHX3 (0.001 < p < 0.05) [5,19]. Whereas the minor G allele of rs599839, located 500 bp from the 3′-untranslated region (UTR) of the PSRC1 gene (1p13.3), is a well-replicated variant in various subtypes of CVD and an intronic SNP, rs12801636, of the PCNXL3 gene (11q13.1), is a validated SNP for lipid levels [20,21], others have not been implicated yet as trait-associated SNPs. However, the genes harboring intronic SNPs, ALK (rs4575680), EPHA3 (rs1512909), and TULP4, also known as TUSP (rs341137), have been implicated in CVD-related traits, such as blood pressure, arterial fibrillation, and systemic sclerosis [22,23,24]. Genes near the intergenic SNPs identified in Korean patients with DCVD, especially the nearest gene to rs1401939 (2q22.1), LINC01853, a long intergenic noncoding RNA gene (lncRNA), was recently reported to be associated with coronary artery calcified atherosclerotic plaque in African-American T2D patients. Moreover, a nearby gene, LRP1B, a member of the LDL receptor gene family, has previously been implicated in CHD and heart failure [25]. The ZWINT gene neighboring rs1503908 has also been reported to be related to cardiac hypertrophy; however, the intergenic SNPs such as rs6750818, rs1154846, and rs9586032, have never been implicated in DCVD-related traits [26]. In silico functional analysis provides additional evidence to support the role of these genes in DCVD pathogenesis, particularly in the migration and proliferation of smooth muscle cells that occur after vascular damage. In particular, network analysis highlighted the hub genes in the PPI network, such as CDKN2B, HDAC4, CELS2, and MYL2.
Predicting individual disease risk is at the core of precision medicine to prevent disease progression in susceptible individuals through early intervention and lifestyle management. The GRS model, which combines a small number of susceptibility SNPs identified by GWAS, has been replaced by the PRS model that incorporates the effects of a larger number of SNPs passing a less stringent association p-value threshold to improve statistical power [27]. The PRS method has shown the potential to improve risk stratification accuracy beyond traditional risk factors [28]. According to a European study, the AUC of each GRS model for CHD prediction consisting of five SNPs, seven clinical predictors, or both GRS plus clinical predictors were 0.577, 0.699, and 0.715, respectively [19]. In a large-scale study of CAD risk prediction in T2D patients, adding a weighted GRS comprised of 204 CAD candidate SNPs to a model of 13 clinical predictors such as age, sex, history of CAD, smoking habits, and SBP lead to an 8% improvement in risk classification. However, the AUCs of the models did not appear to be good enough to distinguish high- and low-risk individuals (i.e., genetic 0.567, clinical 0.675, combined 0.681), and all participants were of European ancestry [29]. Recently, the issue of limited generalizability of European derived PRS has been raised, and the importance of developing PRS specific to non-European populations is emphasized [30,31].
In the current study, we constructed non-logit probit models, also called liability threshold models, to predict DCVD risk by combining effects of a set of 47- or 231-tSNPs selected according to the level of statistical significance and observed much-improved model performance in GLT231 compared to GLT47 (i.e., AUC, 0.99 vs. 0.73). By including 231 tSNPs and family history information, the predictability of the nongenetic model comprising age, sex, BMI, and blood creatinine level greatly improved in AUC from 0.63 to 0.97; however, the predictability of the genetic model was higher than that of the multifactorial model (ΔAUC = 2%). We found similar prediction estimates for DCVD risk in each of the four cohorts and validated the performance of these models in 10-fold cross-validation. These results were consistent with observations from discrete- and quantile-based analyses. Besides, we observed consistency between the two risk measures, liability- and risk score-based models, in that the model based on a larger number of SNP markers showed much-improved predictability (AUCs: 99.21% and 99.18% for GLT231 and PRS231, respectively).
Previous studies have raised concerns about the interpretation of the clinical significance of a small change in AUC and the tendency of NRI to make uninformative markers appear predictive [32,33]. Although we analyzed 2378 T2D patients obtained from the four largest population-based cohorts in Korea, 168 DCVD cases may not be enough to develop a risk model specific to CVD subtypes such as CAD. Moreover, the high AUC and NRI statistics observed in the GLT231 and PRS231 models might represent an overfitting issue that often occurs when analyzing a large number of SNP markers in a relatively small number of samples. However, CAD itself consists of heterogeneous subtypes, and the shared genetic factors may underlie the pervasive pleiotropy among CVD subtypes [34]. Furthermore, the lack of statistical significance does not necessarily preclude the presence of an association of a risk factor with the disease. Additional efforts are necessary to implement a risk prediction model in clinical practice, such as developing a set of genetic markers with excellent DCVD risk classification performance, improving the predictive performance of risk models, and validating the promising model in independent datasets.

4. Materials and Methods

4.1. Study Populations

To explore potential risk factors for DCVD, we first identified 2378 T2D patients from 16,147 participants with comparable genetic and clinical data collected from the initial surveys of four population-based Korean cohort studies established by the Center for Genome Science at the Korean National Institute of Health: the Korea Association Resource Study (KARE), Health Examinees (HEXA) Study, Korean Healthy Twin Study (HT), and Health2 Study. Based on the International Diabetes Federation guidelines (https://www.idf.org/), T2D cases were defined as fasting plasma glucose (FPG) ≥ 126 mg/dL, 2-h plasma glucose after 75 g oral glucose tolerance test (2 h OGTT) ≥ 200 mg/dL or with a medical history of T2D.
We identified 168 T2D patients with a medical history of myocardial infarction (MI), CAD, congestive heart failure (CHF), PAD, or CVA (mean age 61.1 ± 0.5 years) and 2210 T2D patients without any history of CVD (mean age 56.9 ± 0.2 years) at the baseline survey conducted from 2001 to 2002. All participants provided written informed consent, and details of each cohort are described elsewhere [35,36,37,38]. This study also obtained Institutional Review Board approval of Hallym University (HIRB-2014-109).

4.2. Genotyping and Quality Controls

Genomic DNA derived from the peripheral blood of participants was genotyped using Genome-wide Human SNP array 5.0 in the KARE study and SNP array 6.0 in the other three cohort studies (Affymetrix Inc., Santa Clara, CA, USA). We found 352,228, 516,610, 606,876, and 627,659 SNPs that passed the quality control filters (i.e., genotyping call rate ≥ 95%, minor allele frequency ≥ 1%, and Hardy-Weinberg equilibrium p-value ≥ 1 × 10−6) in the KARE, HEXA, Twin-family, and Health2 studies, respectively [35,36,37,38]. We computed linkage disequilibrium (LD), represented as r2, between SNP pairs using Haploview software [39]. To fill in both missing genotypes and untyped markers, we imputed genotypes at an additional > 4.4 million SNP loci using the East Asian reference panel of the 1000 Genomes Project with IMPUTE2 [40].

4.3. Statistical Analysis

4.3.1. Association of Conventional Risk Factors with the Development of DCVD

To identify nongenetic risk factors associated with DCVD, we initially conducted univariate logistic regression analyses to estimate odds ratios (ORs) and 95% confidence intervals (CIs) for age, sex, family history of T2D or DCVD, four environmental, and thirteen clinical variables by comparing 168 DCVD patients with 2210 T2D patients at baseline. We then selected a set of informative covariates by a backward elimination procedure (Table 1). All analyses were performed using STATA software package v.11.2 (Stata Corp., College Station, TX, USA).

4.3.2. Genetic Association Analysis of DCVD Based on Generalized Linear Mixed Model

We initially performed genome-wide GLMM analysis under an additive genetic model after adjusting for age, sex, BMI, and creatinine level using 210,830 autosomal tagging SNPs (tSNPs) after removing redundant SNPs (r2 > 0.8) in 168 DCVD cases and 2210 T2D controls as implemented in Genome-wide Complex Trait Analysis (GCTA) v1.24 [41]. We generated Manhattan plots using the R package ‘qqman’ (https://cran.r-project.org/web/packages/qqman) and further explored the ±500-kb regions adjacent to the significant SNPs using a web-based program, LocusZoom v1.3 (http://locuszoom.org/).
We also identified the SNPs associated with CVD or DCVD by searching for review articles in PubMed and web databases, such as GWAS Catalog (https://www.ebi.ac.uk/gwas/) and HuGe Navigator (https://phgkb.cdc.gov/HuGENavigator/home.do), until 6 September 2018. For 231 SNPs, including 216 SNPs identified here at p < 1 × 10−3 plus 15 reported SNPs replicated at p < 0.05 in the present study, we conducted GLMM analyses using STATA after adjusting for the four covariates shown above.

4.3.3. Gene Functional Enrichment, Pathway, and Network Analyses

We analyzed the enrichment of gene ontology (GO) terms and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways to group the candidate SNPs into functionally annotated gene sets using the web application of the Database for Annotation, Visualization and Integrated Discovery (DAVID) v6.8 [42]. Biological functions with a false discovery rate (FDR) < 10% were considered to be strongly enriched in the annotation categories. Furthermore, we displayed the protein–protein interaction (PPI) network for the selected gene list using STRING v11 [43].

4.3.4. Risk Prediction of DCVD in T2D Patients Using

Incidence-Based Liability Threshold Models

The lifetime risk (i.e., incidence-based risk) of developing DCVD in an individual with T2D was estimated based on the measured liabilities, age- and sex-specific incidence rates (IRs), and disease-free mortalities obtained from the Korean Statistical Information System (KOSIS, www.kosis.kr) (Table S3).
The liability to DCVD was estimated based on the prevalence, additive relative risk (RR), heritability (h2), and family history of DCVD using the method proposed by So et al. (2011) [44] We applied the CVD prevalence among Korean adults with T2D (17%) and h2 of 0.5 [3,45]. Genetic and nongenetic factors were categorized into 0, 1, or 2, and measurable liability for each individual was estimated from the equation L = i β i x i + j β j x j + e , where x i and x j denote the risk allele count at the ith susceptibility locus and the risk score of the jth nongenetic variable, respectively. The residual, e, represents the liability contributed by the unknown risk factors. The detailed procedure is described elsewhere [44,46]. Based on the individual lifetime risk to DCVD, we constructed genetic, nongenetic, and multifactorial LT models (i.e., GLT, nGLT, and MLT) for 47- and 231-SNP sets selected at p < 10−4 and p < 10−3, respectively. We further examined the predictive performance of discrete (L—low risk, H – high risk) and quartile models (Q1—lowest risk, Q2—low risk, Q3—high risk, Q4—highest risk) for DCVD risk prediction. We also compared the predictability of each risk model on DCVD observed in each of the four cohort studies and validated them using 10-fold cross-validation.

Polygenic Risk Scores

To compare the predictive performance for the DCVD risk with the GLT models, the PRS for each of the three SNP set, PRS15, PRS47, and PRS231, were constructed based on the formula, PRS = i = 1 m ( l o g O R i × x i ) , where x i denotes the risk allele count at the ith susceptibility locus [34]. Two measures of risk, GLT and PRS, each with values in different ranges, were converted into a common scale of 0–10 using a formula for min-max normalization, X N = X X m i n X r a n g e × 10 , where XN is the normalized values, X is the original values, Xmin is the minimum value on the original scale, and Xrange is the difference between the maximum score and the minimum score on the original scale [47]. We compared the predictability of the GLT models with the PRS models based on the interpretation of the AUC and continuous NRI using the STATA commands, ‘reccomp’ and ‘incrisk’, respectively [48]. All analyses were conducted using two statistical software packages, Stata and R.

5. Conclusions

We validated the impact of traditional CVD risk factors such as age, obesity, elevated blood pressure, cigarette smoking, and alcohol drinking on the development of DCVD in Korean T2D patients. We also replicated significant associations with DCVD for 15 previously reported SNPs located in or near CVD candidate genes. In silico gene expression analysis lent further support to the functional roles of these genes in DCVD pathogenesis, particularly in the migration and proliferation of smooth muscle cells that occur after vascular damage, and highlighted the hub genes in the PPI network, such as CDKN2B, HDAC4, CELS2, and MYL2. For the same set of SNP markers, the GLT and PRS models showed similar predictive performance. Using the genetic variants that have even modest effects on phenotypic variance, it is possible to improve risk stratification accuracy beyond traditional risk factors. In conclusion, the polygenic LT model developed in an ethnically homogenous Korean population may help identify T2D patients at high risk of CVD in East Asians genetically similar to Koreans.

Supplementary Materials

The following are available online at https://www.mdpi.com/2218-1989/11/1/6/s1, Figure S1: Regional association plots for regions containing each of two SNPs, rs4538911 (LOC392180-MCPH1, 8p23.2) and rs9982069 (PPIAL3-SLC6A6P, 21q21.1), Figure S2: Schematic diagram of the regulation of actin cytoskeleton pathway (KEGG pathway, hsa04810), Figure S3: Protein-protein interaction network of 170 candidate genes for diabetic cardiovascular disease, Figure S4: The frequencies of cases and controls and their risks of DCVD by risk percentiles for genetic (GLT), nongenetic (nGLT), and multifactorial liability threshold (MLT) models of 168 DCVD cases and 2210 T2D controls, Table S1: Logistic regression analysis for association between diabetic cardiovascular disease and five nongenetic factors transformed into categorical variables, Table S2: Results of the linear mixed model analysis of 231 candidate SNPs for diabetic cardiovascular disease, Table S3: Incidence rates of cardiovascular disease in patients with type 2 diabetes and disease-free mortality rates in Korea, Table S4: Association results of risk prediction models for diabetic cardiovascular disease after stratification into risk quartiles in the case-control study, Table S5: Predictability of genetic and nongenetic liability threshold models on diabetic cardiovascular disease in KARE, HEXA, Health2, and Twin-family Studies, Table S6: Predictability of genetic and nongenetic liability threshold models on diabetic cardiovascular disease after 10-fold cross validation test, Table S7: Reclassification improvement achieved by adding genetic markers to the multifactorial models for diabetic cardiovascular disease.

Author Contributions

Conceptualization, J.W.P.; methodology, J.W.P., E.P.H., and S.G.H.; validation, J.W.P. and E.P.H.; formal analysis, E.P.H.; investigation, J.W.P. and E.P.H.; resources, J.W.P.; data curation, E.P.H.; writing—original draft preparation, E.P.H. and J.W.P.; writing—review and editing, J.W.P. and S.G.H.; visualization, E.P.H.; supervision, J.W.P.; funding acquisition, J.W.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Hallym University Research Fund, 2017 (HRF-201711-015).

Institutional Review Board Statement

This study has been approved by Institutional Review Board approval of Hallym University (HIRB-2014-109).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study and details of each cohort are described elsewhere [35,36,37,38].

Data Availability Statement

Restrictions apply to the availability of these data. Data was obtained from the National Biobank of Korea, the Center for Disease Control and Prevention (KCDC) and are available from https://is.cdc.go.kr/ with the permission of the KCDC.

Acknowledgments

The genotype data (the Korean Genome Analysis Project, 4845-301) and the phenotype data (the Korean Genome Epidemiology Study, 4851-302) were provided by the National Biobank of Korea, the Center for Disease Control and Prevention (KCDC), Republic of Korea (4851-307, KBP- 2015-012 and KBP-2015-043). As part of a research project entitled, ‘Simulation study for early warning model using linear mixed model and construction for early warning system contents based on Korean genome information’, funded by KCDC (2013E7300300), Saangyong Uhmm, of Hallym University coded scripts to estimate an individual’s liability and impute genotype data using software, such as Python and IMPUTE2. Suyeon Park. and Juyoung Lee, of Korea National Institute of Health (KNIH), contributed to data acquisition and statistical methodology. The Korea Institute of Science and Technology Information (KISTI) also provided computing resources through the Global Science experimental Data hub Center (GSDC) and internet service at speeds of 1 Gbps, the Korea Research Environment Open Network (KREONET). The authors thank the participants and the investigative staff involved in data generation at each local institute of the KARE, Twin-family, HEXA, and Health2 studies.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

References

  1. Centers for Disease Control and Prevention. National Diabetes Fact Sheet: National Estimates and General Information on Diabetes and Prediabetes in the United States; Department of Health and Human Services, Centers for Disease Control and Prevention: Atlanta, GA, USA, 2017. Available online: https://www.cdc.gov/diabetes/pubs (accessed on 23 December 2017).
  2. Kim, B.Y.; Won, J.C.; Lee, J.H.; Kim, H.S.; Park, J.W.; Ha, K.H.; Won, K.C.; Kim, D.J.; Park, K.S. Diabetes Fact Sheets in Korea, 2018: An Appraisal of Current Status. Diabetes Metab. J. 2019, 43, 487–494. [Google Scholar] [CrossRef] [PubMed]
  3. Koo, B.K.; Lee, C.H.; Yang, B.R.; Hwang, S.S.; Choi, N.K. The incidence and prevalence of diabetes mellitus and related atherosclerotic complications in Korea: A National Health Insurance Database Study. PLoS ONE 2014, 9, e110650. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  4. Flannick, J.; Florez, J.C. Type 2 diabetes: Genetic data sharing to advance complex disease research. Nat. Rev. Genet. 2016, 17, 535–549. [Google Scholar] [CrossRef] [PubMed]
  5. Ma, R.C. Genetics of cardiovascular and renal complications in diabetes. J. Diabetes Investig. 2016, 7, 139–154. [Google Scholar] [CrossRef] [Green Version]
  6. Qi, L.; Qi, Q.; Prudente, S.; Mendonca, C.; Andreozzi, F.; di Pietro, N.; Sturma, M.; Novelli, V.; Mannino, G.C.; Formoso, G.; et al. Association between a genetic variant related to glutamic acid metabolism and coronary heart disease in individuals with type 2 diabetes. JAMA 2013, 310, 821–828. [Google Scholar] [CrossRef] [Green Version]
  7. Yang, J.; Zaitlen, N.A.; Goddard, M.E.; Visscher, P.M.; Price, A.L. Advantages and pitfalls in the application of mixed-model association methods. Nat. Genet. 2014, 46, 100–106. [Google Scholar] [CrossRef] [Green Version]
  8. Chatterjee, N.; Shi, J.; Garcia-Closas, M. Developing and evaluating polygenic risk prediction models for stratified disease prevention. Nat. Rev. Genet. 2016, 17, 392–406. [Google Scholar] [CrossRef]
  9. Murea, M.; Ma, L.; Freedman, B.I. Genetic and environmental factors associated with type 2 diabetes and diabetic vascular complications. Rev. Diabet. Stud. 2012, 9, 6–22. [Google Scholar] [CrossRef] [Green Version]
  10. Lee, S.H.; Wray, N.R.; Goddard, M.E.; Visscher, P.M. Estimating missing heritability for disease from genome-wide association studies. Am. J. Hum. Genet. 2011, 88, 294–305. [Google Scholar] [CrossRef] [Green Version]
  11. Flannick, J. The Contribution of Low-Frequency and Rare Coding Variation to Susceptibility to Type 2 Diabetes. Curr. Diab. Rep. 2019, 19, 25. [Google Scholar] [CrossRef] [Green Version]
  12. Martin-Timon, I.; Sevillano-Collantes, C.; Segura-Galindo, A.; Del Canizo-Gomez, F.J. Type 2 diabetes and cardiovascular disease: Have all risk factors the same strength? World J. Diabetes 2014, 5, 444–470. [Google Scholar] [CrossRef] [PubMed]
  13. Gheith, O.; Farouk, N.; Nampoory, N.; Halim, M.A.; Al-Otaibi, T. Diabetic kidney disease: World wide difference of prevalence and risk factors. J. Nephropharmacol. 2015, 5, 49–56. [Google Scholar] [CrossRef] [PubMed]
  14. Petrie, J.R.; Guzik, T.J.; Touyz, R.M. Diabetes Hypertension;and Cardiovascular Disease: Clinical Insights and Vascular Mechanisms. Can. J. Cardiol. 2018, 34, 575–584. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  15. Yoon, H.E.; Mo, E.Y.; Shin, S.J.; Moon, S.D.; Han, J.H.; Kim, E.S. Serum gamma-glutamyltransferase is not associated with subclinical atherosclerosis in patients with type 2 diabetes. Cardiovasc. Diabetol. 2016, 15, 108. [Google Scholar] [CrossRef] [PubMed]
  16. Cardoso, C.R.; Leite, N.C.; Salles, G.F. Prognostic Importance of C-Reactive Protein in High Cardiovascular Risk Patients With Type 2 Diabetes Mellitus: The Rio de Janeiro Type 2 Diabetes Cohort Study. J. Am. Heart Assoc. 2016, 5, e004554. [Google Scholar] [CrossRef] [PubMed]
  17. Rosengren, A.; Smyth, A.; Rangarajan, S.; Ramasundarahettige, C.; Bangdiwala, S.I.; AlHabib, K.F.; Avezum, A.; Bengtsson Boström, K.; Chifamba, J.; Gulec, S.; et al. Socioeconomic status and risk of cardiovascular disease in 20 low-income; middle-income; and high-income countries: The Prospective Urban Rural Epidemiologic (PURE) study. Lancet Glob. Health 2019, 7, e748–e760. [Google Scholar] [CrossRef] [Green Version]
  18. Doria, A. Leveraging Genetics to Improve Cardiovascular Health in Diabetes: The 2018 Edwin Bierman Award Lecture. Diabetes 2019, 68, 479–489. [Google Scholar] [CrossRef] [Green Version]
  19. Qi, L.; Parast, L.; Cai, T.; Powers, C.; Gervino, E.V.; Hauser, T.H.; Hu, F.B.; Doria, A. Genetic susceptibility to coronary heart disease in type 2 diabetes: 3 independent studies. J. Am. Coll. Cardiol. 2011, 58, 2675–2682. [Google Scholar] [CrossRef] [Green Version]
  20. Angelakopoulou, A.; Shah, T.; Sofat, R.; Shah, S.; Berry, D.J.; Cooper, J.; Palmen, J.; Tzoulaki, I.; Wong, A.; Jefferis, B.J.; et al. Comparative analysis of genome-wide association studies signals for lipids; diabetes; and coronary heart disease: Cardiovascular Biomarker Genetics Collaboration. Eur. Heart J. 2012, 33, 393–407. [Google Scholar] [CrossRef] [Green Version]
  21. Willer, C.J.; Schmidt, E.M.; Sengupta, S.; Peloso, G.M.; Gustafsson, S.; Kanoni, S.; Ganna, A.; Chen, J.; Buchkovich, M.L.; Mora, S.; et al. Discovery and refinement of loci associated with lipid levels. Nat. Genet. 2013, 45, 1274–1283. [Google Scholar] [CrossRef] [Green Version]
  22. Parmar, P.G.; Taal, H.R.; Timpson, N.J.; Thiering, E.; Lehtimaki, T.; Marinelli, M.; Lind, P.A.; Howe, L.D.; Verwoert, G.; Aalto, V.; et al. International Genome-Wide Association Study Consortium Identifies Novel Loci Associated With Blood Pressure in Children and Adolescents. Circ. Cardiovasc. Genet. 2016, 9, 266–278. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  23. Nielsen, J.B.; Thorolfsdottir, R.B.; Fritsche, L.G.; Zhou, W.; Skov, M.W.; Graham, S.E.; Herron, T.J.; McCarthy, S.; Schmidt, E.M.; Sveinbjornsson, G.; et al. Biobank-driven genomic discovery yields new insight into atrial fibrillation biology. Nat. Genet. 2018, 50, 1234–1239. [Google Scholar] [CrossRef] [PubMed]
  24. Gorlova, O.Y.; Li, Y.; Gorlov, I.; Ying, J.; Chen, W.V.; Assassi, S.; Reveille, J.D.; Arnett, F.C.; Zhou, X.; Bossini-Castillo, L.; et al. Gene-level association analysis of systemic sclerosis: A comparison of African-Americans and White populations. PLoS ONE 2018, 13, e0189498. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  25. Divers, J.; Palmer, N.D.; Langefeld, C.D.; Brown, W.M.; Lu, L.; Hicks, P.J.; Smith, S.C.; Xu, J.; Terry, J.G.; Register, T.C.; et al. Genome-wide association study of coronary artery calcified atherosclerotic plaque in African Americans with type 2 diabetes. BMC Genet. 2017, 18, 105. [Google Scholar] [CrossRef]
  26. Parsa, A.; Chang, Y.P.; Kelly, R.J.; Corretti, M.C.; Ryan, K.A.; Robinson, S.W.; Gottlieb, S.S.; Kardia, S.L.R.; Shuldiner, A.R.; Liggett, S.B. Hypertrophy-associated polymorphisms ascertained in a founder cohort applied to heart failure risk and mortality. Clin. Transl. Sci. 2011, 4, 17–23. [Google Scholar] [CrossRef]
  27. Wray, N.R.; Yang, J.; Hayes, B.J.; Price, A.L.; Goddard, M.E.; Visscher, P.M. Pitfalls of predicting complex traits from SNPs. Nat. Rev. Genet. 2013, 14, 507–515. [Google Scholar] [CrossRef] [Green Version]
  28. Torkamani, A.; Wineinger, N.E.; Topol, E.J. The personal and clinical utility of polygenic risk scores. Nat. Rev. Genet. 2018, 19, 581–590. [Google Scholar] [CrossRef]
  29. Morieri, M.L.; Gao, H.; Pigeyre, M.; Shah, H.S.; Sjaarda, J.; Mendonca, C.; Hastings, T.; Buranasupkajorn, P.; Motsinger-Reif, A.A.; Rotroff, D.M.; et al. Genetic Tools for Coronary Risk Assessment in Type 2 Diabetes: A Cohort Study From the ACCORD Clinical Trial. Diabetes Care. 2018, 41, 2404–2413. [Google Scholar] [CrossRef] [Green Version]
  30. Martin, A.R.; Kanai, M.; Kamatani, Y.; Okada, Y.; Neale, B.M.; Daly, M.J. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 2019, 51, 584–591. [Google Scholar] [CrossRef]
  31. Duncan, L.; Shen, H.; Gelaye, B.; Meijsen, J.; Ressler, K.; Feldman, M.; Peterson, R.; Domingue, B. Analysis of polygenic risk score usage and performance in diverse human populations. Nat. Commun. 2019, 10, 3328. [Google Scholar] [CrossRef]
  32. Pencina, M.J.; D’Agostino, R.B.; Pencina, K.M.; Janssens, A.C.J.W.; Greenland, P. Interpreting incremental value of markers added to risk prediction models. Am. J. Epidemiol. 2012, 176, 473–481. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  33. Pepe, M.S.; Janse, H.; Li, C.I. Net risk reclassification p values: Valid or misleading? J. Natl. Cancer Inst. 2014, 106, dju041. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  34. Moorthie, S.; de Villiers, C.B.; Brigden, T.; Gaynor, L.; Hall, A.; Johnson, E.; Sanderson, S.; Kroese, M. Polygenic Scores Risk and Cardiovascular Disease; PHG Foundation: Cambridge, UK, 2019; Available online: https://www.phgfoundation.org/documents/prs-report-final-web.pdf (accessed on 19 October 2019).
  35. Cho, Y.S.; Go, M.J.; Kim, Y.J.; Heo, J.Y.; Oh, J.H.; Ban, H.J.; Yoon, D.; Lee, M.H.; Kim, D.J.; Park, M.; et al. A large-scale genome-wide association study of Asian populations uncovers genetic factors influencing eight quantitative traits. Nat. Genet. 2009, 41, 527–534. [Google Scholar] [CrossRef] [PubMed]
  36. Kim, Y.J.; Go, M.J.; Hu, C.; Hong, C.B.; Kim, Y.K.; Lee, J.Y.; Hwang, J.Y.; Oh, J.H.; Kim, D.J.; Kim, N.H.; et al. Large-scale genome-wide association studies in East Asians identify new genetic loci influencing metabolic traits. Nat. Genet. 2011, 43, 990–995. [Google Scholar] [CrossRef] [PubMed]
  37. Suh, Y.J.; Kim, S.; Kim, S.H.; Park, J.; Lim, H.A.; Park, H.J.; Choi, H.; Ng, D.; Lee, M.K.; Nam, M. Combined genome-wide linkage and association analyses of fasting glucose level in healthy twins and families of Korea. J. Korean. Med. Sci. 2013, 28, 415–423. [Google Scholar] [CrossRef] [PubMed]
  38. Wen, W.; Kato, N.; Hwang, J.Y.; Guo, X.; Tabara, Y.; Li, H.; Dorajoo, R.; Yang, X.; Tsai, F.J.; Li, S.; et al. Genome-wide association studies in East Asians identify new loci for waist-hip ratio and waist circumference. Sci. Rep. 2016, 6, 17958. [Google Scholar] [CrossRef] [PubMed]
  39. Barrett, J.C.; Fry, B.; Maller, J.; Daly, M.J. Haploview: Analysis and visualization of LD and haplotype maps. Bioinformatics 2005, 21, 263–265. [Google Scholar] [CrossRef] [Green Version]
  40. Howie, B.N.; Donnelly, P.; Marchini, J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 2009, 5, e1000529. [Google Scholar] [CrossRef] [Green Version]
  41. Yang, J.; Lee, S.H.; Goddard, M.E.; Visscher, P.M. GCTA: A tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 2011, 88, 76–82. [Google Scholar] [CrossRef] [Green Version]
  42. Huang da, W.; Sherman, B.T.; Lempicki, R.A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 2009, 4, 44–57. [Google Scholar] [CrossRef]
  43. Szklarczyk, D.; Gable, A.L.; Lyon, D.; Junge, A.; Wyder, S.; Huerta-Cepas, J.; Simonovic, M.; Doncheva, N.T.; Morris, J.H. STRING v11: Protein-protein association networks with increased coverage; supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 2019, 47, D607–D613. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  44. So, H.C.; Kwan, J.S.; Cherny, S.S.; Sham, P.C. Risk prediction of complex diseases from family history and known susceptibility loci; with applications for cancer screening. Am. J. Hum. Genet. 2011, 88, 548–565. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  45. Barakat, K.; Hitman, G.A. Genetic susceptibility to macrovascular complications of type 2 diabetes mellitus. Best Pract. Res. Clin. Endocrinol. Metab. 2001, 15, 359–370. [Google Scholar] [CrossRef] [PubMed]
  46. So, H.C.; Sham, P.C. Exploring the predictive power of polygenic scores derived from genome-wide association studies: A study of 10 complex traits. Bioinformatics 2017, 33, 886–892. [Google Scholar] [CrossRef]
  47. Suarez-Alvarez, M.M.; Pham, D.T.; Prostov, M.Y.; Prostov, Y.I. Statistical approach to normalization of feature vectors and clustering of mixed datasets. Proc. R. Soc. A 2012, 468, 2630–2651. [Google Scholar] [CrossRef]
  48. Longton, G.; Pepe, M. Current Methods for Evaluating Prediction Performance of Biomarkers and Tests. Available online: http://research.fhcrc.org/content/dam/stripe/diagnostic-biomarkers-statistical-center/files/incrisk.pdf. (accessed on 1 February 2020).
Figure 1. Protein–protein interaction network of 31 candidate genes for diabetic cardiovascular disease: Light-green line indicates the presence of co-publications found through text mining; light purple, evidence of homology; purple line, experimental evidence of coexpression; black line, evidence of mRNA coexpression (confidence score of STRING, 0.25).
Figure 1. Protein–protein interaction network of 31 candidate genes for diabetic cardiovascular disease: Light-green line indicates the presence of co-publications found through text mining; light purple, evidence of homology; purple line, experimental evidence of coexpression; black line, evidence of mRNA coexpression (confidence score of STRING, 0.25).
Metabolites 11 00006 g001
Figure 2. Comparison of the area under the ROC curves (AUCs) of three liability threshold (LT) models, nongenetic (nGLT), genetic (GLT), and multifactorial (MLT) models: (A) Bar graph with standard error bars for comparing AUC values of LT models with or without DCVD family history (grey-filled bars and transparent bars, respectively) (B,C). AUC statistics estimated for genetic (solid green lines), nongenetic (blue dashed lines), and multifactorial liability threshold (red tight-dotted lines) models, including a family history of DCVD for the 47- and 231-SNP sets, respectively.
Figure 2. Comparison of the area under the ROC curves (AUCs) of three liability threshold (LT) models, nongenetic (nGLT), genetic (GLT), and multifactorial (MLT) models: (A) Bar graph with standard error bars for comparing AUC values of LT models with or without DCVD family history (grey-filled bars and transparent bars, respectively) (B,C). AUC statistics estimated for genetic (solid green lines), nongenetic (blue dashed lines), and multifactorial liability threshold (red tight-dotted lines) models, including a family history of DCVD for the 47- and 231-SNP sets, respectively.
Metabolites 11 00006 g002
Table 1. Risk of cardiovascular disease in Korean patients with type 2 diabetes according to environmental and clinical characteristics.
Table 1. Risk of cardiovascular disease in Korean patients with type 2 diabetes according to environmental and clinical characteristics.
DCVDT2D OnlyLogistic Regression
Characteristics *(N = 168)(N = 2210)OR (95% CI)p
Men, N (%)91 (54.2)1159 (52.4)1.07 (0.78–1.47)0.666
Age, years (%)61.1 ± 0.556.9 ± 0.21.07 (1.05–1.09)3.2 × 10−9 ‡
Income, N (%)
<1 million won82 (48.8)908 (41.1)Reference
1 million won ≤69 (41.1)1110 (50.2)0.69 (0.49–0.96)0.027
Education, N (%)
<High school108 (64.3)1444 (65.3)Reference
High school ≤58 (34.5)752 (34.0)1.03 (0.74–1.44)0.856
Smoking status, N (%)
Nonsmoker93 (55.4)1255 (56.8)Reference
Current smoker30 (17.9)510 (23.1)0.79 (0.52–1.21)0.286
Ex-smoker45 (26.8)426 (19.3)1.43 (0.98–2.07)0.062
Drinking status, N (%)
Nondrinker75 (44.6)1042 (47.2)Reference
Current drinker65 (38.7)973 (44.0)0.93 (0.66–1.31)0.670
Ex-drinker26 (15.5)184 (8.3)1.96 (1.22–3.15)0.005
Family history, N (%)
T2D, yes112 (66.7)1367 (61.9)0.96 (0.66–1.39)0.824
DCVD, yes13 (11.7)82 (5.5)1.22 (0.66–2.25)0.530
BMI, kg/m225.9 ± 0.325.2 ± 0.11.06 (1.01–1.12)0.012
SBP, mm Hg131.1 ± 1.5127.9 ± 0.41.01 (1.00–1.02)0.032
DBP, mm Hg80.8 ± 0.979.8 ± 0.21.01 (0.99–1.02)0.256
TC, mg/dL192.9 ± 3.3201.1 ± 0.90.995 (0.992–0.999)0.018
TG, mg/dL176.0 ± 7.5199.7 ± 3.20.998 (0.997–0.999)0.042
GGT, IU/L38.4 ± 2.557.8 ± 2.70.996 (0.992–1.000)0.027
AST, IU/L29.8 ± 1.332.2 ± 0.60.995 (0.985–1.004)0.246
ALT, IU/L30.9 ± 1.533.7 ± 0.80.996 (0.988–1.003)0.263
Creatinine, mg/dL1.01 ± 0.030.91 ± 0.012.62 (1.64–4.19)5.3 × 10−5 ‡
CRP, mg/L2.73 ± 0.492.57 ± 0.101.01 (0.98–1.04)0.689
FPG, mg/dL124.7 ± 3.3137.0 ± 1.20.99 (0.99–1.00)0.004
2 h PG, mg/dL200.3 ± 10.7240.9 ± 3.00.99 (0.99–1.00)0.001
Hemoglobin A1c7.09 ± 1.497.44 ± 1.740.87 (0.74–1.03)0.101
ALT, alanine transaminase; AST, aspartate aminotransferase; BMI, body mass index; CI, confidence interval; CRP, C-reactive protein; DBP, diastolic blood pressure; DCVD, diabetic cardiovascular disease; FPG, fasting plasma glucose; GGT, gamma-glutamyl transpeptidase; OR, odds ratio; SBP, systolic blood pressure; TC, total cholesterol; TG, triglyceride; T2D, type 2 diabetes mellitus; 2 h PG, 2-h plasma glucose after 75 g oral glucose tolerance test. * Data are shown as the number of subjects (percentage) for categorical variables and mean ± standard deviation for continuous variables. The mean values of 2 h PG and hemoglobin A1c were estimated from the KARE data. ORs, 95% CIs, and p-values were estimated by comparing 168 DCVD cases to 2210 T2D controls selected from the initial surveys of four cohort studies using univariate logistic regression analysis. The variables remained statistically significant at p < 0.05 after backward stepwise selection in the multivariate logistic regression model.
Table 2. Results of linear mixed model analysis of 47 candidate SNPs for diabetic cardiovascular disease.
Table 2. Results of linear mixed model analysis of 47 candidate SNPs for diabetic cardiovascular disease.
GeneChrSNPFunctionN/RLMM
RAF (Ca/Co)ORp
15 previously reported SNPs (p < 0.05)
LOC107986441 (KCNN2) *5q22.2rs4621553intronA/G0.09/0.051.050.002
MRAS *3q22.3rs98188703’ UTRC/T0.03/0.011.080.011
CELSR2, PSRC1 *1p13.3rs599839500bp~3′ UTRA/G0.09/0.061.040.015
IBTK *6q14.1rs16893526intergenicG/A0.15/0.111.030.017
ZFHX3 *16q22.3rs879324intronA/G0.67/0.621.020.022
CDKN2B *9p21.3rs1333042intronA/G0.71/0.651.020.025
MREGP112p11.21rs11610422intergenicA/G0.07/0.051.040.027
LOC1002881464q24rs17035270intronC/T0.99/0.041.040.028
SPSB4 *3q23rs16851055intron (ncRNA)G/A0.23/0.181.020.036
ILRUN (C6orf106) *6p21.31rs2814993intronG/A0.03/0.011.070.037
MTAP *9p21.3rs7865618intronG/A0.90/0.861.020.037
PCNXL311q13.1rs12801636intronA/G0.56/0.491.020.038
HDAC4 *2q37.3rs6706785intergenicG/T0.32/0.271.020.040
TFCP2L1 *2q14.2rs17006292intronC/A0.04/0.031.050.043
MYL2*12q24.11rs3782889intronG/A0.88/0.831.020.046
32 SNPs associated with DCVD (p < 10−4)
MCPH1 *8p23.2rs4538911intergenicC/G0.13/0.061.085.0 × 10−7
LOC10050597321q21.1rs9982069intergenicG/A0.49/0.381.049.1 × 10−7
CDH11 *16q21rs17465734intergenicT/A0.05/0.011.148.0 × 10−6
CD82 *11p11.2rs7946015intergenicA/T0.26/0.171.048.2 × 10−6
FAM19A5 (TAFA5) *22q13.31rs5768165intergenicG/T0.11/0.051.071.3 × 10−5
rs2338258intergenicT/C0.13/0.071.063.6 × 10−5
rs5768143intergenicC/T0.13/0.071.059.1 × 10−5
MGC458004q34.3rs17072597intronC/T0.22/0.141.051.5 × 10−5
KCNE4 *2q36.1rs16864293intergenicT/A0.09/0.041.081.6 × 10−5
SLC9A3 *5p15.33rs1053226intronC/T0.05/0.021.111.8 × 10−5
SP3 *2q31.1rs41326844intergenicT/C0.47/0.361.032.5 × 10−5
AHRR *5p15.33rs6555242intronT/G0.07/0.031.093.1 × 10−5
VAPA *18p11.22rs16956185intergenicG/A0.15/0.081.063.2 × 10−5
ZWINT *, MIR392410q21.1rs1503908intergenicA/G0.19/0.121.053.9 × 10−5
NOX4 *11q14.3rs319025intronT/C0.67/0.561.034.1 × 10−5
SPOCK1 *5q31.2rs6893667intergenicC/T0.06/0.021.104.2 × 10−5
C14orf64 (LINCO1550)14q32.2rs877455intergenicG/A0.10/0.051.074.8 × 10−5
LDLRAD3 *11p13rs1001715intronG/A0.44/0.331.034.9 × 10−5
rs12276510intronG/A0.43/0.331.035.5 × 10−5
ST18 *8q11.23rs2450153intergenicG/A0.63/0.521.035.3 × 10−5
rs3843918intergenicT/C0.46/0.441.037.0 × 10−5
CYP2B6 *19q13.2rs1872125intronT/C0.24/0.161.045.7 × 10−5
FGF9 *13q12.11rs9506827intergenicT/C0.29/0.201.045.9 × 10−5
MIRN65614q32.31rs8016145intergenicG/A0.09/0.041.086.4 × 10−5
DLG2 *11q14.1rs349083intronG/A0.47/0.361.036.5 × 10−5
LOC6467009p21.1rs10968749intergenicA/G0.19/0.121.047.4 × 10−5
METTL21EP, SLC10A2 *13q33.1rs9586032intergenicG/A0.23/0.151.047.4 × 10−5
PPIAL321q21.1rs2825256intergenicT/A0.67/0.551.037.4 × 10−5
HMP19 *5q35.2rs2913472intergenicA/C0.05/0.021.117.9 × 10−5
ALK *2p23.2rs4575680intronG/C0.08/0.041.079.0 × 10−5
MIR126111q14.3rs10501726intergenicA/T0.08/0.041.089.5 × 10−5
NRP1 *10p11.22rs767164intergenicT/A0.30/0.211.049.8 × 10−5
bp, base pair; Ca/Co, cases/controls; Chr., chromosome; LMM, linear mixed model; ncRNA, noncoding RNA; N/R, non-risk/risk allele; OR, odds ratio; RAF, risk allele frequency; SNP, single nucleotide polymorphism; UTR, untranslated region. * Genes linked to more than one Gene Ontology term. The risk allele frequencies were estimated for cases (left) and controls (right). ORs and p-values were estimated in linear mixed models after adjusting for age, sex, body mass index, and serum creatinine level. 15 previously reported SNPs that were replicated in the current LMM analysis (p < 0.05).
Table 3. Gene Ontology functional enrichment analyses of 31 differentially expressed genes in diabetic cardiovascular disease.
Table 3. Gene Ontology functional enrichment analyses of 31 differentially expressed genes in diabetic cardiovascular disease.
Biological Function *Gene, NpFDR, % Gene Set
GO:0014911~positive regulation of smooth muscle cell migration30.00122.0NOX4, HDAC4, NRP1
GO:0048731~system development160.00142.3NOX4, NRP1, MYL2, FGF9, MRAS, TFCP2L1, SPOCK1, CELSR2, ALK, APCDD1, HDAC4, CDKN2B, SP3, MCPH1, ZFHX3, DLG2
GO:0061061~muscle structure development60.00264.2NOX4, HDAC4, MYL2, FGF9, MRAS, ZFHX3
GO:0048513~animal organ development130.00274.3NOX4, NRP1, MYL2, FGF9, MRAS, TFCP2L1, CELSR2, APCDD1, HDAC4, CDKN2B, SP3, MCPH1, ZFHX3
GO:0007517~muscle organ development50.00304.8HDAC4, MYL2, FGF9, MRAS, ZFHX3
GO:0014910~regulation of smooth muscle cell migration30.00355.6NOX4, HDAC4, NRP1
GO:0014909~smooth muscle cell migration30.00406.4NOX4, HDAC4, NRP1
GO:0048523~negative regulation of cellular process150.00487.5NOX4, NRP1, MYL2, FGF9, TFCP2L1, SPOCK1, APCDD1, HDAC4, AHRR, CDKN2B, SP3, ZWINT, MCPH1, ZFHX3, DLG2
GO:0007275~multicellular organism development160.00558.7NOX4, NRP1, MYL2, FGF9, MRAS, TFCP2L1, SPOCK1, CELSR2, ALK, APCDD1, HDAC4, CDKN2B, SP3, MCPH1, ZFHX3, DLG2
GO:0014812~muscle cell migration30.00558.7NOX4, HDAC4, NRP1
FDR, false discovery rate; GO, gene ontology * Categories of GO terms. Fisher’s exact p-values and FDRs for each GO term were estimated using the DAVID tool.
Table 4. Comparison of predictive performance between genetic liability threshold model and polygenic risk score model for predicting diabetic cardiovascular disease in T2D patients.
Table 4. Comparison of predictive performance between genetic liability threshold model and polygenic risk score model for predicting diabetic cardiovascular disease in T2D patients.
ModelCa/Co, N *Ca/Co, Mean (Range) OR (95% CI) p-ValueAUC
Nongenetic
nGLT167/21950.24 (0.06–0.40)/0.20 (0.06–0.40)1.23 (1.15–1.32)4.8 × 10−90.63 (0.59–0.67)
nGRS167/21952.32 (0.00–4.25)/1.78 (0.00–4.41)1.21 (1.13–1.29)8.9 × 10−90.64 (0.60–0.68)
Genetic
GLT15164/21720.15 (0.11–0.23)/0.15 (0.10–0.26)1.05 (0.99–1.10)0.0890.54 (0.49–0.58)
GLT47163/20760.25 (0.14–0.42)/0.20 (0.12–0.39)1.54 (1.41–1.68)7.3 × 10−220.73 (0.70- 0.77)
GLT231114/15580.38 (0.21–0.66)/0.16 (0.06–0.45)14.13 (9.08–21.97)7.4 × 10−320.99 (0.99–0.99)
L: < 0.21 0 (0.0)/1911 (86.5)ReferenceNA0.93 (0.93–0.94)
H: 0.21 ≤ 168 (100.0)/299 (13.5)NANA100/86.8/87.7 §
PRS15164/21720.27 (0.16–0.42)/0.26 (0.12–0.48)1.16 (1.02–1.30)0.0190.55 (0.50–0.60)
PRS47163/20760.93 (0.51–1.50)/0.69 (0.29–1.31)2.72 (2.38–3.10)3.0 × 10−490.84 (0.81–0.87)
PRS231114/15585.40 (4.57–6.82)/4.19 (3.28–5.81)18.41 (11.17–30.35)3.2 × 10−300.99 (0.99–0.99)
L: < 4.57 0 (0.0)/ 1369 (62.0)ReferenceNA0.81 (0.80–0.82)
H: 4.57 ≤ 168 (100.0)/ 841 (38.1)NANA100/61.9/64.6 §
Multifactorial
MLT47162/20620.23 (0.06–0.38)/0.17 (0.04–0.38)1.84 (1.65–2.04)1.9 × 10−290.76 (0.72–0.80)
MLT231113/15520.41 (0.08–0.72)/0.15 (0.02–0.51)7.79 (5.67–10.68)4.9 × 10−370.97 (0.95–0.99)
MRS47162/20623.26 (0.68–5.43)/2.47 (0.37–5.40)1.39 (1.28–1.51)2.5 × 10−150.71 (0.67–0.76)
MRS231113/15527.71 (4.57–9.99)/5.92 (3.48–8.74)2.98 (2.48–3.58)3.0 × 10−310.86 (0.82–0.89)
AUC, area under the receiver operating characteristic curve; Ca/Co, Case/Control; CI, confidence interval; DCVD, diabetic cardiovascular disease; GLT, genetic liability threshold model; MLT, multifactorial liability threshold model; MRS, multifactorial risk score model; N, number; nGLT, nongenetic liability threshold model; nGRS, nongenetic risk score model; OR, odds ratio; PRS, polygenic risk score. * The number of cases and controls for each PRS model was the same as the GLT model based on the same number of SNPs. Mean and range of liability or risk score groups using three sets of single nucleotide polymorphism markers (i.e., 15, 47, and 231 SNPs) for each of the case and control groups. For the discrete GLT231 and PRS231 models, the numbers and percentages of cases and controls were shown. ORs, 95% CIs, and p-values were estimated using logistic regression analysis for every 1-point increase in the standardized values of liability and polygenic risk score, respectively. § The AUCs of three liability threshold models were computed with a family history of DCVD. Sensitivity/Specificity/Percentage of persons correctly classified for DCVD status based on each categorical model.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Hong, E.P.; Heo, S.G.; Park, J.W. The Liability Threshold Model for Predicting the Risk of Cardiovascular Disease in Patients with Type 2 Diabetes: A Multi-Cohort Study of Korean Adults. Metabolites 2021, 11, 6. https://doi.org/10.3390/metabo11010006

AMA Style

Hong EP, Heo SG, Park JW. The Liability Threshold Model for Predicting the Risk of Cardiovascular Disease in Patients with Type 2 Diabetes: A Multi-Cohort Study of Korean Adults. Metabolites. 2021; 11(1):6. https://doi.org/10.3390/metabo11010006

Chicago/Turabian Style

Hong, Eun Pyo, Seong Gu Heo, and Ji Wan Park. 2021. "The Liability Threshold Model for Predicting the Risk of Cardiovascular Disease in Patients with Type 2 Diabetes: A Multi-Cohort Study of Korean Adults" Metabolites 11, no. 1: 6. https://doi.org/10.3390/metabo11010006

APA Style

Hong, E. P., Heo, S. G., & Park, J. W. (2021). The Liability Threshold Model for Predicting the Risk of Cardiovascular Disease in Patients with Type 2 Diabetes: A Multi-Cohort Study of Korean Adults. Metabolites, 11(1), 6. https://doi.org/10.3390/metabo11010006

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop