Predicting the Risk of Incident Type 2 Diabetes Mellitus in Chinese Elderly Using Machine Learning Techniques
Abstract
:1. Introduction
2. Materials and Methods
2.1. Study Design and Participants
2.2. Candidate Predictors
2.3. Outcome
2.4. Machine Learning Algorithms
2.4.1. Logistic Regression (LR)
2.4.2. Decision Tree (DT)
2.4.3. Random Forest (RF)
2.4.4. Extreme Gradient Boosting (XGBoost)
2.5. Model Development
2.6. Model Evaluation
2.7. Model Interpretation
2.8. Statistical Analysis
3. Results
3.1. Baseline Characteristics
3.2. Features Selected by LASSO Regression
3.3. Comparison of the Model Performance
3.4. Feature Importance
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- World Health Organization. Diabetes. Available online: https://www.who.int/health-topics/diabetes#tab=tab_1 (accessed on 11 January 2022).
- International Diabetes Federation. Diabetes around the World in 2021. Available online: https://diabetesatlas.org/ (accessed on 11 January 2022).
- International Diabetes Federation. IDF Atlas 10th Edition. Available online: https://diabetesatlas.org/atlas/tenth-edition/ (accessed on 11 January 2022).
- Ma, R.C.W.; Tsoi, K.Y.; Tam, W.H.; Wong, C.K.C. Developmental origins of type 2 diabetes: A perspective from China. Eur. J. Clin. Nutr. 2017, 71, 870–880. [Google Scholar] [CrossRef] [PubMed]
- Huang, Y.; Vemer, P.; Zhu, J.; Postma, M.J.; Chen, W. Economic burden in Chinese patients with diabetes mellitus using electronic insurance claims data. PLoS ONE 2016, 11, e0159297. [Google Scholar] [CrossRef] [PubMed]
- Li, Y.; Wang, D.D.; Ley, S.H.; Vasanti, M.; Howard, A.G.; He, Y.; Hu, F.B. Time trends of dietary and lifestyle factors and their potential impact on diabetes burden in China. Diabetes Care 2017, 40, 1685–1694. [Google Scholar] [CrossRef] [Green Version]
- Peer, N.; Balakrishna, Y.; Durao, S. Screening for type 2 diabetes mellitus. Cochrane Database Syst. Rev. 2020, 5, 1465–1858. [Google Scholar]
- Topol, E.J. High-performance medicine: The convergence of human and artificial intelligence. Nat. Med. 2019, 25, 44–56. [Google Scholar] [CrossRef]
- Nelson, C.A.; Pérez-Chada, L.M.; Creadore, A.; Li, S.J.; Lo, K.; Manjaly, P.; Pournamdari, A.B.; Tkachenko, E.; Barbieri, J.S.; Ko, J.M.; et al. Patient perspectives on the use of artificial intelligence for skin cancer screening: A qualitative study. JAMA Derm. 2020, 156, 501–512. [Google Scholar] [CrossRef] [PubMed]
- Anwar, F.; Qurat Ul, A.; Ejaz, M.Y.; Mosavi, A. A comparative analysis on diagnosis of diabetes mellitus using different approaches—A survey. Inf. Med Unlocked 2020, 21, 100482. [Google Scholar] [CrossRef]
- Rigla, M.; García-Sáez, G.; Pons, B.; Hernando, M.E. Artificial intelligence methodologies and their application to diabetes. J. Diabetes Sci. Technol. 2018, 12, 303–310. [Google Scholar] [CrossRef]
- Maniruzzaman, M.; Kumar, N.; Menhazul Abedin, M.; Shaykhul Islam, M.; Suri, H.S.; El-Baz, A.S.; Suri, J.S. Comparative approaches for classification of diabetes mellitus data: Machine learning paradigm. Comput. Methods Programs Biomed. 2017, 152, 23–34. [Google Scholar] [CrossRef]
- Dreiseitl, S.; Ohno-Machado, L. Logistic regression and artificial neural network classification models: A methodology review. J. Biomed. Inform. 2002, 35, 352–359. [Google Scholar] [CrossRef] [Green Version]
- Dinh, A.; Miertschin, S.; Young, A.; Mohanty, S.D. A data-driven approach to predicting diabetes and cardiovascular disease with machine learning. BMC Med. Inf. Decis. Mak. 2019, 19, 211. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Moon, S.; Jang, J.Y.; Kim, Y.; Oh, C.M. Development and validation of a new diabetes index for the risk classification of present and new-onset diabetes: Multicohort study. Sci. Rep. 2021, 11, 15748. [Google Scholar] [CrossRef] [PubMed]
- Yu, W.; Liu, T.; Valdez, R.; Gwinn, M.; Khoury, M.J. Application of support vector machine modeling for prediction of common diseases: The case of diabetes and pre-diabetes. BMC Med. Inf. Decis. Mak. 2010, 10, 16. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Borzouei, S.; Soltanian, A.R. Application of an artificial neural network model for diagnosing type 2 diabetes mellitus and determining the relative importance of risk factors. Epidemiol. Health 2018, 40, e2018007. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Garcia-Carretero, R.; Vigil-Medina, L.; Mora-Jimenez, I.; Soguero-Ruiz, C.; Barquero-Perez, O.; Ramos-Lopez, J. Use of a K-nearest neighbors model to predict the development of type 2 diabetes within 2 years in an obese, hypertensive population. Med. Biol. Eng. Comput. 2020, 58, 991–1002. [Google Scholar] [CrossRef] [PubMed]
- Pei, D.; Yang, T.; Zhang, C. Estimation of diabetes in a high-risk adult Chinese population using J48 Decision Tree model. Diabetes Metab. Syndr. Obes. 2020, 13, 4621–4630. [Google Scholar] [CrossRef]
- Ooka, T.; Johno, H.; Nakamoto, K.; Yoda, Y.; Yokomichi, H.; Yamagata, Z. Random forest approach for determining risk prediction and predictive factors of type 2 diabetes: Large-scale health check-up data in Japan. BMJ Nutr. Prev. Health 2021, 4, 140–148. [Google Scholar] [CrossRef]
- Wang, L.; Wang, X.; Chen, A.; Jin, X.; Che, H. Prediction of type 2 diabetes risk and its effect evaluation based on the XGBoost model. Healthcare 2020, 8, 247. [Google Scholar] [CrossRef]
- Silva, K.; Lee, W.K.; Forbes, A.; Demmer, R.T.; Barton, C.; Enticott, J. Use and performance of machine learning models for type 2 diabetes prediction in community settings: A systematic review and meta-analysis. Int. J. Med. Inf. 2020, 143, 104268. [Google Scholar] [CrossRef]
- Xie, Z.; Nikolayeva, O.; Luo, J.; Li, D. Building risk prediction models for type 2 diabetes using machine learning techniques. Prev. Chronic Dis. 2019, 16, E130. [Google Scholar] [CrossRef]
- Katarya, R.; Jain, S. Comparison of different machine learning models for diabetes detection. In Proceedings of the 2020 IEEE International Conference on Advances and Developments in Electrical and Electronics Engineering (ICADEE), Coimbatore, India, 10–11 December 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 117–121. [Google Scholar]
- Adua, E.; Kolog, E.A.; Afrifa-Yamoah, E.; Amankwah, B.; Obirikorang, C.; Anto, E.O.; Acheampong, E.; Wang, W.; Tetteh, A.Y. Predictive model and feature importance for early detection of type II diabetes mellitus. Transl. Med. Commun. 2021, 6, 17. [Google Scholar] [CrossRef]
- Zou, Q.; Qu, K.; Luo, Y.; Yin, D.; Ju, Y.; Tang, H. Predicting diabetes mellitus with machine learning techniques. Front. Genet. 2018, 9, 515. [Google Scholar] [CrossRef] [PubMed]
- Xue, M.; Su, Y.; Li, C.; Wang, S.; Yao, H. Identification of potential type II diabetes in a large-scale Chinese population using a systematic machine learning framework. J. Diabetes Res. 2020, 2020, 6873891. [Google Scholar] [CrossRef] [PubMed]
- Kuo, K.M.; Talley, P.; Kao, Y.; Huang, C.H. A multi-class classification model for supporting the diagnosis of type II diabetes mellitus. PeerJ 2020, 8, e9920. [Google Scholar] [CrossRef]
- Zhao, H.; Zhang, X.; Xu, Y.; Gao, L.; Ma, Z.; Sun, Y.; Wang, W. Predicting the risk of hypertension based on several easy-to-collect risk factors: A machine learning method. Front. Public Health 2021, 9, 619429. [Google Scholar] [CrossRef]
- Agardh, E.; Allebeck, P.; Hallqvist, J.; Moradi, T.; Sidorchuk, A. Type 2 diabetes incidence and socio-economic position: A systematic review and meta-analysis. Int. J. Epidemiol. 2011, 40, 804–818. [Google Scholar] [CrossRef] [Green Version]
- Smith, A.D.; Crippa, A.; Woodcock, J.; Brage, S. Physical activity and incident type 2 diabetes mellitus: A systematic review and dose-response meta-analysis of prospective cohort studies. Diabetologia 2016, 59, 2527–2545. [Google Scholar] [CrossRef] [Green Version]
- Lu, J.; Li, M.; Peng, K.; Xu, M.; Xu, Y.; Chen, Y.; Wang, T.; Zhao, Z.; Dai, M.; Zhang, D.; et al. Predictive value of fasting glucose, postload glucose, and hemoglobin A1c on risk of diabetes and complications in Chinese adults. Diabetes Care 2019, 42, 1539–1548. [Google Scholar] [CrossRef]
- American Diabetes Association. 2. Classification and diagnosis of diabetes: Standards of medical care in diabetes—2021. Diabetes Care 2021, 44, S15–S33. [Google Scholar] [CrossRef]
- Choi, R.Y.; Coyner, A.S.; Kalpathy-Cramer, J.; Chiang, M.F.; Campbell, J.P. Introduction to machine learning, neural networks, and deep learning. Transl. Vis. Sci. Technol. 2020, 9, 14. [Google Scholar]
- Cox, D.R. The regression analysis of binary sequences. J. R. Stat. Soc. Ser. B Stat. Methodol. 1958, 20, 215–242. [Google Scholar] [CrossRef]
- Quinlan, J.R. Induction of decision trees. Mach. Learn. 1986, 1, 81–106. [Google Scholar] [CrossRef] [Green Version]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
- Muhammad, L.J.; Algehyne, E.A.; Usman, S.S. Predictive supervised machine learning models for diabetes mellitus. SN Comput. Sci. 2020, 1, 240. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; Association for Computing Machinery: New York, NY, USA, 2016; pp. 785–794. [Google Scholar]
- Taye, G.T.; Shim, E.B.; Hwang, H.-J.; Lim, K.M. Machine learning approach to predict ventricular fibrillation based on QRS complex shape. Front. Physiol. 2019, 10, 1193. [Google Scholar] [CrossRef]
- Tibshirani, R. Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 1996, 58, 267–288. [Google Scholar] [CrossRef]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Snoek, J.; Larochelle, H.; Adams, R.P. Practical Bayesian optimization of machine learning algorithms. In Proceedings of the Advances in Neural Information Processing Systems 25 (NIPS 2012), Lake Tahoe, NV, USA, 3–6 December 2012; Neural Information Processing Systems Foundation, Inc. (NIPS): La Jolla, CA, USA, 2013; pp. 2951–2959. [Google Scholar]
- Frazier, P.I. A tutorial on Bayesian optimization. arXiv 2018, arXiv:1807.02811. [Google Scholar]
- Data, M.I.T.C.; Dernoncourt, F.; Nemati, S.; Kassis, E.B.; Ghassemi, M.M. Hyperparameter Selection. In Secondary Analysis of Electronic Health Records; Springer: Cham, Switzerland, 2016; pp. 419–427. [Google Scholar]
- Koul, N.; Manvi, S.S. Framework for classification of cancer gene expression data using Bayesian hyper-parameter optimization. Med. Biol. Eng. Comput. 2021, 59, 2353–2371. [Google Scholar] [CrossRef]
- Lundberg, S.M.; Lee, S.I. A unified approach to interpreting model predictions. In Proceedings of the Advances in Neural Information Processing Systems 30 (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 4766–4775. [Google Scholar]
- Xue, B.; Li, D.; Lu, C.; King, C.R.; Wildes, T.; Avidan, M.S.; Kannampallil, T.; Abraham, J. Use of machine learning to develop and evaluate models using preoperative and intraoperative data to identify risks of postoperative complications. JAMA Netw. Open 2021, 4, e212240. [Google Scholar] [CrossRef]
- Ravaut, M.; Harish, V.; Sadeghi, H.; Leung, K.K.; Volkovs, M.; Kornas, K.; Watson, T.; Poutanen, T.; Rosella, L.C. Development and validation of a machine learning model using administrative health data to predict onset of type 2 diabetes. JAMA Netw. Open 2021, 4, e2111315. [Google Scholar] [CrossRef] [PubMed]
- Lai, H.; Huang, H.; Keshavjee, K.; Guergachi, A.; Gao, X. Predictive models for diabetes mellitus using machine learning techniques. BMC Endocr. Disord. 2019, 19, 101. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wei, H.; Sun, J.; Shan, W.; Xiao, W.; Wang, B.; Ma, X.; Hu, W.; Wang, X.; Xia, Y. Environmental chemical exposure dynamics and machine learning-based prediction of diabetes mellitus. Sci. Total Env. 2022, 806, 150674. [Google Scholar] [CrossRef] [PubMed]
- Sadeghi, S.; Khalili, D.; Ramezankhani, A.; Mansournia, M.A.; Parsaeian, M. Diabetes mellitus risk prediction in the presence of class imbalance using flexible machine learning methods. BMC Med. Inf. Decis. Mak. 2022, 22, 36. [Google Scholar] [CrossRef]
- Wu, Y.; Hu, H.; Cai, J.; Chen, R.; Zuo, X.; Cheng, H.; Yan, D. Machine learning for predicting the 3-year risk of incident diabetes in Chinese adults. Front. Public Health 2021, 9, 626331. [Google Scholar] [CrossRef]
- Deberneh, H.M.; Kim, I. Prediction of type 2 diabetes based on machine learning algorithm. Int. J. Environ. Res. Public Health 2021, 18, 3317. [Google Scholar] [CrossRef]
- Zhang, L.; Wang, Y.; Niu, M.; Wang, C.; Wang, Z. Machine learning for characterizing risk of type 2 diabetes mellitus in a rural Chinese population: The Henan Rural Cohort Study. Sci. Rep. 2020, 10, 4406. [Google Scholar] [CrossRef]
- Rufo, D.D.; Debelee, T.G.; Ibenthal, A.; Negera, W.G. Diagnosis of diabetes mellitus using gradient boosting machine (LightGBM). Diagnostics 2021, 11, 1714. [Google Scholar] [CrossRef]
- Farran, B.; AlWotayan, R.; Alkandari, H.; Al-Abdulrazzaq, D.; Channanath, A.; Thanaraj, T.A. Use of non-invasive parameters and machine-learning algorithms for predicting future risk of type 2 diabetes: A retrospective cohort study of health data from Kuwait. Front. Endocrinol. 2019, 10, 624. [Google Scholar] [CrossRef]
- Yang, T.; Zhang, L.; Yi, L.; Feng, H.; Li, S.; Chen, H.; Zhu, J.; Zhao, J.; Zeng, Y.; Liu, H. Ensemble learning models based on noninvasive features for type 2 diabetes screening: Model development and validation. JMIR Med. Inf. 2020, 8, e15431. [Google Scholar] [CrossRef]
- Maniruzzaman, M.; Rahman, M.J.; Ahammed, B.; Abedin, M.M. Classification and prediction of diabetes disease using machine learning paradigm. Health Inf. Sci. Syst. 2020, 8, 7. [Google Scholar] [CrossRef] [PubMed]
- Lorenzo, C.; Wagenknecht, L.E.; Hanley, A.J.G.; Rewers, M.J.; Karter, A.J.; Haffner, S.M. A1C between 5.7 and 6.4% as a marker for identifying pre-diabetes, insulin sensitivity and secretion, and cardiovascular risk factors: The Insulin Resistance Atherosclerosis Study (IRAS). Diabetes Care 2010, 33, 2104–2109. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Abbasi, A.; Sahlqvist, A.-S.; Lotta, L.; Brosnan, J.M.; Vollenweider, P.; Giabbanelli, P.; Nunez, D.J.; Waterworth, D.; Scott, R.A.; Langenberg, C.; et al. A systematic review of biomarkers and risk of incident type 2 diabetes: An overview of epidemiological, prediction and aetiological research literature. PLoS ONE 2016, 11, e0163721. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Meng, X.-H.; Huang, Y.-X.; Rao, D.-P.; Zhang, Q.; Liu, Q. Comparison of three data mining models for predicting diabetes or prediabetes by risk factors. Kaohsiung J. Med. Sci. 2013, 29, 93–99. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Cao, G.; Cui, Z.; Ma, Q.; Wang, C.; Xu, Y.; Sun, H.; Ma, Y. Changes in health inequalities for patients with diabetes among middle-aged and elderly in China from 2011 to 2015. BMC Health Serv. Res. 2020, 20, 719. [Google Scholar] [CrossRef] [PubMed]
- Espelt, A.; Kunst, A.E.; Palència, L.; Gnavi, R.; Borrell, C. Twenty years of socio-economic inequalities in type 2 diabetes mellitus prevalence in Spain, 1987–2006. Eur. J. Public Health 2012, 22, 765–771. [Google Scholar] [CrossRef] [Green Version]
- Asadi-Lari, M.; Khosravi, A.; Nedjat, S.; Mansournia, M.A.; Majdzadeh, R.; Mohammad, K.; Vaez-Mahdavi, M.R.; Faghihzadeh, S.; Haeri Mehrizi, A.A.; Cheraghian, B. Socioeconomic status and prevalence of self-reported diabetes among adults in Tehran: Results from a large population-based cross-sectional study (Urban HEART-2). J. Endocrinol. Investig. 2016, 39, 515–522. [Google Scholar] [CrossRef]
- Pantell, M.S.; Prather, A.A.; Downing, J.M.; Gordon, N.P.; Adler, N.E. Association of social and behavioral risk factors with earlier onset of adult hypertension and diabetes. JAMA Netw. Open 2019, 2, e193933. [Google Scholar] [CrossRef]
- Li, G.; Zhang, P.; Wang, J.; Gregg, E.W.; Yang, W.; Gong, Q.; Li, H.; Li, H.; Jiang, Y.; An, Y.; et al. The long-term effect of lifestyle interventions to prevent diabetes in the China Da Qing Diabetes Prevention Study: A 20-year follow-up study. Lancet 2008, 371, 1783–1789. [Google Scholar] [CrossRef]
- Pan, X.-R.; Li, G.-W.; Hu, Y.-H.; Wang, J.-X.; Yang, W.-Y.; An, Z.-X.; Hu, Z.-X.; Lin, J.; Xiao, J.-Z.; Cao, H.-B.; et al. Effects of diet and exercise in preventing NIDDM in people with impaired glucose tolerance: The Da Qing IGT and Diabetes Study. Diabetes Care 1997, 20, 537–544. [Google Scholar] [CrossRef]
- Anjana, R.M.; Deepa, M.; Pradeepa, R.; Mahanta, J.; Narain, K.; Das, H.K.; Adhikari, P.; Rao, P.V.; Saboo, B.; Kumar, A.; et al. Prevalence of diabetes and prediabetes in 15 states of India: Results from the ICMR-INDIAB population-based cross-sectional study. Lancet Diabetes Endocrinol. 2017, 5, 585–596. [Google Scholar] [CrossRef]
- Subramani, S.K.; Yadav, D.; Mishra, M.; Pakkirisamy, U.; Mathiyalagen, P.; Prasad, G. Prevalence of type 2 diabetes and prediabetes in the Gwalior-Chambal region of central India. Int. J. Environ. Res. Public Health 2019, 16, 4708. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zhou, T.; Liu, X.; Liu, Y.; Li, X. Spatio-temporal patterns of the associations between type 2 diabetes and its risk factors in mainland China: A systematic review and meta-analysis. Lancet 2018, 392, S32. [Google Scholar] [CrossRef]
- Aryal, K.K.; Mehata, S.; Neupane, S.; Vaidya, A.; Dhimal, M.; Dhakal, P.; Rana, S.; Bhusal, C.L.; Lohani, G.R.; Paulin, F.H.; et al. The burden and determinants of non communicable diseases risk factors in Nepal: Findings from a nationwide STEPS survey. PLoS ONE 2015, 10, e0134834. [Google Scholar] [CrossRef] [PubMed]
- Xu, Z.; Qi, X.; Dahl, A.K.; Xu, W. Waist-to-height ratio is the best indicator for undiagnosed type 2 diabetes. Diabet. Med. 2013, 30, e201–e207. [Google Scholar] [CrossRef]
- Wang, Y.; Rimm, E.B.; Stampfer, M.J.; Willett, W.C.; Hu, F.B. Comparison of abdominal adiposity and overall obesity in predicting risk of type 2 diabetes among men. Am. J. Clin. Nutr. 2005, 81, 555–563. [Google Scholar] [CrossRef] [Green Version]
- Bommer, C.; Heesemann, E.; Sagalova, V.; Manne-Goehler, J.; Atun, R.; Bärnighausen, T.; Vollmer, S. The global economic burden of diabetes in adults aged 20–79 years: A cost-of-illness study. Lancet Diabetes Endocrinol. 2017, 5, 423–430. [Google Scholar] [CrossRef]
- Rowley, W.R.; Bezold, C.; Arikan, Y.; Byrne, E.; Krohe, S. Diabetes 2030: Insights from yesterday, today, and future trends. Popul. Health Manag 2017, 20, 6–12. [Google Scholar] [CrossRef] [Green Version]
Characteristics | Total (n = 127,031) | Incident T2DM | p-Value | |
---|---|---|---|---|
Yes (n = 8298) | No (n = 118,733) | |||
Age, mean (SD), years | 71.94 (5.10) | 72.39 (5.31) | 71.91 (5.08) | <0.001 |
Gender, n (%) | <0.001 | |||
Men | 56,774 (44.69) | 4114 (7.25) | 52,660 (92.75) | |
Women | 70,257 (55.31) | 4184 (5.96) | 66,073 (94.04) | |
Education, n (%) | <0.001 | |||
Elementary school and below | 75,828 (59.69) | 5597 (7.38) | 70,231 (92.62) | |
Junior high school | 28,298 (22.28) | 1522 (5.38) | 26,776 (94.62) | |
Technical secondary school or high school | 13,742 (10.82) | 695 (5.06) | 13,047 (94.94) | |
Junior college and above | 9163 (7.21) | 484 (5.28) | 8679 (94.72) | |
Marital status, n (%) | <0.001 | |||
Married | 98,131 (77.25) | 6046 (6.16) | 92,085 (93.84) | |
Divorced | 656 (0.52) | 48 (7.32) | 608 (92.68) | |
Widowed | 27,350 (21.53) | 2082 (7.61) | 25,268 (92.39) | |
Single | 894 (0.70) | 122 (13.65) | 772 (86.35) | |
Hypertension, n (%) | <0.001 | |||
Yes | 56,847 (44.75) | 4347 (7.65) | 52,500 (92.35) | |
No | 70,184 (55.25) | 3951 (5.63) | 66,233 (94.37) | |
Myocardial infarction, n (%) | 0.621 | |||
Yes | 686 (0.54) | 48 (7.00) | 638 (93.00) | |
No | 126,345 (99.46) | 8250 (6.53) | 118,095 (93.47) | |
Coronary heart disease, n (%) | 0.413 | |||
Yes | 7471 (5.88) | 505 (6.76) | 6966 (93.24) | |
No | 119,560 (94.12) | 7793 (6.52) | 111,767 (93.48) | |
Angina pectoris, n (%) | 0.711 | |||
Yes | 506 (0.40) | 31 (6.13) | 475 (93.87) | |
No | 126,525 (99.60) | 8267 (6.53) | 118,258 (93.47) | |
Fatty liver, n (%) | 0.020 | |||
Yes | 2279 (1.79) | 176 (7.72) | 2103 (92.28) | |
No | 124,752 (98.21) | 8122 (6.51) | 116,630 (93.49) | |
Exercise, n (%) | <0.001 | |||
Yes | 74,741 (58.84) | 4323 (5.78) | 70,418 (94.22) | |
No | 52,290 (41.16) | 3975 (7.60) | 48,315 (92.40) | |
Current smoking, n (%) | <0.001 | |||
Yes | 20,498 (16.14) | 1515 (7.39) | 18,983 (92.61) | |
No | 106,533 (83.86) | 6783 (6.37) | 99,750 (93.63) | |
Current drinking, n (%) | 0.908 | |||
Yes | 21,429 (16.87) | 1396 (6.51) | 20,033 (93.49) | |
No | 105,602 (83.13) | 6902 (6.54) | 98,700 (93.46) | |
BMI, mean (SD), kg/m2 | 23.70 (3.26) | 24.47 (3.51) | 23.65 (3.24) | <0.001 |
WC, mean (SD), cm | 84.12 (9.16) | 86.30 (9.62) | 83.97 (9.10) | <0.001 |
SBP, mean (SD), mm Hg | 137.12 (20.00) | 140.63 (20.38) | 136.87 (19.95) | <0.001 |
DBP, mean (SD), mm Hg | 80.09 (11.20) | 81.63 (11.42) | 79.99 (11.18) | <0.001 |
FPG, mean (SD), mmol/L | 5.12 (0.69) | 5.71 (0.79) | 5.08 (0.66) | <0.001 |
TC, median (IQR), mmol/L | 4.81 (4.20–5.45) | 4.84 (4.20–5.49) | 4.81 (4.20–5.44) | 0.034 |
TG, median (IQR), mmol/L | 1.17 (0.85–1.63) | 1.28 (0.90–1.79) | 1.16 (0.85–1.62) | <0.001 |
HDL-C, median (IQR), mmol/L | 1.36 (1.15–1.62) | 1.32 (1.11–1.58) | 1.37 (1.15–1.62) | <0.001 |
LDL-C, median (IQR), mmol/L | 2.60 (2.08–3.17) | 2.64 (2.11–3.24) | 2.60 (2.07–3.16) | <0.001 |
ALT, median (IQR), U/L | 16.00 (12.00–21.00) | 17.00 (13.00–23.00) | 16.00 (12.00–20.90) | <0.001 |
AST, median (IQR), U/L | 21.50 (18.00–26.00) | 22.00 (18.00–26.00) | 21.50 (18.00–26.00) | 0.004 |
TBIL, median (IQR), µmol/L | 11.90 (9.17–15.30) | 12.40 (9.50–15.90) | 11.90 (9.10–15.30) | <0.001 |
Scr, mean (SD), µmol/L | 76.82 (19.93) | 79.21 (20.94) | 76.66 (19.85) | <0.001 |
BUN, median (IQR), mmol/L | 5.71 (4.76–6.82) | 5.67 (4.70–6.80) | 5.71 (4.77–6.83) | 0.037 |
SUA, mean (SD), µmol/L | 323.80 (91.90) | 333.01 (94.31) | 323.15 (91.70) | <0.001 |
Predictors | Coefficient |
---|---|
Age | 0.012 |
Gender | −0.026 |
Education | −0.027 |
Marital status | 0.023 |
Hypertension | 0.010 |
Exercise | −0.035 |
Current smoking | 0.017 |
Current drinking | −0.010 |
WC | 0.033 |
SBP | 0.014 |
FPG | 0.219 |
TC | −0.022 |
TG | 0.020 |
HDL-C | 0.006 |
LDL-C | 0.009 |
ALT | 0.037 |
AST | −0.026 |
TBIL | 0.006 |
Scr | 0.004 |
BUN | −0.017 |
SUA | −0.002 |
Model | AUC | Sensitivity | Specificity | Accuracy |
---|---|---|---|---|
LR | 0.7601 | 0.6320 | 0.7636 | 0.7550 |
DT | 0.7280 | 0.5821 | 0.7633 | 0.7514 |
RF | 0.7772 | 0.6428 | 0.7524 | 0.7453 |
XGBoost | 0.7805 | 0.6452 | 0.7577 | 0.7503 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, Q.; Zhang, M.; He, Y.; Zhang, L.; Zou, J.; Yan, Y.; Guo, Y. Predicting the Risk of Incident Type 2 Diabetes Mellitus in Chinese Elderly Using Machine Learning Techniques. J. Pers. Med. 2022, 12, 905. https://doi.org/10.3390/jpm12060905
Liu Q, Zhang M, He Y, Zhang L, Zou J, Yan Y, Guo Y. Predicting the Risk of Incident Type 2 Diabetes Mellitus in Chinese Elderly Using Machine Learning Techniques. Journal of Personalized Medicine. 2022; 12(6):905. https://doi.org/10.3390/jpm12060905
Chicago/Turabian StyleLiu, Qing, Miao Zhang, Yifeng He, Lei Zhang, Jingui Zou, Yaqiong Yan, and Yan Guo. 2022. "Predicting the Risk of Incident Type 2 Diabetes Mellitus in Chinese Elderly Using Machine Learning Techniques" Journal of Personalized Medicine 12, no. 6: 905. https://doi.org/10.3390/jpm12060905
APA StyleLiu, Q., Zhang, M., He, Y., Zhang, L., Zou, J., Yan, Y., & Guo, Y. (2022). Predicting the Risk of Incident Type 2 Diabetes Mellitus in Chinese Elderly Using Machine Learning Techniques. Journal of Personalized Medicine, 12(6), 905. https://doi.org/10.3390/jpm12060905