Next Article in Journal
Bridelia ferruginea Tea Consumption Improves Antioxidant Status in Individuals Living with Type 2 Diabetes
Previous Article in Journal
Mediterranean Diet and Metabolic Syndrome
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

All-Cause Mortality Prediction in Subjects with Diabetes Mellitus Using a Machine Learning Model and Shapley Values

1
Department of Cardiology, University of Medicine and Pharmacy of Craiova, 200349 Craiova, Romania
2
Research and Development Department, Med Fanavaran Plus Co., Karaj 3187411213, Iran
3
Department of Anatomy, “George Emil Palade” University of Medicine, Pharmacy, Sciences and Technology, 540142 Târgu Mureș, Romania
4
Department of Cardiovascular Surgery, Emergency Institute for Cardiovascular Diseases and Transplantation, 540136 Târgu Mureş, Romania
5
Department of Cardiovascular Surgery, University of Medicine and Pharmacy of Craiova, 200349 Craiova, Romania
*
Author to whom correspondence should be addressed.
Diabetology 2025, 6(1), 5; https://doi.org/10.3390/diabetology6010005
Submission received: 23 September 2024 / Revised: 24 December 2024 / Accepted: 30 December 2024 / Published: 7 January 2025

Abstract

:
Background/Objectives: Diabetes mellitus (DM) is a prevalent disease with an increased risk of complications. Identifying risk factors for mortality in these patients is crucial, as early recognition can facilitate prompt therapeutic intervention. Machine learning (ML) models have proved to be valuable tools in different scenarios of healthcare decision making. We aimed to develop and test an ML model to predict all-cause mortality in a large cohort of subjects with DM. Methods: We included 1969 consecutive patients with DM type 1 (T1DM, n = 255) and type 2 (T2DM, n = 1714). eXtreme Gradient Boosting (XGBoost) was used for the prediction of all-cause mortality in this cohort and the Shapley additive explanation (SHAP) was used to assess the importance of each feature of the classifier. The missing values were imputed using the Missforest methodology. Results: The all-cause mortality rate was 21% during 5.5 ± 1.1 years of follow-up. The ML model achieved 90% sensitivity and 87% specificity with an AUC of 0.88 and an accuracy of 88% for predicting all-cause mortality. The SHAP analysis identified a lower glomerular filtration rate (eGFR), duration of insulin therapy, and a lower level of hemoglobin as the first three factors that contribute to the higher mortality rate. Conclusions: ML models can become valuable tools in clinical practice due to their unique ability to simultaneously assess the cumulative influence of multiple parameters and discover high-order interactions. The application of such models in clinical practice could improve the early identification of subjects at risk for complications and mortality and prompt early therapeutical interventions.

1. Introduction

According to the International Diabetes Federation (IDF), as of 2021, over 500 million people worldwide are living with diabetes mellitus (DM). Furthermore, this number is projected to increase significantly, with estimates suggesting that, by 2045, nearly 700 million individuals could be affected by diabetes [1]. This anticipated rise is concerning, and it justifies the urgency of taking proactive measures to address this global health challenge.
Type 2 diabetes mellitus (T2DM) is the most prevalent form of diabetes, accounting for nearly 90% of all cases, while type 1 diabetes mellitus (T1DM) is comparatively rare [2]. Individuals with diabetes mellitus (DM) experience a significantly higher mortality rate, largely attributable to long-term complications associated with the disease. These include cardiovascular disease, kidney disease, and neuropathy, which collectively contribute to the increased burden of morbidity and mortality in this population [3,4,5,6].
In recent years, computational modeling and machine learning (ML) algorithms have emerged as powerful tools in healthcare, demonstrating significant potential for improving the predictive modeling of disease outcomes [7,8]. These techniques can be used to analyze large and complex datasets, favoring the development of more accurate and individualized predictive models.
However, the foremost important challenge associated with ML models, particularly in the healthcare setting, is the difficulty of interpreting and translating the ‘black box’ results in clinical practice, thus limiting their applicability. To address this limitation, several approaches to interpreting model predictions have been proposed, and the Shapley additive explanation (SHAP) has proven particularly advantageous by providing a framework to explain model outputs [9]. The advantage of using the SHAP plots lies in their ability to allocate a rank of contribution to each feature. This ranking provides clinicians with an interpretable view of the rationale behind the model’s predictions, reducing the gap between complex ML outputs and practical decisions.
Thereby, in this study, we integrated commonly available clinical and biological DM characteristics by using an ML algorithm and tested the accuracy of predicting all-cause mortality in a large cohort of subjects with DM by using Shapley values to clarify the importance of each feature.

2. Materials and Methods

2.1. Data Collection

We retrospectively selected all consecutive patients aged above 18 years old with a documented diagnosis of diabetes mellitus who were admitted to the Emergency County Hospital of Craiova, Romania, a reference hospital for the region, between January 2016 and December 2018. The dataset was obtained from the digital database of the hospital. All patients admitted to the hospital are required to sign a general informed consent form upon admission, which includes their agreement to the use of medical data for scientific purposes.
The study was approved by the Ethics Committee of Emergency County Hospital of Craiova (no. 74 from 7 September 2020) and conducted in accordance with local legislation and institutional requirements.

2.2. Choice of Parameters

We selected a subset of 14 demographic/clinical parameters, 17 biological tests, and 5 parameters relevant to DM characterization. All parameters are listed in Table 1. We specifically selected for the study parameters that are commonly obtained during admission in our hospital.
Hypertension history was assessed from the medical records. The diagnosis of atrial fibrillation (AFib) was made either if the patient had a documented history of AFib or if it was present on the resting electrocardiogram obtained during hospital admission. Arterial hypertension was defined as a history of blood pressure above 140/90 mmHg, and/or current hypertension treatment [10].

2.3. Machine Learning Model

For the analysis, we used eXtreme Gradient Boosting (XGBoost), a modified version of the gradient boosting classifier [11]. XGBoost is an ensemble classifier that combines many weak predictive models, usually decision trees, to create a more powerful and efficient model. This approach is iterative: at each iteration, a new weak learner is added to the ensemble, and the weights of the previous learners are adjusted and fixed. This iterative process continues until the model’s performance converges, resulting in a robust predictive model. The input for the model consisted of 36 measurements. Datasets were split according to the 80/20 rule, i.e., 80% of the dataset was used to train the model, and 20% was used to test it.

2.4. Imputation and Imbalance Treatment

The dataset was obtained from a clinical setting and thus showed missing measurements. Additionally, the dataset was inspected, and outliers (values that were non-physiological for the given parameters) due to typing or unit of measurement errors were excluded from the analysis. The missing values were imputed using the Missforest methodology [12]. The percentage of imputation for each measurement is listed in Table 2. As the imbalance of classes can result in a bias of the model toward the larger class, we used a combination of SMOTE and Tomek’s links named the SMOTE–Tomek approach [13]. In this approach, some artificial samples were synthesized using the synthetic minority oversampling technique (SMOTE) algorithm and some pairs of observations from two classes were removed according to the Tomek link rules.

2.5. Feature Importance in the Model

The Shapley additive explanation was used to assess the importance of each feature of the classifier. The unique advantage of the summary plot is its combination of feature importance and feature effect, allowing for the interpretation of the underlying relationships between features and outcomes. A positive value indicates that the feature promotes the likelihood of a positive outcome, whereas a negative one indicates that the feature decreases the likelihood of a positive outcome. The distribution of the Shapley values per feature is shown by overlapping points that are jittered in the y-axis direction. Blue dots indicate values below the cut-off of each measurement, whereas red dots indicate values above the specific cut-off.

2.6. Outcome

The status of alive/deceased was obtained either from the records of our hospital or, when not available, from the National Registry of Persons. The status of each subject (dead/alive) was updated in December 2022.

2.7. Statistical Analysis

The statistical analysis was performed using SPSS (ver. 20.0) software (IBM, Chicago, IL, USA). The categorical variables are reported as the mean ± standard deviation (STDEV). A Student’s t-test was used to compare categorical data. For all tests, p  <  0.05 was deemed statistically significant.

3. Results

3.1. Study Population

A total of 1969 subjects were included in this study (males, n = 936). The average age was 59 ± 13 years. Most subjects had T2DM (n = 1714, 87%), 692 of whom (40%) had insulin in their treatment protocol. Subjects with T1DM were younger (p < 0.001) and had a longer history of DM (p < 0.001). The characteristics of the two groups are listed in Table 2.

3.2. Outcome

The mean follow-up period was 5.5 ± 1.1 years. The overall mortality rate in the population was 21%. Subjects with T1DM showed a lower mortality rate when compared to subjects with T2DM (8% vs. 23%, p < 0.01). Overall, subjects with poor survival were older, with a longer duration of DM and more impaired kidney function (Table 3).

3.3. Imputation Rate

The imputation rate ranged from 0% to 18.1%. The imputation percentage for each parameter used in the analysis is listed in Supplemental Table S1.

3.4. Machine Learning Model and Feature Importance

The ML model demonstrated good results in predicting all-cause mortality in our study cohort, with an accuracy of 89% and an area under the curve of 88%. The sensitivity was 90% and the specificity was 87%, while the F1-score was 88%.
The feature importance for the first 20 parameters in order of position is shown in Figure 1. We can observe that eGFR had the highest ranking, followed by mean years of therapy with insulin and hemoglobin value, GGT, and urea.
Regarding the direction of contribution, a lower eGFR, lower hemoglobin, higher GGT, and lower urea values were associated with a higher mortality rate. The interpretation of the duration of insulin treatment is challenging as it seems that both short and prolonged insulin therapy has an impact on mortality.

4. Discussion

Diabetes mellitus is a major public health challenge of the twenty-first century. It affects a large and growing number of people, impacting their quality of life and inducing serious multi-organ complications, which significantly increase the risk of mortality. Integrating machine learning and artificial intelligence models in the care of patients with DM could prove beneficial on an individual and societal level.
For this study, we used a machine learning (ML) model to integrate clinical, anthropometric, and biological parameters that are widely available and commonly obtained during hospital admissions or routine evaluations of individuals with diabetes mellitus to predict all-cause mortality. Moreover, we employed Shapley values to illustrate the contribution of each variable to the model, indicating both the magnitude and the direction (positive or negative) of their impact on the prediction of all-cause mortality. A single larger-scale study investigated the combination of ML models and SHAP to predict one-year mortality in subjects with DM, showing promising results. However, the model included mostly anthropometric and administrative parameters, and the performance was below that reported in our study, in the range of 0.78–0.799 in terms of AUC [14].

4.1. Study Population

Most of the subjects included in this study had T2DM, while only a small proportion had T1DM. Subjects with T1DM were significantly younger, with a lower BMI, and a longer history of diabetes and insulin intake.
A large percentage of patients had arterial hypertension (72%). Our results are similar to those reported in the literature [15,16] and reflect the common coexistence of these two pathologies.
Due to improvements in medical care and treatment, overall mortality in diabetes subjects has shown a significant decline in recent decades. There has been up to a 70% reduction in the all-cause death rate in subjects with T2DM and roughly a 30% reduction in all-cause mortality in subjects with T1DM [17]. In our population, those with T2DM had a significantly higher mortality rate, results that are not supported by larger scale studies [18,19], which conversely, reported higher mortality rates in subjects with T1DM. The differences may be due to the imbalanced study population, with subjects with T1DM being represented by a small fraction in our cohort. Nonetheless, the overall mortality rate was comparable to that reported in previous studies [20,21].

4.2. Machine Learning Model and Feature Importance

The ML model demonstrated good accuracy in predicting mortality in subjects with diabetes mellitus.
A low eGFR value was associated with the highest mortality risk in our model. The relationship between diabetes and kidney disease onset and progression is well established, with diabetes being the most determinant cause of kidney transplants in developed countries [22]. Moreover, all-cause mortality is also higher in diabetic subjects with impaired kidney function [23,24], highlighting the crucial need to identify subjects with reduced renal function in order to provide early target treatment and prevent progression to end-stage renal disease and cardiovascular complications [25].
Secondly, the years of treatment with insulin ranked as the second most important feature in the model. Insulin therapy [26] and escalating doses [27] were associated with a significant increase in all-cause and cardiac death in patients with diabetes compared to those receiving oral hypoglycemic medications alone.
Interestingly, the SHAP values did not show a clear direction. A shorter period of insulin therapy was associated both negatively and positively with mortality. On one hand, patients in whom insulin was recently introduced might showed a lower mortality rate due to better glycemic control, resulting in a lower rate of complications, or due to a shorter evolution of the disease with fewer complications. Conversely, shorter periods of insulin therapy could also imply that insulin was introduced recently, possibly due to a more severe disease with poorer glycemic control or diabetes-related complications and thus a higher death risk.
Next, the SHAP analysis indicated that a lower level of hemoglobin was found to directly increase all-cause mortality. In both diabetic and non-diabetic patients, the presence of anemia has been shown to increase significantly all-cause mortality risk [28,29], supporting the results of our analysis. Moreover, we demonstrated that a lower platelet count is also associated with mortality. Previous research suggested that platelets might play a significant role in the pathogenesis of micro- and macrovascular complications of DM [30]. Nonetheless, the risk for mortality has been consistently associated with higher mean volume and platelet count and a pro-coagulative state, while the data regarding the impact of thrombocytopenia on DM are scarce. Larger studies are warranted to understand the interplay between platelet count and diabetes complications.
A high level of gamma-glutamyl transferase was associated with a higher risk for mortality in our cohort. GGT has been identified as a marker of oxidative stress, and the association between increased plasma levels of GGT and the risk of developing DM [31,32], as well as the risk for mortality in already diagnosed patients, has been described [33].
Interestingly, in our analysis, treatment with biguanides showed a clear positive association with mortality. The most commonly used representative of the biguanide class is metformin, which is usually the first-line oral anti-hyperglycemic agent for treating patients with non-insulin-dependent diabetes mellitus. In previous large-scale studies [34,35], biguanides were found to have a negative effect on diabetic patients, particularly those with associated ischemic heart disease. Since our data were limited regarding the ischemic status of the patients, we could not verify the association between the treatment with biguanides, underlying ischemic heart disease, and the risk of mortality.
Although controversies are present in the literature [36,37], in our study, treatment with sulfonylureas showed a clear protective role for all-cause mortality.
Obesity is closely linked with T2DM etiology. The ‘obesity paradox’ observed in diabetic patients describes a particular association between BMI and mortality, with higher mortality in patients with a low BMI and lower mortality seen in overweight and obese individuals [38]. Subjects with a lower BMI (<25 kg/m2) were more likely to require treatment with insulin compared to diabetic patients with a higher BMI, and treatment with insulin was associated with a higher risk for complications and mortality [39]. Among the suggested hypotheses are certain underlying genetic abnormalities of subjects who have DM and a low-to-normal BMI or the late recognition of DM in thinner subjects, resulting in later treatment initiation.
Last, glycated hemoglobin represents the most commonly used screening tool of diabetes control in clinical practice, and the association between a high value of HbA1c and mortality in diabetic patients remains debatable as results from studies are contradictory [40,41]. In our study, a higher HbA1c was associated with higher all-cause mortality.

5. Study Limitations

Our study has several limitations. The major limitation is that the data were collected from a single center, which is a large regional emergency hospital treating more severely ill patients with poorer DM control and longer evolution of disease who require admission for treatment adjustments or DM-associated complications. This may influence the percentage of mortality, which is expected to be higher in this population. Moreover, our results were not tested on an external database to validate their reproducibility. Secondly, we did not have any data regarding the underlying cause of death in order to perform a subanalysis. Third, this study was retrospective; thus, we did not have any quality control over the data existing in the medical records. Lastly, the study involved an imbalanced cohort with a smaller representation of patients with type T1DM, and a separate subgroup analysis is warranted to better delineate the specific risk factors and predictors for T1DM and T2DM populations.
These limitations should be considered when interpreting our study findings, underlying the need for sustained efforts to enhance the robustness and applicability of the models.

6. Conclusions

We developed and tested a machine learning model that identified individuals with diabetes mellitus who are at high risk for all-cause mortality and the SHAP plot provided an overview of the impact of each parameter in the model. Key findings included the strong predictive value of low eGFR, highlighting the critical role of impaired renal function in mortality among diabetic patients. Duration of insulin therapy emerged as the second most important predictor, with its dual-directional impact suggesting complex underlying dynamics between disease progression, glycemic control, and mortality risk.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/diabetology6010005/s1, Table S1. Imputation percentage for each measurement.

Author Contributions

Conceptualization, O.M., M.G.O., E.Ț., O.N., M.B., I.D., L.M. and V.R.; methodology, O.M., M.G.O., I.D. and V.R.; supervision, O.M., M.G.O., E.Ț., O.N., M.B., I.D., L.M. and V.R.; validation, O.M., M.G.O., E.Ț., O.N., M.B., I.D., L.M. and V.R.; writing—review and editing, O.M., M.G.O., E.Ț., O.N., M.B., I.D., L.M. and V.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and approved by the Ethics Committee of Emergency County Hospital of Craiova (no. 74 from 7 September 2020).

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

Mostafa Ghelich Oghli is employed by Med Fanavaran Plus Co., Karaj 3187411213, Iran. The company had no involvement in data collection or analysis or in the writing of the manuscript.

References

  1. Sun, H.; Saeedi, P.; Karuranga, S.; Pinkepank, M.; Ogurtsova, K.; Duncan, B.B.; Stein, C.; Basit, A.; Chan, J.C.; Mbanya, J.C.; et al. IDF Diabetes Atlas: Global, regional and country-level diabetes prevalence estimates for 2021 and projections for 2045. Diabetes Res. Clin. Pract. 2022, 183, 109119. [Google Scholar] [CrossRef] [PubMed]
  2. Ahmad, E.; Lim, S.; Lamptey, R.; Webb, D.R.; Davies, M.J. Type 2 diabetes. Lancet 2022, 400, 1803–1820. [Google Scholar] [CrossRef] [PubMed]
  3. Matheus, A.S.; Tannus, L.R.; Cobas, R.A.; Palma, C.C.; Negrato, C.A.; Gomes, M.B. Impact of diabetes on cardiovascular disease: An update. Int. J. Hypertens. 2013, 2013, 653789. [Google Scholar] [CrossRef] [PubMed]
  4. Sattar, N. Revisiting the links between glycaemia, diabetes and cardiovascular disease. Diabetologia 2013, 56, 686–695. [Google Scholar] [CrossRef]
  5. Schena, F.P.; Gesualdo, L. Pathogenetic mechanisms of diabetic nephropathy. J. Am. Soc. Nephrol. 2005, 16 (Suppl. S1), S30–S33. [Google Scholar] [CrossRef]
  6. Chiles, N.S.; Phillips, C.L.; Volpato, S.; Bandinelli, S.; Ferrucci, L.; Guralnik, J.M.; Patel, K.V. Diabetes, peripheral neuropathy, and lower-extremity function. J. Diabetes Complicat. 2014, 28, 91–95. [Google Scholar] [CrossRef]
  7. Quesada, J.A.; Lopez-Pineda, A.; Gil-Guillén, V.F.; Durazo-Arvizu, R.; Orozco-Beltrán, D.; López-Domenech, A.; Carratalá-Munuera, C. Machine learning to predict cardiovascular risk. Int. J. Clin. Pract. 2019, 73, e13389. [Google Scholar] [CrossRef]
  8. Adler, E.D.; Voors, A.A.; Klein, L.; Macheret, F.; Braun, O.O.; Urey, M.A.; Zhu, W.; Sama, I.; Tadel, M.; Campagnari, C.; et al. Improving risk prediction in heart failure using machine learning. Eur. J. Heart Fail. 2020, 22, 139–147. [Google Scholar] [CrossRef]
  9. Shapley, L. A value for n-person games. In Contributions to the Theory of Games, Volume II; Kuhn, H., Tucker, A., Eds.; Princeton University Press: Princeton, NJ, USA, 1953; pp. 307–318. [Google Scholar]
  10. Mancia, G.; Kreutz, R.; Brunström, M.; Burnier, M.; Grassi, G.; Januszewicz, A.; Muiesan, M.L.; Tsioufis, K.; Agabiti-Rosei, E.; Algharably, E.A.E.; et al. 2023 ESH Guidelines for the management of arterial hypertension The Task Force for the management of arterial hypertension of the European Society of Hypertension: Endorsed by the International Society of Hypertension (ISH) and the European Renal Association (ERA). J. Hypertens. 2023, 41, 1874–2071. [Google Scholar]
  11. Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
  12. Stekhoven, D.J.; Bühlmann, P. MissForest--non-parametric missing value imputation for mixed-type data. Bioinformatics 2012, 28, 112–118. [Google Scholar] [CrossRef]
  13. Swana, E.F.; Doorsamy, W.; Bokoro, P. Tomek Link and SMOTE Approaches for Machine Fault Classification with an Imbalanced Dataset. Sensors 2022, 22, 3246. [Google Scholar] [CrossRef] [PubMed]
  14. Alimbayev, A.; Zhakhina, G.; Gusmanov, A.; Sakko, Y.; Yerdessov, S.; Arupzhanov, I.; Kashkynbayev, A.; Zollanvari, A.; Gaipov, A. Predicting 1-year mortality of patients with diabetes mellitus in Kazakhstan based on administrative health data using machine learning. Sci. Rep. 2023, 13, 8412. [Google Scholar] [CrossRef] [PubMed]
  15. The Hypertension in Diabetes Study Group. Prevalence of hypertension in newly presenting type 2 diabetic patients and the association with risk factors for cardiovascular and diabetic complications. J. Hypertens. 1993, 11, 309–317. [Google Scholar] [CrossRef]
  16. Tarnow, L.; Rossing, P.; Gall, M.A.; Nielsen, F.S.; Parving, H.H. Prevalence of arterial hypertension in diabetic patients before and after the JNC-V. Diabetes Care 1994, 17, 1247–1251. [Google Scholar] [CrossRef]
  17. Rawshani, A.; Rawshani, A.; Gudbjörnsdottir, S. Mortality and Cardiovascular Disease in Type 1 and Type 2 Diabetes. N. Engl. J. Med. 2017, 377, 300–301. [Google Scholar] [CrossRef]
  18. Carstensen, B.; Rønn, P.F.; Jørgensen, M.E. Prevalence, incidence and mortality of type 1 and type 2 diabetes in Denmark 1996–2016. BMJ Open Diabetes Res. Care 2020, 8, e001071. [Google Scholar] [CrossRef]
  19. de Fine Olivarius, N.; Andreasen, A.H. Five-year all-cause mortality of 1323 newly diagnosed middle-aged and elderly diabetic patients. Data from the population-based study, diabetes care in general practice, Denmark. J. Diabetes Complicat. 1997, 11, 83–89. [Google Scholar] [CrossRef]
  20. Hansen, L.J.; Olivarius Nde, F.; Siersma, V. 16-year excess all-cause mortality of newly diagnosed type 2 diabetic patients: A cohort study. BMC Public Health 2009, 9, 400. [Google Scholar] [CrossRef]
  21. Colhoun, H.M.; the WHO Multinational Study Group; Lee, E.T.; Bennett, P.H.; Lu, M.; Keen, H.; Wang, S.-L.; Stevens, L.K.; Fuller, J.H. Risk factors for renal failure: The WHO Multinational Study of Vascular Disease in Diabetes. Diabetologia 2001, 44 (Suppl. S2), S46–S53. [Google Scholar] [CrossRef]
  22. Groop, P.-H.; Thomas, M.C.; Moran, J.L.; Wadèn, J.; Thorn, L.M.; Mäkinen, V.-P.; Rosengård-Bärlund, M.; Saraheimo, M.; Hietala, K.; Heikkilä, O.; et al. The presence and severity of chronic kidney disease predicts all-cause mortality in type 1 diabetes. Diabetes 2009, 58, 1651–1658. [Google Scholar] [CrossRef]
  23. Afkarian, M.; Sachs, M.C.; Kestenbaum, B.; Hirsch, I.B.; Tuttle, K.R.; Himmelfarb, J.; De Boer, I.H. Kidney disease and increased mortality risk in type 2 diabetes. J. Am. Soc. Nephrol. 2013, 24, 302–308. [Google Scholar] [CrossRef] [PubMed]
  24. Bruno, G.; Merletti, F.; Bargero, G.; Novelli, G.; Melis, D.; Soddu, A.; Perotto, M.; Pagano, G.; Cavallo-Perin, P. Estimated glomerular filtration rate, albuminuria and mortality in type 2 diabetes: The Casale Monferrato study. Diabetologia 2007, 50, 941–948. [Google Scholar] [CrossRef] [PubMed]
  25. Donoiu, I.; Târtea, G.; Sfredel, V.; Raicea, V.; Țucă, A.M.; Preda, A.N.; Cozma, D.; Vătășescu, R. Dapagliflozin Ameliorates Neural Damage in the Heart and Kidney of Diabetic Mice. Biomedicines 2023, 11, 3324. [Google Scholar] [CrossRef] [PubMed]
  26. Xu, S.; Wang, B.; Liu, W.; Wu, C.; Huang, J. The effects of insulin therapy on mortality in diabetic patients undergoing percutaneous coronary intervention. Ann. Transl. Med. 2021, 9, 1294. [Google Scholar] [CrossRef]
  27. Gamble, J.-M.; Chibrikov, E.; Twells, L.K.; Midodzi, W.K.; Young, S.W.; MacDonald, D.; Majumdar, S.R. Association of insulin dosage with mortality or major adverse cardiovascular events: A retrospective cohort study. Lancet Diabetes Endocrinol. 2017, 5, 43–52. [Google Scholar] [CrossRef]
  28. Kengne, A.P.; Czernichow, S.; Hamer, M.; Batty, G.D.; Stamatakis, E. Anaemia, haemoglobin level and cause-specific mortality in people with and without diabetes. PLoS ONE 2012, 7, e41875. [Google Scholar] [CrossRef]
  29. McFarlane, S.I.; Salifu, M.O.; Makaryus, J.; Sowers, J.R. Anemia and cardiovascular disease in diabetic nephropathy. Curr. Diabetes Rep. 2006, 6, 213–218. [Google Scholar] [CrossRef]
  30. Zvetkova, E.; Ivanov, I.; Koytchev, E.; Antonova, N.; Gluhcheva, Y.; Alexandrova-Watanabe, A.; Kostov, G. Hematological and Hemorheological Parameters of Blood Platelets as Biomarkers in Diabetes Mellitus Type 2: A Comprehensive Review. Appl. Sci. 2024, 14, 4684. [Google Scholar] [CrossRef]
  31. Perry, I.J.; Wannamethee, S.G.; Shaper, A.G. Prospective study of serum gamma-glutamyltransferase and risk of NIDDM. Diabetes Care 1998, 21, 732–737. [Google Scholar] [CrossRef]
  32. André, P.; Balkau, B.; Born, C.; Charles, M.A.; Eschwège, E.; DESIR Study Group. Three-year increase of gamma-glutamyltransferase level and development of type 2 diabetes in middle-aged men and women: The DESIR cohort. Diabetologia 2006, 49, 2599–2603. [Google Scholar] [CrossRef]
  33. Sluik, D.; Beulens, J.W.; Weikert, C.; van Dieren, S.; Spijkerman, A.M.; van der A, D.L.; Fritsche, A.; Joost, H.; Boeing, H.; Nöthlings, U. Gamma-glutamyltransferase, cardiovascular disease and mortality in individuals with diabetes mellitus. Diabetes Metab. Res. Rev. 2012, 28, 284–288. [Google Scholar] [CrossRef] [PubMed]
  34. Mannucci, E.; Monami, M.; Masotti, G.; Marchionni, N. All-cause mortality in diabetic patients treated with combinations of sulfonylureas and biguanides. Diabetes Metab. Res. Rev. 2004, 20, 44–47. [Google Scholar] [CrossRef] [PubMed]
  35. Monami, M.; Marchionni, N.; Masotti, G.; Mannucci, E. Effect of combined secretagogue/biguanide treatment on mortality in type 2 diabetic patients with and without ischemic heart disease. Int. J. Cardiol. 2008, 126, 247–251. [Google Scholar] [CrossRef] [PubMed]
  36. Varvaki Rados, D.; Catani Pinto, L.; Reck Remonti, L.; Bauermann Leitão, C.; Gross, J.L. The Association between Sulfonylurea Use and All-Cause and Cardiovascular Mortality: A Meta-Analysis with Trial Sequential Analysis of Randomized Clinical Trials. PLoS Med. 2016, 13, e1001992. [Google Scholar]
  37. Costanzo, P.; Cleland, J.G.; Pellicori, P.; Clark, A.L.; Hepburn, D.; Kilpatrick, E.S.; Perrone-Filardi, P.; Zhang, J.; Atkin, S.L. The obesity paradox in type 2 diabetes mellitus: Relationship of body mass index to prognosis: A cohort study. Ann. Intern. Med. 2015, 162, 610–618. [Google Scholar] [CrossRef]
  38. Lajous, M.; Bijon, A.; Fagherazzi, G.; Boutron-Ruault, M.C.; Balkau, B.; Clavel-Chapelon, F.; Hernán, M.A. Body mass index, diabetes, and mortality in French women: Explaining away a “paradox”. Epidemiology 2014, 25, 10–14. [Google Scholar] [CrossRef]
  39. Andersson, C.; van Gaal, L.; Caterson, I.D.; Weeke, P.; James, W.P.T.; Couthino, W.; Finer, N.; Sharma, A.M.; Maggioni, A.P.; Torp-Pedersen, C. Relationship between HbA1c levels and risk of cardiovascular adverse outcomes and all-cause mortality in overweight and obese cardiovascular high-risk women and men with type 2 diabetes. Diabetologia 2012, 55, 2348–2355. [Google Scholar] [CrossRef]
  40. UK Prospective Diabetes Study (UKPDS) Group. Effect of intensive blood-glucose control with metformin on complications in overweight patients with type 2 diabetes (UKPDS 34). Lancet 1998, 352, 854–865. [Google Scholar] [CrossRef]
  41. Patel, A.; MacMahon, S.; Chalmers, J.; Neal, B.; Billot, L.; Woodward, M.; Marre, M.; Cooper, M.; Glasziou, P.; Grobbee, D.; et al. Intensive blood glucose control and vascular outcomes in patients with type 2 diabetes. N. Engl. J. Med. 2008, 358, 2560–2572. [Google Scholar]
Figure 1. Shapley plot showing the contribution to all-cause mortality for the study parameters. ALT, alanine aminotransferase; AST, aspartate aminotransferase; BMI, body mass index; DM, diabetes mellitus; DBP, diastolic blood pressure; eGFR, estimated glomerular filtration rate calculated using the Modification of Diet in Renal Disease (MDRD) equation; GGT, gamma-glutamyl transferase; HbA1c, glycated hemoglobin; HDL, high-density level.
Figure 1. Shapley plot showing the contribution to all-cause mortality for the study parameters. ALT, alanine aminotransferase; AST, aspartate aminotransferase; BMI, body mass index; DM, diabetes mellitus; DBP, diastolic blood pressure; eGFR, estimated glomerular filtration rate calculated using the Modification of Diet in Renal Disease (MDRD) equation; GGT, gamma-glutamyl transferase; HbA1c, glycated hemoglobin; HDL, high-density level.
Diabetology 06 00005 g001
Table 1. Clinical, biological, and diabetes mellitus characteristics included in the study.
Table 1. Clinical, biological, and diabetes mellitus characteristics included in the study.
Demographic and ClinicalBiological DataDM Related
Gender (male/female)
Age (years)
Height (cm)
Current weight (kg)
Maximum weight (kg)
Current BMI (kg/m2)
Maximum BMI (kg/m2)
Abdominal circumference (cm)
Hip circumference (cm)
Systolic blood pressure (mmHg)
Diastolic blood pressure (mmHg)
Heart rate (bpm)
Arterial hypertension (yes/no)
Atrial fibrillation (yes/no)
Uric acid (mg/dL)
ALT (U/L)
AST (U/L)
GGT (U/L)
Serum amylase (U/L)
Creatinine (mg/dL)
eGFR (mL/min/1.73 m2)
Urea (mg/dL)
Alkaline phosphatase (U/L)
Total cholesterol (mg/dL)
HDL cholesterol (mg/dL)
LDL cholesterol (mg/dL)
Triglycerides (mg/dL)
Hemoglobin (g/dL)
Platelet count (number/µL)
Glycated hemoglobin (%)
Maximum glycemia (mg/dL)
Type of DM (type 1/2)
Duration of DM (years)
Duration of Insulin treatment (years)
DM Medication:
  • Biguanides
  • Sulfonylureas
  • Alpha-glucosidase inhibitors
  • SGLT2 inhibitors
  • Dipeptidyl peptidase-4 inhibitors
  • GLP-1 receptor agonists
Number of classes of DM medication
ALT, alanine aminotransferase; AST, aspartate aminotransferase; BMI, body mass index; DM, diabetes mellitus; eGFR, estimated glomerular filtration rate calculated using the Modification of Diet in Renal Disease (MDRD) equation; GGT, gamma-glutamyl transferase; GLP1, glucagon-like peptide 1; SGLT2, sodium–glucose cotransporter 2.
Table 2. Characteristics of the study population, according to the type of diabetes mellitus.
Table 2. Characteristics of the study population, according to the type of diabetes mellitus.
ParameterUnitAll
(n = 1969)
T1DM
(n = 255)
T2DM
(n = 1714)
p Value *
Anthropometric data
Ageyears59 ± 1337 ± 1362 ± 10<0.01
Current weightkg81 ± 2068 ± 1483 ± 20<0.01
Maximum weightkg91 ± 2178 ± 1893 ± 21<0.01
Current BMIkg/m229 ± 724 ± 430 ± 7<0.01
Maximum BMIkg/m233 ± 727 ± 534 ± 6<0.01
Abdominal circumferencecm102 ± 1685 ± 12105 ± 15<0.01
Hip circumferencecm105 ± 1396 ± 9106 ± 12<0.01
DM characteristics
Duration of DMyears10.0 ± 8.513 ± 1110 ± 8<0.01
Duration of insulin treatment years4.4 ± 7.113 ± 114 ± 5<0.01
HbA1c%10.1 ± 2.610.1 ± 310.0 ± 2.5<0.01
Maximum glycemiamg/dL309 ± 98341 ± 98305 ± 98<0.01
SBPmmHg135 ± 19124 ± 18137 ± 19<0.01
DBPmmHg80 ± 1277 ± 1180 ± 12<0.01
Heart ratebpm83 ± 1487 ± 1483 ± 13ns
Laboratory data
Hemoglobing/dL13.5 ± 1.913.6 ± 2.013.5 ± 1.8<0.01
Platelet countnumber/µL221.000 ± 87.000234.800 ± 78.000219.200 ± 88.000<0.01
Creatininemg/dL1.0 ± 2.01.0 ± 1.81.0 ± 2.0<0.01
Ureamg/dL44 ± 2535 ± 1945 ± 26<0.01
eGFR mL/min/1.73 m282 ± 25100 ± 2479 ± 25<0.01
Uric acidmg/dL5.1 ± 1.84.3 ± 1.77.8 ± 1.8<0.01
Alkaline phosphatase U/L80 ± 4486 ± 5379 ± 42<0.01
GGTU/L80 ± 18156 ± 16584 ± 183<0.01
AST U/L31 ± 2727 ± 2731 ± 27<0.01
ALT U/L36 ± 3328 ± 2737 ± 34<0.01
Total cholesterolmg/dL190 ± 61190 ± 56190 ± 62ns
HDL cholesterolmg/dL41 ± 1450 ± 1840 ± 13<0.01
LDL cholesterolmg/dL111 ± 45113 ± 42111 ± 46<0.01
Triglyceridesmg/dL201 ± 280144 ± 140210 ± 300<0.01
Values represent mean ± standard deviation; * p value represents the statistical significance of a 2-sided t-test, T1DM parameters vs. T2DM parameters. ALT, alanine aminotransferase; AST, aspartate aminotransferase; BMI, body mass index; DM, diabetes mellitus; DBP, diastolic blood pressure; eGFR, estimated glomerular filtration rate calculated using the Modification of Diet in Renal Disease (MDRD) equation; GGT gamma-glutamyl transferase; HbA1c, glycated hemoglobin; HDL, high-density level; LDL, low-density level; ns, not significant; SBP, systolic blood pressure.
Table 3. Characteristics of the entire study population and separated according to outcome.
Table 3. Characteristics of the entire study population and separated according to outcome.
ParameterUnitAll
(n = 1969)
Alive
(n = 1560)
Deceased
(n = 409)
p Value *
Anthropometric data
Ageyears59 ± 1457 ± 1467 ± 11<0.001
Current weightkg81 ± 2081 ± 1979 ± 21ns
Max weightkg91 ± 2191 ± 2190 ± 22ns
Current BMIkg/m229 ± 729 ± 729 ± 7ns
Maximum BMIkg/m233 ± 733 ± 733 ± 7ns
Abdominal circumferencecm102 ± 16102 ± 16104 ± 17ns
Hip circumferencecm105 ± 13105 ± 12105 ± 14ns
DM characteristics
Duration of DMyears10.0 ± 8.59.5 ± 8.311.4 ± 9.1<0.001
Duration of insulin treatment years4.4 ± 7.14.3 ± 7.14.3 ± 7.4ns
HbA1c%10.1 ± 2.610.0 ± 2.510.2 ± 2.9ns
Maximum glycemiamg/dL309 ± 98302 ± 95333 ± 105<0.001
SBPmmHg135 ± 19135 ± 19136 ± 21ns
DBPmmHg80 ± 1280 ± 1178 ± 13<0.001
Heart ratebpm83 ± 1484 ± 1383 ± 15ns
Laboratory data
Hemoglobing/dL13.5 ± 1.913.7 ± 1.712.7 ± 2.2<0.001
Platelet countnumber/µL221.000 ± 87.000222.000 ± 75.000218.000 ± 121.000ns
Creatininemg/dL1.0 ± 2.00.9 ± 0.41.4 ± 4.2<0.001
Ureamg/dL44 ± 2540 ± 2156 ± 35<0.001
eGFR mL/min/1.73 m282 ± 2585.5 ± 24.268.4 ± 27.6<0.001
Uric acidmg/dL5.1 ± 1.85.0 ± 1.95.5 ± 2.0<0.001
Alkaline phosphatase U/L80 ± 4478 ± 3987 ± 57<0.001
GGTU/L80 ± 18172 ± 150111 ± 268<0.001
AST U/L31 ± 2729 ± 2336 ± 38<0.001
ALT U/L36 ± 3336 ± 3038 ± 44ns
Total cholesterolmg/dL190 ± 61192 ± 60180 ± 66<0.001
HDL cholesterolmg/dL41 ± 1441 ± 1338 ± 15<0.001
LDL cholesterolmg/dL111 ± 45113 ± 45103 ± 44<0.001
Triglyceridesmg/dL201 ± 280202 ± 290199 ± 273ns
Values represent mean ± standard deviation; * p value represents the statistical significance of a 2-sided t-test, alive vs. deceased. ALT, alanine aminotransferase; AST, aspartate aminotransferase; BMI, body mass index; DM, diabetes mellitus; DBP, diastolic blood pressure; eGFR, estimated glomerular filtration rate calculated using the Modification of Diet in Renal Disease (MDRD) equation; GGT, gamma-glutamyl transferase; HbA1c, glycated hemoglobin; HDL, high-density level; LDL, low-density level; ns, not significant; SBP, systolic blood pressure.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Mirea, O.; Oghli, M.G.; Neagoe, O.; Berceanu, M.; Țieranu, E.; Moraru, L.; Raicea, V.; Donoiu, I. All-Cause Mortality Prediction in Subjects with Diabetes Mellitus Using a Machine Learning Model and Shapley Values. Diabetology 2025, 6, 5. https://doi.org/10.3390/diabetology6010005

AMA Style

Mirea O, Oghli MG, Neagoe O, Berceanu M, Țieranu E, Moraru L, Raicea V, Donoiu I. All-Cause Mortality Prediction in Subjects with Diabetes Mellitus Using a Machine Learning Model and Shapley Values. Diabetology. 2025; 6(1):5. https://doi.org/10.3390/diabetology6010005

Chicago/Turabian Style

Mirea, Oana, Mostafa Ghelich Oghli, Oana Neagoe, Mihaela Berceanu, Eugen Țieranu, Liviu Moraru, Victor Raicea, and Ionuț Donoiu. 2025. "All-Cause Mortality Prediction in Subjects with Diabetes Mellitus Using a Machine Learning Model and Shapley Values" Diabetology 6, no. 1: 5. https://doi.org/10.3390/diabetology6010005

APA Style

Mirea, O., Oghli, M. G., Neagoe, O., Berceanu, M., Țieranu, E., Moraru, L., Raicea, V., & Donoiu, I. (2025). All-Cause Mortality Prediction in Subjects with Diabetes Mellitus Using a Machine Learning Model and Shapley Values. Diabetology, 6(1), 5. https://doi.org/10.3390/diabetology6010005

Article Metrics

Back to TopTop