Using Machine Learning and Feature Importance to Identify Risk Factors for Mortality in Pediatric Heart Surgery
Abstract
:1. Introduction
2. Material and Methods
2.1. Ethics Statement and Study Sample
2.2. Data Collection
2.3. Data Preprocessing and Feature Engineering
2.4. Machine Learning Experiments
2.5. Model Explainability Using SHAP
- Let be the -th observation in the survival dataset.
- Let be the -th feature in the dataset.
- Let be the SHAP value of the -th feature of observation .
- Let be the -th repeated CV model.
- Let be the times to the event of interest, where .
- As defined by Krzyzinski et al. [37], is the SurvSHAP(t) value of the -th feature of observation at time point .
2.6. Statistical Analysis
3. Results
3.1. Sample Characteristics
3.2. Feature Engineering and Feature Selection
3.3. Machine Learning Experiments
3.3.1. Feature Importance
3.3.2. Comparison with CPH
4. Discussion
4.1. Limitations
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Lindinger, A.; Schwedler, G.; Hense, H.-W. Prevalence of congenital heart defects in newborns in germany: Results of the first registration year of the PAN study (July 2006 to June 2007). Klin. Padiatr. 2010, 222, 321–326. [Google Scholar] [CrossRef]
- Sifrim, A.; Hitz, M.P.; Wilsdon, A.; Breckpot, J.; Turki, S.H.; Thienpont, B.; McRae, J.; Fitzgerald, T.W.; Singh, T.; Swaminathan, G.J.; et al. Distinct genetic architectures for syndromic and nonsyndromic congenital heart defects identified by exome sequencing. Nat. Genet. 2016, 48, 1060–1065. [Google Scholar] [CrossRef]
- Patel, S.S.; Burns, T.L. Nongenetic risk factors and congenital heart defects. Pediatr. Cardiol. 2013, 34, 1535–1555. [Google Scholar] [CrossRef]
- Dittrich, S.; Arenz, C.; Krogmann, O.; Tengler, A.; Meyer, R.; Bauer, U.; Hofbeck, M.; Beckmann, A.; Horke, A. German registry for cardiac operations and interventions in patients with congenital heart disease: Report 2021 and 9 years’ longitudinal observations on fallot and coarctation patients. Thorac. Cardiovasc. Surg. 2022, 70, e21–e33. [Google Scholar] [CrossRef]
- Gilboa, S.M.; Salemi, J.L.; Nembhard, W.N.; Fixler, D.E.; Correa, A. Mortality resulting from congenital heart disease among children and adults in the united states, 1999 to 2006. Circulation 2010, 122, 2254–2263. [Google Scholar] [CrossRef]
- Marino, B.S.; Cassedy, A.; Drotar, D.; Wray, J. The Impact of Neurodevelopmental and Psychosocial Outcomes on Health-Related Quality of Life in Survivors of Congenital Heart Disease. J. Pediatr. 2016, 174, 11–22.e2. [Google Scholar] [CrossRef]
- Tsao, C.W.; Aday, A.W.; Almarzooq, Z.I.; Alonso, A.; Beaton, A.Z.; Bittencourt, M.S.; Boehme, A.K.; Buxton, A.E.; Carson, A.P.; Commodore-Mensah, Y.; et al. Heart Disease and Stroke Statistics—2022 Update: A Report From the American Heart Association. Circulation 2022, 145, e153–e639. [Google Scholar]
- Bertsimas, D.; Zhuo, D.; Dunn, J.; Levine, J.; Zuccarelli, E.; Smyrnakis, N.; Tobota, Z.; Maruszewski, B.; Fragata, J.; Sarris, G.E. Adverse Outcomes Prediction for Congenital Heart Surgery: A Machine Learning Approach. World J. Pediatr. Congenit. Heart Surg. 2021, 12, 453–460. [Google Scholar] [CrossRef]
- GBD 2017 Congenital Heart Disease Collaborators. Global, regional, and national burden of congenital heart disease, 1990–2017: A systematic analysis for the Global Burden of Disease Study 2017. Lancet Child Adolesc. Health 2020, 4, 185–200. [Google Scholar] [CrossRef]
- Jacobs, M.L.; O’brien, S.M.; Jacobs, J.P.; Mavroudis, C.; Lacour-Gayet, F.; Pasquali, S.K.; Welke, K.; Pizarro, C.; Tsai, F.; Clarke, D.R. An empirically based tool for analyzing morbidity associated with operations for congenital heart disease. J. Thorac. Cardiovasc. Surg. 2013, 145, 1046–1057.e1. [Google Scholar] [CrossRef]
- Hickey, P.A.; Connor, J.A.; Cherian, K.M.; Jenkins, K.; Doherty, K.; Zhang, H.; Gaies, M.; Pasquali, S.; Tabbutt, S.; St. Louis, J.D.; et al. International quality improvement initiatives. Cardiol. Young 2017, 27, S61–S68. [Google Scholar] [CrossRef] [PubMed]
- Pace, N.D.; Oster, M.E.; Forestieri, N.E.; Enright, D.; Knight, J.; Meyer, R.E. Sociodemographic Factors and Survival of Infants with Congenital Heart Defects. Pediatrics 2018, 142, e20180302. [Google Scholar] [CrossRef]
- Fogel, A.L.; Kvedar, J.C. Artificial intelligence powers digital medicine. NPJ Digital Med. 2018, 1, 5. [Google Scholar] [CrossRef] [PubMed]
- Bruckert, S.; Finzel, B.; Schmid, U. The next generation of medical decision support: A roadmap toward transparent expert companions. Front. Artif. Intell. 2020, 3, 507973. [Google Scholar] [CrossRef] [PubMed]
- Holzinger, A.; Goebel, R.; Fong, R.; Moon, T.; Müller, K.-R.; Samek, W. (Eds.) xxAI—Beyond Explainable AI: International Workshop, Held in Conjunction with ICML 2020, 18 July 2020, Vienna, Austria, Revised and Extended Papers; Springer International Publishing: Cham, Switzerland, 2022; Volume 13200. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; ACM: New York, NY, USA, 2016; pp. 785–794. [Google Scholar] [CrossRef]
- Moncada-Torres, A.; van Maaren, M.C.; Hendriks, M.P.; Siesling, S.; Geleijnse, G. Explainable machine learning can outperform Cox regression predictions and provide insights in breast cancer survival. Sci. Rep. 2021, 11, 6968. [Google Scholar] [CrossRef]
- Lundberg, S.; Lee, S.-I. A unified approach to interpreting model predictions. arXiv 2017. [Google Scholar] [CrossRef]
- Du, X.; Wang, H.; Wang, S.; He, Y.; Zheng, J.; Zhang, H.; Hao, Z.; Chen, Y.; Xu, Z.; Lu, Z. Machine Learning Model for Predicting Risk of In-Hospital Mortality after Surgery in Congenital Heart Disease Patients. Rev. Cardiovasc. Med. 2022, 23, 376. [Google Scholar] [CrossRef]
- Semler, S.; Wissing, F.; Heyder, R. German medical informatics initiative: A national approach to integrating health data from patient care and medical research. Methods Inf. Med. 2018, 57, e50–e56. [Google Scholar] [CrossRef]
- R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2024. [Google Scholar]
- Erikssen, G.; Liestøl, K.; Seem, E.; Birkeland, S.; Saatvedt, K.J.; Hoel, T.N.; Døhlen, G.; Skulstad, H.; Svennevig, J.L.; Thaulow, E.; et al. Achievements in Congenital Heart Defect Surgery. Circulation 2015, 131, 337–346. [Google Scholar] [CrossRef]
- Jacobs, J.P.; O’brien, S.M.; Pasquali, S.K.; Kim, S.; Gaynor, J.W.; Tchervenkov, C.I.; Karamlou, T.; Welke, K.F.; Lacour-Gayet, F.; Mavroudis, C.; et al. The Importance of Patient-Specific Preoperative Factors: An Analysis of The Society of Thoracic Surgeons Congenital Heart Surgery Database. Ann. Thorac. Surg. 2014, 98, 1653–1659. [Google Scholar] [CrossRef]
- Van Buuren, S.; Groothuis-Oudshoorn, K. Mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 2011, 45, 1–67. [Google Scholar] [CrossRef]
- Wright, M.N.; Ziegler, A. Ranger: A fast impleentation of random forests for high dimensional data in C++ and R. arXiv 2017, arXiv:1508.04409. [Google Scholar] [CrossRef]
- Chen, T.; He, T.; Benesty, M.; Khotilovich, V.; Tang, Y.; Cho, H.; Chen, K.; Mitchell, R.; Cano, I.; Zhou , T.; et al. Xgboost: Extreme Gradient Boosting. R Package Version 1.7.8.1. 2022. Available online: https://CRAN.R-project.org/package=xgboost (accessed on 25 September 2024).
- Shortliffe, E.H.; Sepúlveda, M.J. Clinical decision support in the era of artificial intelligence. JAMA 2018, 320, 2199–2200. [Google Scholar] [CrossRef] [PubMed]
- James, G.; Witten, D.; Hastie, T.; Tibshirani, R. Tree-Based Methods. In An Introduction to Statistical Learning. Springer Texts in Statistics; Springer: New York, NY, USA, 2021; pp. 327–352. [Google Scholar] [CrossRef]
- Harrell, F.E.; Lee, K.L.; Mark, D.B. Multivariable prognostic models: Issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat. Med. 1996, 15, 361–387. [Google Scholar] [CrossRef]
- Wilson, S. ParBayesianOptimization: Parallel Bayesian Optimization of Hyperparameters. 2022. Available online: https://cran.r-project.org/web/packages/ParBayesianOptimization/index.html (accessed on 25 September 2024).
- Mayer, M. SplitTools: Tools for Data Splitting. 2022. Available online: https://cran.r-project.org/web/packages/splitTools/index.html (accessed on 25 September 2024).
- Therneau, T.M. A Package for Survival Analysis in R. 2022. Available online: https://cran.r-project.org/web/packages/survival/index.html (accessed on 25 September 2024).
- Shapley, L.S. A VALUE FOR n-PERSON GAMES. Contrib. Theory Games 1953, 2, 307–317. [Google Scholar]
- Lundberg, S.M.; Erion, G.; Chen, H.; DeGrave, A.; Prutkin, J.M.; Nair, B.; Katz, R.; Himmelfarb, J.; Bansal, N.; Lee, S.-I. From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2020, 2, 56–67. [Google Scholar] [CrossRef]
- Komisarczyk, K.; Kozminski, P.; Maksymiuk, S.; Biecek, P. Treeshap: Fast SHAP Values Computation for Tree Ensemble Models. arXiv 2023, arXiv:2109.09847. [Google Scholar] [CrossRef]
- James, G.; Witten, D.; Hastie, T.; Tibshirani, R. Resampling Methods. In An Introduction to Statistical Learning. Springer Texts in Statistics; Springer: New York, NY, USA, 2021; pp. 197–223. [Google Scholar] [CrossRef]
- Krzyziński, M.; Spytek, M.; Baniecki, H.; Biecek, P. SurvSHAP(t): Time-dependent explanations of machine learning survival models. Knowl.-Based Syst. 2023, 262, 110234. [Google Scholar] [CrossRef]
- Meis, J.; Baumann, L.; Pilz, M.; Sauer, L. DescrTab2: Publication Quality Descriptive Statistics Tables. 2022. Available online: https://cran.r-project.org/web/packages/DescrTab2/index.html (accessed on 25 September 2024).
- Mayer, M. Shapviz: SHAP Visualizations. 2022. Available online: https://cran.r-project.org/web/packages/shapviz/index.html (accessed on 25 September 2024).
- Spytek, M.; Krzyziński, M.; Baniecki, H.; Biecek, P. Survex: Explainable Machine Learning in Survival Analysis. 2022. Available online: https://cran.r-project.org/web/packages/survex/index.html (accessed on 25 September 2024).
- Jalali, A.; Lonsdale, H.; Do, N.; Peck, J.; Gupta, M.; Kutty, S.; Ghazarian, S.R.; Jacobs, J.P.; Rehman, M.; Ahumada, L.M. Deep Learning for Improved Risk Prediction in Surgical Outcomes. Sci. Rep. 2020, 10, 9289. [Google Scholar] [CrossRef]
- Volkova, A.; Ruggles, K.V. Predictive Metagenomic Analysis of Autoimmune Disease Identifies Robust Autoimmunity and Disease Specific Microbial Signatures. Front. Microbiol. 2021, 12, 621310. [Google Scholar] [CrossRef]
- Triedman, J.K.; Newburger, J.W. Trends in Congenital Heart Disease: The Next Decade. Circulation 2016, 133, 2716–2733. [Google Scholar] [CrossRef]
- Beckmann, A.; Dittrich, S.; Arenz, C.; Krogmann, O.; Horke, A.; Tengler, A.; Meyer, R.; Bauer, U.M.M.; Hofbeck, M.; German Quality Assurance/Competence Network for Congenital Heart Defects Investigators. German Registry for Cardiac Operations and Interventions in Patients with Congenital Heart Disease: Report 2020-Comprehensive Data from 6 Years of Experience. Thorac. Cardiovasc. Surg. 2021, 69, e21–e31. [Google Scholar] [CrossRef]
- O’brien, S.M.; Jacobs, J.P.; Pasquali, S.K.; Gaynor, J.W.; Karamlou, T.; Welke, K.F.; Filardo, G.; Han, J.M.; Kim, S.; Shahian, D.M.; et al. The Society of Thoracic Surgeons Congenital Heart Surgery Database Mortality Risk Model: Part 1-Statistical Methodology. Ann. Thorac. Surg. 2015, 100, 1054–1062. [Google Scholar] [CrossRef]
- Kalfa, D.; Krishnamurthy, G.; Duchon, J.; Najjar, M.; Levasseur, S.; Chai, P.; Chen, J.; Quaegebeur, J.; Bacha, E. Outcomes of cardiac surgery in patients weighing <2.5 kg: Affect of patient-dependent and -independent variables. J. Thorac. Cardiovasc. Surg. 2014, 148, 2499–2506.e1. [Google Scholar]
- Kempny, A.; Dimopoulos, K.; Uebing, A.; Diller, G.-P.; Rosendahl, U.; Belitsis, G.; Gatzoulis, M.A.; Wort, S.J. Outcome of cardiac surgery in patients with congenital heart disease in England between 1997 and 2015. PLoS ONE 2017, 12, e0178963. [Google Scholar] [CrossRef]
- Gritti, M.N.; Farid, P.; Manlhiot, C.; Noone, D.; Sakha, S.; Ali, S.; Bernknopf, B.; McCrindle, B.W. Factors Associated with Acute Kidney Injury After Cardiopulmonary Bypass in Children. CJC Pediatr. Congenit. Heart Dis. 2023, 2, 20–29. [Google Scholar] [CrossRef]
- Li, S.; Krawczeski, C.D.; Zappitelli, M.; Devarajan, P.; Thiessen-Philbrook, H.; Coca, S.G.; Kim, R.W.; Parikh, C.R. Incidence, risk factors, and outcomes of acute kidney injury after pediatric cardiac surgery: A prospective multicenter study*. Crit. Care Med. 2011, 39, 1493. [Google Scholar] [CrossRef]
- Zappitelli, M.; Bernier, P.-L.; Saczkowski, R.S.; Tchervenkov, C.I.; Gottesman, R.; Dancea, A.; Hyder, A.; Alkandari, O. A small post-operative rise in serum creatinine predicts acute kidney injury in children undergoing cardiac surgery. Kidney Int. 2009, 76, 885–892. [Google Scholar] [CrossRef]
- Brown, K.L.; Ridout, D.; Pagel, C.; Wray, J.; Anderson, D.; Barron, D.J.; Cassidy, J.; Davis, P.J.; Rodrigues, W.; Stoica, S.; et al. Incidence and risk factors for important early morbidities associated with pediatric cardiac surgery in a UK population. J. Thorac. Cardiovasc. Surg. 2019, 158, 1185–1196.e7. [Google Scholar] [CrossRef] [PubMed]
- Zürn, C.; Hübner, D.; Ziesenitz, V.C.; Höhn, R.; Schuler, L.; Schlange, T.; Gorenflo, M.; A Kari, F.; Kroll, J.; Loukanov, T.; et al. Model-driven survival prediction after congenital heart surgery. Interdiscip. CardioVascular Thorac. Surg. 2023, 37, ivad089. [Google Scholar] [CrossRef] [PubMed]
- Agarwal, H.S.; Wolfram, K.B.; Saville, B.R.; Donahue, B.S.; Bichell, D.P. Postoperative complications and association with outcomes in pediatric cardiac surgery. J. Thorac. Cardiovasc. Surg. 2014, 148, 609–616.e1. [Google Scholar] [CrossRef] [PubMed]
- Boehne, M.; Sasse, M.; Karch, A.; Dziuba, F.; Horke, A.; Kaussen, T.; Mikolajczyk, R.; Beerbaum, P.; Jack, T. Systemic inflammatory response syndrome after pediatric congenital heart surgery: Incidence, risk factors, and clinical outcome. J. Card. Surg. 2017, 32, 116–125. [Google Scholar] [CrossRef]
- Soares, L.C.d.C.; Ribas, D.; Spring, R.; Silva, J.M.F.d.; Miyague, N.I. Clinical profile of systemic inflammatory response after pediatric cardiac surgery with cardiopulmonary bypass. Arq. Bras. Cardiol. 2010, 94, 127–133. [Google Scholar] [CrossRef]
- Güvener, M.; Korun, O.; Demirtürk, O.S. Risk Factors for Systemic Inflammatory Response After Congenital Cardiac Surgery. J. Card. Surg. 2015, 30, 92–96. [Google Scholar] [CrossRef]
- MacCallum, N.S.; Finney, S.J.; Gordon, S.E.; Quinlan, G.J.; Evans, T.W. Modified Criteria for the Systemic Inflammatory Response Syndrome Improves Their Utility Following Cardiac Surgery. Chest 2014, 145, 1197–1203. [Google Scholar] [CrossRef]
- Maglogiannis, I.; Iliadis, L.; Macintyre, J.; Cortez, P. (Eds.) Artificial Intelligence Applications and Innovations: 18th IFIP WG 12.5 International Conference, AIAI 2022, Hersonissos, Crete, Greece, 17–20 June 2022, Proceedings, Part I; Springer International Publishing: Cham, Switzerland, 2022; Volume 646. [Google Scholar]
- Liu, Y.; Liu, Y.; Liu, Z.; Liang, Y.; Meng, C.; Zhang, J.; Zheng, Y. Federated forest. IEEE Trans. Big Data 2022, 8, 843–854. [Google Scholar] [CrossRef]
- Hauschild, A.-C.; Lemanczyk, M.; Matschinske, J.; Frisch, T.; Zolotareva, O.; Holzinger, A.; Baumbach, J.; Heider, D. Federated random forests can improve local performance of predictive models for various healthcare applications. Bioinformatics 2022, 38, 2278–2286. [Google Scholar] [CrossRef]
- Leung, C.; Law, A.; Sima, O. Towards Privacy-Preserving Collaborative Gradient Boosted Decision Trees; UC Berkeley: Berkeley, CA, USA, 2019. [Google Scholar]
- Le, N.K.; Liu, Y.; Nguyen, Q.M.; Liu, Q.; Liu, F.; Cai, Q.; Hirche, S. FedXGBoost: Privacy-Preserving XGBoost for Federated Learning. arXiv 2021, arXiv:2106.10662. [Google Scholar] [CrossRef]
- Jones, K.; Ong, Y.J.; Zhou, Y.; Baracaldo, N. Federated XGBoost on Sample-Wise Non-IID Data. arXiv 2022, arXiv:2209.01340. [Google Scholar] [CrossRef]
- Andreux, M.; Manoel, A.; Menuet, R.; Saillard, C.; Simpson, C. Federated Survival Analysis with Discrete-Time Cox Models. arXiv 2020, arXiv:2006.08997. [Google Scholar] [CrossRef]
- Wang, X.; Zhang, H.G.; Xiong, X.; Hong, C.; Weber, G.M.; Brat, G.A.; Bonzel, C.-L.; Luo, Y.; Duan, R.; Palmer, N.P.; et al. SurvMaximin: Robust federated approach to transporting survival risk prediction models. J. Biomed. Inform. 2022, 134, 104176. [Google Scholar] [CrossRef]
- Rahimian, S.; Kerkouche, R.; Kurth, I.; Fritz, M. Practical challenges in differentially-private federated survival analysis of medical data. In Proceedings of the Machine Learning Research Conference on Health, Inference, and Learning (CHIL), Inference, Virtual, 7–8 April 2022; pp. 411–425. [Google Scholar]
- Archetti, A.; Matteucci, M. Federated Survival Forests. arXiv 2023, arXiv:2302.02807. [Google Scholar] [CrossRef]
- Ben Saad, S.; Brik, B.; Ksentini, A. A trust and explainable federated deep learning framework in zero touch B5G networks. In Proceedings of the GLOBECOM 2022—2022 IEEE Global Communications Conference, Rio de Janeiro, Brazil, 4–8 December 2022; IEEE: Rio de Janeiro, Brazil, 2022; pp. 1037–1042. [Google Scholar] [CrossRef]
- Rahman, A.; Hossain, S.; Muhammad, G.; Kundu, D.; Debnath, T.; Rahman, M.; Khan, S.I.; Tiwari, P.; Band, S.S. Federated learning-based AI approaches in smart healthcare: Concepts, taxonomies, challenges and open issues. Cluster Comput. 2022, 26, 2271–2311. [Google Scholar] [CrossRef]
- Renda, A.; Ducange, P.; Marcelloni, F.; Sabella, D.; Filippou, M.C.; Nardini, G.; Stea, G.; Virdis, A.; Micheli, D.; Rapone, D.; et al. Federated learning of explainable AI models in 6G systems: Towards secure and automated vehicle networking. Information 2022, 13, 395. [Google Scholar] [CrossRef]
- Bárcena JL, C.; Daole, M.; Ducange, P.; Marcelloni, F.; Renda, A.; Ruffini, F.; Schiavo, A. Fed-XAI: Federated learning of explainable artificial intelligence models. In Proceedings of the XAI.it 2022: 3rd Italian Workshop on Explainable Artificial Intelligence, Udine, Italy, 28 November–2 December 2022. [Google Scholar]
- Rumesh, Y.; Senevirathna, T.; Porambage, P.; Liyanage, M.; Ylianttila, M. Comprehensive Analysis over Centralized and Federated Learning-Based Anomaly Detection in Networks with Explainable AI (XAI). 2023. Available online: https://cris.vtt.fi/en/publications/comprehensive-analysis-over-centralized-and-federated-learning-ba (accessed on 25 September 2024).
- Bogdanova, A.; Imakura, A.; Sakurai, T. DC-SHAP method for consistent explainability in privacy-preserving distributed machine learning. Hum.-Cent. Intell. Syst. 2023, 3, 197–210. [Google Scholar] [CrossRef]
Category | Variable | UVHD I | UVHD II | BVHD Cmplx. | BVHD Smpl. | Total Cohort |
---|---|---|---|---|---|---|
Demographics | N (%) | 50 (3.84) | 111 (8.53) | 291 (22.35) | 850 (65.28) | 1302 (100.00) |
Median age at adm. (IQR) [days] | 0.50 (0 to 17.25) | 1097 (219 to 1332) | 20 (0 to 232.50) | 169 (112 to 504.75) | 159 (63 to 502.25) | |
Median height (IQR) [cm] | 51.50 (49.25 to 53.75) | 92 (67.50 to 100) | 54 (50 to 67.50) | 65 (59 to 78) | 64 (55 to 78) | |
Missing height (%) | 0 (0.00) | 0 (0.00) | 0 (0.00) | 1 (0.08) | 1 (0.08) | |
Median weight (IQR) [kg] | 3.30 (2.96 to 3.88) | 13 (6.93 to 15) | 3.96 (3.29 to 6.93) | 69 (4.69 to 9) | 5.81 (3.96 to 9.10) | |
Missing weight (%) | 0 (0.00) | 5 (0.38) | 3 (0.23) | 21 (1.61) | 29 (2.23) | |
Sex: male (%) | 30 (60.00) | 73 (65.77) | 187 (64.26) | 441 (51.88) | 731 (56.14) | |
Sex: female (%) | 20 (40.00) | 38 (34.23) | 104 (35.74) | 409 (48.12) | 571 (43.86) | |
Encounter-related | Median days duration of stay (IQR) | 25 (14 to 55.50) | 18 (9.50 to 32.50) | 15 (9 to 23.50) | 7 (6 to 11) | 9 (6 to 17) |
Median days between adm. and surgery (IQR) | 6 (2.25 to 9.50) | 1 (1 to 1) | 3 (1 to 7) | 1 (1 to 1) | 1 (1 to 3) | |
Median days disch. after surgery (IQR) | 17 (11 to 48.75) | 14 (8 to 30.50) | 10 (7 to 17) | 6 (5 to 8) | 7 (5 to 12) | |
Disease-related | Median no. of previous adm. (IQR) | 0 (0 to 0) | 2 (1 to 2) | 0 (0 to 1) | 0 (0 to 0) | 0 (0 to 0) |
Deceased status (%) | 31 (62.00) | 8 (7.21) | 28 (9.62) | 6 (0.71) | 73 (5.61) | |
Malformations (%) | 16 (32.00) | 22 (19.82) | 39 (13.40) | 88 (10.35) | 165 (12.67) | |
Chrom. alterations (%) | 3 (6.00) | 2 (1.80) | 26 (8.93) | 180 (21.18) | 211 (16) | |
Pulm. hypertension (%) | 0 (0.00) | 2 (1.80) | 4 (1.37) | 9 (1.06) | 15 (1) |
Category | Variables | Censored | Deceased | Total | p | |
---|---|---|---|---|---|---|
Demographics | Sex | m | 686 (56%) | 45 (62%) | 731 (56%) | 0.263 1 |
w | 544 (44%) | 27 (38%) | 571 (44%) | |||
Weight <2500 g | No | 1201 (98%) | 62 (86%) | 1263 (97%) | <0.001 1 | |
Yes | 29 (2%) | 10 (14%) | 39 (3%) | |||
Disease-related | Chrom. alterations | No | 1027 (83%) | 64 (89%) | 1091 (84%) | 0.227 1 |
Yes | 203 (17%) | 8 (11%) | 211 (16%) | |||
Disease group | BVHD cmplx. | 263 (21%) | 28 (39%) | 291 (22%) | ||
BVHD smpl. | 844 (69%) | 6 (8%) | 850 (65%) | |||
UVHD I | 19 (2%) | 31 (43%) | 50 (4%) | <0.001 1 | ||
UVHD II | 104 (8%) | 7 (10%) | 111 (9%) | |||
Heart disease history | No previous hospitalization | 956 (78%) | 56 (78%) | 1012 (78%) | 0.360 1 | |
BVHD smpl. | 52 (4%) | 0 (0%) | 52 (4%) | |||
UVHD II/UVHD III | 5 (0%) | 0 (0%) | 5 (0%) | |||
BVHD cmplx. | 123 (10%) | 9 (12%) | 132 (10%) | |||
UVHD Ib | 35 (3%) | 4 (6%) | 39 (3%) | |||
UVHD Ia | 59 (5%) | 3 (4%) | 62 (5%) | |||
Malformations | No | 1092 (89%) | 45 (62%) | 1137 (87%) | <0.001 1 | |
Yes | 138 (11%) | 27 (38%) | 165 (13%) | |||
Pulm. hypertension | No | 1215 (99%) | 72 (100%) | 1287 (99%) | 0.346 1 | |
Yes | 15 (1%) | 0 (0%) | 15 (1%) | |||
Encounter-related | Days between admission and surgery | mean ± sd | 3.5 ± 9.8 | 6.9 ± 7.5 | 3.7 ± 9.7 | <0.001 2 |
min−max | 0–182 | 0–34 | 0–182 | |||
Days until discharge after surgery | mean ± sd | 12 ± 16 | 27 ± 28 | 13 ± 18 | <0.001 2 | |
min − max | 1–130 | 0–122 | 0–130 | |||
No. of previous admissions | mean ± sd | 0.32 ± 0.66 | 0.32 ± 0.69 | 0.32 ± 0.66 | 0.967 2 | |
min − max | 0–4 | 0–3 | 0–4 | |||
Laboratory analytes | C-reactive protein (maximum) | mean ± sd | 58 ± 44 | 68 ± 56 | 58 ± 45 | 0.326 2 |
min − max | 1–356 | 0.1–259 | 0.1–356 | |||
Leukocytes (minimum) | mean ± sd | 9.7 ± 3.5 | 6.9 ± 3.6 | 9.6 ± 3.6 | <0.001 2 | |
min − max | 1.5–26 | 1.6–16 | 1.5–26 | |||
Serum creatinine (maximum) | mean ± sd | 0.44 ± 0.26 | 0.85 ± 0.39 | 0.47 ± 0.28 | <0.001 2 | |
min − max | 0.17–3.2 | 0.25–2.2 | 0.17–3.2 | |||
Urea (maximum) | mean ± sd | 32 ± 17 | 48 ± 21 | 33 ± 17 | <0.001 2 | |
min − max | 5–152 | 12–107 | 5–152 | |||
Surgery-related | Age at surgery | mean ± sd | 422 ± 568 | 152 ± 344 | 407 ± 561 | <0.001 2 |
min − max | 0–3466 | 0–1597 | 0–3466 | |||
Aortic cross clamp time | mean ± sd | 72 ± 57 | 68 ± 61 | 72 ± 57 | 0.437 2 | |
min − max | 0–280 | 0–284 | 0–284 | |||
Circulatory arrest during surgery | No | 1091 (89%) | 44 (61%) | 1135 (87%) | <0.001 1 | |
Yes | 139 (11%) | 28 (39%) | 167 (13%) | |||
Heart lung machine during surgery | 0 min. | 165 (13%) | 10 (14%) | 175 (13%) | 0.282 1 | |
≥1 min–<90 min | 207 (17%) | 7 (10%) | 214 (16%) | |||
≥90 min | 858 (70%) | 55 (76%) | 913 (70%) | |||
Hypothermia during surgery | >32 °C | 308 (25%) | 19 (26%) | 327 (25%) | <0.001 1 | |
≥28 °C–≤32 °C | 455 (37%) | 12 (17%) | 467 (36%) | |||
<28 °C | 467 (38%) | 41 (57%) | 508 (39%) | |||
Open thorax | No | 1162 (94%) | 31 (43%) | 1193 (92%) | <0.001 1 | |
Yes | 68 (6%) | 41 (57%) | 109 (8%) |
Algorithm | Hyperparameter | Minimum | Maximum | Step Size | Bounds | Optimized Value |
---|---|---|---|---|---|---|
XGB | colsample_bytree | 0.5 | 0.8 | 0.3 | [0.3, 1] | 0.8 |
learning_rate | 0.01 | 0.11 | 0.05 | [0.001, 0.2] | 0.11 | |
max_depth | 1 | 9 | 4 | [1, 40] | 5 | |
min_child_weight | 1 | 9 | 4 | [0, 10] | 1 | |
subsample | 0.5 | 0.8 | 0.3 | [0.3, 1] | 0.5 | |
RSF | max.depth | 1 | 9 | 4 | [1, 40] | 40 |
min.node.size | 1 | 9 | 4 | [1, 20] | 20 | |
mtry | 2 | 6 | 2 | [2, 9] | 2 | |
num.trees | 500 | 1000 | 500 | [100, 1000] | 100 | |
sample.fraction | 0.5 | 0.8 | 0.3 | [0.3, 1] | 0.63 |
Top Features | XGB | RSF |
---|---|---|
Age at surgery | 0.12 | 0.18 |
Aortic cross clamp time | n/a | 0.19 |
Days between admission and surgery | n/a | 0.24 |
Disease group | 0.20 | 0.19 |
Open thorax | 0.13 | n/a |
Serum creatinine (maximum) | 0.38 | 0.31 |
Urea (maximum) | 0.15 | n/a |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kapsner, L.A.; Feißt, M.; Purbojo, A.; Prokosch, H.-U.; Ganslandt, T.; Dittrich, S.; Mang, J.M.; Wällisch, W. Using Machine Learning and Feature Importance to Identify Risk Factors for Mortality in Pediatric Heart Surgery. Diagnostics 2024, 14, 2587. https://doi.org/10.3390/diagnostics14222587
Kapsner LA, Feißt M, Purbojo A, Prokosch H-U, Ganslandt T, Dittrich S, Mang JM, Wällisch W. Using Machine Learning and Feature Importance to Identify Risk Factors for Mortality in Pediatric Heart Surgery. Diagnostics. 2024; 14(22):2587. https://doi.org/10.3390/diagnostics14222587
Chicago/Turabian StyleKapsner, Lorenz A., Manuel Feißt, Ariawan Purbojo, Hans-Ulrich Prokosch, Thomas Ganslandt, Sven Dittrich, Jonathan M. Mang, and Wolfgang Wällisch. 2024. "Using Machine Learning and Feature Importance to Identify Risk Factors for Mortality in Pediatric Heart Surgery" Diagnostics 14, no. 22: 2587. https://doi.org/10.3390/diagnostics14222587
APA StyleKapsner, L. A., Feißt, M., Purbojo, A., Prokosch, H. -U., Ganslandt, T., Dittrich, S., Mang, J. M., & Wällisch, W. (2024). Using Machine Learning and Feature Importance to Identify Risk Factors for Mortality in Pediatric Heart Surgery. Diagnostics, 14(22), 2587. https://doi.org/10.3390/diagnostics14222587