Next Article in Journal
Discrimination of Exudative Pleural Effusions Based on Pleural Adenosine Deaminase (ADA)-C-Reactive Protein (CRP) Levels, and Their Combination: An Observational Prospective Study
Previous Article in Journal
Effects of SLCO1B1 and SLCO1B3 Genetic Polymorphisms on Valsartan Pharmacokinetics in Healthy Korean Volunteers
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Prediction of Hemorrhagic Transformation after Ischemic Stroke Using Machine Learning

1
Department of Convergence Software, Hallym University, Chuncheon 24252, Korea
2
Institute of New Frontier Research Team, Hallym University College of Medicine, Chuncheon 24252, Korea
3
Department of Neurology, Chuncheon Sacred Heart Hospital, Chuncheon 24253, Korea
4
Department of Otorhinolaryngology and Head and Neck Surgery, Chuncheon Sacred Heart Hospital, Chuncheon 24253, Korea
5
Department of Anesthesiology and Pain Medicine, Chuncheon Sacred Heart Hospital, Chuncheon 24253, Korea
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
J. Pers. Med. 2021, 11(9), 863; https://doi.org/10.3390/jpm11090863
Submission received: 1 August 2021 / Revised: 25 August 2021 / Accepted: 28 August 2021 / Published: 30 August 2021
(This article belongs to the Section Methodology, Drug and Device Discovery)

Abstract

:
Hemorrhagic transformation (HT) is one of the leading causes of a poor prognostic marker after acute ischemic stroke (AIS). We compared the performances of the several machine learning (ML) algorithms to predict HT after AIS using only structured data. A total of 2028 patients with AIS, who were admitted within seven days of symptoms onset, were included in this analysis. HT was defined based on the criteria of the European Co-operative Acute Stroke Study-II trial. The whole dataset was randomly divided into a training and a test dataset with a 7:3 ratio. Binary logistic regression, support vector machine, extreme gradient boosting, and artificial neural network (ANN) algorithms were used to assess the performance of predicting the HT occurrence after AIS. Five-fold cross validation and a grid search technique were used to optimize the hyperparameters of each ML model, which had its performance measured by the area under the receiver operating characteristic (AUROC) curve. Among the included AIS patients, the mean age and number of male subjects were 69.6 years and 1183 (58.3%), respectively. HT was observed in 318 subjects (15.7%). There were no significant differences in corresponding variables between the training and test dataset. Among all the ML algorithms, the ANN algorithm showed the best performance in terms of predicting the occurrence of HT in our dataset (0.844). Feature scaling including standardization and normalization, and the resampling strategy showed no additional improvement of the ANN’s performance. The ANN-based prediction of HT after AIS showed better performance than the conventional ML algorithms. Deep learning may be used to predict important outcomes for structured data-based prediction.

1. Introduction

According to the Global Burden of Stroke in the World Health Organization’s 2016 report, stroke is the leading cause of death and disability worldwide [1], with the incidence of ischemic stroke exceeding that of hemorrhagic stroke [2]. Hemorrhagic transformation (HT) is the one of the major potential complications after acute ischemic stroke (AIS), and is associated with the natural recanalization of the occluded cerebral arteries, thrombolysis, or mechanical thrombectomy, and is a major barrier for antithrombotic treatment after AIS [3,4,5]. Therefore, it is an important issue for stroke practitioners to predict the occurrence of HT during treatment in these patients [6,7]. However, in previous studies, the performance of predicting HT via C-statistics showed relatively poor predictive power at 0.70 [8].
Recently, machine learning (ML) or deep learning (DL) algorithms have been widely used in medical practice as a clinical decision support system [9,10]. In several studies, the usefulness of the ML strategy to predict the risk of HT following AIS was assessed [11,12,13,14,15]. Wang et al. reported that the neural network model showed the best performance (AUROC = 0.82) to predict symptomatic intracerebral hemorrhage (ICH) following thrombolysis in patients with AIS [11]. In another multicenter trial using the Observational Medical Outcomes Partnership Common Data model, the least absolute shrinkage and selection operator regression model showed an AUROC of 0.75 to predict HT [12]. Asadi et al. studied the usefulness of ML algorithms to predict poor outcomes in patients with AIS who received endovascular intervention [13]. They suggested that the support vector machine (SVM) successfully predicted poor outcomes, and that post-infarct ICH was an important factor in a poor prognosis. However, this study had a relatively small number of study participants (107 subjects) and, thus, the study result may require additional validation. Other studies reported a high accuracy rate (~84%) for predicting HT in their stroke cohort, with only the radiologic markers of an MRI used to perform the ML tasks [14,15]. In this regard, different ML algorithms were used to improve the prediction of HT after AIS. However, there are no studies showing high prediction performance using clinical variables in ML tasks.
HT can be divided into symptomatic and asymptomatic cases. Previous studies reported that not only symptomatic HT, but also asymptomatic HT can affect clinical outcomes after AIS [7,16]. There are cases where intracranial hemorrhage, occurring after cerebral reperfusion, could be asymptomatic [17], and, in these cases, it is difficult to determine when to begin antithrombotic treatment [18]. Therefore, we limited HT to a radiological definition rather than a clinical definition. We hypothesized that DL algorithms could better predict HT after AIS than conventional prediction models. Thus, we aimed to assess the important predictor of HT in several ML algorithms, and how to improve the prediction performance of the ML model used in this study.

2. Materials and Methods

2.1. Population and Study Design

This study is a cross-sectional retrospective case-control study using a prospectively collected stroke database in a tertiary teaching hospital. In this registry, patient’s demographics, stroke mechanism, clinical, laboratory, and radiological results were collected by the stroke practitioner and regularly audited by external researchers [19]. From January 2015 to December 2020, a total of 2555 patients admitted to this hospital were included in the registry. Among them, patients with diffusion restrictive lesions in brain MRI scans, with relevant focal neurologic deficits, were included in the analysis. In this analysis, we excluded patients admitted to the hospital seven days after stroke onset and those with missing variables in clinical and laboratory parameters (Figure 1). This study was approved by the Chuncheon Sacred Heart Hospital Institutional Review Board/Ethics Committee (IRB No. 2019-11-017). Written informed consent for the registry enrollment was provided by the participants or their guardians.

2.2. Data Information

Clinicodemographic variables, including age, sex, body mass index, and cardiovascular disease risk profile at hospital admission, were included in the ML model. Age, sex, and stroke-related information, including a history of taking antithrombotics, symptom onset to hospital arrival time, stroke subtype according to the Trial of ORG 10172 in Acute Stroke Treatment classification, and laboratory parameters at hospital admission were also included in the ML model. Originally, HT was divided into hemorrhagic infarct type 1, 2, parenchymal hemorrhage type 1, 2, and symptomatic ICH according to the second European Co-operative Acute Stroke Study-II criteria [20]. We defined HT as when all of these subtypes of HT were identified in follow up brain CT or gradient-echo MRI 48 h after the initial evaluation of AIS.

2.3. Machine Learning Algorithm

As described earlier, we aimed to assess the classification performance (HT or no HT) of several ML algorithms using different optimization techniques. At first, we randomly divided a whole dataset into a training and test dataset with a 7:3 ratio, with a similar proportion of HT maintained in the training and test dataset. For the input variables, continuous variables were used as the raw values, and categorical variables were encoded using one-hot encoding. In the preprocessing process of the continuous variables, we used these as raw values (crude method) for the different scaling methods, including normalization, min-max scaling, standardization, and robust scaling (Figure 2a) [21,22]. Binary logistic regression (BLR), SVM, extreme gradient boosting (XGB), and artificial neural network (ANN) algorithms were used to assess the performance of each algorithm in predicting HT in our dataset. The ANN algorithm was composed of an input layer, four fully connected hidden connected layers, and one output layer (Figure 2b) in the ANN preprocessing task. In the training process, we used five-fold cross validation to reduce the model’s overfitting, and used the grid search technique to select the best combination of hyperparameters in each ML algorithm. The detailed information on the parameter settings are presented in Table S1. On performing each ML task, we extracted the variable importance of the input variables to identify which variables were important in predicting HT in the training dataset. We used the sklearn and keras Python package for these ML processes, and the model training was performed with the TensorFlow interface using NVIDIA’s GeForce GTX 1080ti graphic processing units.

2.4. Statistical Analysis

The baseline characteristics of the patients in the training and test datasets were compared using the Student’s t-test or the Mann–Whitney U test for continuous variables, and Pearson’s χ2-test for categorical variables, as appropriate. When we obtained the probability for the HT from each ML classifier, values of >0.5 were assigned positive HT status. The performance of each ML model was measured with the receiver operating characteristics curve. All statistical analyses were performed with R version 3.6.1 (the R Foundation for Statistical Computing) and Python version 3.7.7 in the anaconda environment.

3. Results

A total of 2028 patients were included in the final ML tasks. Age and portion of male were 69.6 years and 58.3%, respectively. In the whole dataset, HT was observed in 318 patients (15.7%). The comparison of baseline characteristics between the training and test datasets are presented in Table 1. Stroke subtype and stroke severity were equally distributed in the training and test datasets. In addition, the proportion of patients who had been taking antithrombotics before the index stroke or who received thrombolysis for the index stroke were also equally distributed between the training and test datasets. Therefore, there was no significant difference in the input variables for the HT prediction model.
Table 2 shows the overall performance of HT prediction for each of the ML classifiers. The performance of the grid search-based ANN algorithm was the best classifier for predicting HT (accuracy = 87.8%, F1-score = 93.2%), followed by the SVM algorithm. In addition, we represented the most important variables in each of the ML classifiers (Table 3). Although the variable importance factors in each ML algorithm were different, gender, age, prior antithrombotic usage, stroke severity, white blood cell count, stroke subtype, and fasting blood sugar were identified as important factors in the model’s classification.
Figure 3 shows the result of the performance of each ML classifier with a five-fold cross validation and grid search hyperparameter optimization technique. The ANN algorithm was the best performing algorithm on the test dataset (AUROC = 0.842, Figure 3a,b). We additionally performed ANN modelling with scaling of the input variables, and there was no additional improvement in the model’s performance (Figure 3b). We performed ML tasks to determine whether scaling of the input parameters or resampling technique could improve ML algorithm performances. The implementation of these techniques in our BLR, SVM, and XGB algorithms did not show any additional improvement in the model’s performances (Figures S1 and S2).

4. Discussion

In this study of the performance comparison of HT prediction in AIS patients, we identified that the ANN’s prediction performance was better than those of other ML algorithms. In addition, the grid-search hyperparameter optimization technique was useful for improving the performance of ML algorithms using structural numerical data, but the scaling strategy did not show any additional improvement.
It is important for AIS patients who have completed the emergent treatment to reduce the incidence of complications, such as pneumonia, deep vein thrombosis, or HT, which is known to be associated with the worsening of the stroke prognosis. The incidence of HT ranges from 11.0 to 37.5 in patients with AIS according to different clinical settings [23,24,25,26,27,28]. Antithrombotic therapy to prevent additional ischemia immediately after the index stroke is associated with the development of HT or intracranial hemorrhage. On the other hand, the risk of stroke recurrence is high during the acute stage and more than half of relapsed patients have a recurrence within 30 days of the index stroke [29]. Therefore, to minimize the impact of HT after a stroke, we should consider which variables have a causal relationship with the HT development using the conventional statistical model, and which ML models are effective at improving the prediction performance of subsequent HT development.
The ANN algorithm had several advantages compared to traditional ML algorithms. First, the ANN algorithm is quite robust to noise in the training data [30]. If the training data contains errors, they do not significantly affect the final result of the algorithm. Second, ANN is resilient to long duration training processes due to the considerable number of parameter weights and training examples [31]. Third, the higher the number of hidden layers stacked in the ANN algorithm, the more chance that vanishing gradient problems could develop. However, ANN algorithms with few hidden layers, utilizing structured numerical data, can overcome these problems [32]. There is no exact information about how many layers can be stacked to overcome the disadvantage of ANN algorithms falling into the local minima. However, in the case of using medical structured data, the performance of ANN algorithms is reported to be superior when using 3–5 hidden layers, as in this study [33,34]. In addition, the ANN algorithm can perform complex non-linear fitting of high dimensional data, and has well-developed architecture selection methods to prevent the overfitting of training models [35].
Wang et al. studied the usefulness of ML algorithms to predict symptomatic ICH after thrombolysis in 2237 patients with hyperacute ischemic stroke [11]. Of these ML models, the three-layered ANN model showed the best performance in terms of predicting symptomatic ICH in this cohort. This study was conducted on a different stroke population and, thus, a direct comparison of the algorithm’s performance may be difficult. However, these difference in the ANN model’s performance could be explained by the following reasons. First, Wang et al.’s study used imputation for missing parameters. Since missing values were replaced with representative values, such as mean or median values during multiple imputation, the meanings of the variables may have flattened during this imputation process and, as a result, the performance of the ML classifier could be underestimated [36]. Second, the absence of stroke-related information, such as stroke subtype or laboratory parameters, in Wang et al.’s study might be associated with lower resolution for HT prediction. Indeed, stroke subtype classified with the Trial of Org 10172 in Acute Stroke Treatment classification is associated with the occurrence of HT, with cardioembolic stroke etiology having a causal relationship with the HT development [37]. Third, we used five-fold cross validation and the grid search technique for hyperparameter optimization, which enabled us to obtain tuned performance for each ML algorithm by reducing the overfitting of training data.
In general, scaling methods, such as standardization or normalization, reduce the variability of the weight or error of each variable, thereby reducing the failure of the learning process due to gradient exploding that occurs during learning of the neural network model and improving the performance of the model [38]. However, the learning result of additional ANN models performed with the scaling input variable in our study were not better than the crude ANN model. In other DL studies relating to stroke, there was no mention of the effect of neural network scaling on DL performance [11,39]. Ahsan et al. reported the effect of scaling on performance in various ML methods [40]. They concluded that the effect of scaling on ML performance varied depending on the characteristics of the data. Therefore, it can be reported as evidence that the scaling method of ANN had no effect on the model performance improvement of the numerical data of stroke patients.
Of our ML models, age, gender, stroke severity, stroke subtype, prior antithrombotics usage, white blood cell count, and fasting blood sugar were important variables in predicting HT (Table 3). Age and stroke severity are important prognostic markers for AIS [41], and were also identified as a predictor for HT in other studies [42]. Andrade et al. summarized the important predictors for HT in clinical trials [43]. Considering that important variables in our ML model are exactly matched with the variables presented in this report, we suggest that clinically important variables have a significant influence on the performance of the classifier, even in ML classification.
There are several limitations to our study. First, we only evaluated the numerical or categorical data related to the patients’ clinicodemographic factors for laboratory variables at admission. HT can be affected by a variety of post-stroke management treatments, such as blood pressure management and concurrent post-stroke antithrombotic medications [44,45]. Therefore, we could not evaluate the impact of post-stroke care for the development of HT. Second, we did not assess radiologic markers of HT. Aside from our research on HT, image DLs using CT or MRI are being actively conducted. In future studies, we expect that the ensemble learning method, which adds the patient’s clinical variables and image variables, will further enhance the predictive power of the DL model.

5. Conclusions

The ANN algorithm was more effective at predicting HT in AIS patient then the conventional ML algorithms and showed the best performance for the prediction of HT in our dataset (0.844) without additional feature scaling. In later trials, ensemble strategy, using numerical and unstructured imaging data DL, could be useful to predict HT after AIS.

Supplementary Materials

The following are available online at https://www.mdpi.com/article/10.3390/jpm11090863/s1, Table S1: Grid search parameters in each machine learning classifier.; Figure S1: Result of the receiver operating characteristic curve of binary logistic regression (a) and support vector machine (b), extreme gradient boosting (c), and artificial neural network algorithm (d) before and after resampling strategy.; Figure S2: Result of the receiver operating characteristic curve of binary logistic regression (a) and support vector machine (b), and extreme gradient boosting (c) before and after different scaling methods.

Author Contributions

Conceptualization, C.K. and J.-M.C.; data curation, J.-M.C., S.-Y.S., S.-H.L. and P.-J.K.; formal analysis, C.K., S.-Y.S. and J.-M.C.; funding acquisition, C.K.; methodology, J.-M.C., S.-Y.S., Y.-S.K. and D.-K.K.; resources, C.K.; supervision, C.K., J.-M.C. and J.-J.L.; visualization, P.-J.K., S.-Y.S., J.-H.S. and J.-J.L.; writing—original draft, S.-Y.S., J.-M.C. and C.K.; writing—review and editing, C.K. and S.-Y.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Research Fund of Korea (NRF-2019R1G1A1097707), Hallym University Research Fund 2019 (HURF-2019-54), the Chong Ken Dang Co. (2019-11-07), by a grant of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (grant number: HR21C0198), and carried out with the support of R&D Program for Forest Science Technology (project no. 2021397C10-2123-0107, 2021397B10-2123-0107) provided by the Korea Forest Service (Korea Forestry Promotion Institute). The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Institutional Review Board Statement

The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Institutional Review Board (or Ethics Committee) of Chuncheon Sacred Heart Hospital (protocol code 2019-11-017 and date of approval: 2019-11-17).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to the policy of our IRB.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Global Burden of Disease Stroke Expert Group. Global, regional, and country-specific lifetime risks of stroke, 1990 and 2016. N. Engl. J. Med. 2018, 379, 2429–2437. [Google Scholar] [CrossRef]
  2. Krishnamurthi, R.V.; Barker-Collo, S.; Parag, V.; Parmar, P.; Witt, E.; Jones, A.; Mahon, S.; Anderson, C.S.; Barber, P.A.; Feigin, V.L. Stroke incidence by major pathological type and ischemic subtypes in the Auckland regional community stroke studies: Changes between 2002 and 2011. Stroke 2018, 49, 3–10. [Google Scholar] [CrossRef]
  3. Álvarez-Sabín, J.; Maisterra, O.; Santamarina, E.; Kase, C.S. Factors influencing haemorrhagic transformation in ischaemic stroke. Lancet Neurol. 2013, 12, 689–705. [Google Scholar] [CrossRef]
  4. Fagan, S.C.; Lapchak, P.A.; Liebeskind, D.S.; Ishrat, T.; Ergul, A. Recommendations for preclinical research in hemorrhagic transformation. Transl. Stroke Res. 2013, 4, 322–327. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  5. Yaghi, S.; Willey, J.Z.; Cucchiara, B.; Goldstein, J.N.; Gonzales, N.R.; Khatri, P.; Kim, L.J.; Mayer, S.A.; Sheth, K.N.; Schwamm, L.H. Treatment and outcome of hemorrhagic transformation after intravenous alteplase in acute ischemic stroke: A scientific statement for healthcare professionals from the American Heart Association/American Stroke Association. Stroke 2017, 48, e343–e361. [Google Scholar] [CrossRef] [PubMed]
  6. Dzialowski, I.; Pexman, J.W.; Barber, P.A.; Demchuk, A.M.; Buchan, A.M.; Hill, M.D. Asymptomatic hemorrhage after thrombolysis may not be benign: Prognosis by hemorrhage type in the Canadian alteplase for stroke effectiveness study registry. Stroke 2007, 38, 75–79. [Google Scholar] [CrossRef] [Green Version]
  7. Park, J.; Ko, Y.; Kim, W.-J.; Jang, M.; Yang, M.; Han, M.-K.; Oh, C.-W.; Park, S.; Lee, J.; Lee, J. Is asymptomatic hemorrhagic transformation really innocuous? Neurology 2012, 78, 421–426. [Google Scholar] [CrossRef] [PubMed]
  8. Liu, J.; Wang, Y.; Jin, Y.; Guo, W.; Song, Q.; Wei, C.; Li, J.; Zhang, S.; Liu, M. Prediction of Hemorrhagic Transformation After Ischemic Stroke: Development and Validation Study of a Novel Multi-biomarker Model. Front. Aging Neurosci. 2021, 13, 257. [Google Scholar] [CrossRef] [PubMed]
  9. Zihni, E.; Madai, V.I.; Livne, M.; Galinovic, I.; Khalil, A.A.; Fiebach, J.B.; Frey, D. Opening the black box of artificial intelligence for clinical decision support: A study predicting stroke outcome. PLoS ONE 2020, 15, e0231166. [Google Scholar] [CrossRef] [Green Version]
  10. Strohm, L.; Hehakaya, C.; Ranschaert, E.R.; Boon, W.P.; Moors, E.H. Implementation of artificial intelligence (AI) applications in radiology: Hindering and facilitating factors. Eur. Radiol. 2020, 30, 5525–5532. [Google Scholar] [CrossRef]
  11. Wang, F.; Huang, Y.; Xia, Y.; Zhang, W.; Fang, K.; Zhou, X.; Yu, X.; Cheng, X.; Li, G.; Wang, X. Personalized risk prediction of symptomatic intracerebral hemorrhage after stroke thrombolysis using a machine-learning model. Ther. Adv. Neurol. Disord. 2020, 13, 1756286420902358. [Google Scholar] [CrossRef] [Green Version]
  12. Wang, Q.; Reps, J.M.; Kostka, K.F.; Ryan, P.B.; Zou, Y.; Voss, E.A.; Rijnbeek, P.R.; Chen, R.; Rao, G.A.; Morgan Stewart, H. Development and validation of a prognostic model predicting symptomatic hemorrhagic transformation in acute ischemic stroke at scale in the OHDSI network. PLoS ONE 2020, 15, e0226718. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  13. Asadi, H.; Dowling, R.; Yan, B.; Mitchell, P. Machine learning for outcome prediction of acute ischemic stroke post intra-arterial therapy. PLoS ONE 2014, 9, e88225. [Google Scholar] [CrossRef] [Green Version]
  14. Scalzo, F.; Alger, J.R.; Hu, X.; Saver, J.L.; Dani, K.A.; Muir, K.W.; Demchuk, A.M.; Coutts, S.B.; Luby, M.; Warach, S. Multi-center prediction of hemorrhagic transformation in acute ischemic stroke using permeability imaging features. Magn. Reson. Imaging 2013, 31, 961–969. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  15. Yu, Y.; Guo, D.; Lou, M.; Liebeskind, D.; Scalzo, F. Prediction of hemorrhagic transformation severity in acute stroke from source perfusion MRI. IEEE Trans. Biomed. Eng. 2017, 65, 2058–2065. [Google Scholar] [CrossRef] [PubMed]
  16. Kent, D.M.; Hinchey, J.; Price, L.L.; Levine, S.R.; Selker, H.P. In acute ischemic stroke, are asymptomatic intracranial hemorrhages clinically innocuous? Stroke 2004, 35, 1141–1146. [Google Scholar] [CrossRef] [Green Version]
  17. Schlegel, D.J.; Tanne, D.; Demchuk, A.M.; Levine, S.R.; Kasner, S.E. Prediction of hospital disposition after thrombolysis for acute ischemic stroke using the National Institutes of Health Stroke Scale. Arch. Neurol. 2004, 61, 1061–1064. [Google Scholar] [CrossRef] [PubMed]
  18. Kim, J.-T.; Heo, S.-H.; Park, M.-S.; Chang, J.; Choi, K.-H.; Cho, K.-H. Use of antithrombotics after hemorrhagic transformation in acute ischemic stroke. PLoS ONE 2014, 9, e89798. [Google Scholar] [CrossRef]
  19. Kim, T.J.; Lee, J.S.; Oh, M.-S.; Kim, J.-W.; Yoon, J.S.; Lim, J.-S.; Lee, C.-H.; Mo, H.; Jeong, H.-Y.; Kim, Y. Predicting functional outcome based on linked data after acute ischemic stroke: S-SMART Score. Transl. Stroke Res. 2020, 11, 1296–1305. [Google Scholar] [CrossRef]
  20. Hacke, W.; Kaste, M.; Fieschi, C.; Von Kummer, R.; Davalos, A.; Meier, D.; Larrue, V.; Bluhmki, E.; Davis, S.; Donnan, G. Randomised double-blind placebo-controlled trial of thrombolytic therapy with intravenous alteplase in acute ischaemic stroke (ECASS II). Lancet 1998, 352, 1245–1251. [Google Scholar] [CrossRef]
  21. Patro, S.; Sahu, K.K. Normalization: A preprocessing stage. arXiv 2015, arXiv:1503.06462. [Google Scholar] [CrossRef]
  22. Dhahri, H.; Al Maghayreh, E.; Mahmood, A.; Elkilani, W.; Faisal Nagi, M. Automated breast cancer diagnosis based on machine learning algorithms. J. Healthc. Eng. 2019, 2019, 4253641. [Google Scholar] [CrossRef]
  23. Paciaroni, M.; Bandini, F.; Agnelli, G.; Tsivgoulis, G.; Yaghi, S.; Furie, K.L.; Tadi, P.; Becattini, C.; Zedde, M.; Abdul-Rahim, A.H. Hemorrhagic transformation in patients with acute ischemic stroke and atrial fibrillation: Time to initiation of oral anticoagulant therapy and outcomes. J. Am. Heart Assoc. 2018, 7, e010133. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  24. Pande, S.; Win, M.; Khine, A.; Zaw, E.; Manoharraj, N.; Lolong, L.; Tin, A. Haemorrhagic transformation following ischaemic stroke: A retrospective study. Sci. Rep. 2020, 10, 1–9. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  25. Jaillard, A.; Cornu, C.; Durieux, A.; Moulin, T.; Boutitie, F.; Lees, K.R.; Hommel, M. Hemorrhagic transformation in acute ischemic stroke: The MAST-E study. Stroke 1999, 30, 1326–1332. [Google Scholar] [CrossRef]
  26. Sun, J.; Meng, D.; Liu, Z.; Hua, X.; Xu, Z.; Zhu, J.; Qian, Z.; Xu, X. Neutrophil to Lymphocyte Ratio Is a Therapeutic Biomarker for Spontaneous Hemorrhagic Transformation. Neurotox. Res. 2020, 38, 219–227. [Google Scholar] [CrossRef] [PubMed]
  27. Suh, C.H.; Jung, S.C.; Cho, S.J.; Woo, D.-C.; Oh, W.Y.; Lee, J.G.; Kim, K.W. MRI for prediction of hemorrhagic transformation in acute ischemic stroke: A systematic review and meta-analysis. Acta Radiol. 2020, 61, 964–972. [Google Scholar] [CrossRef] [PubMed]
  28. Bang, O.Y.; Buck, B.H.; Saver, J.L.; Alger, J.R.; Yoon, S.R.; Starkman, S.; Ovbiagele, B.; Kim, D.; Ali, L.K.; Sanossian, N. Prediction of hemorrhagic transformation after recanalization therapy using T2*-permeability magnetic resonance imaging. Ann. Neurol. 2007, 62, 170–176. [Google Scholar] [CrossRef] [PubMed]
  29. Seiffge, D.J.; Paciaroni, M.; Wilson, D.; Koga, M.; Macha, K.; Cappellari, M.; Schaedelin, S.; Shakeshaft, C.; Takagi, M.; Tsivgoulis, G. Direct oral anticoagulants versus vitamin K antagonists after recent ischemic stroke in patients with atrial fibrillation. Ann. Neurol. 2019, 85, 823–834. [Google Scholar] [CrossRef]
  30. Zhang, W.; Li, C.; Peng, G.; Chen, Y.; Zhang, Z. A deep convolutional neural network with new training methods for bearing fault diagnosis under noisy environment and different working load. Mech. Syst. Signal Process. 2018, 100, 439–453. [Google Scholar] [CrossRef]
  31. Qian, Z.; Wu, C.; Chen, H.; Chen, M. Diabetic Retinopathy Grading Using Attention based Convolution Neural Network. In Proceedings of the 2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, China, 12–14 March 2021; pp. 2652–2655. [Google Scholar]
  32. Jha, D.; Gupta, V.; Ward, L.; Yang, Z.; Wolverton, C.; Foster, I.; Liao, W.-k.; Choudhary, A.; Agrawal, A. Enabling deeper learning on big data for materials informatics applications. Sci. Rep. 2021, 11, 1–12. [Google Scholar] [CrossRef]
  33. Zhang, D.; Yin, C.; Zeng, J.; Yuan, X.; Zhang, P. Combining structured and unstructured data for predictive models: A deep learning approach. BMC Med. Inform. Decis. Mak. 2020, 20, 1–11. [Google Scholar] [CrossRef] [PubMed]
  34. Holmgren, G.; Andersson, P.; Jakobsson, A.; Frigyesi, A. Artificial neural networks improve and simplify intensive care mortality prognostication: A national cohort study of 217,289 first-time intensive care unit admissions. J. Intensive Care 2019, 7, 1–8. [Google Scholar] [CrossRef]
  35. Livingstone, D.J.; Manallack, D.T.; Tetko, I.V. Data modelling with neural networks: Advantages and limitations. J. Comput.-Aided Mol. Des. 1997, 11, 135–142. [Google Scholar] [CrossRef]
  36. Li, P.; Stuart, E.A.; Allison, D.B. Multiple imputation: A flexible tool for handling missing data. JAMA 2015, 314, 1966–1967. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  37. Molina, C.A.; Montaner, J.; Abilleira, S.; Ibarra, B.; Romero, F.; Arenillas, J.F.; Alvarez-Sabín, J. Timing of spontaneous recanalization and risk of hemorrhagic transformation in acute cardioembolic stroke. Stroke 2001, 32, 1079–1084. [Google Scholar] [CrossRef] [PubMed]
  38. Shanker, M.; Hu, M.Y.; Hung, M.S. Effect of data standardization on neural network training. Omega 1996, 24, 385–397. [Google Scholar] [CrossRef]
  39. Chung, C.-C.; Chan, L.; Bamodu, O.A.; Hong, C.-T.; Chiu, H.-W. Artificial neural network based prediction of postthrombolysis intracerebral hemorrhage and death. Sci. Rep. 2020, 10, 1–10. [Google Scholar] [CrossRef]
  40. Ahsan, M.M.; Mahmud, M.; Saha, P.K.; Gupta, K.D.; Siddique, Z. Effect of Data Scaling Methods on Machine Learning Algorithms and Model Performance. Technologies 2021, 9, 52. [Google Scholar] [CrossRef]
  41. Saposnik, G.; Guzik, A.K.; Reeves, M.; Ovbiagele, B.; Johnston, S.C. Stroke prognostication using age and NIH stroke scale: SPAN-100. Neurology 2013, 80, 21–28. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  42. Ge, W.-Q.; Chen, J.; Pan, H.; Chen, F.; Zhou, C.-Y. Analysis of risk factors increased hemorrhagic transformation after acute ischemic stroke. J. Stroke Cerebrovasc. Dis. 2018, 27, 3587–3590. [Google Scholar] [CrossRef] [PubMed]
  43. Andrade, J.B.C.d.; Mohr, J.P.; Lima, F.O.; Barros, L.C.M.; Nepomuceno, C.R.; Portela, L.B.; Silva, G.S. Predictors of hemorrhagic transformation after acute ischemic stroke based on the experts’ opinion. Arq. Neuro-Psiquiatr. 2020, 78, 390–396. [Google Scholar] [CrossRef] [PubMed]
  44. Butcher, K.; Christensen, S.; Parsons, M.; De Silva, D.A.; Ebinger, M.; Levi, C.; Jeerakathil, T.; Campbell, B.C.; Barber, P.A.; Bladin, C. Postthrombolysis blood pressure elevation is associated with hemorrhagic transformation. Stroke 2010, 41, 72–77. [Google Scholar] [CrossRef] [Green Version]
  45. Chen, Z.; Ding, Y.; Ji, X.; Yin, X.; Meng, R. Advance of antithrombotic treatment in patients with cerebral microbleed. J. Thromb. Thrombolysis 2021, 51, 530–535. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Flowchart of the study participants. DWI: diffusion weighted image; HT: hemorrhagic transformation.
Figure 1. Flowchart of the study participants. DWI: diffusion weighted image; HT: hemorrhagic transformation.
Jpm 11 00863 g001
Figure 2. Schematic representation of the machine learning model: (a) the preprocessing process of the categorical and continuous variables and (b) the schematic representation of the artificial neural network structure. BLR: binary logistic regression; SVM: support vector machine; XGB: extreme gradient boosting; ANN: artificial neural network; HTf: hemorrhagic transformation; and relu: rectified linear unit.
Figure 2. Schematic representation of the machine learning model: (a) the preprocessing process of the categorical and continuous variables and (b) the schematic representation of the artificial neural network structure. BLR: binary logistic regression; SVM: support vector machine; XGB: extreme gradient boosting; ANN: artificial neural network; HTf: hemorrhagic transformation; and relu: rectified linear unit.
Jpm 11 00863 g002
Figure 3. Result of the receiver operating characteristic curve of binary logistic regression, extreme gradient boosting, and support vector machine algorithms (a) and the artificial neural network algorithm before and after input parameter scaling (b). BLR: binary logistic regression; XGB: extreme gradient boosting; SVM: support vector machine; ANN: artificial neural network; and AUROC: area under the receiver operating characteristic curve.
Figure 3. Result of the receiver operating characteristic curve of binary logistic regression, extreme gradient boosting, and support vector machine algorithms (a) and the artificial neural network algorithm before and after input parameter scaling (b). BLR: binary logistic regression; XGB: extreme gradient boosting; SVM: support vector machine; ANN: artificial neural network; and AUROC: area under the receiver operating characteristic curve.
Jpm 11 00863 g003
Table 1. Comparison of baseline characteristics between training and test datasets.
Table 1. Comparison of baseline characteristics between training and test datasets.
VariablesTraining (n = 1419)Test (n = 609)Whole Dataset
(n = 2028)
p Value
Male815 (57.4%)368 (60.4%)1183 (58.3%)0.229
Age, year69.7 ± 12.969.3 ± 12.469.6 ± 12.80.451
Onset to arrival time, hours29.1 ± 44.532.2 ± 45.830.6 ± 48.20.183
BMI, kg/m224.1 ± 3.624.1 ± 3.424.1 ± 3.60.606
Initial NIHSS, score5.1 ± 5.74.9 ± 5.65.0 ± 5.60.562
Stroke subtype 0.313
LAA491 (34.6%)222 (36.5%)713 (35.2%)
SVO410 (28.9%)185 (30.4%)595 (29.3%)
CE270 (19.0%)111 (18.2%)381 (18.8%)
SOE51 (3.6%)12 (2.0%)63 (3.1%)
SUE197 (13.9%)79 (13.0%)276 (13.6%)
Past medical history
Prior stroke359 (25.3%)146 (24.0%)505 (24.9%)0.564
Hypertension921 (64.9%)398 (65.4%)1319 (65.0%)0.834
Diabetes250 (17.6%)118 (18.3%)368 (18.1%)0.167
Dyslipidemia495 (34.9%)208 (34.2%)703 (34.7%)0.979
Current smoking319 (22.5%)140 (23.0%)459 (22.6%)0.847
Atrial fibrillation273 (19.2%)105 (17.2%)378 (18.6)0.319
Prior antithrombotics treatment529 (37.3%)222 (36.5%)751 (37.0%)0.762
Thrombolysis188 (13.2%)76 (12.5%)264 (13.0%)0.689
Laboratory parameter
WBC, 103/μL7.8 ± 2.97.9 ± 3.07.8 ± 2.90.414
Hemoglobin, g/dL13.6 ± 2.013.8 ± 1.813.7 ± 2.00.140
Platelet count, 103/μL233.6 ± 74.9234.5 ± 80.9233.9 ± 76.80.820
Total cholesterol, g/dL168.1 ± 63.7168.2 ± 41.5168.2 ± 57.90.994
TG, mg/dL128.8 ± 85.5133.1 ± 81.3130.1 ± 84.30.288
HDL, mg/dL45.7 ± 11.544.9 ± 10.645.5 ± 11.30.158
LDL, mg/dL100.3 ± 35.4102.4 ± 34.9100.9 ± 35.20.225
BUN, mg/dL17.7 ± 9.417.6 ± 9.317.7 ± 9.40.860
Creatinine, mg/dL1.0 ± 0.81.0 ± 0.71.0 ± 0.70.956
FBS, mg/dL126.7 ± 52.8126.0 ± 49.0126.5 ± 51.60.759
A1c, %6.3 ± 1.46.3 ± 1.46.3 ± 1.40.848
INR1.1 ± 0.41.0 ± 0.21.1 ± 0.30.235
SBP, mmHg146.0 ± 26.5145.6 ± 26.4145.9 ± 26.50.768
DBP, mmHg84.0 ± 13.983.9 ± 14.184.0 ± 13.90.522
Hemorrhagic transformation221 (15.6%)97 (15.9%)318 (15.7%)0.893
Categorical variables are represented by the number (percent), and continuous variable are represented by mean (± standard deviation). BMI: body mass index; NIHSS: National Institute of Health Stroke Scale; LAA: large artery atherosclerosis, SVO: small vessel occlusion; CE: cardioembolism; SOE: stroke of other determined etiology; SUO: stroke of undetermined etiology; WBC: white blood cell; TG: triglycerides; HDL: high-density lipoprotein; LDL: low-density lipoprotein; BUN: blood urea nitrogen; FBS: fasting blood sugar; A1c: glycated hemoglobin; INR: international normalized ratio; SBP: systolic blood pressure; and DBP: diastolic blood pressure.
Table 2. Results of several performance parameters of machine learning algorithms to predict hemorrhagic transformation in the test dataset.
Table 2. Results of several performance parameters of machine learning algorithms to predict hemorrhagic transformation in the test dataset.
TPFPFNTNTotalPrecisionRecallAccuracyF1-Score
BLR48628712460987.394.683.790.8
SVM50410781760986.698.185.692.0
XGB48628732260986.994.683.490.6
ANN_crude50617572960989.996.787.893.2
TP: true positive; FP: false positive; FN: false negative; TN: true negative; BLR: binary logistic regression; SVM: support vector machine; XGB: extreme gradient boosting; and ANN crude: artificial neural network crude model.
Table 3. Most important input variables of predicting hemorrhagic transformation after acute ischemic stroke in each machine learning classifier.
Table 3. Most important input variables of predicting hemorrhagic transformation after acute ischemic stroke in each machine learning classifier.
NoVariableBLRSVMXGBANN
1Age3rd7th1st
2Male1st5th8th
3Onset to arrival time
4BMI
5NIHSS 1st3rd1st
6Previous mRS7th
7TOAST_1
8TOAST_2 2nd
9TOAST_3 2nd 5th
10TOAST_48th9th2nd
11TOAST_5 8th
12Previous stroke10th
13Hypertension
14Diabetes4th
15Dyslipidemia 6th9th
16Current smoking
17Atrial fibrillation 7th
18Prior antithrombotic usage2nd 4th
19Thrombolysis9th10th
20WBC 3rd 6th
21Hemoglobin 5th10th
22Platelet count 8th 9th
23Total cholesterol
24Triglycerides
25High density lipoprotein 6th
26Low density lipoprotein5th
27Blood urea nitrogen 4th
28Creatinine
29Fasting blood sugar 3rd
30Glycated hemoglobin 7th
31INR
32BPsys 10th4th
33BPdia6th
BLR: binary logistic regression; SVM: support vector machine; XGB: extreme gradient boosting; ANN: artificial neural network; BMI: body mass index; NIHSS: National Institute of Health Stroke Scale; mRS: modified Rankin Scale; TOAST: Trial of ORG 10172 in Acute Stroke Treatment; WBC: white blood cell; INR: international normalized ratio; BPsys: systolic blood pressure; and BPdia: diastolic blood pressure.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Choi, J.-M.; Seo, S.-Y.; Kim, P.-J.; Kim, Y.-S.; Lee, S.-H.; Sohn, J.-H.; Kim, D.-K.; Lee, J.-J.; Kim, C. Prediction of Hemorrhagic Transformation after Ischemic Stroke Using Machine Learning. J. Pers. Med. 2021, 11, 863. https://doi.org/10.3390/jpm11090863

AMA Style

Choi J-M, Seo S-Y, Kim P-J, Kim Y-S, Lee S-H, Sohn J-H, Kim D-K, Lee J-J, Kim C. Prediction of Hemorrhagic Transformation after Ischemic Stroke Using Machine Learning. Journal of Personalized Medicine. 2021; 11(9):863. https://doi.org/10.3390/jpm11090863

Chicago/Turabian Style

Choi, Jeong-Myeong, Soo-Young Seo, Pum-Jun Kim, Yu-Seop Kim, Sang-Hwa Lee, Jong-Hee Sohn, Dong-Kyu Kim, Jae-Jun Lee, and Chulho Kim. 2021. "Prediction of Hemorrhagic Transformation after Ischemic Stroke Using Machine Learning" Journal of Personalized Medicine 11, no. 9: 863. https://doi.org/10.3390/jpm11090863

APA Style

Choi, J. -M., Seo, S. -Y., Kim, P. -J., Kim, Y. -S., Lee, S. -H., Sohn, J. -H., Kim, D. -K., Lee, J. -J., & Kim, C. (2021). Prediction of Hemorrhagic Transformation after Ischemic Stroke Using Machine Learning. Journal of Personalized Medicine, 11(9), 863. https://doi.org/10.3390/jpm11090863

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop