Machine-Learning Algorithm for Predicting Fatty Liver Disease in a Taiwanese Population
Abstract
:1. Introduction
2. Materials and Methods
2.1. Patients and Data Preparation
2.2. Diagnosis of Fatty Liver Disease (FLD)
2.3. Machine-Learning Model Construction and Validation
2.4. Performance Metrics
2.5. Statistical Analysis
3. Results
3.1. Characteristics of the Participant Population
3.2. Results of Different Model Performance Metrics
3.3. Comparison of the Performance of Machine-Learning Models and the Fatty Liver Index
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Li, J.; Zou, B.; Yeo, Y.H.; Feng, Y.; Xie, X.; Lee, D.H.; Fujii, H.; Wu, Y.; Kam, L.Y.; Ji, F.; et al. Prevalence, incidence, and outcome of non-alcoholic fatty liver disease in Asia, 1999–2019: A systematic review and meta-analysis. Lancet Gastroenterol. Hepatol. 2019, 4, 389–398. [Google Scholar] [CrossRef]
- McGowan, C.E.; Jones, P.; Long, M.D.; Barritt, A.S.T. Changing shape of disease: Nonalcoholic fatty liver disease in Crohn’s disease—A case series and review of the literature. Inflamm. Bowel Dis. 2012, 18, 49–54. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Arieira, C.; Monteiro, S.; Xavier, S.; Dias de Castro, F.; Magalhaes, J.; Moreira, M.J.; Marinho, C.; Cotter, J. Hepatic steatosis and patients with inflammatory bowel disease: When transient elastography makes the difference. Eur. J. Gastroenterol. Hepatol. 2019, 31, 998–1003. [Google Scholar] [CrossRef] [PubMed]
- Lin, Y.J.; Lin, C.H.; Wang, S.T.; Lin, S.Y.; Chang, S.S. Noninvasive and Convenient Screening of Metabolic Syndrome Using the Controlled Attenuation Parameter Technology: An Evaluation Based on Self-Paid Health Examination Participants. J. Clin. Med. 2019, 8, 1775. [Google Scholar] [CrossRef] [Green Version]
- Saroli Palumbo, C.; Restellini, S.; Chao, C.Y.; Aruljothy, A.; Lemieux, C.; Wild, G.; Afif, W.; Lakatos, P.L.; Bitton, A.; Cocciolillo, S.; et al. Screening for Nonalcoholic Fatty Liver Disease in Inflammatory Bowel Diseases: A Cohort Study Using Transient Elastography. Inflamm. Bowel Dis. 2019, 25, 124–133. [Google Scholar] [CrossRef]
- Yen, Y.H.; Kuo, F.Y.; Lin, C.C.; Chen, C.L.; Chang, K.C.; Tsai, M.C.; Hu, T.H. Predicting Hepatic Steatosis in Living Liver Donors via Controlled Attenuation Parameter. Transpl. Proc. 2018, 50, 3533–3538. [Google Scholar] [CrossRef]
- Han, B.; Lee, G.B.; Yim, S.Y.; Cho, K.H.; Shin, K.E.; Kim, J.H.; Park, Y.G.; Han, K.D.; Kim, Y.H. Non-Alcoholic Fatty Liver Disease Defined by Fatty Liver Index and Incidence of Heart Failure in the Korean Population: A Nationwide Cohort Study. Diagnostics 2022, 12, 663. [Google Scholar] [CrossRef]
- Bedogni, G.; Bellentani, S.; Miglioli, L.; Masutti, F.; Passalacqua, M.; Castiglione, A.; Tiribelli, C. The Fatty Liver Index: A simple and accurate predictor of hepatic steatosis in the general population. BMC Gastroenterol. 2006, 6, 33. [Google Scholar] [CrossRef] [Green Version]
- Ji, W.; Xue, M.; Zhang, Y.; Yao, H.; Wang, Y. A Machine Learning Based Framework to Identify and Classify Non-alcoholic Fatty Liver Disease in a Large-Scale Population. Front. Public Health 2022, 10, 846118. [Google Scholar] [CrossRef]
- Noureddin, M.; Ntanios, F.; Malhotra, D.; Hoover, K.; Emir, B.; McLeod, E.; Alkhouri, N. Predicting NAFLD prevalence in the United States using National Health and Nutrition Examination Survey 2017–2018 transient elastography data and application of machine learning. Hepatol. Commun. 2022. [Google Scholar] [CrossRef]
- Ghandian, S.; Thapa, R.; Garikipati, A.; Barnes, G.; Green-Saxena, A.; Calvert, J.; Mao, Q.; Das, R. Machine learning to predict progression of non-alcoholic fatty liver to non-alcoholic steatohepatitis or fibrosis. JGH Open 2022, 6, 196–204. [Google Scholar] [CrossRef] [PubMed]
- Shafiha, R.; Bahcivanci, B.; Gkoutos, G.V.; Acharjee, A. Machine Learning-Based Identification of Potentially Novel Non-Alcoholic Fatty Liver Disease Biomarkers. Biomedicines 2021, 9, 1636. [Google Scholar] [CrossRef]
- Atsawarungruangkit, A.; Laoveeravat, P.; Promrat, K. Machine learning models for predicting non-alcoholic fatty liver disease in the general United States population: NHANES database. World J. Hepatol. 2021, 13, 1417–1427. [Google Scholar] [CrossRef]
- Liu, Y.X.; Liu, X.; Cen, C.; Li, X.; Liu, J.M.; Ming, Z.Y.; Yu, S.F.; Tang, X.F.; Zhou, L.; Yu, J.; et al. Comparison and development of advanced machine learning tools to predict nonalcoholic fatty liver disease: An extended study. Hepatobiliary Pancreat. Dis. Int. 2021, 20, 409–415. [Google Scholar] [CrossRef] [PubMed]
- Wu, C.C.; Yeh, W.C.; Hsu, W.D.; Islam, M.M.; Nguyen, P.A.A.; Poly, T.N.; Wang, Y.C.; Yang, H.C.; Li, Y.C.J. Prediction of fatty liver disease using machine learning algorithms. Comput. Methods Programs Biomed. 2019, 170, 23–29. [Google Scholar] [CrossRef]
- Chang, Y.Y.; Li, P.C.; Chang, R.F.; Chang, Y.Y.; Huang, S.P.; Chen, Y.Y.; Chang, W.Y.; Yen, H.H. Development and validation of a deep learning-based algorithm for colonoscopy quality assessment. Surg. Endosc. 2022. [Google Scholar] [CrossRef] [PubMed]
- Demšar, J.; Curk, T.; Erjavec, A.; Gorup, Č.; Hočevar, T.; Milutinovič, M.; Možina, M.; Polajnar, M.; Toplak, M.; Starič, A.; et al. Orange: Data mining toolbox in Python. J. Mach. Learn. Res. 2013, 14, 2349–2353. [Google Scholar]
- Yen, H.H.; Su, P.Y.; Zeng, Y.H.; Liu, I.L.; Huang, S.P.; Hsu, Y.C.; Chen, Y.Y.; Yang, C.W.; Wu, S.S.; Chou, K.C. Glecaprevir-pibrentasvir for chronic hepatitis C: Comparing treatment effect in patients with and without end-stage renal disease in a real-world setting. PLoS ONE 2020, 15, e0237582. [Google Scholar] [CrossRef]
- Yen, H.-H.; Su, P.-Y.; Liu, I.-L.; Zeng, Y.-Y.; Huang, S.-P.; Hsu, Y.-C.; Yang, C.-W.; Chen, Y.-Y. Direct-acting antiviral treatment for Hepatitis C Virus in geriatric patients: A real-world retrospective comparison between early and late elderly patients. PeerJ 2021, 9, e10944. [Google Scholar] [CrossRef]
- Yen, H.H.; Su, P.Y.; Liu, I.I.; Zeng, Y.H.; Huang, S.P.; Hsu, Y.C.; Hsu, P.K.; Chen, Y.Y. Retrieval of lost patients in the system for hepatitis C microelimination: A single-center retrospective study. BMC Gastroenterol. 2021, 21, 209. [Google Scholar] [CrossRef]
- Zhou, F.; Zhou, J.; Wang, W.; Zhang, X.J.; Ji, Y.X.; Zhang, P.; She, Z.G.; Zhu, L.; Cai, J.; Li, H. Unexpected Rapid Increase in the Burden of NAFLD in China From 2008 to 2018: A Systematic Review and Meta-Analysis. Hepatology 2019, 70, 1119–1133. [Google Scholar] [CrossRef] [PubMed]
- Chalasani, N.; Younossi, Z.; Lavine, J.E.; Charlton, M.; Cusi, K.; Rinella, M.; Harrison, S.A.; Brunt, E.M.; Sanyal, A.J. The diagnosis and management of nonalcoholic fatty liver disease: Practice guidance from the American Association for the Study of Liver Diseases. Hepatology 2018, 67, 328–357. [Google Scholar] [CrossRef] [PubMed]
- Cusi, K.; Isaacs, S.; Barb, D.; Basu, R.; Caprio, S.; Garvey, W.T.; Kashyap, S.; Mechanick, J.I.; Mouzaki, M.; Nadolsky, K.; et al. American Association of Clinical Endocrinology Clinical Practice Guideline for the Diagnosis and Management of Nonalcoholic Fatty Liver Disease in Primary Care and Endocrinology Clinical Settings: Co-Sponsored by the American Association for the Study of Liver Diseases (AASLD). Endocr. Pract. 2022, 28, 528–562. [Google Scholar] [CrossRef]
- Procino, F.; Misciagna, G.; Veronese, N.; Caruso, M.G.; Chiloiro, M.; Cisternino, A.M.; Notarnicola, M.; Bonfiglio, C.; Bruno, I.; Buongiorno, C.; et al. Reducing NAFLD-screening time: A comparative study of eight diagnostic methods offering an alternative to ultrasound scans. Liver Int. 2019, 39, 187–196. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Amal, S.; Safarnejad, L.; Omiye, J.A.; Ghanzouri, I.; Cabot, J.H.; Ross, E.G. Use of Multi-Modal Data and Machine Learning to Improve Cardiovascular Disease Care. Front. Cardiovasc. Med. 2022, 9, 840262. [Google Scholar] [CrossRef]
- Yen, H.H.; Wu, P.Y.; Chen, M.F.; Lin, W.C.; Tsai, C.L.; Lin, K.P. Current Status and Future Perspective of Artificial Intelligence in the Management of Peptic Ulcer Bleeding: A Review of Recent Literature. J. Clin. Med. 2021, 10, 3527. [Google Scholar] [CrossRef]
- Yen, H.-H.; Wu, P.-Y.; Su, P.-Y.; Yang, C.-W.; Chen, Y.-Y.; Chen, M.-F.; Lin, W.-C.; Tsai, C.-L.; Lin, K.-P. Performance Comparison of the Deep Learning and the Human Endoscopist for Bleeding Peptic Ulcer Disease. J. Med. Biol. Eng. 2021, 41, 504–513. [Google Scholar] [CrossRef]
- Dundar, T.T.; Yurtsever, I.; Pehlivanoglu, M.K.; Yildiz, U.; Eker, A.; Demir, M.A.; Mutluer, A.S.; Tektaş, R.; Kazan, M.S.; Kitis, S.; et al. Machine Learning-Based Surgical Planning for Neurosurgery: Artificial Intelligent Approaches to the Cranium. Front. Surg. 2022, 9, 863633. [Google Scholar] [CrossRef]
- Sakatani, K.; Oyama, K.; Hu, L.; Warisawa, S. Estimation of Human Cerebral Atrophy Based on Systemic Metabolic Status Using Machine Learning. Front. Neurol. 2022, 13, 869915. [Google Scholar] [CrossRef]
- Azizi, Z.; Shiba, Y.; Alipour, P.; Maleki, F.; Raparelli, V.; Norris, C.; Forghani, R.; Pilote, L.; El Emam, K. Importance of sex and gender factors for COVID-19 infection and hospitalisation: A sex-stratified analysis using machine learning in UK Biobank data. BMJ Open 2022, 12, e050450. [Google Scholar] [CrossRef]
- Ma, H.; Xu, C.-F.; Shen, Z.; Yu, C.-H.; Li, Y.-M. Application of machine learning techniques for clinical predictive modeling: A cross-sectional study on nonalcoholic fatty liver disease in China. BioMed Res. Int. 2018, 2018, 4304376. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Pei, X.; Deng, Q.; Liu, Z.; Yan, X.; Sun, W. Machine Learning Algorithms for Predicting Fatty Liver Disease. Ann. Nutr. Metab. 2021, 77, 38–45. [Google Scholar] [CrossRef] [PubMed]
- Zhao, M.; Song, C.; Luo, T.; Huang, T.; Lin, S. Fatty Liver Disease Prediction Model Based on Big Data of Electronic Physical Examination Records. Front. Public Health 2021, 9, 668351. [Google Scholar] [CrossRef] [PubMed]
No Fatty Liver (n = 23,625) | Fatty Liver Disease (n = 8305) | p-Value | |
---|---|---|---|
Categorial variable | N (%) | N (%) | |
Male sex | 13,484 (57.1%) | 6293 (75.8%) | <0.0001 |
Continuous variables | Mean ± SD | Mean ± SD | |
Age (years) | 48.63 ± 10.92 | 50.48 ± 9.93 | <0.0001 |
Weight (kg) | 63.28 ± 10.6 | 75.011 ± 12.13 | <0.0001 |
Height (cm) | 164.7 ± 8.02 | 166.34 ± 7.95 | <0.0001 |
BMI (kg/m2) | 23.244 ± 2.91 | 27.044 ± 3.47 | <0.0001 |
Waist (cm) | 63.27 ± 10.6 | 75.01 ± 12.13 | <0.0001 |
SBP (mmHg) | 121.11 ± 16.14 | 130.17 ± 15.77 | <0.0001 |
DBP (mmHg) | 77.25 ± 10.42 | 83.45 ± 10.62 | <0.0001 |
ALT (IU/L) | 23.31 ± 20.42 | 39.64 ± 25.71 | <0.0001 |
AST (IU/L) | 23.86 ± 19.344 | 30.39 ± 16.33 | <0.0001 |
Cr (mg/dL) | 0.811 ± 0.23 | 0.86 ± 0.23 | <0.0001 |
Sugar (mg/dL) | 93.88 ± 16.01 | 104.44 ± 25.56 | <0.0001 |
T-Cho (mg/dL) | 191.666 ± 34.5 | 197.31 ± 36.39 | <0.0001 |
HDL (mg/dL) | 52.588 ± 13.53 | 43.07 ± 9.39 | <0.0001 |
LDL (mg/dL) | 118.4 ± 30.38 | 124.08 ± 32.47 | <0.0001 |
TG (mg/dL) | 98.52 ± 69.62 | 162.64 ± 110.33 | <0.0001 |
r-GT (U/L) | 21.91 ± 24.66 | 35.08 ± 36.76 | <0.0001 |
WBC (×109/L) | 5.4 ± 1.45 | 6.18 ± 1.56 | <0.0001 |
Hb (g/dL) | 13.99 ± 1.53 | 14.7 ± 1.31 | <0.0001 |
MCH (pg) | 30.11 ± 2.98 | 30.17 ± 2.71 | 0.1025 |
MCHC (g/dL) | 33.455 ± 0.95 | 33.61 ± 0.94 | <0.0001 |
MCV (fL) | 41.8 ± 4.21 | 43.71 ± 3.63 | <0.0001 |
RBC-RDW (%) | 13.53 ± 1.27 | 13.39 ± 1.04 | <0.0001 |
RBC Count (106/μL) | 4.67 ± 0.52 | 4.9 ± 0.51 | <0.0001 |
RBC Volume (fL) | 89.89 ± 7.53 | 89.67 ± 6.81 | 0.0166 |
Platelet (103/μL) | 222.64 ± 53.55 | 229.82 ± 52.68 | <0.0001 |
FIB-4 | 1.21 ± 0.64 | 1.17 ± 0.56 | <0.0001 |
Testing Population (n = 6386) | Training Population (n = 25,544) | p-Value | |||
---|---|---|---|---|---|
Categorial Variable | N | % | N | % | |
Male sex | 3920 | 61.4% | 15857 | 62.1% | 0.3077 |
Fatty liver disease | 1647 | 25.8% | 6658 | 26.1% | 0.6552 |
Continuous variables | Mean | SD | Mean | SD | |
Age (years) | 49.0338 | 10.6897 | 49.1273 | 10.7045 | 0.5325 |
Weight (kg) | 66.2086 | 11.8944 | 66.3478 | 12.2305 | 0.4133 |
Height (cm) | 165.1039 | 8.0304 | 165.1287 | 8.0323 | 0.8254 |
BMI (kg/m²) | 24.1912 | 3.3860 | 24.2331 | 3.5129 | 0.3896 |
Waist (cm) | 81.3276 | 9.4895 | 81.4931 | 9.6282 | 0.2178 |
SBP (mmHg) | 123.3472 | 16.2762 | 123.4969 | 16.5927 | 0.5173 |
DBP (mmHg) | 78.7839 | 10.8325 | 78.8869 | 10.8106 | 0.4962 |
ALT (IU/L) | 27.6682 | 20.3123 | 27.5262 | 23.6926 | 0.6599 |
AST (IU/L) | 25.4887 | 11.4994 | 25.5720 | 20.2435 | 0.7520 |
Cr (mg/dL) | 0.8184 | 0.2498 | 0.8216 | 0.2228 | 0.3200 |
Sugar (mg/dL) | 96.8274 | 21.5711 | 96.5744 | 18.9770 | 0.3543 |
T-Cho (mg/dL) | 192.5857 | 34.7903 | 193.2651 | 35.1615 | 0.1663 |
HDL(mg/dL) | 50.2668 | 13.2300 | 50.0619 | 13.2686 | 0.2693 |
LDL (mg/dL) | 119.4998 | 30.8285 | 119.9765 | 31.0895 | 0.2723 |
TG (mg/dL) | 113.8447 | 101.3611 | 115.5403 | 82.8250 | 0.1629 |
r-GT (U/L) | 25.2388 | 28.2466 | 25.3584 | 29.0509 | 0.7674 |
WBC (×109/L) | 5.6059 | 1.5110 | 5.6049 | 1.5196 | 0.9632 |
Hb (g/dL) | 14.1648 | 1.5050 | 14.1758 | 1.5102 | 0.6019 |
MCH (pg) | 30.1294 | 2.8629 | 30.1250 | 2.9260 | 0.9135 |
MCHC (g/dL) | 33.4929 | 0.9346 | 33.4895 | 0.9532 | 0.7981 |
MCV (fL) | 42.2640 | 4.1367 | 42.3027 | 4.1600 | 0.5058 |
RBC-RDW (%) | 13.5007 | 1.2364 | 13.4978 | 1.2132 | 0.8666 |
RBC Count (106/μL) | 4.7246 | 0.5119 | 4.7314 | 0.5298 | 0.3560 |
RBC volume (fL) | 89.8414 | 7.2438 | 89.8342 | 7.3755 | 0.9443 |
Platelet (103/μL) | 225.1690 | 52.8910 | 224.3424 | 53.5484 | 0.2688 |
FIB-4 | 1.1890 | 0.6050 | 1.1966 | 0.6242 | 0.3836 |
Model | AUROC | Accuracy | Recall | F1 | Specificity | Precision |
---|---|---|---|---|---|---|
xgBoost | 0.882 | 0.833 | 0.833 | 0.829 | 0.683 | 0.827 |
Neural network | 0.874 | 0.824 | 0.824 | 0.820 | 0.683 | 0.818 |
Logistic regression | 0.870 | 0.825 | 0.825 | 0.815 | 0.629 | 0.816 |
Random forest | 0.849 | 0.818 | 0.818 | 0.809 | 0.629 | 0.808 |
SVM | 0.551 | 0.569 | 0.569 | 0.595 | 0.536 | 0.656 |
Difference between Areas (p-Value) | Neural Network | Logistic Regression | Random Forest | SVM | Fatty Liver Index |
---|---|---|---|---|---|
xgBoost | 0.0076 (p = 0.0105) | 0.0114 (p = 0.0001) | 0.0327 (p < 0.0001) | 0.330 (p < 0.0001) | 0.0347 (p < 0.0001) |
Neural network | 0.00382 (p = 0.2303) | 0.0251 (p < 0.0001) | 0.323 (p < 0.0001) | 0.00204 (p = 0.5978) | |
Logistic regression | 0.0213 (p < 0.0001) | 0.0319 (p < 0.0001) | 0.0233 (p < 0.0001) | ||
Random forest | 0.298 (p < 0.0001) | 0.00204 (p = 0.5978) | |||
SVM | 0.295 (p < 0.0001) |
Author/Year | Setting/Country | Fatty/Total Population, (%) | Validation Method | ML Model | Accuracy (%) | Area under Curve (%) |
---|---|---|---|---|---|---|
Ma [31] 2018 | Hospital/China | 2522/10,508 (24%) | 10-fold cross validation | LR | 82.92% | N/A |
Wu [15] 2018 | Hospital/Taiwan | 377/577 (65.3%) | 10-fold cross validation | Random forest | 87.48% | 92.25% |
Liu [14] 2021 | Hospital/China | 5878/15,315 (38.4%) | 32% of dataset as testing data | xgBoost | 79.5% | 87.3% |
Atsawarungruangkit [13] 2021 | Population/USA | 817/3235 (25.3%) | 30% of dataset as testing data | Ensemble of subspace discriminant | 77.7% | 78% |
Pei [32] 2021 | Hospital/China | 845/3419 (24.7%) | 30% of dataset as testing data | xgBoost | 94.15% | 93.06% |
Zhao [33] 2021 | Hospital/China | 9173/39,884 (23%) | 30% of dataset as testing data | xgBoost | 89% | N/A |
Our Study 2022 | Hospital/Taiwan | 8375/31,930 (26.2%) | 20% of dataset as testing data | xgBoost | 83.3% | 88.2% |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chen, Y.-Y.; Lin, C.-Y.; Yen, H.-H.; Su, P.-Y.; Zeng, Y.-H.; Huang, S.-P.; Liu, I.-L. Machine-Learning Algorithm for Predicting Fatty Liver Disease in a Taiwanese Population. J. Pers. Med. 2022, 12, 1026. https://doi.org/10.3390/jpm12071026
Chen Y-Y, Lin C-Y, Yen H-H, Su P-Y, Zeng Y-H, Huang S-P, Liu I-L. Machine-Learning Algorithm for Predicting Fatty Liver Disease in a Taiwanese Population. Journal of Personalized Medicine. 2022; 12(7):1026. https://doi.org/10.3390/jpm12071026
Chicago/Turabian StyleChen, Yang-Yuan, Chun-Yu Lin, Hsu-Heng Yen, Pei-Yuan Su, Ya-Huei Zeng, Siou-Ping Huang, and I-Ling Liu. 2022. "Machine-Learning Algorithm for Predicting Fatty Liver Disease in a Taiwanese Population" Journal of Personalized Medicine 12, no. 7: 1026. https://doi.org/10.3390/jpm12071026
APA StyleChen, Y. -Y., Lin, C. -Y., Yen, H. -H., Su, P. -Y., Zeng, Y. -H., Huang, S. -P., & Liu, I. -L. (2022). Machine-Learning Algorithm for Predicting Fatty Liver Disease in a Taiwanese Population. Journal of Personalized Medicine, 12(7), 1026. https://doi.org/10.3390/jpm12071026