ELIPF: Explicit Learning Framework for Pre-Emptive Forecasting, Early Detection and Curtailment of Idiopathic Pulmonary Fibrosis Disease
Abstract
:1. Introduction
2. Related Work
3. Methods and Materials
3.1. Data Source and Implementation Environment
3.2. Data Preprocessing
3.2.1. Feature Selection
3.2.2. Data Sampling
3.3. Proposed Machine-Learning Model
3.4. Performance Evaluation Metrics
3.5. Process Workflow
4. Experimental Results
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Kalchiem-Dekel, O.; Galvin, J.R.; Burke, A.P.; Atamas, S.P.; Todd, N.W. Interstitial lung disease and pulmonary fibrosis: A practical approach for general medicine physicians with a focus on the medical history. J. Clin. Med. 2018, 7, 476. [Google Scholar] [CrossRef]
- Barratt, S.L.; Creamer, A.; Hayton, C.; Chaudhuri, N. Idiopathic pulmonary fibrosis (IPF): An overview. J. Clin. Med. 2018, 7, 201. [Google Scholar] [CrossRef] [PubMed]
- Katzenstein, A.-L.A.; Myers, J.L. Idiopathic pulmonary fibrosis: Clinical relevance of pathologic classification. Am. J. Respir. Crit. Care Med. 1998, 157, 1301–1315. [Google Scholar] [CrossRef] [PubMed]
- Society, A.T. ATS/ERS International consensus statement: Idiopathic pulmonary fibrosis: Diagnosis and treatment. Am. J. Respir. Crit. Care Med. 2000, 161, 646–664. [Google Scholar]
- Ley, B.; Collard, H.R. Epidemiology of idiopathic pulmonary fibrosis. Clin. Epidemiol. 2013, 5, 483. [Google Scholar] [CrossRef] [PubMed]
- King, T.E., Jr.; Pardo, A.; Selman, M. Idiopathic pulmonary fibrosis. Lancet 2011, 378, 1949–1961. [Google Scholar] [CrossRef] [PubMed]
- Song, J.W.; Hong, S.-B.; Lim, C.-M.; Koh, Y.; Kim, D.S. Acute exacerbation of idiopathic pulmonary fibrosis: Incidence, risk factors and outcome. Eur. Respir. J. 2011, 37, 356–363. [Google Scholar] [CrossRef]
- Selman, M.; Pardo, A. Idiopathic pulmonary fibrosis: An epithelial/fibroblastic cross-talk disorder. Respir. Res. 2001, 3, 3. [Google Scholar] [CrossRef] [PubMed]
- Kistler, K.D.; Nalysnyk, L.; Rotella, P.; Esser, D. Lung transplantation in idiopathic pulmonary fibrosis: A systematic review of the literature. BMC Pulm. Med. 2014, 14, 139. [Google Scholar] [CrossRef]
- Raghu, G.; Collard, H.R.; Egan, J.J.; Martinez, F.J.; Behr, J.; Brown, K.K.; Colby, T.V.; Cordier, J.-F.; Flaherty, K.R.; Lasky, J.A.; et al. An official ATS/ERS/JRS/ALAT statement: Idiopathic pulmonary fibrosis: Evidence-based guidelines for diagnosis and management. Am. J. Respir. Crit. Care Med. 2011, 183, 788–824. [Google Scholar] [CrossRef]
- Nalysnyk, L.; Cid-Ruzafa, J.; Rotella, P.; Esser, D. Incidence and prevalence of idiopathic pulmonary fibrosis: Review of the literature. Eur. Respir. Rev. 2012, 21, 355–361. [Google Scholar] [CrossRef] [PubMed]
- Raghu, G.; Weycker, D.; Edelsberg, J.; Bradford, W.Z.; Oster, G. Incidence and prevalence of idiopathic pulmonary fibrosis. Am. J. Respir. Crit. Care Med. 2006, 174, 810–816. [Google Scholar] [CrossRef] [PubMed]
- King, T.E., Jr.; Tooze, J.A.; Schwarz, M.I.; Brown, K.R.; Cherniack, R.M. Predicting survival in idiopathic pulmonary fibrosis: Scoring system and survival model. Am. J. Respir. Crit. Care Med. 2001, 164, 1171–1181. [Google Scholar] [CrossRef]
- Bhatt, H.; Jadav, N.K.; Kumari, A.; Gupta, R.; Tanwar, S.; Polkowski, Z.; Tolba, A.; Hassanein, A.S. Artificial neural network-driven federated learning for heart stroke prediction in healthcare 4.0 underlying 5G. Concurr. Comput. Pract. Exp. 2024, 36, e7911. [Google Scholar] [CrossRef]
- Vekaria, D.; Kumari, A.; Tanwar, S.; Kumar, N. ξboost: An AI-based data analytics scheme for COVID-19 prediction and economy boosting. IEEE Internet Things J. 2020, 8, 15977–15989. [Google Scholar] [CrossRef] [PubMed]
- Ajakwe, S.O.; Arkter, R.; Ahakonye, L.A.C.; Kim, D.S.; Lee, J.M. Real-time monitoring of COVID-19 vaccination compliance: A ubiquitous IT convergence approach. In Proceedings of the 2021 International Conference on Information and Communication Technology Convergence (ICTC), Jeju Island, Republic of Korea, 20–22 October 2021; pp. 440–445. [Google Scholar]
- Ryerson, C.J.; Hartman, T.; Elicker, B.M.; Ley, B.; Lee, J.S.; Abbritti, M.; Jones, K.D.; King, T.E., Jr.; Ryu, J.; Collard, H.R. Clinical features and outcomes in combined pulmonary fibrosis and emphysema in idiopathic pulmonary fibrosis. Chest 2013, 144, 234–240. [Google Scholar] [CrossRef] [PubMed]
- Alsomali, H.; Palmer, E.; Aujayeb, A.; Funston, W. Early diagnosis and treatment of idiopathic pulmonary fibrosis: A narrative review. Pulm. Ther. 2023, 9, 177–193. [Google Scholar] [CrossRef] [PubMed]
- Ahmad, Y.; Mooney, J.; Allen, I.E.; Seaman, J.; Kalra, A.; Muelly, M.; Reicher, J. A Machine Learning System to Indicate Diagnosis of Idiopathic Pulmonary Fibrosis Non-Invasively in Challenging Cases. Diagnostics 2024, 14, 830. [Google Scholar] [CrossRef] [PubMed]
- Onishchenko, D.; Marlowe, R.J.; Ngufor, C.G.; Faust, L.J.; Limper, A.H.; Hunninghake, G.M.; Martinez, F.J.; Chattopadhyay, I. Screening for idiopathic pulmonary fibrosis using comorbidity signatures in electronic health records. Nat. Med. 2022, 28, 2107–2116. [Google Scholar] [CrossRef]
- Dack, E.; Christe, A.; Fontanellaz, M.; Brigato, L.; Heverhagen, J.T.; Peters, A.A.; Huber, A.T.; Hoppe, H.; Mougiakakou, S.; Ebner, L. Artificial intelligence and interstitial lung disease: Diagnosis and prognosis. Investig. Radiol. 2023, 58, 602–609. [Google Scholar] [CrossRef]
- Mekov, E.; Miravitlles, M.; Petkov, R. Artificial intelligence and machine learning in respiratory medicine. Expert Rev. Respir. Med. 2020, 14, 559–564. [Google Scholar] [CrossRef] [PubMed]
- Soffer, S.; Morgenthau, A.S.; Shimon, O.; Barash, Y.; Konen, E.; Glicksberg, B.S.; Klang, E. Artificial Intelligence for Interstitial Lung Disease Analysis on Chest Computed Tomography: A Systematic Review. Acad. Radiol. 2021, 29, S226–S235. [Google Scholar] [CrossRef] [PubMed]
- Mäkelä, K.; Mäyränpää, M.I.; Sihvo, H.-K.; Bergman, P.; Sutinen, E.; Ollila, H.; Kaarteenaho, R.; Myllärniemi, M. Artificial intelligence identifies inflammation and confirms fibroblast foci as prognostic tissue biomarkers in idiopathic pulmonary fibrosis. Hum. Pathol. 2021, 107, 58–68. [Google Scholar] [CrossRef]
- Handa, T.; Tanizawa, K.; Oguma, T.; Uozumi, R.; Watanabe, K.; Tanabe, N.; Niwamoto, T.; Shima, H.; Mori, R.; Nobashi, T.W.; et al. Novel artificial intelligence-based technology for chest computed tomography analysis of idiopathic pulmonary fibrosis. Ann. Am. Thorac. Soc. 2022, 19, 399–406. [Google Scholar] [CrossRef]
- Shi, Y.; Wong, W.K.; Goldin, J.G.; Brown, M.S.; Kim, G.H.J. Prediction of progression in idiopathic pulmonary fibrosis using CT scans at baseline: A quantum particle swarm optimization-Random forest approach. Artif. Intell. Med. 2019, 100, 101709. [Google Scholar] [CrossRef]
- Mandal, S.; Balas, V.E.; Shaw, R.N.; Ghosh, A. Prediction analysis of idiopathic pulmonary fibrosis progression from OSIC dataset. In Proceedings of the 2020 IEEE International Conference on Computing, Power and Communication Technologies (GUCON), Greater Noida, India, 2–4 October 2020. [Google Scholar]
- Wang, Z. Deep Learning Approach for Auto-Detecting Idiopathic Pulmonary Fibrosis Prediction. In Proceedings of the 2021 IEEE International Conference on Artificial Intelligence and Industrial Design (AIID), Guangzhou, China, 28–30 May 2021; pp. 283–290. [Google Scholar]
- Wu, X.; Yin, C.; Chen, X.; Zhang, Y.; Su, Y.; Shi, J.; Weng, D.; Jiang, X.; Zhang, A.; Zhang, W.; et al. Idiopathic pulmonary fibrosis mortality risk prediction based on artificial intelligence: The CTPF model. Front. Pharmacol. 2022, 13, 878764. [Google Scholar] [CrossRef] [PubMed]
- Thillai, M.; Oldham, J.M.; Ruggiero, A.; Kanavati, F.; McLellan, T.; Saini, G.; Johnson, S.R.; Ble, F.X.; Azim, A.; Ostridge, K.; et al. Deep learning-based segmentation of CT scans predicts disease progression and mortality in IPF. Am. J. Respir. Crit. Care Med. 2024. Available online: https://www.atsjournals.org/doi/abs/10.1164/rccm.202311-2185OC (accessed on 2 March 2024). [CrossRef] [PubMed]
- Ali, S.; Hussain, A.; Aich, S.; Park, M.S.; Chung, M.P.; Jeong, S.H.; Song, J.W.; Lee, J.H.; Kim, H.C. A Soft Voting Ensemble-Based Model for the Early Prediction of Idiopathic Pulmonary Fibrosis (IPF) Disease Severity in Lungs Disease Patients. Life 2021, 11, 1092. [Google Scholar] [CrossRef]
- Ajakwe, S.O.; Ajakwe, I.U.; Jun, T.; Kim, D.S.; Lee, J.M. CIS-WQMS: Connected intelligence smart water quality monitoring scheme. Internet Things 2023, 23, 100800. [Google Scholar] [CrossRef]
- Huang, J.; Li, Y.-F.; Xie, M. An empirical analysis of data preprocessing for machine learning-based software cost estimation. Inf. Softw. Technol. 2015, 67, 108–127. [Google Scholar] [CrossRef]
- Li, J.; Cheng, K.; Wang, S.; Morstatter, F.; Trevino, R.P.; Tang, J.; Liu, H. Feature selection: A data perspective. ACM Comput. Surv. (CSUR) 2017, 50, 94. [Google Scholar] [CrossRef]
- Ajakwe, S.O.; Ihekoronye, V.U.; Ajakwe, I.U.; Jun, T.; Kim, D.S.; Lee, J.M. Connected Intelligence for Smart Water Quality Monitoring System in IIoT. In Proceedings of the 13th International Conference on Information and Communication Technology Convergence (ICTC), Jeju Island, Republic of Korea, 19–21 October 2022; pp. 2386–2391. [Google Scholar]
- Saeys, Y.; Inza, I.; Larrañaga, P. A review of feature selection techniques in bioinformatics. Bioinformatics 2007, 23, 2507–2517. [Google Scholar] [CrossRef] [PubMed]
- Akhtar, F.; Li, J.; Pei, Y.; Xu, Y.; Rajput, A.; Wang, Q. Optimal features subset selection for large for gestational age classification using gridsearch based recursive feature elimination with cross-validation scheme. In Proceedings of the International Conference on Frontier Computing, Kitakyushu, Japan, 9–12 July 2019; Springer: Singapore, 2019. [Google Scholar]
- Ihekoronye, V.U.; Ajakwe, S.O.; Kim, D.S.; Lee, J.M. Hierarchical intrusion detection system for secured military drone network: A perspicacious approach. In Proceedings of the MILCOM 2022—2022 IEEE Military Communications Conference (MILCOM), Rockville, MD, USA, 28 November–2 December 2022; pp. 336–341. [Google Scholar]
- Yahaya, C.A.C.; Firdaus, A.; Mohamad, S.; Ernawan, F.; Razak, M.F.A. Automated Feature Selection using Boruta Algorithm to Detect Mobile Malware. Int. J. Adv. Trends Comput. Sci. Eng. 2020, 9, 9029–9036. [Google Scholar]
- Han, W.; Huang, Z.; Li, S.; Jia, Y. Distribution-sensitive unbalanced data oversampling method for medical diagnosis. J. Med. Syst. 2019, 43, 39. [Google Scholar] [CrossRef] [PubMed]
- Ajakwe, S.O.; Deji-Oloruntoba, O.; Olatubosun, S.; Duorinaah, F.; Bayode, I.A. Multidimensional Perspective to Data Preprocessing for Model Cognition Verity. In Recent Trends and Future Direction for Data Analytics; IGI Global Publishers: Hershey, PA, USA, 2024; ISBN 13 9798369336090. [Google Scholar] [CrossRef]
- Juarez, M.M.; Chan, A.L.; Norris, A.G.; Morrissey, B.M.; Albertson, T.E. Acute exacerbation of idiopathic pulmonary fibrosis. Chest 2007, 132, 1652–1658. [Google Scholar]
- Lee, H.J.; Im, J.-G.; Ahn, J.M.; Yeon, K.M. Lung cancer in patients with idiopathic pulmonary fibrosis: CT findings. J. Comput. Assist. Tomogr. 1996, 20, 979–982. [Google Scholar] [CrossRef]
No. | Features | Description | Values |
---|---|---|---|
1 | Ht | Height | Numerical |
2 | wt | Weight | Numerical |
3 | BMI | Body mass index | Numerical |
4 | Age | Age of patient | Numerical |
5 | Job | oldest job | 1 = Housewife, 2 = Office worker, 3 = commerce, construction site = 4 |
6 | Smoking | smoking or not | 1 = I never smoked, 2 = Smoking, 3 = I quit smoking before |
7 | DxAge | age at diagnosis | Numerical |
8 | ABG_Done | Implementation | 1 = enforcement, 0 = not enforced |
9 | PFT_FEVm | Pulmonary function test forced expiratory volume measure | Numerical |
10 | PFT_FEVpc | Pulmonary function test forced expiratory predicted value | Numerical |
11 | PFT_FVCm | Pulmonary function test forced vital capacity predicted value | Numerical |
12 | PFT_FVCpc | FVC forecast | Numerical |
13 | PFT_FFpc | Pulmonary function test free fluid | Numerical |
14 | PFT_DLCOm | Diffusing capacity of lungs for carbon monoxide measure | Numerical |
15 | PFT_DLCOpc | Diffusing capacity of lungs for carbon monoxide predicted value | Numerical |
16 | CT_UIPPattern | CT Finding | 1 = Definite, 2 = Probable, 3 = Inconsistent |
17 | CT_Honeycombing | Accompanying Honeycombing | yes = 1, No = 0 |
18 | CT_GGO_1 | Accompanying GGO | yes = 1, No = 0 |
19 | BloodSampleYN | Presence of blood sample | yes = 1, No = 0 |
20 | PlasmaSampleYN | Plasma sample presence | yes = 1, No = 0 |
21 | SerumSampleYN | Serum sample presence | yes = 1, No = 0 |
22 | NTProBNPYN | Whether NT-pro BNP is implemented | 1 = enforcement, 0 = not enforced |
23 | NTProBNP | NT-pro BNP | Numerical |
23 | GAP_1 | Register for GAP Stage | Categorical |
24 | EchoYN | Echo implementation | 1 = enforcement, 0 = not enforced |
25 | GAPIdx_1 | Register GAP Index | Categorical |
27 | RxName | Drug name (ingredient) | 0 = None, 1 = Corticosteroid, 2 = N-acetylcysteine, 3 = Pirfenidone, 4 = Nintedanib |
28 | RxHomeYN | Whether home oxygen treatment | yes = 1, No = 0 |
29 | Survival_FUMonths | Tracking period (months) | Categorical |
30 | State | Exacerbation stages | 0 = No exacerbation, 1 = first stage exacerbation, 2 = second stage exacerbation, 3 = third stage exacerbation |
Classifier | Specification |
---|---|
Multi-layer Perceptron | hidden_layer_sizes = (500,500,500), max_iter = 1000, random_state = 42 |
Random Forest | n_estimators = 500, random_state = 0, criterion = ‘gini’, max_depth = 20, min_samples_split = 5, min_samples_leaf = 5 |
Support Vector Machine | kernel = ‘linear’, degree = 7, gamma = 5.9, C = 30, decision_function_shape = ‘ovr’, probability = True, random_state = 0 |
Gradient Boosting Machine | learning_rate = 0.1, n_estimators = 500, max_depth = 15, min_samples_split = 5, min_samples_leaf = 5, subsample = 1, max_features = ‘sqrt’, random_state = 10 |
XGBoost | random_state = 0, silent = False, scale_pos_weight = 2, learning_rate = 0.01, colsample_bytree = 0.4, subsample = 0.9, objective = ‘binary:logistic’, n_estimators = 500, reg_alpha = 0.1, max_depth = 15, gamma = 7 |
Classifiers | Accuracy | Precision | Recall | F1-Score |
---|---|---|---|---|
MLP | 0.7622 | 0.8016 | 0.8412 | 0.8250 |
Random Forest | 0.7803 | 0.8105 | 0.8611 | 0.8300 |
XG Boost | 0.7441 | 0.8624 | 0.7109 | 0.7864 |
SVM | 0.7648 | 0.8232 | 0.8105 | 0.8103 |
GBM | 0.7881 | 0.8135 | 0.8722 | 0.8401 |
SVE | 0.7958 | 0.8261 | 0.8637 | 0.8420 |
Classifiers | Time (s) | MSE | MCC | K |
---|---|---|---|---|
MLP | 0.7154 | 0.2532 | 0.0107 | 0.5229 |
Random Forest | 0.2720 | 0.4677 | 0.0327 | 0.0428 |
XG Boost | 0.0339 | 0.2222 | 0.0160 | 0.0348 |
SVM | 2.3360 | 0.2144 | 0.0120 | 0.0662 |
GBM | 0.0166 | 0.3927 | 0.0197 | 0.0170 |
SVE | 0.4740 | 0.1925 | 0.5555 | 0.58484 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Theodore Armand, T.P.; Mozumder, M.A.I.; Carole, K.S.; Deji-Oloruntoba, O.; Kim, H.-C.; Ajakwe, S.O. ELIPF: Explicit Learning Framework for Pre-Emptive Forecasting, Early Detection and Curtailment of Idiopathic Pulmonary Fibrosis Disease. BioMedInformatics 2024, 4, 1807-1821. https://doi.org/10.3390/biomedinformatics4030099
Theodore Armand TP, Mozumder MAI, Carole KS, Deji-Oloruntoba O, Kim H-C, Ajakwe SO. ELIPF: Explicit Learning Framework for Pre-Emptive Forecasting, Early Detection and Curtailment of Idiopathic Pulmonary Fibrosis Disease. BioMedInformatics. 2024; 4(3):1807-1821. https://doi.org/10.3390/biomedinformatics4030099
Chicago/Turabian StyleTheodore Armand, Tagne Poupi, Md Ariful Islam Mozumder, Kouayep Sonia Carole, Opeyemi Deji-Oloruntoba, Hee-Cheol Kim, and Simeon Okechukwu Ajakwe. 2024. "ELIPF: Explicit Learning Framework for Pre-Emptive Forecasting, Early Detection and Curtailment of Idiopathic Pulmonary Fibrosis Disease" BioMedInformatics 4, no. 3: 1807-1821. https://doi.org/10.3390/biomedinformatics4030099
APA StyleTheodore Armand, T. P., Mozumder, M. A. I., Carole, K. S., Deji-Oloruntoba, O., Kim, H. -C., & Ajakwe, S. O. (2024). ELIPF: Explicit Learning Framework for Pre-Emptive Forecasting, Early Detection and Curtailment of Idiopathic Pulmonary Fibrosis Disease. BioMedInformatics, 4(3), 1807-1821. https://doi.org/10.3390/biomedinformatics4030099