Decoding Radiomics: A Step-by-Step Guide to Machine Learning Workflow in Hand-Crafted and Deep Learning Radiomics Studies
Abstract
:1. Introduction
2. Machine Learning and Deep Learning
3. Bias–Variance Trade-Off
4. Step-by-Step Radiomic Workflow
4.1. Study Design and Data Collection
4.1.1. Eligibility
4.1.2. Reference Standard
4.1.3. Monocentric Versus Multicenter
4.1.4. Imaging Protocol
4.2. Image Preprocessing
4.2.1. Normalization and Standardization
4.2.2. Discretization
4.2.3. Co-Registration
4.2.4. Resampling
4.2.5. Image Filtering and Enhancement Techniques
4.3. Segmentation
4.4. Feature Extraction
4.4.1. Hand-Crafted Features
4.4.2. Deep Features
4.5. Tabular Data
4.6. Data Preparation: Missing Values, Data Scarcity, Confounding Factors, and Class Imbalance Problems
4.6.1. Missing Values
4.6.2. Data Scarcity
4.6.3. Confounding Factors
4.6.4. Class Imbalance Problems
4.7. Features Robustness
4.8. Feature Selection and Regularization
4.8.1. The Need for Feature Selection and Regularization
4.8.2. Filter, Wrapper, and Embedded Methods
4.8.3. Dimensionality Reduction
4.9. Modeling
4.9.1. Supervised Versus Unsupervised Learning
4.9.2. Regression, Classification, and Clustering Problems
4.9.3. Model Selection
4.9.4. Data Partition for Training, Validation, and Test Phases
4.9.5. Validation
4.10. Model Testing and Performance Metrics
4.11. Model Uncertainty Assessment and Calibration
4.12. Model Comparison
5. Challenges and Future Perspectives: Collaborative Science and Clinical Translatability
Author Contributions
Funding
Conflicts of Interest
References
- Ding, H.; Wu, C.; Liao, N.; Zhan, Q.; Sun, W.; Huang, Y.; Jiang, Z.; Li, Y. Radiomics in Oncology: A 10-Year Bibliometric Analysis. Front. Oncol. 2021, 11, 689802. [Google Scholar] [CrossRef] [PubMed]
- Volpe, S.; Mastroleo, F.; Krengli, M.; Jereczek-Fossa, B.A. Quo vadis Radiomics? Bibliometric analysis of 10-year Radiomics journey. Eur. Radiol. 2023, 33, 6736–6745. [Google Scholar] [CrossRef] [PubMed]
- Polidori, T.; De Santis, D.; Rucci, C.; Tremamunno, G.; Piccinni, G.; Pugliese, L.; Zerunian, M.; Guido, G.; Pucciarelli, F.; Bracci, B.; et al. Radiomics applications in cardiac imaging: A comprehensive review. Radiol. Med. 2023, 128, 922–933. [Google Scholar] [CrossRef]
- Lambin, P.; Rios-Velazquez, E.; Leijenaar, R.; Carvalho, S.; van Stiphout, R.G.P.M.; Granton, P.; Zegers, C.M.L.; Gillies, R.; Boellard, R.; Dekker, A.; et al. Radiomics: Extracting more information from medical images using advanced feature analysis. Eur. J. Cancer 2012, 48, 441–446. [Google Scholar] [CrossRef]
- Mayerhoefer, M.E.; Materka, A.; Langs, G.; Häggström, I.; Szczypiński, P.; Gibbs, P.; Cook, G. Introduction to Radiomics. J. Nucl. Med. 2020, 61, 488–495. [Google Scholar] [CrossRef]
- Cè, M.; Ibba, S.; Cellina, M.; Tancredi, C.; Fantesini, A.; Fazzini, D.; Fortunati, A.; Perazzo, C.; Presta, R.; Montanari, R.; et al. Radiologists’ perceptions on AI integration: An in-depth survey study. Eur. J. Radiol. 2024, 177, 111590. [Google Scholar] [CrossRef]
- Pesapane, F.; Codari, M.; Sardanelli, F. Artificial intelligence in medical imaging: Threat or opportunity? Radiologists again at the forefront of innovation in medicine. Eur. Radiol. Exp. 2018, 2, 35. [Google Scholar] [CrossRef]
- Castiglioni, I.; Rundo, L.; Codari, M.; Di Leo, G.; Salvatore, C.; Interlenghi, M.; Gallivanone, F.; Cozzi, A.; D’Amico, N.C.; Sardanelli, F. AI applications to medical images: From machine learning to deep learning. Phys. Medica 2021, 83, 9–24. [Google Scholar] [CrossRef]
- Sardanelli, F.; Colarieti, A. Open issues for education in radiological research: Data integrity, study reproducibility, peer-review, levels of evidence, and cross-fertilization with data scientists. Radiol. Med. 2022, 128, 133–135. [Google Scholar] [CrossRef]
- Lambin, P.; Leijenaar, R.T.H.; Deist, T.M.; Peerlings, J.; de Jong, E.E.C.; van Timmeren, J.; Sanduleanu, S.; Larue, R.T.H.M.; Even, A.J.G.; Jochems, A.; et al. Radiomics: The bridge between medical imaging and personalized medicine. Nat. Rev. Clin. Oncol. 2017, 14, 749–762. [Google Scholar] [CrossRef]
- Hatt, M.; Le Rest, C.C.; Tixier, F.; Badic, B.; Schick, U.; Visvikis, D. Radiomics: Data Are Also Images. J. Nucl. Med. 2019, 60, 38S–44S. [Google Scholar] [CrossRef] [PubMed]
- Li, S.; Zhou, B. A review of radiomics and genomics applications in cancers: The way towards precision medicine. Radiat. Oncol. 2022, 17, 217. [Google Scholar] [CrossRef] [PubMed]
- Liu, Z.; Wang, S.; Dong, D.; Wei, J.; Fang, C.; Zhou, X.; Sun, K.; Li, L.; Li, B.; Wang, M.; et al. The Applications of Radiomics in Precision Diagnosis and Treatment of Oncology: Opportunities and Challenges. Theranostics 2019, 9, 1303–1322. [Google Scholar] [CrossRef] [PubMed]
- Stanzione, A.; Cuocolo, R.; Ugga, L.; Verde, F.; Romeo, V.; Brunetti, A.; Maurea, S. Oncologic Imaging and Radiomics: A Walkthrough Review of Methodological Challenges. Cancers 2022, 14, 4871. [Google Scholar] [CrossRef] [PubMed]
- Van Timmeren, J.E.; Cester, D.; Tanadini-Lang, S.; Alkadhi, H.; Baessler, B. Radiomics in medical imaging-”how-to” guide and critical reflection. Insights Imaging 2020, 11, 91. [Google Scholar] [CrossRef]
- Shur, J.D.; Doran, S.J.; Kumar, S.; Ap Dafydd, D.; Downey, K.; O’connor, J.P.B.; Papanikolaou, N.; Messiou, C.; Koh, D.-M.; Orton, M.R. Radiomics in Oncology: A Practical Guide. RadioGraphics 2021, 41, 1717–1732. [Google Scholar] [CrossRef]
- Kocak, B.; Durmaz, E.S.; Ates, E.; Kilickesmez, O. Radiomics with artificial intelligence: A practical guide for beginners. Diagn. Interv. Radiol. 2019, 25, 485–495. [Google Scholar] [CrossRef]
- Zitnik, M.; Nguyen, F.; Wang, B.; Leskovec, J.; Goldenberg, A.; Hoffman, M.M. Machine learning for integrating data in biology and medicine: Principles, practice, and opportunities. Inf. Fusion 2019, 50, 71–91. [Google Scholar] [CrossRef]
- Scapicchio, C.; Gabelloni, M.; Barucci, A.; Cioni, D.; Saba, L.; Neri, E. A deep look into radiomics. Radiol. Med. 2021, 126, 1296–1311. [Google Scholar] [CrossRef]
- Kocak, B.; D’antonoli, T.A.; Mercaldo, N.; Alberich-Bayarri, A.; Baessler, B.; Ambrosini, I.; Andreychenko, A.E.; Bakas, S.; Beets-Tan, R.G.H.; Bressem, K.; et al. METhodological RadiomICs Score (METRICS): A quality scoring tool for radiomics research endorsed by EuSoMII. Insights Imaging 2024, 15, 8. [Google Scholar] [CrossRef]
- Majumder, S.; Katz, S.; Kontos, D.; Roshkovan, L. State of the art: Radiomics and radiomics-related artificial intelligence on the road to clinical translation. BJR|Open 2023, 6, tzad004. [Google Scholar] [CrossRef] [PubMed]
- Hsiao, J.H. Understanding Human Cognition Through Computational Modeling. Top. Cogn. Sci. 2024, 16, 349–376. [Google Scholar] [CrossRef] [PubMed]
- Kufel, J.; Bargieł-Łączek, K.; Kocot, S.; Koźlik, M.; Bartnikowska, W.; Janik, M.; Czogalik, A.; Dudek, P.; Magiera, M.; Lis, A.; et al. What Is Machine Learning, Artificial Neural Networks and Deep Learning?—Examples of Practical Applications in Medicine. Diagnostics 2023, 13, 2582. [Google Scholar] [CrossRef]
- Spicer, J.; Sanborn, A.N. What does the mind learn? A comparison of human and machine learning representations. Curr. Opin. Neurobiol. 2019, 55, 97–102. [Google Scholar] [CrossRef]
- James, G.; Witten, D.; Hastie, T.; Tibshirani, R. An Introduction to Statistical Learning: With Applications in R; Springer: New York, NY, USA, 2013. [Google Scholar]
- Lu, S.-C.; Swisher, C.L.; Chung, C.; Jaffray, D.; Sidey-Gibbons, C. On the importance of interpretable machine learning predictions to inform clinical decision making in oncology. Front. Oncol. 2023, 13, 1129380. [Google Scholar] [CrossRef]
- Cheng, P.M.; Montagnon, E.; Yamashita, R.; Pan, I.; Cadrin-Chênevert, A.; Romero, F.P.; Chartrand, G.; Kadoury, S.; Tang, A. Deep Learning: An Update for Radiologists. RadioGraphics 2021, 41, 1427–1445. [Google Scholar] [CrossRef]
- Linardatos, P.; Papastefanopoulos, V.; Kotsiantis, S. Explainable AI: A Review of Machine Learning Interpretability Methods. Entropy 2020, 23, 18. [Google Scholar] [CrossRef]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
- Tan, P.-N.; Steinbach, M.; Karpatne, A.; Kumar, V. Introduction to Data Mining, 2nd ed.; Pearson Education: London, UK, 2019. [Google Scholar]
- Larrañaga, P.; Calvo, B.; Santana, R.; Bielza, C.; Galdiano, J.; Inza, I.; Lozano, J.A.; Armañanzas, R.; Santafé, G.; Pérez, A.; et al. Machine learning in bioinformatics. Brief. Bioinform. 2006, 7, 86–112. [Google Scholar] [CrossRef]
- Jiang, T.; Gradus, J.L.; Rosellini, A.J. Supervised Machine Learning: A Brief Primer. Behav. Ther. 2020, 51, 675–687. [Google Scholar] [CrossRef]
- Mongan, J.; Moy, L.; Kahn, C.E. Checklist for Artificial Intelligence in Medical Imaging (CLAIM): A Guide for Authors and Reviewers. Radiol. Artif. Intell. 2020, 2, e200029. [Google Scholar] [CrossRef] [PubMed]
- Kocak, B.; Baessler, B.; Bakas, S.; Cuocolo, R.; Fedorov, A.; Maier-Hein, L.; Mercaldo, N.; Müller, H.; Orlhac, F.; dos Santos, D.P.; et al. CheckList for EvaluAtion of Radiomics research (CLEAR): A step-by-step reporting guideline for authors and reviewers endorsed by ESR and EuSoMII. Insights Imaging 2023, 14, 75. [Google Scholar] [CrossRef] [PubMed]
- Cui, Y.; Yin, F.-F. Impact of image quality on radiomics applications. Phys. Med. Biol. 2022, 67, 15TR03. [Google Scholar] [CrossRef] [PubMed]
- Orlhac, F.; Lecler, A.; Savatovski, J.; Goya-Outi, J.; Nioche, C.; Charbonneau, F.; Ayache, N.; Frouin, F.; Duron, L.; Buvat, I. How can we combat multicenter variability in MR radiomics? Validation of a correction procedure. Eur. Radiol. 2021, 31, 2272–2280. [Google Scholar] [CrossRef]
- Campello, V.M.; Martín-Isla, C.; Izquierdo, C.; Guala, A.; Palomares, J.F.R.; Viladés, D.; Descalzo, M.L.; Karakas, M.; Çavuş, E.; Raisi-Estabragh, Z.; et al. Minimising multi-centre radiomics variability through image normalisation: A pilot study. Sci. Rep. 2022, 12, 12532. [Google Scholar] [CrossRef]
- Weinreb, J.C.; Barentsz, J.O.; Choyke, P.L.; Cornud, F.; Haider, M.A.; Macura, K.J.; Margolis, D.; Schnall, M.D.; Shtern, F.; Tempany, C.M.; et al. PI-RADS Prostate Imaging—Reporting and Data System: 2015, Version 2. Eur. Urol. 2016, 69, 16–40. [Google Scholar] [CrossRef]
- An, J.Y.; Unsdorfer, K.M.L.; Weinreb, J.C. BI-RADS, C-RADS, CAD-RADS, LI-RADS, Lung-RADS, NI-RADS, O-RADS, PI-RADS, TI-RADS: Reporting and Data Systems. RadioGraphics 2019, 39, 1435–1436. [Google Scholar] [CrossRef]
- Park, D.; Oh, D.; Lee, M.; Lee, S.Y.; Shin, K.M.; Jun, J.S.; Hwang, D. Importance of CT image normalization in radiomics analysis: Prediction of 3-year recurrence-free survival in non-small cell lung cancer. Eur. Radiol. 2022, 32, 8716–8725. [Google Scholar] [CrossRef]
- Moradmand, H.; Aghamiri, S.M.R.; Ghaderi, R. Impact of image preprocessing methods on reproducibility of radiomic features in multimodal magnetic resonance imaging in glioblastoma. J. Appl. Clin. Med. Phys. 2020, 21, 179–190. [Google Scholar] [CrossRef]
- Um, H.; Tixier, F.; Bermudez, D.; Deasy, J.O.; Young, R.J.; Veeraraghavan, H. Impact of image preprocessing on the scanner dependence of multi-parametric MRI radiomic features and covariate shift in multi-institutional glioblastoma datasets. Phys. Med. Biol. 2019, 64, 165011. [Google Scholar] [CrossRef]
- Gonzalez, R.C.; Woods, R.E. Digital Image Processing, 4th ed.; Pearson: London, UK, 2018. [Google Scholar]
- Ellingson, B.M.; Zaw, T.; Cloughesy, T.F.; Naeini, K.M.; Lalezari, S.; Mong, S.; Lai, A.; Nghiemphu, P.L.; Pope, W.B. Comparison between intensity normalization techniques for dynamic susceptibility contrast (DSC)-MRI estimates of cerebral blood volume (CBV) in human gliomas. J. Magn. Reson. Imaging 2012, 35, 1472–1477. [Google Scholar] [CrossRef] [PubMed]
- Depeursinge, A.; Al-Kadi, O.; Mitchell, J. Biomedical Texture Analysis: Fundamentals, Tools and Challenges; Academic Press: London, UK, 2017. [Google Scholar]
- Pérez-García, F.; Sparks, R.; Ourselin, S. TorchIO: A Python library for efficient loading, preprocessing, augmentation and patch-based sampling of medical images in deep learning. Comput. Methods Programs Biomed. 2021, 208, 106236. [Google Scholar] [CrossRef] [PubMed]
- Li, Y.; Tan, G.; Vangel, M.; Hall, J.; Cai, W. Influence of feature calculating parameters on the reproducibility of CT radiomic features: A thoracic phantom study. Quant. Imaging Med. Surg. 2020, 10, 1775–1785. [Google Scholar] [CrossRef] [PubMed]
- Duron, L.; Balvay, D.; Perre, S.V.; Bouchouicha, A.; Savatovsky, J.; Sadik, J.-C.; Thomassin-Naggara, I.; Fournier, L.; Lecler, A. Gray-level discretization impacts reproducible MRI radiomics texture features. PLoS ONE 2019, 14, e0213459. [Google Scholar] [CrossRef]
- Loi, S.; Mori, M.; Palumbo, D.; Crippa, S.; Palazzo, G.; Spezi, E.; Del Vecchio, A.; Falconi, M.; De Cobelli, F.; Fiorino, C. Limited impact of discretization/interpolation parameters on the predictive power of CT radiomic features in a surgical cohort of pancreatic cancer patients. Radiol. Med. 2023, 128, 799–807. [Google Scholar] [CrossRef]
- Liberini, V.; De Santi, B.; Rampado, O.; Gallio, E.; Dionisi, B.; Ceci, F.; Polverari, G.; Thuillier, P.; Molinari, F.; Deandreis, D. Impact of segmentation and discretization on radiomic features in 68Ga-DOTA-TOC PET/CT images of neuroendocrine tumor. EJNMMI Phys. 2021, 8, 21. [Google Scholar] [CrossRef]
- Larue, R.T.H.M.; van Timmeren, J.E.; de Jong, E.E.C.; Feliciani, G.; Leijenaar, R.T.H.; Schreurs, W.M.J.; Sosef, M.N.; Raat, F.H.P.J.; van der Zande, F.H.R.; Das, M.; et al. Influence of gray level discretization on radiomic feature stability for different CT scanners, tube currents and slice thicknesses: A comprehensive phantom study. Acta Oncol. 2017, 56, 1544–1553. [Google Scholar] [CrossRef]
- Carré, A.; Klausner, G.; Edjlali, M.; Lerousseau, M.; Briend-Diop, J.; Sun, R.; Ammari, S.; Reuzé, S.; Andres, E.A.; Estienne, T.; et al. Standardization of brain MR images across machines and protocols: Bridging the gap for MRI-based radiomics. Sci. Rep. 2020, 10, 12340. [Google Scholar] [CrossRef]
- Stefano, A.; Leal, A.; Richiusa, S.; Trang, P.; Comelli, A.; Benfante, V.; Cosentino, S.; Sabini, M.G.; Tuttolomondo, A.; Altieri, R.; et al. Robustness of PET Radiomics Features: Impact of Co-Registration with MRI. Appl. Sci. 2021, 11, 10170. [Google Scholar] [CrossRef]
- Jiao, F.; Wang, M.; Sun, X.; Ju, Z.; Lu, J.; Wang, L.; Jiang, J.; Zuo, C. Based on Tau PET Radiomics Analysis for the Classification of Alzheimer’s Disease and Mild Cognitive Impairment. Brain Sci. 2023, 13, 367. [Google Scholar] [CrossRef]
- Mandal, P.K.; Mahajan, R.; Dinov, I.D. Structural Brain Atlases: Design, Rationale, and Applications in Normal and Pathological Cohorts. J. Alzheimer’s Dis. 2012, 31, S169–S188. [Google Scholar] [CrossRef] [PubMed]
- Bleker, J.; Roest, C.; Yakar, D.; Huisman, H.; Kwee, T.C. The Effect of Image Resampling on the Performance of Radiomics-Based Artificial Intelligence in Multicenter Prostate MRI. J. Magn. Reson. Imaging 2023, 59, 1800–1806. [Google Scholar] [CrossRef] [PubMed]
- Schick, U.; Lucia, F.; Dissaux, G.; Visvikis, D.; Badic, B.; Masson, I.; Pradier, O.; Bourbonne, V.; Hatt, M. MRI-derived radiomics: Methodology and clinical applications in the field of pelvic oncology. Br. J. Radiol. 2019, 92, 20190105. [Google Scholar] [CrossRef] [PubMed]
- Demircioğlu, A. The effect of preprocessing filters on predictive performance in radiomics. Eur. Radiol. Exp. 2022, 6, 40. [Google Scholar] [CrossRef]
- Lo, S.-C.; Li, H.; Freedman, M. Optimization of wavelet decomposition for image compression and feature preservation. IEEE Trans. Med. Imaging 2003, 22, 1141–1151. [Google Scholar] [CrossRef]
- Lunscher, W.H.H.J.; Beddoes, M.P. Optimal Edge Detector Design I: Parameter Selection and Noise Effects. IEEE Trans. Pattern Anal. Mach. Intell. 1986, PAMI-8, 164–177. [Google Scholar] [CrossRef]
- Kumar, A. Study and analysis of different segmentation methods for brain tumor MRI application. Multimed. Tools Appl. 2023, 82, 7117–7139. [Google Scholar] [CrossRef]
- du Plessis, T.; Ramkilawon, G.; Rae, W.I.D.; Botha, T.; Martinson, N.A.; Dixon, S.A.P.; Kyme, A.; Sathekge, M.M. Introducing a secondary segmentation to construct a radiomics model for pulmonary tuberculosis cavities. Radiol. Med. 2023, 128, 1093–1102. [Google Scholar] [CrossRef]
- Kim, D.H.; Kim, Y.J.; Kim, K.G.; Jeon, J.Y. Automated Vertebral Segmentation and Measurement of Vertebral Compression Ratio Based on Deep Learning in X-Ray Images. J. Digit. Imaging 2021, 34, 853–861. [Google Scholar] [CrossRef]
- Yu, Y.; Wang, C.; Fu, Q.; Kou, R.; Huang, F.; Yang, B.; Yang, T.; Gao, M. Techniques and Challenges of Image Segmentation: A Review. Electronics 2023, 12, 1199. [Google Scholar] [CrossRef]
- Wang, Z.; Gu, S.; Leader, J.K.; Kundu, S.; Tedrow, J.R.; Sciurba, F.C.; Gur, D.; Siegfried, J.M.; Pu, J. Optimal threshold in CT quantification of emphysema. Eur. Radiol. 2013, 23, 975–984. [Google Scholar] [CrossRef] [PubMed]
- Fedorov, A.; Beichel, R.; Kalpathy-Cramer, J.; Finet, J.; Fillion-Robin, J.-C.; Pujol, S.; Bauer, C.; Jennings, D.; Fennessy, F.; Sonka, M.; et al. 3D Slicer as an image computing platform for the Quantitative Imaging Network. Magn. Reson. Imaging 2012, 30, 1323–1341. [Google Scholar] [CrossRef] [PubMed]
- Dionisio, F.C.F.; Oliveira, L.S.; Hernandes, M.d.A.; Engel, E.E.; de Azevedo-Marques, P.M.; Nogueira-Barbosa, M.H. Manual versus semiautomatic segmentation of soft-tissue sarcomas on magnetic resonance imaging: Evaluation of similarity and comparison of segmentation times. Radiol. Bras. 2021, 54, 155–164. [Google Scholar] [CrossRef] [PubMed]
- Sharma, S.R.; Alshathri, S.; Singh, B.; Kaur, M.; Mostafa, R.R.; El-Shafai, W. Hybrid Multilevel Thresholding Image Segmentation Approach for Brain MRI. Diagnostics 2023, 13, 925. [Google Scholar] [CrossRef]
- Amer, G.M.H.; Abushaala, A.M. Edge detection methods. In Proceedings of the 2015 2nd World Symposium on Web Applications and Networking (WSWAN), Sousse, Tunisia, 21–23 March 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 1–7. [Google Scholar]
- Sammouda, R.; El-Zaart, A. An Optimized Approach for Prostate Image Segmentation Using K-Means Clustering Algorithm with Elbow Method. Comput. Intell. Neurosci. 2021, 2021, 1–13. [Google Scholar] [CrossRef]
- Minaee, S.; Boykov, Y.Y.; Porikli, F.; Plaza, A.J.; Kehtarnavaz, N.; Terzopoulos, D. Image Segmentation Using Deep Learning: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 2021, 1. [Google Scholar] [CrossRef]
- Tajbakhsh, N.; Jeyaseelan, L.; Li, Q.; Chiang, J.N.; Wu, Z.; Ding, X. Embracing imperfect datasets: A review of deep learning solutions for medical image segmentation. Med. Image Anal. 2020, 63, 101693. [Google Scholar] [CrossRef]
- Dice, L.R. Measures of the Amount of Ecologic Association Between Species. Ecology 1945, 26, 297–302. [Google Scholar] [CrossRef]
- Zijdenbos, A.; Dawant, B.; Margolin, R.; Palmer, A. Morphometric analysis of white matter lesions in MR images: Method and validation. IEEE Trans. Med. Imaging 1994, 13, 716–724. [Google Scholar] [CrossRef]
- Gillies, R.J.; Kinahan, P.E.; Hricak, H. Radiomics: Images Are More than Pictures, They Are Data. Radiology 2016, 278, 563–577. [Google Scholar] [CrossRef]
- O’Connor, J.P.B.; Aboagye, E.O.; Adams, J.E.; Aerts, H.J.W.L.; Barrington, S.F.; Beer, A.J.; Boellaard, R.; Bohndiek, S.E.; Brady, M.; Brown, G.; et al. Imaging biomarker roadmap for cancer studies. Nat. Rev. Clin. Oncol. 2017, 14, 169–186. [Google Scholar] [CrossRef] [PubMed]
- Tomaszewski, M.R.; Gillies, R.J. The Biological Meaning of Radiomic Features. Radiology 2021, 298, 505–516. [Google Scholar] [CrossRef] [PubMed]
- Demircioğlu, A. Deep Features from Pretrained Networks Do Not Outperform Hand-Crafted Features in Radiomics. Diagnostics 2023, 13, 3266. [Google Scholar] [CrossRef] [PubMed]
- Parmar, C.; Leijenaar, R.T.H.; Grossmann, P.; Velazquez, E.R.; Bussink, J.; Rietveld, D.; Rietbergen, M.M.; Haibe-Kains, B.; Lambin, P.; Aerts, H.J. Radiomic feature clusters and Prognostic Signatures specific for Lung and Head & Neck cancer. Sci. Rep. 2015, 5, srep11044. [Google Scholar] [CrossRef]
- van Griethuysen, J.J.M.; Fedorov, A.; Parmar, C.; Hosny, A.; Aucoin, N.; Narayan, V.; Beets-Tan, R.G.H.; Fillion-Robin, J.-C.; Pieper, S.; Aerts, H.J.W.L. Computational Radiomics System to Decode the Radiographic Phenotype. Cancer Res. 2017, 77, e104–e107. [Google Scholar] [CrossRef]
- Aerts, H.J.W.L.; Velazquez, E.R.; Leijenaar, R.T.H.; Parmar, C.; Grossmann, P.; Carvalho, S.; Bussink, J.; Monshouwer, R.; Haibe-Kains, B.; Rietveld, D.; et al. Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nat. Commun. 2014, 5, 4006. [Google Scholar] [CrossRef]
- Grossmann, P.; Stringfield, O.; El-Hachem, N.; Bui, M.M.; Velazquez, E.R.; Parmar, C.; Leijenaar, R.T.; Haibe-Kains, B.; Lambin, P.; Gillies, R.J.; et al. Defining the biological basis of radiomic phenotypes in lung cancer. eLife 2017, 6, 23421. [Google Scholar] [CrossRef]
- Haralick, R.M.; Shanmugam, K.; Dinstein, I.H. Textural Features for Image Classification. IEEE Trans. Syst. Man Cybern. 1973, SMC-3, 610–621. [Google Scholar] [CrossRef]
- Zhang, W.; Guo, Y.; Jin, Q. Radiomics and Its Feature Selection: A Review. Symmetry 2023, 15, 1834. [Google Scholar] [CrossRef]
- Zwanenburg, A.; Vallières, M.; Abdalah, M.A.; Aerts, H.J.W.L.; Andrearczyk, V.; Apte, A.; Ashrafinia, S.; Bakas, S.; Beukinga, R.J.; Boellaard, R.; et al. The Image Biomarker Standardization Initiative: Standardized Quantitative Radiomics for High-Throughput Image-based Phenotyping. Radiology 2020, 295, 328–338. [Google Scholar] [CrossRef]
- Li, Y.; Chen, C.-Y.; Wasserman, W.W. Deep Feature Selection: Theory and Application to Identify Enhancers and Promoters. J. Comput. Biol. 2016, 23, 322–336. [Google Scholar] [CrossRef] [PubMed]
- Tripathi, S.; Fritz, B.A.; Abdelhack, M.; Avidan, M.S.; Chen, Y.; King, C.R. Multi-view representation learning for tabular data integration using inter-feature relationships. J. Biomed. Inform. 2024, 151, 104602. [Google Scholar] [CrossRef] [PubMed]
- Smith, S.M.; Nichols, T.E. Statistical Challenges in “Big Data” Human Neuroimaging. Neuron 2018, 97, 263–268. [Google Scholar] [CrossRef] [PubMed]
- Jager, K.; Zoccali, C.; MacLeod, A.; Dekker, F. Confounding: What it is and how to deal with it. Kidney Int. 2008, 73, 256–260. [Google Scholar] [CrossRef]
- Rios, R.; Miller, R.J.; Manral, N.; Sharir, T.; Einstein, A.J.; Fish, M.B.; Ruddy, T.D.; Kaufmann, P.A.; Sinusas, A.J.; Miller, E.J.; et al. Handling missing values in machine learning to predict patient-specific risk of adverse cardiac events: Insights from REFINE SPECT registry. Comput. Biol. Med. 2022, 145, 105449. [Google Scholar] [CrossRef]
- Heymans, M.W.; Twisk, J.W. Handling missing data in clinical research. J. Clin. Epidemiol. 2022, 151, 185–188. [Google Scholar] [CrossRef]
- Ahmadian, M.; Bodalal, Z.; van der Hulst, H.J.; Vens, C.; Karssemakers, L.H.; Bogveradze, N.; Castagnoli, F.; Landolfi, F.; Hong, E.K.; Gennaro, N.; et al. Overcoming data scarcity in radiomics/radiogenomics using synthetic radiomic features. Comput. Biol. Med. 2024, 174, 108389. [Google Scholar] [CrossRef]
- Park, C.J.; Park, Y.W.; Ahn, S.S.; Kim, D.; Kim, E.H.; Kang, S.-G.; Chang, J.H.; Kim, S.H.; Lee, S.-K. Quality of Radiomics Research on Brain Metastasis: A Roadmap to Promote Clinical Translation. Korean J. Radiol. 2022, 23, 77–88. [Google Scholar] [CrossRef]
- Zwanenburg, A.; Leger, S.; Agolli, L.; Pilz, K.; Troost, E.G.C.; Richter, C.; Löck, S. Assessing robustness of radiomic features by image perturbation. Sci. Rep. 2019, 9, 614. [Google Scholar] [CrossRef]
- Guo, K.; Chen, J.; Qiu, T.; Guo, S.; Luo, T.; Chen, T.; Ren, S. MedGAN: An adaptive GAN approach for medical image generation. Comput. Biol. Med. 2023, 163, 107119. [Google Scholar] [CrossRef]
- van Stralen, K.; Dekker, F.; Zoccali, C.; Jager, K. Confounding. Nephron Clin. Pract. 2010, 116, c143–c147. [Google Scholar] [CrossRef] [PubMed]
- Chyzhyk, D.; Varoquaux, G.; Milham, M.; Thirion, B. How to remove or control confounds in predictive models, with applications to brain biomarkers. GigaScience 2022, 11, 14. [Google Scholar] [CrossRef] [PubMed]
- Spisak, T. Statistical quantification of confounding bias in machine learning models. GigaScience 2022, 11, 82. [Google Scholar] [CrossRef]
- Qu, W.; Balki, I.; Mendez, M.; Valen, J.; Levman, J.; Tyrrell, P.N. Assessing and mitigating the effects of class imbalance in machine learning with application to X-ray imaging. Int. J. Comput. Assist. Radiol. Surg. 2020, 15, 2041–2048. [Google Scholar] [CrossRef]
- Thölke, P.; Mantilla-Ramos, Y.-J.; Abdelhedi, H.; Maschke, C.; Dehgan, A.; Harel, Y.; Kemtur, A.; Berrada, L.M.; Sahraoui, M.; Young, T.; et al. Class imbalance should not throw you off balance: Choosing the right classifiers and performance metrics for brain decoding with imbalanced data. NeuroImage 2023, 277, 120253. [Google Scholar] [CrossRef]
- Hajianfar, G.; Hosseini, S.A.; Bagherieh, S.; Oveisi, M.; Shiri, I.; Zaidi, H. Impact of harmonization on the reproducibility of MRI radiomic features when using different scanners, acquisition parameters, and image pre-processing techniques: A phantom study. Med. Biol. Eng. Comput. 2024, 62, 2319–2332. [Google Scholar] [CrossRef]
- Fornacon-Wood, I.; Mistry, H.; Ackermann, C.J.; Blackhall, F.; McPartlin, A.; Faivre-Finn, C.; Price, G.J.; O’connor, J.P.B. Reliability and prognostic value of radiomic features are highly dependent on choice of feature extraction platform. Eur. Radiol. 2020, 30, 6241–6250. [Google Scholar] [CrossRef]
- Sanchez, L.E.; Rundo, L.; Gill, A.B.; Hoare, M.; Serrao, E.M.; Sala, E. Robustness of radiomic features in CT images with different slice thickness, comparing liver tumour and muscle. Sci. Rep. 2021, 11, 8262. [Google Scholar] [CrossRef]
- Alomar, K.; Aysel, H.I.; Cai, X. Data Augmentation in Classification and Segmentation: A Survey and New Strategies. J. Imaging 2023, 9, 46. [Google Scholar] [CrossRef]
- Jensen, L.J.; Kim, D.; Elgeti, T.; Steffen, I.G.; Hamm, B.; Nagel, S.N. Stability of Radiomic Features across Different Region of Interest Sizes—A CT and MR Phantom Study. Tomography 2021, 7, 238–252. [Google Scholar] [CrossRef]
- Zhang, J.; Zhan, C.; Zhang, C.; Song, Y.; Yan, X.; Guo, Y.; Ai, T.; Yang, G. Fully automatic classification of breast lesions on multi-parameter MRI using a radiomics model with minimal number of stable, interpretable features. Radiol. Med. 2023, 128, 160–170. [Google Scholar] [CrossRef] [PubMed]
- Gitto, S.; Bologna, M.; Corino, V.D.A.; Emili, I.; Albano, D.; Messina, C.; Armiraglio, E.; Parafioriti, A.; Luzzati, A.; Mainardi, L.; et al. Diffusion-weighted MRI radiomics of spine bone tumors: Feature stability and machine learning-based classification performance. Radiol. Med. 2022, 127, 518–525. [Google Scholar] [CrossRef] [PubMed]
- Xue, C.; Yuan, J.; Lo, G.G.; Chang, A.T.Y.; Poon, D.M.C.; Wong, O.L.; Zhou, Y.; Chu, W.C.W. Radiomics feature reliability assessed by intraclass correlation coefficient: A systematic review. Quant. Imaging Med. Surg. 2021, 11, 4431–4460. [Google Scholar] [CrossRef] [PubMed]
- Koo, T.K.; Li, M.Y. A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research. J. Chiropr. Med. 2016, 15, 155–163. [Google Scholar] [CrossRef]
- Van Timmeren, J.E.; Leijenaar, R.T.H.; van Elmpt, W.; Wang, J.; Zhang, Z.; Dekker, A.; Lambin, P. Test–Retest Data for Radiomics Feature Stability Analysis: Generalizable or Study-Specific? Tomography 2016, 2, 361–365. [Google Scholar] [CrossRef]
- Remeseiro, B.; Bolon-Canedo, V. A review of feature selection methods in medical applications. Comput. Biol. Med. 2019, 112, 103375. [Google Scholar] [CrossRef]
- Jia, W.; Sun, M.; Lian, J.; Hou, S. Feature dimensionality reduction: A review. Complex Intell. Syst. 2022, 8, 2663–2693. [Google Scholar] [CrossRef]
- Stańczyk, U. Feature Evaluation by Filter, Wrapper, and Embedded Approaches; Springer: Berlin/Heidelberg, Germany, 2015; pp. 29–44. [Google Scholar] [CrossRef]
- Peng, H.; Long, F.; Ding, C. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 2005, 27, 1226–1238. [Google Scholar] [CrossRef]
- Yu, L.; Lu, H. Feature selection for high-dimensional data: A fast correlation-based filter solution. In Proceedings of the 20th International Conference on Machine Learning (ICML-03), Washington, DC, USA, 21–24 August 2003. [Google Scholar]
- Theodoridis, S.; Pikrakis, A.; Koutroumbas, K.; Cavouras, D. Introduction to Pattern Recognition; Elsevier: Amsterdam, The Netherlands, 2010. [Google Scholar]
- Mustafa, S. Feature selection using sequential backward method in melanoma recognition. In Proceedings of the 2017 13th International Conference on Electronics, Computer and Computation (ICECCO), Abuja, Nigeria, 28–29 November 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 1–4. [Google Scholar]
- Tibshirani, R. Regression Shrinkage and Selection Via the Lasso. J. R. Stat. Soc. Ser. B 1996, 58, 267–288. [Google Scholar] [CrossRef]
- Tan, H.Q.; Cai, J.; Tay, S.H.; Sim, A.Y.; Huang, L.; Chua, M.L.; Tang, Y. Cluster-based radiomics reveal spatial heterogeneity of bevacizumab response for treatment of radiotherapy-induced cerebral necrosis. Comput. Struct. Biotechnol. J. 2024, 23, 43–51. [Google Scholar] [CrossRef]
- Tougui, I.; Jilbab, A.; El Mhamdi, J. Impact of the Choice of Cross-Validation Techniques on the Results of Machine Learning-Based Diagnostic Applications. Health Inform. Res. 2021, 27, 189–199. [Google Scholar] [CrossRef] [PubMed]
- Garau, N.; Paganelli, C.; Summers, P.; Choi, W.; Alam, S.; Lu, W.; Fanciullo, C.; Bellomi, M.; Baroni, G.; Rampinelli, C. External validation of radiomics-based predictive models in low-dose CT screening for early lung cancer diagnosis. Med. Phys. 2020, 47, 4125–4136. [Google Scholar] [CrossRef] [PubMed]
- Decoux, A.; Duron, L.; Habert, P.; Roblot, V.; Arsovic, E.; Chassagnon, G.; Arnoux, A.; Fournier, L. Comparative performances of machine learning algorithms in radiomics and impacting factors. Sci. Rep. 2023, 13, 14069. [Google Scholar] [CrossRef] [PubMed]
- Haukoos, J.S.; Lewis, R.J. Advanced Statistics: Bootstrapping Confidence Intervals for Statistics with “Difficult” Distributions. Acad. Emerg. Med. 2005, 12, 360–365. [Google Scholar] [CrossRef] [PubMed]
- Henderson, A.R. The bootstrap: A technique for data-driven statistics. Using computer-intensive analyses to explore experimental data. Clin. Chim. Acta 2005, 359, 1–26. [Google Scholar] [CrossRef]
- Huang, Y.; Li, W.; Macheret, F.; Gabriel, R.A.; Ohno-Machado, L. A tutorial on calibration measurements and calibration models for clinical prediction models. J. Am. Med. Inform. Assoc. 2020, 27, 621–633. [Google Scholar] [CrossRef]
- Bella, A.; Ferri, C.; Hernández-Orallo, J.; Ramírez-Quintana, M.J. Calibration of Machine Learning Models. In Handbook of Research on Machine Learning Applications and Trends; IGI Global: Hershey, PA, USA, 2010; pp. 128–146. [Google Scholar]
- Schwartz, L.H.; Litière, S.; de Vries, E.; Ford, R.; Gwyther, S.; Mandrekar, S.; Shankar, L.; Bogaerts, J.; Chen, A.; Dancey, J.; et al. RECIST 1.1—Update and clarification: From the RECIST committee. Eur. J. Cancer 2016, 62, 132–137. [Google Scholar] [CrossRef]
- Schweitzer, A.; Chiang, G.; Ivanidze, J.; Baradaran, H.; Young, R.; Zimmerman, R. Regarding “Computer-Extracted Texture Features to Distinguish Cerebral Radionecrosis from Recurrent Brain Tumors on Multiparametric MRI: A Feasibility Study”. Am. J. Neuroradiol. 2016, 38, E18–E19. [Google Scholar] [CrossRef]
- Tiwari, P.; Prasanna, P.; Wolansky, L.; Pinho, M.; Cohen, M.; Nayate, A.; Gupta, A.; Singh, G.; Hatanpaa, K.; Sloan, A.; et al. Computer-Extracted Texture Features to Distinguish Cerebral Radionecrosis from Recurrent Brain Tumors on Multiparametric MRI: A Feasibility Study. Am. J. Neuroradiol. 2016, 37, 2231–2236. [Google Scholar] [CrossRef]
- Colby, J.B. Radiomics Approach Fails to Outperform Null Classifier on Test Data. Am. J. Neuroradiol. 2017, 38, E92–E93. [Google Scholar] [CrossRef]
- DeLong, E.R.; DeLong, D.M.; Clarke-Pearson, D.L. Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach. Biometrics 1988, 44, 837–845. [Google Scholar] [CrossRef] [PubMed]
- Park, S.H.; Goo, J.M.; Jo, C.-H. Receiver Operating Characteristic (ROC) Curve: Practical Review for Radiologists. Korean J. Radiol. 2004, 5, 11–18. [Google Scholar] [CrossRef] [PubMed]
- Wu, Y. Joint comparison of the predictive values of multiple binary diagnostic tests: An extension of McNemar’s test. J. Biopharm. Stat. 2023, 33, 31–42. [Google Scholar] [CrossRef] [PubMed]
- Piovani, D.; Sokou, R.; Tsantes, A.G.; Vitello, A.S.; Bonovas, S. Optimizing Clinical Decision Making with Decision Curve Analysis: Insights for Clinical Investigators. Healthcare 2023, 11, 2244. [Google Scholar] [CrossRef]
- Kerr, K.F.; Wang, Z.; Janes, H.; McClelland, R.L.; Psaty, B.M.; Pepe, M.S. Net Reclassification Indices for Evaluating Risk Prediction Instruments. Epidemiology 2014, 25, 114–121. [Google Scholar] [CrossRef]
- Armstrong, R.A. When to use the Bonferroni correction. Ophthalmic Physiol. Opt. 2014, 34, 502–508. [Google Scholar] [CrossRef]
- Ioannidis, J.P.A. Why Most Published Research Findings Are False. PLoS Med. 2005, 2, e124. [Google Scholar] [CrossRef]
- Akinci D’Antonoli, T.; Cuocolo, R.; Baessler, B.; dos Santos, D.P. Towards reproducible radiomics research: Introduction of a database for radiomics studies. Eur. Radiol. 2023, 34, 436–443. [Google Scholar] [CrossRef]
- Zhong, J.; Lu, J.; Zhang, G.; Mao, S.; Chen, H.; Yin, Q.; Hu, Y.; Xing, Y.; Ding, D.; Ge, X.; et al. An overview of meta-analyses on radiomics: More evidence is needed to support clinical translation. Insights Imaging 2023, 14, 111. [Google Scholar] [CrossRef]
- Goisauf, M.; Cano Abadía, M. Ethics of AI in Radiology: A Review of Ethical and Societal Implications. Front. Big Data 2022, 14, 850383. [Google Scholar] [CrossRef]
- Hillis, J.M.; Visser, J.J.; Cliff, E.R.S.; Aspers, K.v.d.G.; Bizzo, B.C.; Dreyer, K.J.; Adams-Prassl, J.; Andriole, K.P. The lucent yet opaque challenge of regulating artificial intelligence in radiology. NPJ Digit. Med. 2024, 7, 69. [Google Scholar] [CrossRef] [PubMed]
- Park, J.E.; Kim, D.; Kim, H.S.; Park, S.Y.; Kim, J.Y.; Cho, S.J.; Shin, J.H.; Kim, J.H. Quality of science and reporting of radiomics in oncologic studies: Room for improvement according to radiomics quality score and TRIPOD statement. Eur Radiol. 2020, 30, 523–536. [Google Scholar] [CrossRef] [PubMed]
- Vasey, B.; Nagendran, M.; Campbell, B.; Clifton, D.A.; Collins, G.S.; Denaxas, S.; Denniston, A.K.; Faes, L.; Geerts, B.; Ibrahim, M.; et al. Reporting guideline for the early stage clinical evaluation of decision support systems driven by artificial intelligence: DECIDE-AI. BMJ 2022, 377, e070904. [Google Scholar] [CrossRef] [PubMed]
- Rivera, S.C.; Liu, X.; Chan, A.-W.; Denniston, A.K.; Calvert, M.J. Guidelines for clinical trial protocols for interventions involving artificial intelligence: The SPIRIT-AI extension. Nat. Med. 2020, 26, 1351–1363. [Google Scholar] [CrossRef]
- Liu, X.; Rivera, S.C.; Moher, D.; Calvert, M.J.; Denniston, A.K. Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: The CONSORT-AI extension. Nat. Med. 2020, 26, 1364–1374. [Google Scholar] [CrossRef]
- Korte, J.C.; Cardenas, C.; Hardcastle, N.; Kron, T.; Wang, J.; Bahig, H.; Elgohari, B.; Ger, R.; Court, L.; Fuller, C.D.; et al. Radiomics feature stability of open-source software evaluated on apparent diffusion coefficient maps in head and neck cancer. Sci. Rep. 2021, 11, 17633. [Google Scholar] [CrossRef]
- Bontempi, D.; Nuernberg, L.; Pai, S.; Krishnaswamy, D.; Thiriveedhi, V.; Hosny, A.; Mak, R.H.; Farahani, K.; Kikinis, R.; Fedorov, A.; et al. End-to-end reproducible AI pipelines in radiology using the cloud. Nat. Commun. 2024, 15, 6931. [Google Scholar] [CrossRef]
- Zaffino, P.; Marzullo, A.; Moccia, S.; Calimeri, F.; De Momi, E.; Bertucci, B.; Arcuri, P.P.; Spadea, M.F. An Open-Source COVID-19 CT Dataset with Automatic Lung Tissue Classification for Radiomics. Bioengineering 2021, 8, 26. [Google Scholar] [CrossRef]
- Prior, F.; Smith, K.; Sharma, A.; Kirby, J.; Tarbox, L.; Clark, K.; Bennett, W.; Nolan, T.; Freymann, J. The public cancer radiology imaging collections of The Cancer Imaging Archive. Sci. Data 2017, 4, 170124. [Google Scholar] [CrossRef]
- Clark, K.; Vendt, B.; Smith, K.; Freymann, J.; Kirby, J.; Koppel, P.; Moore, S.; Phillips, S.; Maffitt, D.; Pringle, M.; et al. The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository. J. Digit. Imaging 2013, 26, 1045–1057. [Google Scholar] [CrossRef]
- Woznicki, P.; Laqua, F.C.; Al-Haj, A.; Bley, T.; Baeßler, B. Addressing challenges in radiomics research: Systematic review and repository of open-access cancer imaging datasets. Insights Imaging 2023, 14, 216. [Google Scholar] [CrossRef]
Item/Condition | Paragraph | |
---|---|---|
Item 1 | Adherence to checklists | 4.1 Checklist and guidelines |
Item 2 | Eligibility criteria | 4.1.2 Eligibility |
Item 3 | High-quality reference standard | 4.1.3 Reference standard |
Item 4 | Multi-centric | 4.1.4 Monocentric vs. multicentric |
Item 5 | Standardized imaging protocol | 4.1.5 Imaging protocol |
Item 6 | Acquisition parameters | 4.1.5 Imaging protocol |
Item 7 | Time interval imaging-ref.std. | 4.1.3 Reference standard |
Condition 1 | Segmentation? | 4.3 Segmentation |
Condition 2 | Fully automated segmentation? | 4.3 Segmentation |
Item 8 | Segmentation method | 4.3 Segmentation |
Item 9 | Formal evaluation segm. meth. | 4.3 Segmentation |
Item 10 | Test segmentation | 4.3 Segmentation |
Condition 3 | Hand-crafted features? | 4.4.1 Hand-crafted features |
Item 11 | Image preprocessing | 4.2 Image preprocessing |
Item 12 | Standardized feat. extraction soft. | 4.4.1 Hand-crafted features |
Item 13 | Extraction parameters | 4.4.1 Hand-crafted features |
Condition 4 | Tabular data? | 4.5 Tabular data |
Condition 5 | End-to-end deep learning? | 1. Introduction, Figure 1 |
Item 14 | Removal non-robust features | 4.7 Features robustness |
Item 15 | Removal redundant features | 4.8 Features selection and regularization |
Item 16 | Dimensionality compared to data size | 4.8 Features selection and regularization |
Item 17 | Robustness E2E DL pipeline | 4.7 Features robustness |
Item 18 | Data partitioning (train./val./test.) | 4.9 Data partition for training, validation, and test |
Item 19 | Confounding factors | 4.6 Confounding factors |
Item 20 | Appropriate performance metrics | 4.10 Model testing and performance metrics |
Item 21 | Uncertainty assessment | 4.11 Model uncertainty assessment and calibration |
Item 22 | Calibration | 4.11 Model uncertainty assessment and calibration |
Item 23 | Uni-parametric or proof of added value | 4.12 Model comparison |
Item 24 | Comparison with non-radiomics | 4.12 Model comparison |
Item 25 | Comparison with classic stat. model | 4.12 Model comparison |
Item 26 | Internal testing | 4.10 Model testing and performance metrics |
Item 27 | External testing | 4.10 Model testing and performance metrics |
Item 28 | Data availability | 4.13 Challenges and future perspectives |
Item 29 | Code availability | 4.13 Challenges and future perspectives |
Item 30 | Model availability | 4.13 Challenges and future perspectives |
Feature Categories | Example Radiomic Features | Description |
---|---|---|
First-order | Mean, median Max/mean/min intensity 10–90th percentile Skewness, Kurtosis Range, Variance Root Mean Squared (RMS) Standard Deviation (SD) Mean Absolut Deviation (MAD) … Area Volume Maximum 3D diameter Major axis length Minor axis length Surface area Elongation Flatness Sphericity | First-order features include basic statistics on the distribution of the values of individual voxels, disregarding spatial relationship, or shape-based features. |
Second-order | Gray level co-occurrence matrix Gray-level run length matrix Gray-level size zone matrix Neighboring Gray Tone Difference Matrix Gray Levele Dependence Matrix | Second-order features describe the statistical relationships between pixels or voxels. |
High-order | Autoregressive model Haar wavelet | High-order features are usually based on matrices that consider relationships between three or more pixels or voxels. |
Subset Search Process | What It Represents | When It Can Be Used |
---|---|---|
Intraclass Correlation Coefficient (ICC) | The ICC is useful for evaluating the reproducibility or reliability of measurements between different repetitions or between different assessments made by different observers. A high ICC indicates that most variability is due to genuine differences between subjects, suggesting feature robustness. | If a test–retest is performed (the same image from the same patient and scanner obtained a few minutes apart), then the ICC can be calculated. |
Concordance Correlation Coefficient (CCC) | It combines measures of precision and accuracy to assess how well bivariate pairs of observations conform relative to a gold standard or another set. It is valuable for comparing the agreement between features extracted with different imaging techniques or acquisition parameters. | If multiple phantom images for the same and different scanners can be acquired, then the CCC for each feature can be calculated. |
Statistical Method | Exclusion Criteria |
---|---|
Missing Percentage | Disproportionate share of missing samples and difficult to fill |
Variance | Variance close to or equal to 0 |
Frequency | Features excessively concentrated in one category of values |
Correlation Coefficients (Spearman, Pearson, and Kendall) | Correlation coefficients close to or equal to 0 |
Analysis of Variance (ANOVA) | Too-low F-value or excluded features with a p-value < 0.05 |
χ2 Test | Too-low χ2 value or p-value < 0.05 |
Mutual Information | Mutual information close to or equal to 0 |
mRMR (Minimum Redundancy Maximum Relevance) | Features with the minimum correlation and maximum redundancy |
Fisher Score | Large intraclass distances and small interclass distances |
Subset Search Process | Subset Search Method | Criteria |
---|---|---|
Complete search | Breadth First Search Best First Search | Iterate through all possible combinations of feature subsets, then select the feature subset with the best model score. High computational cost |
Heuristic search | Sequential Forward Selection Sequential Backward Selection Bidirectional Search Plus-L Minus-R Selection Sequential Floating Selection Decision Tree Method | Uses rules or guided search strategies to find a good subset of features, without necessarily guaranteeing the optimal solution |
Random search | Random Generation plus Sequential Selection Simulated Annealing Genetic | A random subset of features is generated and then these feature subsets are evaluated; does not guarantee optimality or computational efficiency |
Validation Methods | Advantages | Disadvantages |
---|---|---|
Holdout dataset |
|
|
Cross-validation (CV) |
|
|
Leave-one-out CV (LOOCV) |
|
|
Bootstrapping |
|
|
Performance Metrics | Meaning | Description | |
---|---|---|---|
Regression task | Residual standard error (RSE) | It represents the standard deviation of the residuals, which are the differences between the observed values and the values predicted by the model. It provides an estimate of the typical size of the prediction errors. | It is calculated as the square root of the sum of squared residuals divided by the degrees of freedom (n − p − 1), where n is the number of observations and p is the number of predictors. |
R2 statistic | It represents the proportion of the variance in the dependent variable that is predictable from the independent variables. R2 values range from 0 to 1, with 0 indicating that the model does not explain any of the variability in the response data around its mean and 1 indicating that the model explains all the variability. | It is calculated as 1 minus the ratio of the sum of squared residuals to the total sum of squares. | |
F-statistic | It compares a model with no predictors (intercept only) to the model being evaluated. A higher F-statistic indicates that the model provides a better fit to the data than a model without any predictors. | It is calculated as the ratio of the mean regression sum of squares to the mean error sum of squares, and its significance is evaluated using the F-distribution. | |
Classification task | Sensitivity | Sensitivity, or true-positive rate, is the ability of a test to correctly identify positive cases. | It is calculated as the ratio of true positives to the sum of true positives and false negatives (TP/(TP + FN)) |
Specificity | Specificity, or true-negative rate, is the ability of a test to correctly identify negative cases. | It is calculated as the ratio of true negatives to the sum of true negatives and false positives (TN/(TN + FP)). | |
Accuracy | Accuracy is the overall correctness of a test. | It is calculated as the ratio of the number of correct predictions (true positives and true negatives) to the total number of cases examined ((TP + TN)/(TP + TN + FP + FN)). | |
Precision | Precision, or positive predictive value, indicates the proportion of correct positive results among all those identified as positive. | It is calculated as the ratio of true positives to the sum of true positives and false positives (TP/(TP + FP)). | |
Recall | Recall is synonymous with sensitivity. | ||
F1 score | The F1 score is the harmonic mean of precision and recall, providing a single measure that balances both aspects. | It is calculated as 2 × (Precision × Recall)/(Precision + Recall). | |
Receiver operating characteristic (ROC) curves and area under the curve (AUC) | Depict the trade-off between the true-positive rate (sensitivity) and the false-positive rate (1-specificity) across various classification thresholds. A higher area under the ROC curve signifies a better discrimination capability, with an AUC of 1 indicating a perfect classifier. | ||
Confusion matrices | They present the counts of true-positive, true-negative, false-positive, and false-negative predictions, providing insights into the model’s ability to correctly classify instances from each class. From the confusion matrix, derived metrics, such as accuracy, precision, recall (sensitivity), specificity, and F1-score, can be calculated. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Cè, M.; Chiriac, M.D.; Cozzi, A.; Macrì, L.; Rabaiotti, F.L.; Irmici, G.; Fazzini, D.; Carrafiello, G.; Cellina, M. Decoding Radiomics: A Step-by-Step Guide to Machine Learning Workflow in Hand-Crafted and Deep Learning Radiomics Studies. Diagnostics 2024, 14, 2473. https://doi.org/10.3390/diagnostics14222473
Cè M, Chiriac MD, Cozzi A, Macrì L, Rabaiotti FL, Irmici G, Fazzini D, Carrafiello G, Cellina M. Decoding Radiomics: A Step-by-Step Guide to Machine Learning Workflow in Hand-Crafted and Deep Learning Radiomics Studies. Diagnostics. 2024; 14(22):2473. https://doi.org/10.3390/diagnostics14222473
Chicago/Turabian StyleCè, Maurizio, Marius Dumitru Chiriac, Andrea Cozzi, Laura Macrì, Francesca Lucrezia Rabaiotti, Giovanni Irmici, Deborah Fazzini, Gianpaolo Carrafiello, and Michaela Cellina. 2024. "Decoding Radiomics: A Step-by-Step Guide to Machine Learning Workflow in Hand-Crafted and Deep Learning Radiomics Studies" Diagnostics 14, no. 22: 2473. https://doi.org/10.3390/diagnostics14222473
APA StyleCè, M., Chiriac, M. D., Cozzi, A., Macrì, L., Rabaiotti, F. L., Irmici, G., Fazzini, D., Carrafiello, G., & Cellina, M. (2024). Decoding Radiomics: A Step-by-Step Guide to Machine Learning Workflow in Hand-Crafted and Deep Learning Radiomics Studies. Diagnostics, 14(22), 2473. https://doi.org/10.3390/diagnostics14222473