The Use of Artificial Intelligence to Analyze the Exposome in the Development of Chronic Diseases: A Review of the Current Literature
Abstract
:1. Introduction
2. Materials and Methods
3. Results and Discussion
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Gonzaga-Jauregui, C.; Lupski, J.R.; Gibbs, R.A. Human Genome Sequencing in Health and Disease. Annu. Rev. Med. 2012, 63, 35–61. [Google Scholar] [CrossRef] [PubMed]
- Wild, C.P. Complementing the Genome with an “Exposome”: The Outstanding Challenge of Environmental Exposure Measurement in Molecular Epidemiology. Cancer Epidemiol. Biomark. Prev. 2005, 14, 1847–1850. [Google Scholar] [CrossRef] [PubMed]
- Baker, R.E.; Mahmud, A.S.; Miller, I.F.; Rajeev, M.; Rasambainarivo, F.; Rice, B.L.; Takahashi, S.; Tatem, A.J.; Wagner, C.E.; Wang, L.-F. Infectious Disease in an Era of Global Change. Nat. Rev. Microbiol. 2022, 20, 193–205. [Google Scholar] [CrossRef] [PubMed]
- Peters, R.; Ee, N.; Peters, J.; Beckett, N.; Booth, A.; Rockwood, K.; Anstey, K.J. Common Risk Factors for Major Noncommunicable Disease, a Systematic Overview of Reviews and Commentary: The Implied Potential for Targeted Risk Reduction. Ther. Adv. Chronic Dis. 2019, 10, 2040622319880392. [Google Scholar] [CrossRef] [PubMed]
- Budreviciute, A.; Damiati, S.; Sabir, D.K.; Onder, K.; Schuller-Goetzburg, P.; Plakys, G.; Katileviciute, A.; Khoja, S.; Kodzius, R. Management and Prevention Strategies for Non-Communicable Diseases (NCDs) and Their Risk Factors. Front. Public Health 2020, 788, 574111. [Google Scholar] [CrossRef]
- Chaker, L.; Falla, A.; van der Lee, S.J.; Muka, T.; Imo, D.; Jaspers, L.; Colpani, V.; Mendis, S.; Chowdhury, R.; Bramer, W.M. The Global Impact of Non-Communicable Diseases on Macro-Economic Productivity: A Systematic Review. Eur. J. Epidemiol. 2015, 30, 357–395. [Google Scholar] [CrossRef]
- Benziger, C.P.; Roth, G.A.; Moran, A.E. The Global Burden of Disease Study and the Preventable Burden of NCD. Glob. Heart 2016, 11, 393–397. [Google Scholar] [CrossRef]
- Vineis, P.; Robinson, O.; Chadeau-Hyam, M.; Dehghan, A.; Mudway, I.; Dagnino, S. What Is New in the Exposome? Environ. Int. 2020, 143, 105887. [Google Scholar] [CrossRef]
- Senier, L.; Brown, P.; Shostak, S.; Hanna, B. The Socio-Exposome: Advancing Exposure Science and Environmental Justice in a Postgenomic Era. Environ. Sociol. 2017, 3, 107–121. [Google Scholar] [CrossRef]
- Vermeulen, R.; Schymanski, E.L.; Barabási, A.-L.; Miller, G.W. The Exposome and Health: Where Chemistry Meets Biology. Science 2020, 367, 392–396. [Google Scholar] [CrossRef]
- Sillé, F.; Karakitsios, S.; Kleensang, A.; Koehler, K.; Maertens, A.; Miller, G.W.; Prasse, C.; Quiros-Alcala, L.; Ramachandran, G.; Hartung, T. The Exposome: A New Approach for Risk Assessment. Altern. Anim. Exp. ALTEX 2020, 37, 3–23. [Google Scholar] [CrossRef] [PubMed]
- Hu, H.; Liu, X.; Zheng, Y.; He, X.; Hart, J.; James, P.; Laden, F.; Chen, Y.; Bian, J. Methodological Challenges in Spatial and Contextual Exposome-Health Studies. Crit. Rev. Environ. Sci. Technol. 2022, 53, 827–846. [Google Scholar] [CrossRef] [PubMed]
- Santos, S.; Maitre, L.; Warembourg, C.; Agier, L.; Richiardi, L.; Basagaña, X.; Vrijheid, M. Applying the Exposome Concept in Birth Cohort Research: A Review of Statistical Approaches. Eur. J. Epidemiol. 2020, 35, 193–204. [Google Scholar] [CrossRef]
- Rowe, M. An Introduction to Machine Learning for Clinicians. Acad. Med. 2019, 94, 1433–1436. [Google Scholar] [CrossRef]
- Nuzzi, R.; Boscia, G.; Marolo, P.; Ricardi, F. The Impact of Artificial Intelligence and Deep Learning in Eye Diseases: A Review. Front. Med. 2021, 8, 710329. [Google Scholar] [CrossRef]
- Jiang, F.; Jiang, Y.; Zhi, H.; Dong, Y.; Li, H.; Ma, S.; Wang, Y.; Dong, Q.; Shen, H.; Wang, Y. Artificial Intelligence in Healthcare: Past, Present and Future. Stroke Vasc. Neurol. 2017, 2, 230–243. [Google Scholar] [CrossRef]
- Babel, A.; Taneja, R.; Mondello Malvestiti, F.; Monaco, A.; Donde, S. Artificial Intelligence Solutions to Increase Medication Adherence in Patients With Non-Communicable Diseases. Front. Digit. Health 2021, 3, 669869. [Google Scholar] [CrossRef] [PubMed]
- Lavanya, J.M.S.; Subbulakshmi, P. Machine Learning Techniques for the Prediction of Non-Communicable Diseases. In Proceedings of the 2023 International Conference on Artificial Intelligence and Knowledge Discovery in Concurrent Engineering (ICECONF), Chennai, India, 5–7 January 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 1–8. [Google Scholar]
- Allegra, A.; Tonacci, A.; Sciaccotta, R.; Genovese, S.; Musolino, C.; Pioggia, G.; Gangemi, S. Machine Learning and Deep Learning Applications in Multiple Myeloma Diagnosis, Prognosis, and Treatment Selection. Cancers 2022, 14, 606. [Google Scholar] [CrossRef] [PubMed]
- Murdaca, G.; Banchero, S.; Tonacci, A.; Nencioni, A.; Monacelli, F.; Gangemi, S. Vitamin D and Folate as Predictors of MMSE in Alzheimer’s Disease: A Machine Learning Analysis. Diagnostics 2021, 11, 940. [Google Scholar] [CrossRef]
- Subramanian, M.; Wojtusciszyn, A.; Favre, L.; Boughorbel, S.; Shan, J.; Letaief, K.B.; Pitteloud, N.; Chouchane, L. Precision medicine in the era of artificial intelligence: Implications in chronic disease management. J. Transl. Med. 2020, 18, 472. [Google Scholar] [CrossRef]
- Moher, D.; Liberati, A.; Tetzlaff, J.; Altman, D.G.; Prisma Group. Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. PLoS Med. 2009, 6, e1000097. [Google Scholar] [CrossRef] [PubMed]
- Hartung, T. A Call for a Human Exposome Project. ALTEX Altern. Anim. Exp. 2023, 40, 4–33. [Google Scholar] [CrossRef]
- Home—The European Human Exposome Network (EHEN). Available online: https://www.humanexposome.eu/ (accessed on 5 December 2023).
- Home—Hedimed. Available online: https://www.hedimed.eu/ (accessed on 5 December 2023).
- Ronkainen, J.; Nedelec, R.; Atehortua, A.; Balkhiyarova, Z.; Cascarano, A.; Dang, V.N.; Elhakeem, A.; Van Enckevort, E.; Soares, A.G.; Haakma, S. LongITools: Dynamic Longitudinal Exposome Trajectories in Cardiovascular and Metabolic Noncommunicable Diseases. Environ. Epidemiol. 2022, 6, e184. [Google Scholar] [CrossRef]
- Benjdir, M.; Audureau, É.; Beresniak, A.; Coll, P.; Epaud, R.; Fiedler, K.; Jacquemin, B.; Niddam, L.; Pandis, S.N.; Pohlmann, G. Assessing the Impact of Exposome on the Course of Chronic Obstructive Pulmonary Disease and Cystc Fibrosis: The REMEDIA European Project Approach. Environ. Epidemiol. 2021, 5, e165. [Google Scholar] [CrossRef]
- Vrijheid, M.; Basagaña, X.; Gonzalez, J.R.; Jaddoe, V.W.V.; Jensen, G.; Keun, H.C.; McEachan, R.R.C.; Porcel, J.; Siroux, V.; Swertz, M.A.; et al. Advancing Tools for Human Early Lifecourse Exposome Research and Translation (ATHLETE): Project Overview. Environ. Epidemiol. 2021, 5, e166. [Google Scholar] [CrossRef] [PubMed]
- EPHOR—EPHOR Project. Available online: https://www.ephor-project.eu/ (accessed on 5 December 2023).
- Ronsmans, S.; Hougaard, K.S.; Nawrot, T.S.; Plusquin, M.; Huaux, F.; Cruz, M.J.; Moldovan, H.; Verpaele, S.; Jayapala, M.; Tunney, M. The EXIMIOUS Project—Mapping Exposure-Induced Immune Effects: Connecting the Exposome and the Immunome. Environ. Epidemiol. 2022, 6, e193. [Google Scholar] [CrossRef]
- Van Kamp, I.; Waye, K.P.; Kanninen, K.; Gulliver, J.; Bozzon, A.; Psyllidis, A.; Boshuizen, H.; Selander, J.; van den Hazel, P.; Brambilla, M. Early Environmental Quality and Life-Course Mental Health Effects: The Equal-Life Project. Environ. Epidemiol. 2022, 6, e183. [Google Scholar] [CrossRef]
- Vlaanderen, J.; De Hoogh, K.; Hoek, G.; Peters, A.; Probst-Hensch, N.; Scalbert, A.; Melén, E.; Tonne, C.; De Wit, G.A.; Chadeau-Hyam, M.; et al. Developing the Building Blocks to Elucidate the Impact of the Urban Exposome on Cardiometabolic-Pulmonary Disease: The EU EXPANSE Project. Environ. Epidemiol. 2021, 5, e162. [Google Scholar] [CrossRef] [PubMed]
- Martinez, R.M.; Müller, H.; Negru, S.; Ormenisan, A.; Mühr, L.S.A.; Zhang, X.; Møller, F.T.; Clements, M.S.; Kozlakidis, Z.; Pimenoff, V.N. Human Exposome Assessment Platform. Environ. Epidemiol. 2021, 5, e182. [Google Scholar] [CrossRef]
- Pero-Gascon, R.; Hemeryck, L.Y.; Poma, G.; Falony, G.; Nawrot, T.S.; Raes, J.; Vanhaecke, L.; De Boevre, M.; Covaci, A.; De Saeger, S. FLEXiGUT: Rationale for Exposomics Associations with Chronic Low-Grade Gut Inflammation. Environ. Int. 2022, 158, 106906. [Google Scholar] [CrossRef]
- Fine, L.J.; Philogene, G.S.; Gramling, R.; Coups, E.J.; Sinha, S. Prevalence of Multiple Chronic Disease Risk Factors: 2001 National Health Interview Survey. Am. J. Prev. Med. 2004, 27, 18–24. [Google Scholar] [CrossRef] [PubMed]
- Jordan, C.O.; Slater, M.; Kottke, T.E. Preventing Chronic Disease Risk Factors: Rationale and Feasibility. Medicina 2008, 44, 745. [Google Scholar] [CrossRef] [PubMed]
- Ohanyan, H.; Portengen, L.; Huss, A.; Traini, E.; Beulens, J.W.J.; Hoek, G.; Lakerveld, J.; Vermeulen, R. Machine Learning Approaches to Characterize the Obesogenic Urban Exposome. Environ. Int. 2022, 158, 107015. [Google Scholar] [CrossRef] [PubMed]
- Ohanyan, H.; Portengen, L.; Kaplani, O.; Huss, A.; Hoek, G.; Beulens, J.W.J.; Lakerveld, J.; Vermeulen, R. Associations between the Urban Exposome and Type 2 Diabetes: Results from Penalised Regression by Least Absolute Shrinkage and Selection Operator and Random Forest Models. Environ. Int. 2022, 170, 107592. [Google Scholar] [CrossRef] [PubMed]
- Lee, E.Y.; Akhtari, F.; House, J.S.; Simpson Jr, R.J.; Schmitt, C.P.; Fargo, D.C.; Schurman, S.H.; Hall, J.E.; Motsinger-Reif, A.A. Questionnaire-Based Exposome-Wide Association Studies (ExWAS) Reveal Expected and Novel Risk Factors Associated with Cardiovascular Outcomes in the Personalized Environment and Genes Study. Environ. Res. 2022, 212, 113463. [Google Scholar] [CrossRef]
- Bae, W.D.; Alkobaisi, S.; Horak, M.; Park, C.-S.; Kim, S.; Davidson, J. Predicting Health Risks of Adult Asthmatics Susceptible to Indoor Air Quality Using Improved Logistic and Quantile Regression Models. Life 2022, 12, 1631. [Google Scholar] [CrossRef]
- Ren, X.; Mi, Z.; Georgopoulos, P.G. Socioexposomics of COVID-19 across New Jersey: A Comparison of Geostatistical and Machine Learning Approaches. J. Expo. Sci. Environ. Epidemiol. 2023, 34, 197–207. [Google Scholar] [CrossRef]
- Pries, L.-K.; Lage-Castellanos, A.; Delespaul, P.; Kenis, G.; Luykx, J.J.; Lin, B.D.; Richards, A.L.; Akdede, B.; Binbay, T.; Altinyazar, V. Estimating Exposome Score for Schizophrenia Using Predictive Modeling Approach in Two Independent Samples: The Results from the EUGEI Study. Schizophr. Bull. 2019, 45, 960–965. [Google Scholar] [CrossRef]
- Zhao, F.; Li, L.; Lin, P.; Chen, Y.; Xing, S.; Du, H.; Wang, Z.; Yang, J.; Huan, T.; Long, C. HExpPredict: In Vivo Exposure Prediction of Human Blood Exposome Using a Random Forest Model and Its Application in Chemical Risk Prioritization. Environ. Health Perspect. 2023, 131, 037009. [Google Scholar] [CrossRef]
- Matta, K.; Vigneau, E.; Cariou, V.; Mouret, D.; Ploteau, S.; Le Bizec, B.; Antignac, J.-P.; Cano-Sancho, G. Associations between Persistent Organic Pollutants and Endometriosis: A Multipollutant Assessment Using Machine Learning Algorithms. Environ. Pollut. 2020, 260, 114066. [Google Scholar] [CrossRef]
- Li, K.; Bertrand, K.; Naviaux, J.C.; Monk, J.M.; Wells, A.; Wang, L.; Lingampelly, S.S.; Naviaux, R.K.; Chambers, C. Metabolomic and Exposomic Biomarkers of Risk of Future Neurodevelopmental Delay in Human Milk. Pediatr. Res. 2023, 93, 1710–1720. [Google Scholar] [CrossRef] [PubMed]
- Louis, G.M.B.; Yeung, E.; Kannan, K.; Maisog, J.; Zhang, C.; Grantz, K.L.; Sundaram, R. Patterns and Variability of Endocrine Disrupting Chemicals during Pregnancy: Implications for Understanding the Exposome of Normal Pregnancy. Epidemiology 2019, 30, S65. [Google Scholar] [CrossRef] [PubMed]
- Nemkov, T.; Stefanoni, D.; Bordbar, A. Recipient Epidemiology and Donor Evaluation Study III Red Blood Cell–Omics (REDS-III RBC-Omics) Study. Blood Donor Exposome and Impact of Common Drugs on Red Blood Cell Metabolism. JCI Insight 2021, 6, 146175. [Google Scholar] [CrossRef] [PubMed]
- Buelow, E.; Rico, A.; Gaschet, M.; Lourenço, J.; Kennedy, S.P.; Wiest, L.; Ploy, M.-C.; Dagot, C. Hospital Discharges in Urban Sanitation Systems: Long-Term Monitoring of Wastewater Resistome and Microbiota in Relationship to Their Eco-Exposome. Water Res. X 2020, 7, 100045. [Google Scholar] [CrossRef] [PubMed]
- Loef, B.; Wong, A.; Janssen, N.A.H.; Strak, M.; Hoekstra, J.; Picavet, H.S.J.; Boshuizen, H.C.; Verschuren, W.M.; Herber, G.-C.M. Using Random Forest to Identify Longitudinal Predictors of Health in a 30-Year Cohort Study. Sci. Rep. 2022, 12, 10372. [Google Scholar] [CrossRef]
- Johnson, T.; Kanjo, E.; Woodward, K. DigitalExposome: Quantifying Impact of Urban Environment on Wellbeing Using Sensor Fusion and Deep Learning. Comput. Urban Sci. 2023, 3, 14. [Google Scholar] [CrossRef]
- Patella, V.; Florio, G.; Palmieri, M.; Bousquet, J.; Tonacci, A.; Giuliano, A.; Gangemi, S. Atopic Dermatitis Severity during Exposure to Air Pollutants and Weather Changes with an Artificial Neural Network (ANN) Analysis. Pediatr. Allergy Immunol. 2020, 31, 938–945. [Google Scholar] [CrossRef]
- Jaiswal, S.K.; Agarwal, S.M.; Thodum, P.; Sharma, V.K. SkinBug: An Artificial Intelligence Approach to Predict Human Skin Microbiome-Mediated Metabolism of Biotics and Xenobiotics. Iscience 2021, 24, 101925. [Google Scholar] [CrossRef]
- Atehortúa, A.; Gkontra, P.; Camacho, M.; Diaz, O.; Bulgheroni, M.; Simonetti, V.; Chadeau-Hyam, M.; Felix, J.F.; Sebert, S.; Lekadir, K. Cardiometabolic risk estimation using exposome data and machine learning. Int. J. Med. Inform. 2023, 179, 105209. [Google Scholar] [CrossRef]
- Dong, Y.; Lau, H.X.; Suaini, N.H.A.; Kee, M.Z.L.; Ooi, D.S.Q.; Shek, L.P.; Lee, B.W.; Godfrey, K.M.; Tham, E.H.; Ong, M.E.H.; et al. A machine-learning exploration of the exposome from preconception in early childhood atopic eczema, rhinitis and wheeze development. Environ. Res. 2024, 250, 118523. [Google Scholar] [CrossRef]
- Marín, D.; Orozco, L.Y.; Narváez, D.M.; Ortiz-Trujillo, I.C.; Molina, F.J.; Ramos, C.D.; Rodriguez-Villamizar, L.; Bangdiwala, S.I.; Morales, O.; Cuellar, M. Characterization of the External Exposome and Its Contribution to the Clinical Respiratory and Early Biological Effects in Children: The PROMESA Cohort Study Protocol. PLoS ONE 2023, 18, e0278836. [Google Scholar] [CrossRef] [PubMed]
- De Vito, S.; Esposito, E.; Massera, E.; Formisano, F.; Fattoruso, G.; Ferlito, S.; Del Giudice, A.; D’Elia, G.; Salvato, M.; Polichetti, T. Crowdsensing IoT Architecture for Pervasive Air Quality and Exposome Monitoring: Design, Development, Calibration, and Long-Term Validation. Sensors 2021, 21, 5219. [Google Scholar] [CrossRef] [PubMed]
- Shamji, M.H.; Ollert, M.; Adcock, I.M.; Bennett, O.; Favaro, A.; Sarama, R.; Riggioni, C.; Annesi-Maesano, I.; Custovic, A.; Fontanella, S. EAACI Guidelines on Environmental Science in Allergic Diseases and Asthma–Leveraging Artificial Intelligence and Machine Learning to Develop a Causality Model in Exposomics. Allergy 2023, 78, 1742–1757. [Google Scholar] [CrossRef] [PubMed]
- Hawthorne, C.; Lopez-Campos, G.H. Integration of Annotated Phenotype, Gene and Chemical Text Data to Advance Exposome Informatics. Stud. Health Technol. Inform. 2022, 294, 870–871. [Google Scholar]
- Hawthorne, C.; Simpson, D.A.; Devereux, B.; López-Campos, G. Phexpo: A Package for Bidirectional Enrichment Analysis of Phenotypes and Chemicals. JAMIA Open 2020, 3, 173–177. [Google Scholar] [CrossRef]
Exposome | Factor(s) | Biological Plausibility | Ability to Derive Individual Level of Exposure |
---|---|---|---|
General external exposome | |||
Meteorological Factors | Climate Change, Wind, Temperature | High | Medium/High |
Air Pollution | Pollen, Traffic, NO2, SO2, PM, CO | High | Medium |
Urban/Rural Environment | Population Density, Green Space, Accessibility to Resources | Medium | Medium |
Home Environment | PM, NO2, CO, Metals, Plastic, Pets, Dust | High | Low/Medium |
Food and Water Contaminants | Pesticides, Metals, Fertilizers | High | Medium |
Specific external exposome | |||
Occupational Exposures | Plants, Chemicals, Animal Proteins, Dust | High | Medium |
Medications | Medicines, Surgeries | High | Low |
Personal Behavior | Diet, Physical Activity, Smoking, Alcohol | High | Low/Medium |
Internal exposome | |||
Metabolic Factors Microbioma Inflammation Factors Oxidative Stress Factors Aging Genetic and Epigenetic Factors | Medium/High | Medium/High | |
Socio-exposome | |||
Social Factors | Education, Occupation, Psychological Stress, Access to Food, Racial Inequality | Low-to-High | Low/Medium |
Economic Factors | Economic Status, Occupation | Low-to-High | Low/Medium |
Author and Year | AI Type | ML Model | Topic | Aims | Study Area | Patients | Results |
---|---|---|---|---|---|---|---|
Ohanyan et al., 2022a [37] | Supervised ML | PLS, Bayesian Model Averaging, penalized regression using Minimax Concave Penalty, RF, XGBoost, MLR | Environmental factors and BMI | To explore which factors of the urban exposome are related to body mass index (BMI) and evaluate the consistency of the results across multiple statistical approaches | The Netherlands | 14,829 | Associated with BMI: average neighborhood value of homes, oxidative potential of particulate matter air pollution, healthy food outlets in neighborhood, low-income neighborhoods, and one-person households in neighborhood. Higher BMI levels in low-income neighborhoods, with lower average house values, lower share of one-person households, and smaller amount of healthy food retailers and higher OP levels |
Ohanyan et al., 2022b [38] | Supervised ML | LASSO, ANN, RF | Urban exposome and T2D | To examine the associations of 85 urban exposure factors and the prevalence of T2D and evaluate how the obtained results compare with data on established T2D risk factors | The Netherlands | 14,829 | Lower average home values, higher share of non-Western immigrants, and higher surface temperatures related to higher risk of T2D in LASSO, RF. Some risk factors (air pollutants) appeared in LASSO but were not among most important factors in RF. Other factors (green space) did not appear in LASSO, but appeared in RF. LASSO outperformed both RF and ANN |
Lee et al., 2022 [39] | Supervised ML | Knockoff Boosted Tree | Exposome and cardiovascular risk | To explore the relationship between the exposome and various cardiovascular outcomes with different and shared pathophysiologies in an adult population in the USA | USA | 5015 | Analyses revealed new associations between blood type A (Rh-) with heart attack, paint exposure with stroke, exposure to biohazardous materials with arrhythmia, and higher level of paternal education with reduced risk of cardiovascular disease. Sleep disorders and smoking remained important risk factors. |
Bae et al., 2022 [40] | Deep learning and ML | Logistic regression | Indoor air quality and ashtma | To provide methods for assessing indoor air quality on a patient-specific basis with significant control regarding the level of exposure to each agent | Republic of Korea | 19 | Application of deep learning led to improvement in classification accuracy (11.5–18.4%) of logistic regression model, with low relative errors, ranging between 0.018 and 0.160 |
Ren et al., 2023 [41] | Supervised ML | RF, XGBoost | Socioexposome and COVID-19 | To identify socioexpository associations with COVID-19 outcomes in New Jersey and evaluate the consistency of findings from multiple modeling approaches | USA | Data from 565 municipalities of New Jersey | Positive associations of COVID-19 mortality with historic exposures to NO2, population density, percentage of minorities, and below-high school education, and other social and environmental factors. ML methods detected consistent nonlinear associations not captured by geostatistical models |
Pries et al., 2019 [42] | Supervised ML | LR, GNB, LASSO, Ridge | Schizophrenia and exposome | To demonstrate how predictive modeling approaches can be used to construct an exposome score for schizophrenia | The Netherlands, Turkey, Spain, Serbia | 3316 | Machine learning approaches perform well, especially LR, LASSO, and Ridge. For example, exposure score (LR) distinguished patients from controls (odds ratio [OR] = 1.94, p < 0.001), patients from siblings (OR = 1.58, p < 0.001), and siblings from controls (OR = 1.21, p = 0.001) |
Zhao et al., 2023 [43] | Supervised ML | RF, ANN, SVM | Human blood exposome | To develop an ML model to predict blood concentrations of chemicals and prioritize chemicals potentially hazardous to health | USA | N/A | RF outperformed ANN and SVF models. The most active compounds are food additives and pesticides rather than widely monitored environmental pollutants |
Matta et al., 2020 [44] | Supervised ML | LR, ANN, SVM, Adaptive Boosting, Partial Least Squares Discriminant Analysis | Endometriosis and persistent pollutants in adipose tissue | To apply different ML techniques to explore associations between mixtures of persistent organic pollutants and deep endometriosis | France | 99 | Deep endometriosis associated with octachlorodibenzofuran, cis-heptachlor epoxide, polychlorinated biphenyl 77, or trans-nonachlor, among others. Regularized logistic regression provided good compromise between interpretability of traditional statistical approaches and classification capacity of machine learning approaches |
Li et al., 2022 [45] | Supervised ML | PLS-DA, RF, KNN | Risk of neurodevelopmental delay and for human milk metabolome/exposome issues | To examine the prognostic value of the human milk metabolome and exposome in children at risk for neurodevelopmental delay | USA | 82 | Changes in deoxysphingolipids, phospholipids, glycosphingolipids, plasmalogens, and acylcarnitines in milk of mothers with children at risk for future delay. Predictive classifier had diagnostic accuracy of 0.81 (95% CI: 0.66–0.96) for females and 0.79 (95% CI: 0.62–0.94) for males |
Louis et al., 2019 [46] | Unsupervised ML | Linear mixed-effects model | Exposome and pregnancy outcomes | To better understand the complexity of the exposome and the temporal changes in endocrine-disrupting chemicals during pregnancy and their interaction with pregnancy outcomes | USA | 50 | Four chemical clusters comprising 80 compounds, of which six consistently increased, 63 consistently decreased, and 11 reflected inconsistent patterns over pregnancy. Overall, concentrations tended to decrease over pregnancy for persistent endocrine-disrupting chemicals; inverse pattern was seen for many non-persistent chemicals. Explained variance was highest for five persistent chemicals: polybrominated diphenyl ethers #191 and #126, hexachlorobenzene, p,p’-dichloro-diphenyl-dichloroethylene, and o,p’-dichloro-diphenyl-dichloroethane |
Nemkov et al., 2021 [47] | Supervised and unsupervised ML | Not specified | Blood donor exposome and drug impact on red blood cell metabolism | To evaluate the impact of the exposome that can alter erythrocyte energy and redox metabolism and the possibility of influencing red blood cell storage quality and efficacy post-transfusion | USA | 250 | Impact of drugs (65% of 1366 tested) on RBC metabolism; ranitidine as potential additive |
Buelow et al., 2020 [48] | Supervised ML | RF | Eco-exposome and water pollution | To use ML to evaluate the resistome, microbiota, and eco-exposome signatures of hospital wastewater, as compared to urban wastewater | N/A | N/A | Analysis demonstrated significant impact of pharmaceuticals and surfactants on resistome and microbiota of both hospital and municipal wastewater |
Loef et al., 2022 [49] | Supervised ML | RF | Exposome and preceived health | To study the relationship between exposure and perceived health based on data extracted from the 30-year Doetinchem cohort study. | The Netherlands | 3419 | RF model’s ability to discriminate poor from good self-perceived health was acceptable (area under curve = 0.707). Nine exposures from different exposome-related domains were largely responsible for model’s performance, while 87 exposures seemed to contribute little to performance |
Johnson et al., 2023 [50] | Deep learning and supervised ML | CNN, RF, SVM, decision ree, Gaussian Naive Bayes, LR, XGBoost | Digital exposome and human wellbeing | To use “DigitalExposome” to better understand the relationship between environment and mental health along with perceived environmental responses | UK | 40 | Electrodermal activity and heart rate variability impacted by level of particulate matter in environment. Self-reported wellbeing classified from multimodal dataset with f1-score of 0.76 |
Patella et al., 2020 [51] | Unsupervised Artificial Neural Networks | Kohonen Self-Organizing Map | Atmospheric and climatic factors’ effects on signs and symptoms of atopic dermatitis | To use AI to understand the possible relationships between climate variables and atopic disorder likelihood | Italy | 60 | Good predictivity of disease severity based on environmental pollution data, lower predictivity for weather-related factors |
Jaiswal et al., 2021 [52] | Supervised machine learning and neural networks (plus chemoinformatics) | kNN, Recursive Partitioning, SVM, XGBoost, Perceptive Neural Network, Naive Bayes, Random Forest | Skin microbiome-mediated metabolism of biotics and xenobiotics | To test a tool to predict the metabolic reaction, enzymes, species, and skin sites of the skin microbiome potentially metabolizing biotic/xenobiotic molecules, through chemoinformatics, machine learning, and neural networks | India | N/A (1,094,153 metabolic enzymes) | Multiclass multilabel accuracy: 82.4%; binary accuracy: 90.0% |
Atehortúa et al., 2023 [53] | Supervised machine learning ensemble method | XGBoost | Relationship between exposome and cardiometabolic risk | Developing a model for cardiovascular disease (CVD) and type 2 diabetes (T2D) risk prediction based on exposome factors | UK | 13,764 (equally divided into cases and controls) | ROC AUC of 0.78 ± 0.01 and 0.77 ± 0.01 for CVD and T2D, respectively |
Dong et al., 2024 [54] | Machine learning combination of models | XGBoost, genetic algorithm and logistic regression models. Final multiple logistic regression model | Preconceptional exposome and atopic problems | To apply a machine learning approach to explore the role of the exposome in the preconception phase of atopic problems | Singapore | 1151 mother–child pairs | Pre-conception alcohol consumption and maternal depressive symptoms during pregnancy increase eczema and rhinitis risk. Higher maternal blood neopterin and child blood dimethylglycine protect against early childhood wheeze. After birth, early infection is key driver of atopy |
Model | Application(s) | Advantages | Drawbacks |
---|---|---|---|
Partial Least Squares Regression | Investigation around relationships between continuous-like variables | Handling large amounts of variables, non-orthogonal descriptors, low risk of retrieving correlations by chance | Risk of overlooking real correlations, suboptimal sensitivity |
Random Forest | Regression and classification tasks | High accuracy, robustness to noise, handling missing values and numerical and categorical data, stability to overfitting | Poor interpretability, significant computational efforts |
Extreme Gradient Boosting | Regression and classification tasks | Good accuracy, computational speed, flexibility, robustness to overfitting | Complexity, lack of transparency, memory usage, not fully immune to overfitting |
Multiple Linear Regression | Predictions, explanations of relationships between variables, variable-importance ranking | Ability to determine relative influence of predictors of criterion value, capability of identifying outliers | Requires high-quality data |
LASSO | Regression tasks | Reduces overfitting, performs feature selection, fast to implement and run | Relatively poor stability, not particularly intuitive |
Boosted Trees | Regression and classification tasks | Excellent performances with high-quality data | Poor with noisy data, tendency to overfit |
Logistic Regression | Regression tasks | Easy implementation, interpretable, fast, no assumptions about data distribution | Tendency to overfitting, linearity assumption (rarely found in real world) |
Gaussian Naïve Bayes | Symptom-based diagnosis | Performs well with normally distributed data | Unsuitable with non-gaussian data |
Ridge Regression | Regression tasks | Robust to overfitting, performs well with large data featuring more observations than predictors, low complexity | Not performing feature selection, trades variance for bias |
Support Vector Machine | Pattern recognition, reliability evaluation, bioinformatics, survival time estimation, assessment of disease severity | Effective in high-dimensional spaces, memory efficient, versatile | Unable to provide direct probability estimates, tendency to overfit |
Adaptive Boosting | Regression, clustering, data and text mining | Optimal with noisy data or with many non-relevant features | Need for high quality datasets |
k-Nearest Neighbors | Classification and regression tasks, e.g., pattern recognition, data mining | Simple to implement | Not optimal with large dataset and with high-dimensional data, sensitive to noisy and missing data |
Linear Mixed-Effects | Providing evolutional details of repeated measurements | Prevent false positives, possibility to increase its power | Computational issues, limited interpretation |
Decision Trees | Regression and classification tasks | Interpretability, ability to handle unbalanced data, variable selection, handling missing values, non-parametric nature | Overfitting, sensitivity to small variations, biased learning |
Convolutional Neural Networks | Image segmentation, disease classification and grading | Robust to noise and distortion in input data, automatic feature extraction, no need for supervision | Time consumption, subjectivity |
Artificial Neural Networks | Prediction, data and image interpretation, data mining | Parallel operation, reliable with noisy data, easy to update with new data, good performances in complex problems | Limited output interpretability, computational burden |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Isola, S.; Murdaca, G.; Brunetto, S.; Zumbo, E.; Tonacci, A.; Gangemi, S. The Use of Artificial Intelligence to Analyze the Exposome in the Development of Chronic Diseases: A Review of the Current Literature. Informatics 2024, 11, 86. https://doi.org/10.3390/informatics11040086
Isola S, Murdaca G, Brunetto S, Zumbo E, Tonacci A, Gangemi S. The Use of Artificial Intelligence to Analyze the Exposome in the Development of Chronic Diseases: A Review of the Current Literature. Informatics. 2024; 11(4):86. https://doi.org/10.3390/informatics11040086
Chicago/Turabian StyleIsola, Stefania, Giuseppe Murdaca, Silvia Brunetto, Emanuela Zumbo, Alessandro Tonacci, and Sebastiano Gangemi. 2024. "The Use of Artificial Intelligence to Analyze the Exposome in the Development of Chronic Diseases: A Review of the Current Literature" Informatics 11, no. 4: 86. https://doi.org/10.3390/informatics11040086
APA StyleIsola, S., Murdaca, G., Brunetto, S., Zumbo, E., Tonacci, A., & Gangemi, S. (2024). The Use of Artificial Intelligence to Analyze the Exposome in the Development of Chronic Diseases: A Review of the Current Literature. Informatics, 11(4), 86. https://doi.org/10.3390/informatics11040086