Multi-Modal Fusion of Routine Care Electronic Health Records (EHR): A Scoping Review
Abstract
:1. Introduction
- Early fusion (data fusion) first combines the modalities into a unified representation which is then used to train a single model.
- Late fusion (decision fusion) combines the outcomes of submodels trained independently for each modality.
- Hybrid fusion (intermediate fusion) is the combination of both early and late fusion through gradual intermediate unified representations.
- Encoding fusion (Figure 1a): The encodings of all the input modalities are combined and submitted to a single representation learning model. The latent representation generated by this layer is then used to train the decision layer.
- Decision fusion (Figure 1b): Independent encoding and representation layers are used for each modality. The latent representations of the modalities are then combined and processed with a single decision layer.
- Representation learning fusion (Figure 1c): Multiple latent representations are produced using subsets of the modalities. These latent representations are then combined and submitted to the decision layer.
2. Methods
2.1. Search Queries
- Clinical: EHR, medical, clinical, biomedical, phenotyping, disease, healthcare, “health record”;
- Technical: multimodal, “multi modal”, transformer, BERT, unstructured, embedding, deep, attention;
- General Exclusion: image, imaging, scan, segmentation, leaf.
2.2. Study Selection
3. Routine Care EHR Data
3.1. Demographics
3.2. Disease Conditions
3.3. Medications
3.4. Clinical Notes
3.5. Vital Signs and Laboratory Results
4. Temporal and Semantic Information
4.1. Temporal Information
4.2. Semantic and Concept Information
5. Multi-Modal Fusion
5.1. Encoding Fusion
5.1.1. Examples
- ICD codes (1131);
- Procedure codes in the Current Procedural Terminology (CPT) format (7048);
- Medications (4181);
- Demographic information (Age, Sex);
- Clinical notes represented as 100 topics generated from a topic model using Latent Dirichlet Allocation (LDA).
5.1.2. Strengths
5.1.3. Limitations
5.2. Decision Fusion
5.2.1. Examples
5.2.2. Strengths
- First, best-fit encodings and latent representations are learned for each modality independently of other modalities. For example, a modality can be modeled using a fine-tuned pretrained LM with a limited number of training epochs (i.e., few shot learning), whereas a different modality can use an ML model over tabular data allowing for a large number of training epochs [100].
- Second, decision fusion is more resilient to incomplete data as the latent representations of the modalities are asynchronous [101].
- Third, modalities are weighed in the decision layer irrespective of the dimension sizes of their latent representations, thereby preventing high dimension modalities from overshadowing low dimension modalities [74].
5.2.3. Limitations
5.3. Representation Learning Fusion
5.3.1. Examples
5.3.2. Strengths
5.3.3. Limitations
6. Discussion
7. Conclusions
Author Contributions
Funding
Conflicts of Interest
Abbreviations
ATC | Anatomical Therapeutic Chemical |
BERT | Bidirectional Encoder Representation from Transformers |
DC | Document Classification |
GPI | Generic Product Identifier |
ICD | International Classification of Diseases |
LM | Language Model |
LOS | Length of Stay |
NER | Named Entity Recognition |
QA | Question Answering |
RE | Relation Extraction |
RL | Representation Learning |
References
- Al-Aiad, A.; Duwairi, R.; Fraihat, M. Survey: Deep learning concepts and techniques for electronic health record. In Proceedings of the 2018 IEEE/ACS 15th International Conference on Computer Systems and Applications (AICCSA), Aqaba, Jordan, 28 October–1 November 2018; pp. 1–5. [Google Scholar]
- Seinen, T.M.; Fridgeirsson, E.A.; Ioannou, S.; Jeannetot, D.; John, L.H.; Kors, J.A.; Markus, A.F.; Pera, V.; Rekkas, A.; Williams, R.D.; et al. Use of unstructured text in prognostic clinical prediction models: A systematic review. J. Am. Med. Inform. Assoc. 2022, 29, 1292–1302. [Google Scholar] [CrossRef] [PubMed]
- Poongodi, T.; Sumathi, D.; Suresh, P.; Balusamy, B. Deep learning techniques for electronic health record (EHR) analysis. In Bio-Inspired Neurocomputing; Springer: Singapore, 2021; pp. 73–103. [Google Scholar]
- Eloranta, S.; Boman, M. Predictive models for clinical decision making: Deep dives in practical machine learning. J. Intern. Med. 2022, 292, 278–295. [Google Scholar] [CrossRef] [PubMed]
- Egger, J.; Gsaxner, C.; Pepe, A.; Pomykala, K.L.; Jonske, F.; Kurz, M.; Li, J.; Kleesiek, J. Medical deep learning—A systematic meta-review. Comput. Methods Programs Biomed. 2022, 221, 106874. [Google Scholar] [CrossRef] [PubMed]
- Behrad, F.; Abadeh, M.S. An overview of deep learning methods for multimodal medical data mining. Expert Syst. Appl. 2022, 200, 117006. [Google Scholar] [CrossRef]
- Si, Y.; Du, J.; Li, Z.; Jiang, X.; Miller, T.; Wang, F.; Zheng, W.J.; Roberts, K. Deep representation learning of patient data from Electronic Health Records (EHR): A systematic review. J. Biomed. Inform. 2021, 115, 103671. [Google Scholar] [CrossRef]
- Peng, X.; Long, G.; Pan, S.; Jiang, J.; Niu, Z. Attentive dual embedding for understanding medical concepts in electronic health records. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, 14–19 July 2019; pp. 1–8. [Google Scholar]
- Li, I.; Pan, J.; Goldwasser, J.; Verma, N.; Wong, W.P.; Nuzumlalı, M.Y.; Rosand, B.; Li, Y.; Zhang, M.; Chang, D.; et al. Neural natural language processing for unstructured data in electronic health records: A review. Comput. Sci. Rev. 2022, 46, 100511. [Google Scholar] [CrossRef]
- Wornow, M.; Xu, Y.; Thapa, R.; Patel, B.; Steinberg, E.; Fleming, S.; Pfeffer, M.A.; Fries, J.; Shah, N.H. The shaky foundations of large language models and foundation models for electronic health records. NPJ Digit. Med. 2023, 6, 135. [Google Scholar] [CrossRef]
- Kalyan, K.S.; Rajasekharan, A.; Sangeetha, S. AMMU: A survey of transformer-based biomedical pretrained language models. J. Biomed. Inform. 2022, 126, 103982. [Google Scholar] [CrossRef] [PubMed]
- Stahlschmidt, S.R.; Ulfenborg, B.; Synnergren, J. Multimodal deep learning for biomedical data fusion: A review. Briefings Bioinform. 2022, 23, bbab569. [Google Scholar] [CrossRef] [PubMed]
- Liu, Z.; Zhang, J.; Hou, Y.; Zhang, X.; Li, G.; Xiang, Y. Machine learning for multimodal electronic health records-based research: Challenges and perspectives. In China Health Information Processing Conference; Springer: Singapore, 2022; pp. 135–155. [Google Scholar]
- Xu, P.; Zhu, X.; Clifton, D.A. Multimodal learning with transformers: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 12113–12132. [Google Scholar] [CrossRef]
- Halevi, G.; Moed, H.; Bar-Ilan, J. Suitability of Google Scholar as a source of scientific information and as a source of data for scientific evaluation—Review of the literature. J. Inf. 2017, 11, 823–834. [Google Scholar] [CrossRef]
- Martín-Martín, A.; Orduna-Malea, E.; Thelwall, M.; López-Cózar, E.D. Google Scholar, Web of Science, and Scopus: A systematic comparison of citations in 252 subject categories. J. Inf. 2018, 12, 1160–1177. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Caroprese, L.; Veltri, P.; Vocaturo, E.; Zumpano, E. Deep learning techniques for electronic health record analysis. In Proceedings of the 2018 9th International Conference on Information, Intelligence, Systems and Applications (IISA), Zakynthos, Greece, 23–25 July 2018; pp. 1–4. [Google Scholar]
- Shamshirband, S.; Fathi, M.; Dehzangi, A.; Chronopoulos, A.T.; Alinejad-Rokny, H. A review on deep learning approaches in healthcare systems: Taxonomies, challenges, and open issues. J. Biomed. Inform. 2021, 113, 103627. [Google Scholar] [CrossRef] [PubMed]
- Amal, S.; Safarnejad, L.; Omiye, J.A.; Ghanzouri, I.; Cabot, J.H.; Ross, E.G. Use of multi-modal data and machine learning to improve cardiovascular disease care. Front. Cardiovasc. Med. 2022, 9, 840262. [Google Scholar] [CrossRef] [PubMed]
- Kline, A.; Wang, H.; Li, Y.; Dennis, S.; Hutch, M.; Xu, Z.; Wang, F.; Cheng, F.; Luo, Y. Multimodal machine learning in precision health: A scoping review. NPJ Digit. Med. 2022, 5, 171. [Google Scholar] [CrossRef] [PubMed]
- Baltrušaitis, T.; Ahuja, C.; Morency, L.P. Multimodal machine learning: A survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 41, 423–443. [Google Scholar] [CrossRef]
- Amirahmadi, A.; Ohlsson, M.; Etminani, K. Deep learning prediction models based on EHR trajectories: A systematic review. J. Biomed. Inform. 2023, 144, 104430. [Google Scholar] [CrossRef] [PubMed]
- Li, J.; Han, X.; Qin, Y.; Tan, F.; Chen, Y.; Wang, Z.; Song, H.; Zhou, X.; Zhang, Y.; Hu, L.; et al. Artificial intelligence accelerates multi-modal biomedical process: A Survey. Neurocomputing 2023, 558, 126720. [Google Scholar] [CrossRef]
- Miotto, R.; Wang, F.; Wang, S.; Jiang, X.; Dudley, J.T. Deep learning for healthcare: Review, opportunities and challenges. Briefings Bioinform. 2018, 19, 1236–1246. [Google Scholar] [CrossRef] [PubMed]
- Centers for Medicare & Medicaid Services. ICD Code Lists. 2023. Available online: https://www.cms.gov/medicare/coordination-benefits-recovery-overview/icd-code-lists (accessed on 6 January 2025).
- Wolters Kluwer. Medi-Span Generic Product Identifier (GPI). 2023. Available online: https://www.wolterskluwer.com/en/solutions/medi-span/about/gpi (accessed on 6 January 2025).
- World Health Organization. Anatomical Therapeutic Chemical (ATC) Classification. 2023. Available online: https://www.who.int/tools/atc-ddd-toolkit/atc-classification (accessed on 6 January 2025).
- Charlson, M.E.; Pompei, P.; Ales, K.L.; MacKenzie, C.R. A new method of classifying prognostic comorbidity in longitudinal studies: Development and validation. J. Chronic Dis. 1987, 40, 373–383. [Google Scholar] [CrossRef]
- Elixhauser, A.; Steiner, C.; Harris, D.R.; Coffey, R.M. Comorbidity measures for use with administrative data. Med. Care 1998, 36, 8–27. [Google Scholar] [CrossRef] [PubMed]
- Bodenreider, O. The unified medical language system (UMLS): Integrating biomedical terminology. Nucleic Acids Res. 2004, 32, D267–D270. [Google Scholar] [CrossRef] [PubMed]
- Xiao, C.; Ma, T.; Dieng, A.B.; Blei, D.M.; Wang, F. Readmission prediction via deep contextual embedding of clinical concepts. PLoS ONE 2018, 13, e0195024. [Google Scholar] [CrossRef] [PubMed]
- Peng, X.; Long, G.; Shen, T.; Wang, S.; Jiang, J.; Blumenstein, M. Temporal self-attention network for medical concept embedding. In Proceedings of the 2019 IEEE International Conference on Data Mining (ICDM), Beijing, China, 8–11 November 2019; pp. 498–507. [Google Scholar]
- Rasmy, L.; Xiang, Y.; Xie, Z.; Tao, C.; Zhi, D. Med-BERT: Pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction. NPJ Digit. Med. 2021, 4, 86. [Google Scholar] [CrossRef]
- Finch, A.; Crowell, A.; Bhatia, M.; Parameshwarappa, P.; Chang, Y.C.; Martinez, J.; Horberg, M. Exploiting hierarchy in medical concept embedding. JAMIA Open 2021, 4, ooab022. [Google Scholar] [CrossRef] [PubMed]
- Ye, M.; Cui, S.; Wang, Y.; Luo, J.; Xiao, C.; Ma, F. Medretriever: Target-driven interpretable health risk prediction via retrieving unstructured medical text. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, virtual Event, 1–5 November 2021; pp. 2414–2423. [Google Scholar]
- Prakash, P.; Chilukuri, S.; Ranade, N.; Viswanathan, S. RareBERT: Transformer architecture for rare disease patient identification using administrative claims. In Proceedings of the AAAI Conference on Artificial Intelligence, virtual Event, 2–9 February 2021; Volume 35, pp. 453–460. [Google Scholar]
- Xie, X.; Xiong, Y.; Yu, P.S.; Zhu, Y. EHR coding with multi-scale feature attention and structured knowledge graph propagation. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China, 3–7 November 2019; pp. 649–658. [Google Scholar]
- Alsentzer, E.; Murphy, J.R.; Boag, W.; Weng, W.H.; Jin, D.; Naumann, T.; McDermott, M. Publicly available clinical BERT embeddings. arXiv 2019, arXiv:1904.03323. [Google Scholar]
- Lee, J.; Yoon, W.; Kim, S.; Kim, D.; Kim, S.; So, C.H.; Kang, J. BioBERT: A pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 2020, 36, 1234–1240. [Google Scholar] [CrossRef]
- Wang, L.; Wang, Q.; Bai, H.; Liu, C.; Liu, W.; Zhang, Y.; Jiang, L.; Xu, H.; Wang, K.; Zhou, Y. EHR2Vec: Representation learning of medical concepts from temporal patterns of clinical notes based on self-attention mechanism. Front. Genet. 2020, 11, 630. [Google Scholar] [CrossRef]
- Ji, S.; Cambria, E.; Marttinen, P. Dilated convolutional attention network for medical code assignment from clinical text. arXiv 2020, arXiv:2009.14578. [Google Scholar]
- Vu, T.; Nguyen, D.Q.; Nguyen, A. A label attention model for ICD coding from clinical text. arXiv 2020, arXiv:2007.06351. [Google Scholar]
- Si, Y.; Roberts, K. Patient representation transfer learning from clinical notes based on hierarchical attention network. AMIA Summits Transl. Sci. Proc. 2020, 2020, 597. [Google Scholar] [PubMed]
- Liu, N.; Hu, Q.; Xu, H.; Xu, X.; Chen, M. Med-BERT: A pretraining framework for medical records named entity recognition. IEEE Trans. Ind. Inform. 2021, 18, 5600–5608. [Google Scholar] [CrossRef]
- Zhang, N.; Jankowski, M. Hierarchical BERT for medical document understanding. arXiv 2022, arXiv:2204.09600. [Google Scholar]
- Yang, X.; Chen, A.; PourNejatian, N.; Shin, H.C.; Smith, K.E.; Parisien, C.; Compas, C.; Martin, C.; Costa, A.B.; Flores, M.G.; et al. A large language model for electronic health records. NPJ Digit. Med. 2022, 5, 194. [Google Scholar] [CrossRef] [PubMed]
- Fang, L.; Chen, Q.; Wei, C.H.; Lu, Z.; Wang, K. Bioformer: An efficient transformer language model for biomedical text mining. arXiv 2023, arXiv:2302.01588v1. [Google Scholar]
- Mao, C.; Xu, J.; Rasmussen, L.; Li, Y.; Adekkanattu, P.; Pacheco, J.; Bonakdarpour, B.; Vassar, R.; Shen, L.; Jiang, G.; et al. AD-BERT: Using pre-trained language model to predict the progression from mild cognitive impairment to Alzheimer’s disease. J. Biomed. Inform. 2023, 144, 104442. [Google Scholar] [CrossRef]
- Nguyen, P.; Tran, T.; Wickramasinghe, N.; Venkatesh, S. Deepr: A convolutional net for medical records. IEEE J. Biomed. Health Inform. 2016, 21, 22–30. [Google Scholar] [CrossRef] [PubMed]
- Song, H.; Rajan, D.; Thiagarajan, J.; Spanias, A. Attend and diagnose: Clinical time series analysis using attention models. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32. [Google Scholar]
- Ma, T.; Xiao, C.; Wang, F. Health-atm: A deep architecture for multifaceted patient health record representation and risk prediction. In Proceedings of the 2018 SIAM International Conference on Data Mining, SIAM, San Diego, CA, USA, 3–5 May 2018; pp. 261–269. [Google Scholar]
- Cheung, B.L.P.; Dahl, D. Deep learning from electronic medical records using attention-based cross-modal convolutional neural networks. In Proceedings of the 2018 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI), Las Vegas, NV, USA, 4–7 March 2018; pp. 222–225. [Google Scholar]
- Chen, D.; Qian, G.; Pan, Q. Breast cancer classification with electronic medical records using hierarchical attention bidirectional networks. In Proceedings of the 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Madrid, Spain, 3–6 December 2018; pp. 983–988. [Google Scholar]
- Zhang, J.; Kowsari, K.; Harrison, J.H.; Lobo, J.M.; Barnes, L.E. Patient2vec: A personalized interpretable deep representation of the longitudinal electronic health record. IEEE Access 2018, 6, 65333–65346. [Google Scholar] [CrossRef]
- Zeng, X.; Feng, Y.; Moosavinasab, S.; Lin, D.; Lin, S.; Liu, C. Multilevel self-attention model and its use on medical risk prediction. In Proceedings of the Pacific Symposium On Biocomputing 2020, World Scientific, Kohala Coast, HI, USA, 3–7 January 2020; pp. 115–126. [Google Scholar]
- Zhang, Y. ATTAIN: Attention-based time-aware LSTM networks for disease progression modeling. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI-2019), Macao, China, 10–16 August 2019; pp. 4369–4375. [Google Scholar]
- Mugisha, C.; Paik, I. Pneumonia outcome prediction using structured and unstructured data from EHR. In Proceedings of the 2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Virtual Event, 16–19 December 2020; pp. 2640–2646. [Google Scholar]
- Bagheri, A.; Groenhof, T.K.J.; Veldhuis, W.B.; de Jong, P.A.; Asselbergs, F.W.; Oberski, D.L. Multimodal learning for cardiovascular risk prediction using EHR data. arXiv 2020, arXiv:2008.11979. [Google Scholar]
- Meng, Y.; Speier, W.; Ong, M.; Arnold, C.W. HCET: Hierarchical clinical embedding with topic modeling on electronic health records for predicting future depression. IEEE J. Biomed. Health Inform. 2020, 25, 1265–1272. [Google Scholar] [CrossRef]
- Li, Y.; Rao, S.; Solares, J.R.A.; Hassaine, A.; Ramakrishnan, R.; Canoy, D.; Zhu, Y.; Rahimi, K.; Salimi-Khorshidi, G. BEHRT: Transformer for electronic health records. Sci. Rep. 2020, 10, 7155. [Google Scholar] [CrossRef] [PubMed]
- Cao, Y.; Peng, H.; Yu, P.S. Multi-information source HIN for medical concept embedding. In Proceedings of the Advances in Knowledge Discovery and Data Mining: 24th Pacific-Asia Conference, PAKDD 2020, Virtual Event, 11–14 May 2020; pp. 396–408. [Google Scholar]
- Hashir, M.; Sawhney, R. Towards unstructured mortality prediction with free-text clinical notes. J. Biomed. Inform. 2020, 108, 103489. [Google Scholar] [CrossRef]
- Qiao, Z.; Zhang, Z.; Wu, X.; Ge, S.; Fan, W. Mhm: Multi-modal clinical data based hierarchical multi-label diagnosis prediction. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, 25–30 July 2020; pp. 1841–1844. [Google Scholar]
- Meng, Y.; Speier, W.; Ong, M.K.; Arnold, C.W. Bidirectional representation learning from transformers using multimodal electronic health record data to predict depression. IEEE J. Biomed. Health Inform. 2021, 25, 3121–3129. [Google Scholar] [CrossRef] [PubMed]
- Pang, C.; Jiang, X.; Kalluri, K.S.; Spotnitz, M.; Chen, R.; Perotte, A.; Natarajan, K. CEHR-BERT: Incorporating temporal information from structured EHR data to improve prediction tasks. In Proceedings of the Machine Learning for Health, PMLR, Virtual Event, 4 December 2021; pp. 239–260. [Google Scholar]
- Xu, Z.; So, D.R.; Dai, A.M. Mufasa: Multimodal fusion architecture search for electronic health records. In Proceedings of the AAAI Conference on Artificial Intelligence, virtually, 2–9 February 2021; 2021; Volume 35, pp. 10532–10540. [Google Scholar]
- Yang, B.; Wu, L. How to leverage multimodal EHR data for better medical predictions? arXiv 2021, arXiv:2110.15763. [Google Scholar]
- Chen, Y.P.; Lo, Y.H.; Lai, F.; Huang, C.H. Disease concept-embedding based on the self-supervised method for medical information extraction from electronic health records and disease retrieval: Algorithm development and validation study. J. Med. Internet Res. 2021, 23, e25113. [Google Scholar] [CrossRef] [PubMed]
- Ferri, P.; Sáez, C.; Félix-De Castro, A.; Juan-Albarracín, J.; Blanes-Selva, V.; Sánchez-Cuesta, P.; García-Gómez, J.M. Deep ensemble multitask classification of emergency medical call incidents combining multimodal data improves emergency medical dispatch. Artif. Intell. Med. 2021, 117, 102088. [Google Scholar] [CrossRef] [PubMed]
- Xie, J.; Zhang, B.; Ma, J.; Zeng, D.; Lo-Ciganic, J. Readmission prediction for patients with heterogeneous medical history: A trajectory-based deep learning approach. ACM Trans. Manag. Inf. Syst. (TMIS) 2021, 13, 1–27. [Google Scholar] [CrossRef]
- Niu, S.; Yin, Q.; Song, Y.; Guo, Y.; Yang, X. Label dependent attention model for disease risk prediction using multimodal electronic health records. In Proceedings of the 2021 IEEE International Conference on Data Mining (ICDM), Virtual Event, 7–10 December 2021; pp. 449–458. [Google Scholar]
- Ahuja, Y.; Zou, Y.; Verma, A.; Buckeridge, D.; Li, Y. MixEHR-Guided: A guided multi-modal topic modeling approach for large-scale automatic phenotyping using the electronic health record. J. Biomed. Inform. 2022, 134, 104190. [Google Scholar] [CrossRef]
- Soenksen, L.R.; Ma, Y.; Zeng, C.; Boussioux, L.; Villalobos Carballo, K.; Na, L.; Wiberg, H.M.; Li, M.L.; Fuentes, I.; Bertsimas, D. Integrated multimodal artificial intelligence framework for healthcare applications. NPJ Digit. Med. 2022, 5, 149. [Google Scholar] [CrossRef]
- Li, Y.; Mamouei, M.; Salimi-Khorshidi, G.; Rao, S.; Hassaine, A.; Canoy, D.; Lukasiewicz, T.; Rahimi, K. Hi-BEHRT: Hierarchical Transformer-based model for accurate prediction of clinical events using multimodal longitudinal electronic health records. IEEE J. Biomed. Health Inform. 2022, 27, 1106–1117. [Google Scholar] [CrossRef]
- Lyu, W.; Dong, X.; Wong, R.; Zheng, S.; Abell-Hart, K.; Wang, F.; Chen, C. A Multimodal Transformer: Fusing Clinical Notes with Structured EHR Data for Interpretable In-Hospital Mortality Prediction. In Proceedings of the AMIA Annual Symposium Proceedings. American Medical Informatics Association, Washington, DC, USA, 5–9 November 2022; Volume 2022, p. 719. [Google Scholar]
- Liu, S.; Wang, X.; Hou, Y.; Li, G.; Wang, H.; Xu, H.; Xiang, Y.; Tang, B. Multimodal data matters: Language model pre-training over structured and unstructured electronic health records. IEEE J. Biomed. Health Inform. 2022, 27, 504–514. [Google Scholar] [CrossRef] [PubMed]
- Li, R.; Gao, J. Multi-modal contrastive learning for healthcare data analytics. In Proceedings of the 2022 IEEE 10th International Conference on Healthcare Informatics (ICHI), Rochester, MN, USA, 11–14 June 2022; pp. 120–127. [Google Scholar]
- Miranda, O.; Fan, P.; Qi, X.; Yu, Z.; Ying, J.; Wang, H.; Brent, D.A.; Silverstein, J.C.; Chen, Y.; Wang, L. DeepBiomarker: Identifying important lab tests from electronic medical records for the prediction of suicide-related events among PTSD patients. J. Pers. Med. 2022, 12, 524. [Google Scholar] [CrossRef]
- He, Y.; Wang, C.; Zhang, S.; Li, N.; Li, Z.; Zeng, Z. KG-MTT-BERT: Knowledge graph enhanced BERT for multi-type medical text classification. arXiv 2022, arXiv:2210.03970. [Google Scholar]
- Haudenschild, C.; Vaickus, L.; Levy, J. Configuring a federated network of real-world patient health data for multimodal deep learning prediction of health outcomes. In Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing, Virtual Event, 25–29 April 2022; pp. 627–635. [Google Scholar]
- Lentzen, M.; Linden, T.; Veeranki, S.; Madan, S.; Kramer, D.; Leodolter, W.; Fröhlich, H. A transformer-based model trained on large scale claims data for prediction of severe COVID-19 disease progression. IEEE J. Biomed. Health Inform. 2023, 27, 4548–4558. [Google Scholar] [CrossRef] [PubMed]
- Yang, Z.; Mitra, A.; Liu, W.; Berlowitz, D.; Yu, H. TransformEHR: Transformer-based encoder-decoder generative model to enhance prediction of disease outcomes using electronic health records. Nat. Commun. 2023, 14, 7857. [Google Scholar] [CrossRef] [PubMed]
- Mahbub, M.; Srinivasan, S.; Danciu, I.; Peluso, A.; Begoli, E.; Tamang, S.; Peterson, G.D. Unstructured clinical notes within the 24 hours since admission predict short, mid & long-term mortality in adult ICU patients. PLoS ONE 2022, 17, e0262182. [Google Scholar]
- Gupta, M.; Phan, T.L.T.; Bunnell, H.T.; Beheshti, R. Obesity Prediction with EHR Data: A deep learning approach with interpretable elements. ACM Trans. Comput. Healthc. (HEALTH) 2022, 3, 1–19. [Google Scholar] [CrossRef] [PubMed]
- Ren, H.; Wang, J.; Zhao, W.X.; Wu, N. Rapt: Pre-training of time-aware transformer for learning robust healthcare representation. In Proceedings of the Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, Virtual Event, 14–18 August 2021; pp. 3503–3511. [Google Scholar]
- Gangavarapu, T.; Krishnan, G.S.; Kamath, S.; Jeganathan, J. FarSight: Long-term disease prediction using unstructured clinical nursing notes. IEEE Trans. Emerg. Top. Comput. 2020, 9, 1151–1169. [Google Scholar] [CrossRef]
- Ive, J.; Viani, N.; Kam, J.; Yin, L.; Verma, S.; Puntis, S.; Cardinal, R.N.; Roberts, A.; Stewart, R.; Velupillai, S. Generation and evaluation of artificial mental health records for natural language processing. NPJ Digit. Med. 2020, 3, 69. [Google Scholar] [CrossRef] [PubMed]
- Bayramli, I.; Castro, V.; Barak-Corren, Y.; Madsen, E.M.; Nock, M.K.; Smoller, J.W.; Reis, B.Y. Predictive structured–unstructured interactions in EHR models: A case study of suicide prediction. NPJ Digit. Med. 2022, 5, 15. [Google Scholar] [CrossRef]
- Houssein, E.H.; Mohamed, R.E.; Ali, A.A. Heart disease risk factors detection from electronic health records using advanced NLP and deep learning techniques. Sci. Rep. 2023, 13, 7173. [Google Scholar] [CrossRef] [PubMed]
- Lamproudis, A.; Henriksson, A.; Dalianis, H. Evaluating pretraining strategies for clinical BERT models. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, Marseille, France, 20–25 June 2022; pp. 410–416. [Google Scholar]
- El Boukkouri, H.; Ferret, O.; Lavergne, T.; Zweigenbaum, P. Re-train or train from scratch? Comparing pre-training strategies of BERT in the medical domain. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, Marseille, France, 20–25 June 2022; pp. 2626–2633. [Google Scholar]
- Alrowili, S.; Vijay-Shanker, K. BioM-transformers: Building large biomedical language models with BERT, ALBERT and ELECTRA. In Proceedings of the 20th Workshop on Biomedical Language Processing, Virtual Event, 11 June 2021; pp. 221–227. [Google Scholar]
- Pawar, Y.; Henriksson, A.; Hedberg, P.; Naucler, P. Leveraging clinical bert in multimodal mortality prediction models for COVID-19. In Proceedings of the 2022 IEEE 35th International Symposium on Computer-Based Medical Systems (CBMS), Virtual Event, 21–23 July 2022; pp. 199–204. [Google Scholar]
- Xu, Y.; Ying, H.; Qian, S.; Zhuang, F.; Zhang, X.; Wang, D.; Wu, J.; Xiong, H. Time-aware context-gated graph attention network for clinical risk prediction. IEEE Trans. Knowl. Data Eng. 2022, 35, 7557–7568. [Google Scholar] [CrossRef]
- Tipirneni, S.; Reddy, C.K. Self-supervised transformer for sparse and irregularly sampled multivariate clinical time-series. ACM Trans. Knowl. Discov. Data (TKDD) 2022, 16, 1–17. [Google Scholar] [CrossRef]
- Ji, S.; Hölttä, M.; Marttinen, P. Does the magic of BERT apply to medical code assignment? A quantitative study. Comput. Biol. Med. 2021, 139, 104998. [Google Scholar] [CrossRef]
- Keles, F.D.; Wijewardena, P.M.; Hegde, C. On the computational complexity of self-attention. In Proceedings of the International Conference on Algorithmic Learning Theory, PMLR, Singapore, 20–23 February 2023; pp. 597–619. [Google Scholar]
- Beltagy, I.; Peters, M.E.; Cohan, A. Longformer: The long-document transformer. arXiv 2020, arXiv:2004.05150. [Google Scholar]
- Shukla, S.N.; Marlin, B.M. Integrating physiological time series and clinical notes with deep learning for improved ICU mortality prediction. arXiv 2020, arXiv:2003.11059. [Google Scholar]
- Ljubic, B.; Roychoudhury, S.; Cao, X.H.; Pavlovski, M.; Obradovic, S.; Nair, R.; Glass, L.; Obradovic, Z. Influence of medical domain knowledge on deep learning for Alzheimer’s disease prediction. Comput. Methods Programs Biomed. 2020, 197, 105765. [Google Scholar] [CrossRef]
- Grinsztajn, L.; Oyallon, E.; Varoquaux, G. Why do tree-based models still outperform deep learning on typical tabular data? Adv. Neural Inf. Process. Syst. 2022, 35, 507–520. [Google Scholar]
- Shwartz-Ziv, R.; Armon, A. Tabular data: Deep learning is not all you need. Inf. Fusion 2022, 81, 84–90. [Google Scholar] [CrossRef]
- Lee, Y.; Jun, E.; Choi, J.; Suk, H.I. Multi-view integrative attention-based deep representation learning for irregular clinical time-series data. IEEE J. Biomed. Health Inform. 2022, 26, 4270–4280. [Google Scholar] [CrossRef] [PubMed]
- Ho, J.; Kalchbrenner, N.; Weissenborn, D.; Salimans, T. Axial attention in multidimensional transformers. arXiv 2019, arXiv:1912.12180. [Google Scholar]
- Somepalli, G.; Goldblum, M.; Schwarzschild, A.; Bruss, C.B.; Goldstein, T. Saint: Improved neural networks for tabular data via row attention and contrastive pre-training. arXiv 2021, arXiv:2106.01342. [Google Scholar]
- He, Y.; Zhu, Z.; Zhang, Y.; Chen, Q.; Caverlee, J. Infusing disease knowledge into BERT for health question answering, medical inference and disease name recognition. arXiv 2020, arXiv:2010.03746. [Google Scholar]
- Zheng, S.; Zhu, Z.; Liu, Z.; Guo, Z.; Liu, Y.; Yang, Y.; Zhao, Y. Multi-modal graph learning for disease prediction. IEEE Trans. Med. Imaging 2022, 41, 2207–2216. [Google Scholar] [CrossRef] [PubMed]
- Jiang, X.; Xu, C. Deep learning and machine learning with grid search to predict later occurrence of breast Cancer metastasis using clinical data. J. Clin. Med. 2022, 11, 5772. [Google Scholar] [CrossRef]
Subject | References |
---|---|
Reviews | [1,2,3,4,5,6,7,9,10,11,12,13,14,18,19,20,21,22,23,24,25] |
Taxonomy | [26,27,28,29,30,31] |
Uni-modal Models | [32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49] |
Multi-modal Models | [50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83] |
Model | Ref. | Year | Training Dataset | Evaluation Task |
---|---|---|---|---|
CONTENT | [32] | 2018 | CHF | Readmission |
TeSAN | [33] | 2019 | MIMIC-III, CMS | Phenotyping |
Med-BERT1 | [34] | 2021 | CHF, Truven Health MarketScan | Phenotyping |
SG-Co | [35] | 2021 | KPMAS | Phenotyping, Mortality, Readmission |
Medretriever | [36] | 2021 | Real-world health insurance claim data | Phenotyping |
RareBERT | [37] | 2021 | Symphony Health’s IDV | Phenotyping |
Model | Ref. | Year | Training Dataset | Evaluation Task |
---|---|---|---|---|
MSATT-KG | [38] | 2019 | MIMIC-III | ICD coding |
Clinical BERT | [39] | 2019 | MIMIC-III | NER, inferencing |
BioBERT | [40] | 2020 | PubMed | NER, RE, QA |
EHR2Vec | [41] | 2020 | Private EHR | Phenotyping |
DCAN | [42] | 2020 | MIMIC-III | ICD coding |
LAAT | [43] | 2020 | MIMIC-III | ICD coding |
HAN | [44] | 2020 | MIMIC-III | Mortality |
Med-BERT2 | [45] | 2021 | CMeEE, CMR | NER |
MDBERT | [46] | 2022 | MIMIC-III | ICD coding |
GatorTron | [47] | 2022 | Private EHR, PubMed, Wikipedia | NER, RE, QA, Inferencing |
Bioformer | [48] | 2023 | PubMed | NER, RE, QA, DC |
AD-BERT | [49] | 2023 | Private EHR | Phenotyping |
Model | Ref. | Year | Fusion | Modalities | Dataset | Evaluation Task |
---|---|---|---|---|---|---|
Deepr | [50] | 2016 | Encoding | Structured, Clinical Notes | Private EHR | Mortality, LOS, Phenotyping, Readmission |
SAnD | [51] | 2018 | Encoding | Time Series | MIMIC-III | Mortality, LOS, Phenotyping |
Health-atm | [52] | 2018 | Encoding | Structured | Private EHR, EMRbots | Phenotyping |
AXCNN | [53] | 2018 | RL | Structured, Time Series | Private EHR | Readmission |
HA-BiRNN | [54] | 2018 | RL | Structured, Diagnosis reports | Private EHR | Phenotyping |
Patient2vec | [55] | 2018 | Encoding | Structured | Private EHR | Readmission |
MSAM | [56] | 2019 | RL | Structured | MIMIC-III, Private EHR | Phenotyping |
ATTAIN | [57] | 2019 | Encoding | Structured, Time series | Private EHR | Phenotyping |
HCET | [60] | 2020 | RL | Structured, Clinical Notes | Private EHR | Phenotyping |
BEHRT | [61] | 2020 | Encoding | Structured | CPRD | Phenotyping |
HIN | [62] | 2020 | RL | Structured, Clinical Notes | MIMIC-III | Phenotyping, Symptoms Classification |
MM-HCR | [63] | 2020 | Decision | Clinical Notes, Time Series | MIMIC-III | Mortality |
MHM | [64] | 2020 | RL | Structured, Time Series | MIMIC-III | Phenotyping |
BRLTM | [65] | 2021 | Encoding | Structured, Clinical Notes | Private EHR | Phenotyping |
CEHR-BERT | [66] | 2021 | RL | Structured | CUIMC-NYP | Phenotyping |
MUFASA | [67] | 2021 | Any | Any | MIMIC-III | Phenotyping |
Clinical MAG | [68] | 2021 | RL | Structured, Clinical Notes | MIMIC-III | Phenotyping |
EDisease | [69] | 2021 | RL | Structured, Clinical Notes | Private EHR, NHAMCS | Phenotyping |
DeepEMC2 | [70] | 2021 | RL | Structured, Clinical Notes | Private EHR | Emergency Risk Classification |
TADEL | [71] | 2021 | Encoding | Structured | MCD | Readmission |
LDAM | [72] | 2021 | RL | Clinical Notes, Time Series | MIMIC-III | Phenotyping |
MixEHR-Guided | [73] | 2022 | RL | Structured, Clinical Notes | MIMIC-III, PopHR | Phenotyping |
HAIM | [74] | 2022 | Decision | Any | MIMIC-III | Mortality, LOS, Phenotyping |
Hi-BEHERT | [75] | 2022 | RL | Structured | CPRD | Phenotyping |
MBERT | [76] | 2022 | RL | Structured, Clinical Notes | MIMIC-III | Mortality |
MedM-PLM | [77] | 2022 | RL | Structured, Clinical Notes | MIMIC-III | Medication Recommender, Readmission, ICD Coding |
MCDP | [78] | 2022 | RL | Structured, Times series | MIMIC-III, MIMIC-IV | Phenotyping, Mortality |
DeepBiomarker | [79] | 2022 | Encoding | Structured | UPMC | Phenotyping |
KG-MTT-BERT | [80] | 2022 | RL | Structured, Unstructured | Private EHR | Phenotyping |
ExMed-BERT | [82] | 2023 | Encoding | Structured | IBM Explorys Therapeutic dataset | Phenotyping |
TransformerEHR | [83] | 2023 | Encoding | Structured | Private EHR | Phenotyping |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ben-Miled, Z.; Shebesh, J.A.; Su, J.; Dexter, P.R.; Grout, R.W.; Boustani, M.A. Multi-Modal Fusion of Routine Care Electronic Health Records (EHR): A Scoping Review. Information 2025, 16, 54. https://doi.org/10.3390/info16010054
Ben-Miled Z, Shebesh JA, Su J, Dexter PR, Grout RW, Boustani MA. Multi-Modal Fusion of Routine Care Electronic Health Records (EHR): A Scoping Review. Information. 2025; 16(1):54. https://doi.org/10.3390/info16010054
Chicago/Turabian StyleBen-Miled, Zina, Jacob A. Shebesh, Jing Su, Paul R. Dexter, Randall W. Grout, and Malaz A. Boustani. 2025. "Multi-Modal Fusion of Routine Care Electronic Health Records (EHR): A Scoping Review" Information 16, no. 1: 54. https://doi.org/10.3390/info16010054
APA StyleBen-Miled, Z., Shebesh, J. A., Su, J., Dexter, P. R., Grout, R. W., & Boustani, M. A. (2025). Multi-Modal Fusion of Routine Care Electronic Health Records (EHR): A Scoping Review. Information, 16(1), 54. https://doi.org/10.3390/info16010054