Topic Editors

National Institute of Nursing Research (NINR), Bethesda, MD 20892, USA
Department of Epidemiology and Biostatistics, School of Public Health, University of Maryland, College Park, MD 20742, USA

The Use of Big Data in Public Health Research and Practice

Abstract submission deadline
31 October 2025
Manuscript submission deadline
31 December 2025
Viewed by
3126

Topic Information

Dear Colleagues,

We are organizing a Topic on the use of big data to inform health research and practice. To enable decision-making, it is important to obtain timely data on the determinants of health and well-being. Big data can often be operational or “organic” data generated for non-research purposes, including social media, news feeds, Google Street View images, online reviews, blogs, electronic health records, pharmacy records, and billing records. This Topic is focused on innovative ways that big data are leveraged for health research and practice. Some possible submission ideas are listed below; however, submissions addressing other related topics are also welcomed:

  • Use of electronic health records, billing data, and pharmacy data to understand individualized risk factors and treatment success; 
  • Characterization of built environments with big data derived from various sources (e.g., Street View images and remote sensing imagery data) as well as their impact on health; 
  • Using various user-generated content (e.g., GPS data, accelerometer data, users’ review data, social media data, and web search data) to study individual behaviors and social/cultural environments as well as their impacts on people’s health; 
  • Development of new methods or tools (e.g., natural language processing, machine learning, database management, high-performance computing, data mining, cloud computing, computer vision, visualization, geographic information systems, and spatial analysis) for big-data-based health research; 
  • Use of big data in COVID-19-related research; 
  • Application or development of causal inference methods for big data;
  • Investigating and addressing data quality and uncertainty issues;
  • Blending and integration of big data from different sources.

Dr. Quynh C. Nguyen
Dr. Thu T. Nguyen
Topic Editors

Keywords

  • big data
  • artificial intelligence
  • machine learning
  • deep learning
  • data science
  • natural language processing
  • computer vision
  • chat GPT

Participating Journals

Journal Name Impact Factor CiteScore Launched Year First Decision (median) APC
Cancers
cancers
4.5 8.0 2009 17.4 Days CHF 2900 Submit
International Journal of Environmental Research and Public Health
ijerph
- 7.3 2004 25.8 Days CHF 2500 Submit
ISPRS International Journal of Geo-Information
ijgi
2.8 6.9 2012 35.8 Days CHF 1900 Submit
Machine Learning and Knowledge Extraction
make
4.0 6.3 2019 20.8 Days CHF 1800 Submit
Smart Cities
smartcities
7.0 11.2 2018 28.4 Days CHF 2000 Submit

Preprints.org is a multidiscipline platform providing preprint service that is dedicated to sharing your research from the start and empowering your research journey.

MDPI Topics is cooperating with Preprints.org and has built a direct connection between MDPI journals and Preprints.org. Authors are encouraged to enjoy the benefits by posting a preprint at Preprints.org prior to publication:

  1. Immediately share your ideas ahead of publication and establish your research priority;
  2. Protect your idea from being stolen with this time-stamped preprint article;
  3. Enhance the exposure and impact of your research;
  4. Receive feedback from your peers in advance;
  5. Have it indexed in Web of Science (Preprint Citation Index), Google Scholar, Crossref, SHARE, PrePubMed, Scilit and Europe PMC.

Published Papers (2 papers)

Order results
Result details
Journals
Select all
Export citation of selected articles as:
14 pages, 2597 KiB  
Article
Potential and Observed Supply–Demand Characteristics of Medical Services: A Case Study of Nighttime Visits in Shenzhen
by Xiaojie Wu, Zhengdong Huang and Xi Yu
ISPRS Int. J. Geo-Inf. 2024, 13(11), 382; https://doi.org/10.3390/ijgi13110382 - 30 Oct 2024
Viewed by 713
Abstract
Hospital selection patterns are essential for evaluating medical accessibility and optimizing resource management. In the absence of medical records, early studies primarily used accessibility functions to estimate potential selection probabilities (PSPs). With the advent of travel data, data-driven functions have enabled the calculation [...] Read more.
Hospital selection patterns are essential for evaluating medical accessibility and optimizing resource management. In the absence of medical records, early studies primarily used accessibility functions to estimate potential selection probabilities (PSPs). With the advent of travel data, data-driven functions have enabled the calculation of observed selection probabilities (OSPs). Comparing PSP and OSP helps to leverage travel data to understand hospital selection preferences and improve medical service evaluation models. This study proposes a selection probability-based accessibility model for calculating PSP and OSP accessibility. A case study in Shenzhen employed nighttime navigation data to reduce interference from different travel modes. The distance decay function was validated, with exponential and Gaussian functions performing best. For hospitals, the PSP distribution closely aligned with OSP, except in areas with high hospital density. This discrepancy may result from the PSP function overestimating the selection probability for nearby hospitals, a limitation that could be addressed by fitting the distance decay function to actual data. PSP-based accessibility and Gini coefficients differ from those of OSP. However, when parameters are fitted to actual data, the PSP- and OSP-based functions produce nearly identical results. Fitting to actual data can notably improve the accuracy of PSP and the corresponding accessibility outcomes. These findings may provide valuable references for medical service evaluation methodologies and offer insights for planning and management. Full article
Show Figures

Figure 1

16 pages, 2887 KiB  
Article
Global and Local Interpretable Machine Learning Allow Early Prediction of Unscheduled Hospital Readmission
by Rafael Ruiz de San Martín, Catalina Morales-Hernández, Carmen Barberá, Carlos Martínez-Cortés, Antonio Jesús Banegas-Luna, Francisco José Segura-Méndez, Horacio Pérez-Sánchez, Isabel Morales-Moreno and Juan José Hernández-Morante
Mach. Learn. Knowl. Extr. 2024, 6(3), 1653-1666; https://doi.org/10.3390/make6030080 - 17 Jul 2024
Viewed by 1242
Abstract
Nowadays, most of the health expenditure is due to chronic patients who are readmitted several times for their pathologies. Personalized prevention strategies could be developed to improve the management of these patients. The aim of the present work was to develop local predictive [...] Read more.
Nowadays, most of the health expenditure is due to chronic patients who are readmitted several times for their pathologies. Personalized prevention strategies could be developed to improve the management of these patients. The aim of the present work was to develop local predictive models using interpretable machine learning techniques to early identify individual unscheduled hospital readmissions. To do this, a retrospective, case-control study, based on information regarding patient readmission in 2018–2019, was conducted. After curation of the initial dataset (n = 76,210), the final number of participants was n = 29,026. A machine learning analysis was performed following several algorithms using unscheduled hospital readmissions as dependent variable. Local model-agnostic interpretability methods were also performed. We observed a 13% rate of unscheduled hospital readmissions cases. There were statistically significant differences regarding age and days of stay (p < 0.001 in both cases). A logistic regression model revealed chronic therapy (odds ratio: 3.75), diabetes mellitus history (odds ratio: 1.14), and days of stay (odds ratio: 1.02) as relevant factors. Machine learning algorithms yielded better results regarding sensitivity and other metrics. Following, this procedure, days of stay and age were the most important factors to predict unscheduled hospital readmissions. Interestingly, other variables like allergies and adverse drug reaction antecedents were relevant. Individualized prediction models also revealed a high sensitivity. In conclusion, our study identified significant factors influencing unscheduled hospital readmissions, emphasizing the impact of age and length of stay. We introduced a personalized risk model for predicting hospital readmissions with notable accuracy. Future research should include more clinical variables to refine this model further. Full article
Show Figures

Figure 1

Back to TopTop