Spatially Dependent Bayesian Modeling of Geostatistics Data and Its Application for Tuberculosis (TB) in China
Abstract
:1. Introduction
2. Model Description
2.1. Background
2.2. Model Construction
3. Proof of GMRF Structure
4. Simulation Study
4.1. Data Generation
4.2. Simulation Result
4.3. Fitting Effect Evaluation
5. Empirical Analysis of Tuberculosis Incidence Data
5.1. Data Sources and Pre-Processing
5.2. Model Building
5.3. Implementation of INLA-SPDE
5.4. Model Selection
5.5. Model Fitting Results
6. Disscussion
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Gilks, W.R.; Richardson, S.; Spiegelhalter, D. Markov Chain Monte Carlo in Practice; CRC Press: Boca Raton, FL, USA, 1995. [Google Scholar]
- Rue, H.; Martino, S.; Chopin, N. Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. J. R. Stat. Soc. Ser. B Stat. Methodol. 2009, 71, 319–392. [Google Scholar] [CrossRef]
- Congdon, P. Bayesian model choice based on Monte Carlo estimates of posterior model probabilities. Comput. Stat. Data Anal. 2006, 50, 346–357. [Google Scholar] [CrossRef]
- Green, P.J.; Łatuszyński, K.; Pereyra, M.; Robert, C.P. Bayesian computation: A summary of the current state, and samples backwards and forwards. Stat. Comput. 2015, 25, 835–862. [Google Scholar] [CrossRef]
- Llorente, F.; Martino, L.; Delgado, D.; López-Santiago, J. On the computation of marginal likelihood via MCMC for model selection and hypothesis testing. In Proceedings of the 2020 28th European Signal Processing Conference (EUSIPCO), Amsterdam, The Netherlands, 18–21 January 2021; pp. 2373–2377. [Google Scholar]
- Llorente, F.; Martino, L.; Delgado, D.; Lopez-Santiago, J. Marginal likelihood computation for model selection and hypothesis testing: An extensive review. SIAM Rev. 2023, 65, 3–58. [Google Scholar] [CrossRef]
- Lindgren, F.; Rue, H.; Lindström, J. An explicit link between Gaussian fields and Gaussian Markov random fields: The stochastic partial differential equation approach. J. R. Stat. Soc. Ser. B Stat. Methodol. 2011, 73, 423–498. [Google Scholar] [CrossRef]
- Matérn, B. Spatial Variation; Springer: New York, NY, USA, 1960. [Google Scholar]
- Rue, H.; Held, L. Gaussian Markov Random Fields: Theory and Applications; CRC Press: Boca Raton, FL, USA, 2005. [Google Scholar]
- Cameletti, M.; Lindgren, F.; Simpson, D.; Rue, H. Spatio-temporal modeling of particulate matter concentration through the SPDE approach. AStA Adv. Stat. Anal. 2013, 97, 109–131. [Google Scholar] [CrossRef]
- Moraga, P.; Dean, C.; Inoue, J.; Morawiecki, P.; Noureen, S.R.; Wang, F. Bayesian spatial modelling of geostatistical data using INLA and SPDE methods: A case study predicting malaria risk in Mozambique. Spat.-Spatio-Temporal Epidemiol. 2021, 39, 100440. [Google Scholar] [CrossRef] [PubMed]
- Moraga, P. Species Distribution Modeling using Spatial Point Processes: A Case Study of Sloth Occurrence in Costa Rica. R J. 2021, 12, 293–310. [Google Scholar] [CrossRef]
- Zhang, Z.; Krainski, E.; Zhong, P.; Rue, H.; Huser, R. Joint modeling and prediction of massive spatio-temporal wildfire count and burnt area data with the INLA-SPDE approach. Extremes 2023, 26, 339–351. [Google Scholar] [CrossRef]
- Wilson, B. Evaluating the INLA-SPDE approach for Bayesian modeling of earthquake damages from geolocated cluster data. arXiv 2020. [Google Scholar] [CrossRef]
- Lindgren, F.; Rue, H. Bayesian spatial modelling with R-INLA. J. Stat. Softw. 2015, 63, 1–25. [Google Scholar] [CrossRef]
- Krainski, E.T.; Gómez-Rubio, V.; Bakka, H.; Lenzi, A.; Castro-Camilo, D.; Simpson, D.; Lindgren, F.; Rue, H. Advanced Spatial Modeling with Stochastic Partial Differential Equations Using R and INLA; CRC Press: Boca Raton, FL, USA, 2018. [Google Scholar]
- Anselin, L. Spatial Econometrics: Methods and Models; Springer Science & Business Media: Berlin/Heidelberg, Germany, 1988; Volume 4. [Google Scholar]
- LeSage, J.; Pace, R.K. Introduction to Spatial Econometrics; CRC Press: Boca Raton, FL, USA, 2009. [Google Scholar]
- Bivand, R.S.; Gómez-Rubio, V.; Rue, H. Approximate Bayesian inference for spatial econometrics models. Spat. Stat. 2014, 9, 146–165. [Google Scholar] [CrossRef]
- Bivand, R.; Gómez-Rubio, V.; Rue, H. Spatial data analysis with R-INLA with some extensions. J. Stat. Softw. 2015, 63, 1–31. [Google Scholar] [CrossRef]
- Hoeting, J.A.; Madigan, D.; Raftery, A.E.; Volinsky, C.T. Bayesian model averaging: A tutorial (with comments by M. Clyde, David Draper and EI George, and a rejoinder by the authors. Stat. Sci. 1999, 14, 382–417. [Google Scholar] [CrossRef]
- Gómez-Rubio, V.; Palmí-Perales, F. Multivariate posterior inference for spatial models with the integrated nested Laplace approximation. J. R. Stat. Soc. Ser. C Appl. Stat. 2019, 68, 199–215. [Google Scholar] [CrossRef]
- Gómez-Rubio, V.; Bivand, R.S.; Rue, H. Bayesian model averaging with the integrated nested laplace approximation. Econometrics 2020, 8, 23. [Google Scholar] [CrossRef]
- Teng, J.; Ding, S.; Shi, X.; Zhang, H.; Hu, X. MCMCINLA Estimation of Missing Data and Its Application to Public Health Development in China in the Post-Epidemic Era. Entropy 2022, 24, 916. [Google Scholar] [CrossRef]
- Gómez-Rubio, V.; Bivand, R.S.; Rue, H. Estimating spatial econometrics models with integrated nested Laplace approximation. Mathematics 2021, 9, 2044. [Google Scholar] [CrossRef]
- Basile, R.; Durbán, M.; Mínguez, R.; Montero, J.M.; Mur, J. Modeling regional economic dynamics: Spatial dependence, spatial heterogeneity and nonlinearities. J. Econ. Dyn. Control 2014, 48, 229–245. [Google Scholar] [CrossRef]
- Blangiardo, M.; Cameletti, M. Spatial and spatio-temporal Bayesian models with R-INLA; John Wiley & Sons: Hoboken, NJ, USA, 2015. [Google Scholar]
- Hart, P. The condensed nearest neighbor rule (Corresp.). IEEE Trans. Inf. Theory 1968, 14, 515–516. [Google Scholar] [CrossRef]
- Smirnov, N.V. On the estimation of the discrepancy between empirical curves of distribution for two independent samples. Bull. Math. Univ. Moscou 1939, 2, 3–14. [Google Scholar]
- Meyer, R. Deviance Information Criterion (DIC). In Wiley StatsRef: Statistics Reference Online; Wiley: Hoboken, NJ, USA, 2016. [Google Scholar]
- Spiegelhalter, D.J.; Best, N.G.; Carlin, B.P.; Van Der Linde, A. Bayesian measures of model complexity and fit. J. R. Stat. Soc. Ser. b Stat. Methodol. 2002, 64, 583–639. [Google Scholar] [CrossRef]
- Watanabe, S.; Opper, M. Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. J. Mach. Learn. Res. 2010, 11, 3571–3594. [Google Scholar]
- Gelman, A.; Hwang, J.; Vehtari, A. Understanding predictive information criteria for Bayesian models. Stat. Comput. 2014, 24, 997–1016. [Google Scholar] [CrossRef]
- Roos, M.; Held, L. Sensitivity analysis in Bayesian generalized linear mixed models for binary data. Bayesian Anal. 2011, 6, 259–278. [Google Scholar] [CrossRef]
- Li, X.X.; Wang, L.X.; Zhang, J.; Liu, Y.X.; Zhang, H.; Jiang, S.W.; Chen, J.X.; Zhou, X.N. Exploration of ecological factors related to the spatial heterogeneity of tuberculosis prevalence in PR China. Glob. Health Action 2014, 7, 23620. [Google Scholar] [CrossRef] [PubMed]
- Sun, W.; Gong, J.; Zhou, J.; Zhao, Y.; Tan, J.; Ibrahim, A.N.; Zhou, Y. A spatial, social and environmental study of tuberculosis in China using statistical and GIS technology. Int. J. Environ. Res. Public Health 2015, 12, 1425–1448. [Google Scholar] [CrossRef] [PubMed]
- Moran, P.A. Notes on continuous stochastic phenomena. Biometrika 1950, 37, 17–23. [Google Scholar] [CrossRef]
n | r | |||
---|---|---|---|---|
30 | 3.2183 (2.9396, 3.4864) | 5.0233 (4.8207, 5.2440) | 0.4674 (0.2174, 0.7750) | 1.5645 (0.5607, 2.8764) |
300 | 3.0095 (2.9851, 3.0326) | 5.0157 (4.9936, 5.0373) | 0.2430 (0.1812, 0.3116) | 0.7739 (0.4879, 1.1050) |
n | r | |||
---|---|---|---|---|
30 | −0.9 | −0.8972 (−1.1348, −0.6604) | 0.5180 (0.2153, 0.9279) | 1.6117 (0.5110, 3.1908) |
−0.5 | −0.4866 (−0.7331, −0.2472) | 0.5275 (0.1578, 1.0433) | 1.5465 (0.4604, 3.0258) | |
−0.1 | −0.0898 (−0.3238, 0.1326) | 0.4904 (0.2254, 0.8168) | 1.5305 (0.5246, 2.8692) | |
0.1 | 0.1041 (−0.1156, 0.3052) | 0.5035 (0.2324, 0.8561) | 1.5775 (0.5622, 2.9340) | |
0.5 | 0.5695 (0.2080, 1.0405) | 0.5524 (0.1977, 0.9976) | 1.5628 (0.3258, 3.4612) | |
0.9 | 0.9009 (0.8906, 0.9116) | 0.5692 (0.2339, 0.9661) | 1.6231 (0.5154, 3.1612) | |
300 | −0.9 | −0.9012 (−0.9320, −0.8700) | 0.2412 (0.1699, 0.3288) | 0.7366 (0.4445, 1.1038) |
−0.5 | −0.5013 (−0.5372, −0.4661) | 0.2200 (0.1745, 0.2687) | 0.6624 (0.4725, 0.8729) | |
−0.1 | −0.1011 (−0.1386, −0.0641) | 0.2509 (0.1872, 0.3231) | 0.7952 (0.4979, 1.1445) | |
0.1 | 0.1004 (0.0638, 0.1369) | 0.2508 (0.1868, 0.3230) | 0.7987 (0.4999, 1.1492) | |
0.5 | 0.5087 (0.4788, 0.5372) | 0.2486 (0.1820, 0.3232) | 0.7869 (0.4697, 1.1598) | |
0.9 | 0.9021 (0.8975, 0.9065) | 0.2573 (0.1938, 0.3281) | 0.8233 (0.5124, 1.1841) |
n | |||
---|---|---|---|
30 | −0.9 | 3.1819 (2.9021, 3.4626) | 5.0792 (4.8317, 5.3333) |
−0.5 | 3.1970 (2.9155, 3.4783) | 5.0607 (4.8181, 5.3127) | |
−0.1 | 3.2031 (2.9185, 3.4852) | 5.0454 (4.8073, 5.2903) | |
0.1 | 3.1963 (2.8958, 3.4909) | 5.0657 (4.8071, 5.3361) | |
0.5 | 3.1805 (2.8550, 3.4989) | 5.1152 (4.8340, 5.4198) | |
0.9 | 3.1595 (2.8432, 3.4702) | 5.1337 (4.8548, 5.4265) | |
300 | −0.9 | 3.0031 (2.9762, 3.0296) | 5.0096 (4.9759, 5.0436) |
−0.5 | 3.0049 (2.9784, 3.0307) | 5.0109 (4.9791, 5.0427) | |
−0.1 | 3.0057 (2.9786, 3.0319) | 5.0128 (4.9799, 5.0458) | |
0.1 | 3.0068 (2.9800, 3.0325) | 5.0147 (4.9829, 5.0465) | |
0.5 | 3.0084 (2.9819, 3.0339) | 5.0205 (4.9919, 5.0486) | |
0.9 | 3.0071 (2.9810, 3.0317) | 5.0176 (4.9936, 5.0410) |
Model | DIC | WAIC | CPOm | |||||||
---|---|---|---|---|---|---|---|---|---|---|
Model (11) | ∖ | ∖ | −461.15 | −521.87 | −11.67 | |||||
Model (12) | −0.9 | −3348.64 | −3440.69 | 1450.59 | ||||||
−0.5 | ||||||||||
−0.1 | ||||||||||
0.1 | ||||||||||
0.5 | ||||||||||
0.9 |
Variables | Indicators | Description |
---|---|---|
Economy | Per capita disposable income by region | Refers to the sum of final consumption expenditure and other expenditures and savings available per capita |
Gross regional product | Refers to the final results of productive activities of all resident units in the area over a certain period of time | |
Per capita consumption expenditure by region | Refers to all expenditures to meet the household’s daily consumption | |
Medical care | Number of healthcare institutions by region | Refers to the number of legally established health institutions engaged in disease diagnosis and treatment activities in each region |
Number of beds in healthcare institutions by region | Refers to the number of beds in legally established health institutions in each region | |
Number of consultations by region | Refers to the total number of persons treated in healthcare institutions in each reigon | |
Transportation | Public transportation passenger volume | Refers to passenger traffic on all modes of transport that are open to the public and provide transportation services |
Passenger turnover | Refers to the number of passengers actually transported in a given period of time | |
Railroad mileage | Refers to the total length of the railroad line for passenger and freight transportation within a certain period of time. | |
Highway mileage | Refers to a certain period of time to actually achieve the “highway engineering [WTBZ] technical standards JTJ01-88” the provisions of the grade of highway | |
Modernization | Percentage of urban population | Refers to the population living permanently within the city limits and closely associated with urban activities. |
Urban area | Refers to a dense combination of people and housing covering a certain area in an easily accessible environment. | |
International tourism income | Refers to tourism expenditures incurred by inbound foreigners, overseas Chinese, Hong Kong, Macao and Taiwan people in the course of their travels in mainland China |
Model | Covariates | Mean | SD | 0.025 Quantile | 0.975 Quantile |
---|---|---|---|---|---|
Model 14 | Precipitation | 0.0320 | 0.0020 | 0.0280 | 0.0360 |
Temperature | −0.0820 | 0.0020 | −0.0860 | −0.0780 | |
Economy | 0.0840 | 0.0130 | 0.0600 | 0.1090 | |
Medical care | −0.0810 | 0.0160 | −0.1130 | −0.0500 | |
Transportation | −0.1340 | 0.0090 | −0.1510 | −0.1180 | |
Modernization | 0.0420 | 0.0010 | 0.0400 | 0.0430 | |
Particulate Matter | 0.0120 | 0.0030 | 0.0060 | 0.0180 | |
Model 15 | Precipitation | 0.0778 | 0.0217 | 0.0351 | 0.1202 |
Temperature | −0.0137 | 0.0019 | −0.0180 | −0.0088 | |
Economy | −0.0191 | 0.0871 | −0.1881 | 0.1542 | |
Medical care | −0.1319 | 0.0539 | −0.2398 | −0.0273 | |
Transportation | −0.0366 | 0.0386 | −0.1134 | 0.0383 | |
Modernization | −0.0595 | 0.0375 | −0.1330 | 0.0142 | |
Particulate Matter | 0.0396 | 0.0113 | 0.0174 | 0.0618 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xia, Z.; Tang, B.; Qin, L.; Zhang, H.; Hu, X. Spatially Dependent Bayesian Modeling of Geostatistics Data and Its Application for Tuberculosis (TB) in China. Mathematics 2023, 11, 4193. https://doi.org/10.3390/math11194193
Xia Z, Tang B, Qin L, Zhang H, Hu X. Spatially Dependent Bayesian Modeling of Geostatistics Data and Its Application for Tuberculosis (TB) in China. Mathematics. 2023; 11(19):4193. https://doi.org/10.3390/math11194193
Chicago/Turabian StyleXia, Zongyuan, Bo Tang, Long Qin, Huiguo Zhang, and Xijian Hu. 2023. "Spatially Dependent Bayesian Modeling of Geostatistics Data and Its Application for Tuberculosis (TB) in China" Mathematics 11, no. 19: 4193. https://doi.org/10.3390/math11194193
APA StyleXia, Z., Tang, B., Qin, L., Zhang, H., & Hu, X. (2023). Spatially Dependent Bayesian Modeling of Geostatistics Data and Its Application for Tuberculosis (TB) in China. Mathematics, 11(19), 4193. https://doi.org/10.3390/math11194193