Inference on a Multi-Patch Epidemic Model with Partial Mobility, Residency, and Demography: Case of the 2020 COVID-19 Outbreak in Hermosillo, Mexico

Akuno, Albert Orwa; Ramírez-Ramírez, L. Leticia; Espinoza, Jesús F.

doi:10.3390/e25070968

Open AccessArticle

Inference on a Multi-Patch Epidemic Model with Partial Mobility, Residency, and Demography: Case of the 2020 COVID-19 Outbreak in Hermosillo, Mexico

by

Albert Orwa Akuno

^1,*

,

L. Leticia Ramírez-Ramírez

¹

and

Jesús F. Espinoza

²

¹

Departamento de Probabilidad y Estadística, Centro de Investigación en Matemáticas A.C., Jalisco s/n, Colonia Valenciana, Guanajuato C.P. 36023, Gto, Mexico

²

Departamento de Matemáticas, Universidad de Sonora, Rosales y Boulevard Luis Encinas, Hermosillo C.P. 83000, Sonora, Mexico

^*

Author to whom correspondence should be addressed.

Entropy 2023, 25(7), 968; https://doi.org/10.3390/e25070968

Submission received: 8 May 2023 / Revised: 2 June 2023 / Accepted: 14 June 2023 / Published: 22 June 2023

(This article belongs to the Special Issue Statistical Physics of Opinion Formation and Social Phenomena)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Most studies modeling population mobility and the spread of infectious diseases, particularly those using meta-population multi-patch models, tend to focus on the theoretical properties and numerical simulation of such models. As such, there is relatively scant literature focused on numerical fit, inference, and uncertainty quantification of epidemic models with population mobility. In this research, we use three estimation techniques to solve an inverse problem and quantify its uncertainty for a human-mobility-based multi-patch epidemic model using mobile phone sensing data and confirmed COVID-19-positive cases in Hermosillo, Mexico. First, we utilize a Brownian bridge model using mobile phone GPS data to estimate the residence and mobility parameters of the epidemic model. In the second step, we estimate the optimal model epidemiological parameters by deterministically inverting the model using a Darwinian-inspired evolutionary algorithm (EA)—that is, a genetic algorithm (GA). The third part of the analysis involves performing inference and uncertainty quantification in the epidemic model using two Bayesian Monte Carlo sampling methods: t-walk and Hamiltonian Monte Carlo (HMC). The results demonstrate that the estimated model parameters and incidence adequately fit the observed daily COVID-19 incidence in Hermosillo. Moreover, the estimated parameters from the HMC method yield large credible intervals, improving their coverage for the observed and predicted daily incidences. Furthermore, we observe that the use of a multi-patch model with mobility yields improved predictions when compared to a single-patch model.

Keywords:

epidemics; human mobility; inference; deterministic inversion; Bayesian inference

1. Introduction

Research on the mathematical modeling of infectious diseases has witnessed continuous growth in recent decades, primarily motivated by the substantial impacts that contagious diseases have had in various crucial domains, including social, health, and economic development [1,2,3,4,5]. These domains are intricately linked to interactions within the human population, which serve as a key driving factor of disease.

Indeed, works such as [6,7] have raised awareness of the fact that the emergence and re-emergence of infectious diseases have been historically neglected as a significant threat to public health. Therefore, understanding the propagation dynamics of infectious diseases has become a central issue for epidemiologists and theoreticians [8,9,10,11,12]. Such knowledge subsequently enables the formulation of effective infection mitigation policies which, in turn, contribute to the maintenance of a healthy human population, thereby significantly bolstering the economic development of their respective nations.

Since its advent and declaration as a pandemic in March 2020 by the WHO, COVID-19 has undeniably wreaked havoc and drastically impacted the lives and livelihoods of the global human population. In terms of significance, COVID-19 has been compared to the influenza pandemic of 1918–1920 [8]; however, these pandemics have not yet been equated in terms of severity. The 1918 influenza pandemic had an estimated death toll of 40 million lives and infected one-third of the world’s population. In comparison, to date, COVID-19 has had a death toll of 6.6 million and 633 million confirmed cases (accounting for less than 10% of the total human population) [13]. However, the urgent need to conduct further research on COVID-19 and other infectious diseases is underscored by the warnings issued by scientists in recent years regarding the imminent threat of another pandemic. These cautionary warnings emphasize the pressing need to proactively study and understand these diseases to be better prepared for any potential future outbreaks [14]. These studies seek to understand the dynamics of infection propagation, control, and mitigation strategies. They encompass various aspects, including (but not limited to) model selection, statistical inference, uncertainty quantification, and prediction using empirical epidemiological data. Such studies provide epidemiologists and policymakers with valuable information regarding the effectiveness of control and eradication measures. This knowledge facilitates the formulation of proactive response strategies in anticipation of a potential new pandemic crisis.

Human behavior and interactions are at the core of the transmission dynamics of infectious and communicable diseases such as COVID-19. Consequently, researchers and epidemiologists involved in the study of infectious diseases must accurately incorporate human behavior as a critical component of the models used for modeling infectious diseases [15]. One significant human behavior that has consistently influenced the person-to-person transmission of infectious diseases during outbreaks is mobility [10,16,17,18,19,20,21]. Human mobility and travel patterns promote the creation of temporal social connections and networks that—whether local or global—play a crucial role in the spread of diseases. Human mobility, an undeniably crucial component of human existence and survival, not only plays a critical role in the spread and propagation of epidemics [18,22,23,24], but also crucially determines the speed of the spread of infectious diseases [23,25,26]. Therefore, it cannot be denied that a deep understanding of the role of human mobility in the spread of infectious diseases is central in the quest to have a clear view, for instance, of the impact of control strategies such as mobility restrictions [27,28,29,30,31].

Developments in epidemiological research have confirmed that traditional compartmental epidemic models are deficient in modeling strong heterogeneous disease spread, as they are formulated with the assumption of interactions within a well-mixed homogeneous population [15,32,33]. It has been widely recognized that humans do not exhibit homogeneous mobility patterns [15], highlighting the importance of incorporating the behavioral responses of a heterogeneous population into epidemiological models, as emphasized by [34]. The scientific literature offers a wide array of proposed models that aim to capture the heterogeneous mobility and dispersal patterns observed among humans. Many of these models consist of human groups of spatially separated populations that interact at some level (representing a geo-meta-population multi-patch framework), as evidenced by numerous studies in the literature (see, e.g., [26,27,34,35,36,37,38,39,40,41,42]). Although modeling the geospatiotemporal spread of infectious diseases is a complex task [43], models such as those in the mentioned references have crucially attempted to accurately capture human mobility in different settings. Nevertheless, there is still a need to incorporate further realistic mobility patterns that reflect more complex mobility scenarios [17], which is an inevitable reality when considering human behavior. In certain scenarios, it may be highly desirable to explore the relationship between the spread of infectious diseases and the proportion of time individuals spend in different locations during their travels [44]. This approach considers the impact of human movement on disease transmission and how it relates to the amount of time spent in different locations, with each having specific characteristics, such as the level of risk of infection. These locations, which are homogeneous within themselves and potentially differ between themselves, are usually referred to as patches in the literature. Then, a single-patch model can be reduced to a model with a well-mixed homogeneous population, where every susceptible individual can become infected with the same probability. The concept of multiple patches allows locations with different characteristics to be considered, helping to capture the spatial dynamics and interactions produced by human mobility that influence the spread of infectious diseases.

Once an appropriate epidemic model has been formulated, a myriad of analyses relevant to epidemiologists, policymakers, and public health officials can be conducted. These include (but are not limited to) theoretical analysis and numerical simulations (see, e.g., [39,40,45,46,47]), inference, uncertainty quantification, and predictions using empirical data (see, e.g., [14,48,49,50,51,52,53]). Most studies modeling population mobility and the spread of infectious diseases, particularly those using meta-population multi-patch models, have tended to focus on the theoretical and global properties and numerical simulations of such models. Relatively few studies have focused on inference and uncertainty quantification in epidemic models including population mobility. To the best of our knowledge, to date, the only study that has come close to performing inference, uncertainty quantification, and predictions on a multi-patch epidemic model is [51], in which a projected Stein Variational Gradient Descent (pSVGD) algorithm was used to model COVID-19 in the states of New Jersey and Texas in the U.S.; however, their model did not incorporate population mobility, as they modeled the dynamics and severity of COVID-19 inside and outside long-term care (LTC) facilities, where the interaction was introduced through a contact matrix. Other works that have addressed the problem of statistical inference for epidemics on networks include [54,55]. These models consider families of networks of contact that remain constant over time. Filling the gap left by the scarce literature on uncertainty quantification and Bayesian inverse problems on meta-population multi-patch epidemic models is the central motivation of the present study. To achieve this objective, we conduct inference and uncertainty quantification using confirmed COVID-19 cases in Hermosillo, Mexico, and employ a multi-patch epidemic model incorporating mobility, residency, and demography aspects. This model was first proposed and theoretically analyzed in [42].

Solving an inverse problem entails the utilization of both observed data and a model to deduce the parameter values that characterize the dynamics producing the observations [56,57,58]. However, employing models that adequately capture the heterogeneity and complexity of human behavior, such as mobility, in modeling the spread of infectious diseases introduces numerous parameters [51,59]. This, in turn, presents a multitude of challenges when addressing inverse problems (fit) and conducting inference (uncertainty quantification). To confront the problems originating from the complex system and its high-dimensional parameter space, we divide the problem into two parts: the first is related to the mobility parameters that describe the proportion of individuals who move and the time spent in each of the sub-regions/patches, while the second is associated with the numerical fit or statistical inference of the initial conditions and parameters for the infectious agent, as well as the individual evolution of infected individuals.

The remainder of this article is organized as follows: In Section 2, we present and describe the epidemic model on which the present analysis is centered. Section 3 discusses how the COVID-19 data used in this work were obtained. In Section 4, a discussion on the estimation of the mobility parameters and residence times of the epidemic model using mobile phone sensing data and the Brownian bridge model is presented. We elucidate the formulation of the deterministic inversion model and the corresponding results in Section 5. Section 6 details the Bayesian model formulation, inference, and the obtained results. Finally, we provide concluding remarks and possible extensions of the present work in Section 7.

2. The Multi-Patch Model with Mobility, Residency, and Demography

The emergence and re-emergence of infectious diseases over the past several decades has led to the formulation of numerous mathematical models incorporating varied and distinct disease propagation dynamics by researchers and epidemiologists. The central purpose of such models is to provide fundamental quantitative information and utilitarian guidelines for disease outbreak management and mitigation policy formulations. The provision of such useful information by these models is based on their ability to incorporate and capture realistic situations and disease transmission parameters. One such parameter which plays a crucial role in the spread of infectious diseases is human population mobility [16,17,18,21,39,40,60].

Several aspects of human history, such as social and economic factors, have been significantly impacted by the effects of the spread of infectious disease viruses among mobile sectors of human civilization [21]. Undoubtedly, the effects of the current COVID-19 pandemic is a demonstration of the devastating consequences of the interchange between human population mobility and infectious disease transmission. For instance, in December of 2019, Wuhan, China was the only understood geographical boundary of COVID-19 [61,62,63,64,65]. However, it can be postulated that human mobility within and between cities in China and around the world aided the spread of the pathogen to a significant portion of the planet. In a span of three months, the pathogen had posed monumental health, social, and economic stresses, to the point of being declared a pandemic by the WHO on 11 March 2020. Even though the exact nature of the contagion remained unclear, the undeniable interplay between the swift spread of the virus and human mobility through universally known modes of travel was recognized [16,66,67,68,69]. Indeed, besides aiding the spread of COVID-19, the close relationship between population mobility and the spread of infectious diseases has the potential to shape the emergence of future epidemics. Consequently, in order to formulate robust guidelines to manage both the current ongoing COVID-19 pandemic and possible future pandemics, there is dire need to construct appropriate epidemic models that accurately capture population mobility behaviors.

Human mobility, being a geospatiotemporal phenomenon [17,43,70,71], can be better modeled using meta-population multi-patch models [4,34,39,40,42,44], as traditional homogeneous compartmental models are incapable of capturing such a strong heterogeneous human behavior [15,17,34,51]. Once such models have been constructed, a wide spectrum of quantitative analyses leading to a deeper understanding of the relationship between mobility and the spread of infectious diseases can be conducted. For instance, researchers and epidemiologists may focus on theoretical properties and numerical simulations of the model or parameter estimation, uncertainty quantification, and predictions. In this research, we focus on the latter, using confirmed COVID-19 cases in Hermosillo, Mexico and a multi-patch SEIRS compartmental epidemic model with residency and demography characteristics. The model was proposed and its theoretical properties were analyzed in [42]. It is crucial to emphasize that, in this model, human mobility encompasses more than just travel between two locations; in particular, it involves the multiple temporal interactions that individuals experience while traveling to different destinations.

The following ordinary differential Equations (1) constitute the full form of the considered multi-patch non-linear model. This model extends the traditional compartmental SEIR model, which divides the population into different compartments based on their disease status: Susceptible (S), Exposed (E), Infected (I), and Recovered (R). Most individuals start in the susceptible compartment (S) but, when one of them comes into contact with an infectious individual, they move from the susceptible compartment into the exposed compartment (E), meaning that they have been infected but are not yet infectious. This individual leaves the E compartment to enter the I compartment once they can transmit the disease to susceptible individuals. Over time, infected individuals either recover from the disease or succumb to it. Once individuals recover, they move to the recovered compartment (R) and gain immunity to the disease. In the proposed model, the population dynamic also considers births (into the S compartment) and deaths (not produced by the disease).

The parameters of model (1) are described in Table 1. We refer the reader to [42] for a full description of the arguments leading to the formulation of this model.

\{\begin{matrix} {\dot{S}}_{i} & = Λ_{i} - β_{i} (1 - α_{i}) S_{i} \frac{(1 - α_{i}) I_{i} + \sum_{k = 1}^{n} {\tilde{p}}_{k i} I_{k}}{(1 - α_{i}) N_{i} + \sum_{k = 1}^{n} {\tilde{p}}_{k i} N_{k}} - \sum_{j = 1}^{n} (β_{j} {\tilde{p}}_{i j} S_{i} \frac{(1 - α_{j}) I_{j} + \sum_{k = 1}^{n} {\tilde{p}}_{k j} I_{k}}{(1 - α_{j}) N_{j} + \sum_{k = 1}^{n} {\tilde{p}}_{k j} N_{k}}) \\ - μ_{i} S_{i} + τ_{i} R_{i} \\ {\dot{E}}_{i} & = β_{i} (1 - α_{i}) S_{i} \frac{(1 - α_{i}) I_{i} + \sum_{k = 1}^{n} {\tilde{p}}_{k i} I_{k}}{(1 - α_{i}) N_{i} + \sum_{k = 1}^{n} {\tilde{p}}_{k i} N_{k}} + \sum_{j = 1}^{n} (β_{j} {\tilde{p}}_{i j} S_{i} \frac{(1 - α_{j}) I_{j} + \sum_{k = 1}^{n} {\tilde{p}}_{k j} I_{k}}{(1 - α_{j}) N_{j} + \sum_{k = 1}^{n} {\tilde{p}}_{k j} N_{k}}) \\ - (κ_{i} + μ_{i}) E_{i} \\ {\dot{I}}_{i} & = κ_{i} E_{i} - (γ_{i} + ψ_{i} + μ_{i}) I_{i} \\ {\dot{R}}_{i} & = γ_{i} I_{i} - (τ_{i} + μ_{i}) R_{i}, \end{matrix}

(1)

where

{\tilde{p}}_{k j} = α_{k} p_{k j}

,

i = 1, 2, \dots, n

, and n is the number of patches.

Model (1) is limited to infectious diseases with single-strain mutation; that is, in the formulation of the model [42], it is assumed that the recovered individuals in each patch are conferred with partial immunity, which they lose at a rate of

τ_{i}

. These individuals then re-enter the susceptible compartment, but it only in relation to the same strain of the pathogen. Notably, many infectious diseases—including COVID-19, which is the focus of this research—have several pathogenic strains whose dynamic properties should be studied for complete sensible management of emerging and re-emerging outbreaks with multiple variants. To this end, model (1) can be improved to include multiple COVID-19 disease strains.

Works such as [72,73,74,75,76] have considered models that incorporate several disease strains and mutations, with some of them focusing on this property with respect to the COVID-19 disease. Even though these works are mostly single-patch based, they provide valuable insights and serve as a useful starting point to extend model (1) in this important direction.

3. Data

Based on an institutional agreement between the University of Sonora (UNISON) and the Secretaría de Salud del Estado de Sonora (SSA), a collaborative effort was undertaken to provide accurate and reliable information to the population of Sonora during the pandemic emergency. As part of this initiative, researchers from the Mathematics Department at UNISON were granted access to georeferenced data pertaining to COVID-19 cases in Hermosillo, Mexico. The data set encompasses the period from 1 January 2020 to 6 September 2020. It is important to emphasize that the usage of these data are strictly limited to academic activities in accordance with UNISON’s agreements with public institutions, and utmost care has been taken to safeguard sensitive and private information.

Furthermore, on 14 September 2020, UNISON entered into a specific agreement (referenced as 12615-5700000-000412) to provide statistical consulting services to LUMEX CONSULTORES, S.C. Esteemed research professors from the Mathematics Department (including the co-author J.F.E.) provided these consulting services. The data set provided by LUMEX CONSULTORES, S.C. comprises a comprehensive collection of approximately 80 million records, each containing the GPS position and time stamp of nearly 300,000 devices. These devices were geographically located within Hermosillo city between 21 September 2020 and 15 November 2020. The usage of these data were explicitly authorized for non-profit and academic activities, with appropriate credits duly acknowledged. Notably, user privacy was carefully protected through the meticulous anonymization of the data, employing distinct alphanumeric IDs for each individual device.

In a separate study, Ramirez et al. (2022) utilized mobile phone GPS data from 21 September 2020 to 15 November 2020 to estimate mobility parameters and residence times in the 582 urban AGEBs (Basic Census Geographical Units) of Hermosillo [4]. In alignment with our research objectives of modeling the number of COVID-19 cases in four key zones within Hermosillo, considered as aggregate AGEBs as illustrated in Figure 1, we employed the geo-referenced and GPS information to obtain the number of COVID-19 cases and estimate mobility and residence patterns within and between these zones.

The two geo-referenced databases utilized in this study did not exhibit a temporal overlap, thereby limiting the feasibility of employing the complex model (1), which relies on the estimates of numerous epidemiological parameters. To address this problem, we elected to use the weighted global COVID-19 data in terms of zonal proportions.

The global confirmed positive COVID-19 cases in Hermosillo (Figure 2) can be obtained from [77], a website managed by the Consejo Nacional de Ciencias y Technología (CONACYT) since 26 February 2020. We used the weighted global cases by considering the zonal proportions of the global COVID-19 data, a selection justified in Figure 3, from which we can observe that the proportions of the available zonal COVID-19 data were approximately stable from June 2020. Therefore, we can consider that the data present the same behavior within the period for which the mobility parameters and mobility residence times were estimated. The global (Figure 2) and derived zonal COVID-19 cases, however, exhibited high variability. Therefore, we smoothed the data using 7-day moving averages, using the smoothed version for the rest of the analysis.

Most models that have been proposed to analyze and predict the incidence of COVID-19 since the emergence of SARS-CoV-2 assume that all infected individuals are observed, which is an inexact assumption due to under-reporting of disease incidence statistics. Therefore, this assumption undermines the accuracy of the models and their predictions. Emerging diseases which present a large fraction of asymptomatic pathogen carriers, such as COVID-19, Typhoid fever, Hepatitis B, Epstein–Barr virus, and Zika, are often characterized by incidence under-reporting during disease surveillance [78,79,80,81]. Coupled with asymptomatic and sub-clinical carriers—a phase that is prevalent with COVID-19 disease [82,83,84,85]—another source of under-reporting of disease incidence is a lack of systematic testing. Therefore, it is necessary to account for under-reporting when modeling and fitting COVID-19 disease incidence, as failure to do so will lead to underestimation of the epidemiological characteristics of the disease, especially the transmission parameter [86].

In modeling the observed COVID-19 incidence for eight American countries, ref. [86] noted an under-reporting of COVID-19 cases in Mexico by a factor of 15. This acute case identification problem corroborated the observation in [87], where Mexico had one of the lowest numbers of COVID-19 tests performed per reported cases. We believe that this under-reporting has cascaded down to the local states and cities of Mexico, including Hermosillo. In order to account for this under-reporting, we inflated the smoothed observed COVID-19 incidence (obtained from the weighting procedure previously explained) by a factor of 15 to obtain the daily incidence data. We subsequently considered the inflated data (see Figure 4) as the true observed daily COVID-19 incidence in Hermosillo, Mexico, and used them for inference in this research.

In addition to using real count data for model fitting and uncertainty quantification, we also utilized simulated incidence data generated from the model. Our goal was to assess the identifiability of the parameters and initial conditions for the inverse problem, as well as the statistical inference performance of the model. To this end, we used epidemiological parameters of COVID-19 gathered from the literature, as presented in Table 2, as well as those simulated using the considered model (1), to obtain daily observed incidences for each zone in Hermosillo. For this simulation, we used epidemiological parameter values close to those used by [88], who modeled the COVID-19 lockdown relaxation period in Hermosillo. The parameters and initial condition values used in the simulation can be seen in Table 3; the latter were based on the total population of each zone. After the simulation of the zonal incidence using the model, we added a normal noise

e_{t} \sim N (I_{i} (t), σ = 100)

to introduce variability into the model data. The final simulated incidence data for each zone in Hermosillo are presented in Figure 5.

4. Estimation of the Mobility and Residence Time Parameters

For the present study, we sought to solve an inverse problem and perform uncertainty quantification based on system (1), using confirmed COVID-19 cases for 2020 in Hermosillo, Mexico. Such data may not be sufficient for estimating all of the mobility and infection parameters, as well the initial conditions of the system. Moreover, simultaneous estimation of several parameters often leads to parameter non-identifiability problems [109,110,111,112,113,114,115]. As such, we used the mobile phone sensing data detailed in [4] for the purpose of estimating the mobility vector

α_{i}

,

i = 1, 2, \dots, n

and the mobility residence time matrices

P = (p_{i j})

,

i = 1, 2, \dots, n

,

j = 1, 2, \dots, n

, which we incorporated into system (1). Using mobile consecutive pings, ref. [4] estimated the mobility parameter vectors and mobility residence time matrices for 582 urban AGEBs in Hermosillo, Mexico for three periods, each divided into two parts. We used the same GPS pings during the Third Period, First Part (i.e., 21 September 2020 to 11 October 2020). Using this information, we estimated the mobility and residence time matrices for the four zones in Hermosillo. As we are interested in evaluating the forecasting (after November 2020) ability of the model, we selected this period as its start date was closer to the last period of the considered COVID-19 cases. We note that we could also use mobility data from other periods and areas, employing a similar procedure to that described below, in order to estimate mobility and residence times for the four zones with respect to the period and area of interest.

Solving the inverse problem and performing uncertainty quantification on system (1) for a set total number of AGEBs (patches) posed a logical set-up challenge. Thus, using well-defined geographical demarcations of Hermosillo, we grouped the 582 AGEBs into four main zones. These zones, together with their corresponding total population, are shown in Figure 1. The mobility parameter vector

α_{i}

and residence times

p_{i j}

,

i = 1, 2, 3, 4

,

j = 1, 2, 3, 4

were estimated by searching for individuals who activated at least 11 pings and left their AGEBs within and between the zones. For the individual AGEBs, ref. [4] estimated the probability density functions at position z for any individual originating from AGEB/patch r as

{\hat{h}}_{r} (z) = \frac{1}{n_{r}} \sum_{j = 1}^{n_{r}} h_{r j} (z; {\hat{σ}}_{r j}, δ),

(2)

where

n_{r}

is the patch population size,

h_{r j}

is the density of the expected time of the j-th individual originating from the r-th patch,

{\hat{σ}}_{r j}

is the maximum likelihood estimate of the individual standard deviation of the mobility of the j-th resident from the r-th patch, and

δ^{2}

is the variance of the geolocation error. After grouping and renaming the composing AGEBs of each of the four zones, we have

n_{_{1}}^{[l]} {\hat{h}}_{_{1}}^{[l]} (z) = \sum_{j = 1}^{n_{_{1}}} h_{1 j}^{[l]} (z; {\hat{σ}}_{1 j}^{[l]}, δ), \dots, n_{_{n_{l}}}^{[l]} {\hat{h}}_{_{n_{_{l}}}}^{[l]} (z) = \sum_{j = 1}^{n_{_{1}}} h_{n_{_{l}} j}^{[l]} (z; {\hat{σ}}_{n_{_{l}} j}^{[l]}, δ),

(3)

where the subscript index

[l]

indicates that the said quantity is in zone ℓ, and

n_{l}

represents the total number of AGEBs in the

l^{th}

zone.

From (3), we can obtain the estimated probability density function at location z for any individual originating from patch r in zone ℓ as

{\hat{h}}_{_{l}}^{*} (z) = \frac{1}{N^{[l]}} \sum_{r = 1}^{n_{_{l}}} \sum_{j = 1}^{n_{r}} h_{r j}^{[l]} (z; {\hat{σ}}_{r j}^{[l]}, δ),

(4)

where

N^{[l]} = \sum_{j = 1}^{n_{r}} n_{j}^{[l]}

. Then, the expected occupation time in A of a resident of zone ℓ can thus be obtained, from (4), as

{\hat{P}}_{_{l}} (A) = \int_{A} {\hat{h}}_{_{l}}^{*} (z) d z = \frac{1}{N^{[l]}} \int_{A} \sum_{r = 1}^{n_{_{l}}} \sum_{j = 1}^{n_{r}} h_{r j}^{[l]} (z; {\hat{σ}}_{r j}^{[l]}, δ) d z .

(5)

As mentioned above, for this research, we considered the mobility parameter and the mobility residence matrix for the Third Period, First Part (21 September 2020 to 11 October 2020), which we computed (using the above procedure) to obtain

α = {(\begin{matrix} 0.9668, & 0.9265, & 0.9692, & 0.9680 \end{matrix})}^{⊤} and P = (\begin{matrix} 0.8164 & 0.1289 & 0.0372 & 0.0175 \\ 0.1222 & 0.8119 & 0.0215 & 0.0444 \\ 0.0722 & 0.0504 & 0.7293 & 0.1481 \\ 0.0313 & 0.1166 & 0.1278 & 0.7243 \end{matrix}) .

5. Deterministic Inversion

Due to their exponential growth in time, the state variables of the dynamical system (1) are extremely sensitive to the input parameters. This sensitivity introduces uncertainty into the pre-inferred parameter values which, in turn, may result in divergence or significant uncertainty in the model outputs. For a robust and desirable model analysis, it is imperative to avoid such model divergences by first determining the optimal model parameters that fit the measured data. Moreover, prior knowledge of point parameter estimates is necessary for meaningful uncertainty quantification. To this end, we began by deterministically inverting the system (1), as described in this section, and then subsequently performed uncertainty quantification on the inferred point parameters, described in the following section. An outline of the formulation of the inversion problem and the results is provided below.

5.1. Formulation of the Model to Minimize

Let

x = (S, E, I, R) \in {(L^{2} (0, T))}^{4 n}

where

T > 0

,

S = {(S_{1}, \dots, S_{n})}^{⊤}

,

E = {(E_{1}, \dots, E_{n})}^{⊤}

,

I = {(I_{1}, \dots, I_{n})}^{⊤}

, and

R = {(R_{1}, \dots, R_{n})}^{⊤}

denote the state variables in system (1) for n patches. Let

m_{i}

be the dimension of the parameters to be estimated for patch i. Then, we have the total number of parameters to be estimated for patch i, denoted as

θ_{i} = {θ_{i 1}, \dots, θ_{i m_{i}}}

. Consequently, the total number of parameters,

m = \sum_{i = 1}^{n} m_{i}

, for the global system is

θ_{} = ⋃_{i = 1}^{n} θ_{i} \in R_{+}^{m}

. System (1) can be viewed as a mapping

Ψ (θ) = x

, where

Ψ : R_{+}^{m} ⟶ {(L^{2} (0, T))}^{4 n}

(i.e.,

Ψ

maps the parameters to be estimated to the state variables) defining an initial value problem of the form

\dot{x} = f (x, θ), x (t_{0}) = x_{0} \in R^{n}, t_{0} \in R .

(6)

The function f in the forward problem (6) is continuous and satisfies the Lipschitz condition with respect to

x

(see [42]). Thus, given

θ

, the forward problem (6) has a unique solution

x

.

Besides the numerically simulated states

x

, the inverse problem requires directly measured state variables at a discrete set of points

t_{1}, t_{2}, \dots, t_{k}

as an input. In most circumstances, not all of the state variables of the system can be directly observed. For instance, in epidemiology, data on new confirmed cases of infected individuals is readily available, compared to data on other epidemiological statuses of the population. For this reason, we use the new cases reported over a certain period of time (e.g., days, weeks) from the model to formulate the inverse problem.

Let

y_{obs} = {y_{1, t}, \dots, y_{n, t}}_{t = 1}^{M}

,

M \in N

denote the observed new infected cases in each unit/period of time

t \in 1, 2, \dots, M

for the n patches. Let

Z_{i} (t, θ)

be the incidence in Patch i during

(t - 1, t]

, derived from the numerical solution of the forward problem (6) with parameter vector

θ

. The objective function of the inverse problem is thus defined as

F (θ) = \sum_{i = 1}^{n} \sum_{t = 1}^{M} \frac{{\{Z_{i} (t, θ) - y_{i, t}\}}^{2}}{Z_{i} (t, θ)} .

(7)

Consequently, the inverse problem becomes: compute

\hat{θ} \in Θ

such that

\hat{θ} = \underset{θ \in Θ}{argmin} F (θ) s . t Ψ (θ) = x,

(8)

where

Θ

is the feasible region of the parameter

θ

.

Problem (8) can be optimized using methods such as Landweber [116,117,118,119], faster methods such as Levenberg–Marquardt, or gradient-based numerical non-linear least squares minimization algorithms such as the Sequential Least SQuares programming algorithm (SLSQP) (see, e.g., [51,120]). Due to the complexity of the forward problem (6), the inverse problem (8) is complex, high-dimensional, and computationally intensive. As a consequence, when considering problem (8), it is difficult for gradient-based algorithms and/or other optimization techniques to determine the global minima, thereby easily failing to reach the optimal solution. As an alternative, we chose to solve problem (8) using a Genetic Algorithm (GA), a type of the diverse machine learning Evolutionary Algorithm (EA) that utilizes heuristic search and optimization methods. Besides their efficacy in the optimization of problems involving numerous parameters with large feasible regions (i.e., problems with increased dimensionality) and multiple local optima, GAs do not require gradient computations, which may be computationally challenging or altogether unavailable [121,122].

5.2. Results

As we considered four zones, we minimized Equation (7) as in (8) taking

n = 4

. Besides estimating the epidemiological parameters

⋃_{i = 1}^{4} {β_{i}, κ_{i}, γ_{i}}

, we also estimated the initial conditions

⋃_{i = 1}^{4} {E_{i, 0}, I_{i, 0}}

. Thus, from the deterministic inversion, we estimated a total of 20 parameters:

θ = ⋃_{i = 1}^{4} {β_{i}, κ_{i}, γ_{i}, E_{i, 0}, I_{i, 0}}

.

Table 4 and Table 5 provide the estimated parameters, while Figure 6 and Figure 7 present the fitted incidence for the simulated and observed data, respectively. For the simulated incidence, the fit was visibly good, even though there was variation between the estimated and observed parameters (see Table 3) and a more important difference in the initial conditions. These disparities may be a result of the noise that was added to the simulated data, representing an important modification to the original simulated incidence (especially in the early stages of the outbreak). As in complex models, the situation where there exist different sets of parameters that would give an equally good fit to the zonal model incidences may have had an influence on the results.

As the estimated parameters from the GA had a good fit to the model incidence, we used them as a guide in specifying the initial sampling points for Stan and t-walk when performing Bayesian inference and uncertainty quantification (as detailed in the following section). The values were particularly important for t-walk to begin sampling as, without them, the algorithm does not converge and completely fails. In addition, as there are no studies providing the spaces (upper and lower bounds) for the initial conditions of the incubation and prevalence states parameters for Hermosillo, we used the estimates of the said parameters from the GA as a guide to set their boundaries in the Bayesian procedure.

6. Bayesian Inference

Deterministic inversion of epidemic models has gained prominence in terms of learning the trajectories of epidemics through the use of measured epidemiological data. However, deterministic inversion ignores inherent uncertainties due to imperfect data, stochasticity, model structural discrepancies, and even parameter uncertainties, which typically affect the epidemic model. In order to make robust, valid, and reliable decisions regarding the management, prediction, and forecasting of current and future epidemics, good treatment and quantification of such uncertainties is crucial. In this section, we use Bayesian techniques to quantify the uncertainties that characterize the epidemic model (1), its parameters, and the observed Hermosillo COVID-19 infection data, as well as providing credible intervals for the model output.

Over the past few decades, Bayesian inference has become an important approach, as it provides an optimal probability platform for learning unknown parameters of a system given observed data. A Bayesian framework learns the trajectories of the unknown parameters by updating their prior distributions to the posterior distribution. Such prior updates and exploitation of the parameter space of the posterior distribution can be achieved using standard Monte Carlo Markov Chain (MCMC) methods. Moreover, MCMC methods provide approximate estimates of intractable posterior distributions, which often characterize the modeling of real-world phenomena. The estimation of such intractable posterior distributions has been a long-standing challenge in the realm of Bayesian inference [123]. For this research, we use t-walk and Stan, which are MCMC-based algorithms, to simulate the posterior distribution. A detailed account of how the Bayesian framework is treated is presented in this section.

6.1. Building the Posterior Distribution

In Bayesian theory, the observed data

y_{obs}

are fixed, while the state variable

x

and the parameter

θ

are considered as random variables. In this setting, Bayes’ theorem defines the posterior distribution as

π (θ | y_{obs}) \propto p (y_{obs} | θ) π (θ),

(9)

where

π (θ)

(called the prior distribution) codifies our belief about the unknown parameter

θ

before the data were observed and

p (y_{obs} | θ)

(called the likelihood distribution) codifies all the information available regarding how the observed data

y_{obs}

were obtained. From the Bayesian perspective, (9) constitutes an inverse problem which may be solved by using the Maximum a posteriori (MAP), an argument that maximizes the posterior (

\arg \max_{θ \in Θ} π (θ | y_{obs})

), the posterior mean (

E [π (θ | y_{obs})]

), or posterior median as the optimal value

\hat{θ}

of

θ

.

As we consider count data in this study—which, by their very nature, are events that occur within a given period of time—a Poisson process would be a natural and meaningful starting point for modeling and performing inference on the observed cases. However, count data, especially epidemiological observations, usually exhibit more variation than is implied by the Poisson distribution, pointing to inherent over-dispersion in the data. Such variations may be due to sampling, aggregation, environmental variability, or a combination of these factors, causing count data to have inherent over-dispersion. Therefore, it is not possible that count epidemiological data present an equal mean-variance relationship, as is assumed by the Poisson distribution. In fact, ref. [124] posited that most commonly used approaches assume a quadratic mean-variance relationship. The mean-variance relationship in modeling count data, such as epidemiological data, can be adequately described by the negative-binomial distribution, given that it includes an additional parameter that permits the variance to be larger than the mean [125,126,127]. Moreover, the negative binomial distribution is a mixture of Poisson and Gamma distributions [127,128], indicating that the Poisson distribution is still involved in the modeling process, even when the negative binomial distribution is used.

We assume that the observed data follow the negative binomial distribution with one dispersion parameter

ν

and write

y \sim N B (ν, p)

, where p is the probability of success. This distribution has the form

p (y | μ, θ) = (\begin{matrix} y + ν - 1 \\ ν \end{matrix}) {(\frac{μ}{μ + ν})}^{ν} {(\frac{ν}{μ + ν})}^{y},

(10)

where

μ = E (Y) = \frac{ν p}{1 - p} and Var (Y) + \frac{μ^{2}}{ν} = μ + ρ μ^{2} with ρ = \frac{1}{ν} .

It is worth noting that

ρ μ^{2} > 0

is the additional variance allowed by the Negative Binomial distribution with respect to the Poisson distribution, and that the parameter

ρ

can be viewed as controlling the dispersion of the observed data.

We acknowledge that there are other forms of the Negative binomial distribution; for example, ref. [49] achieved success in modeling COVID-19 cases and hospital demand in Mexico using a Negative binomial distribution with two over-dispersion parameters. This distribution was proposed by [124]. However, in a model with numerous parameters to estimate—as is the case for the model considered here—it is our view that additional parameters would further increase the dimensionality of the already complex optimization problem.

We now detail the construction of the posterior distribution for the present multi-patch model study. Using the observed data and the variable

θ

described above, we assume that the observed cases in each patch follow a negative-binomial distribution with over-dispersion parameter (i.e., the number of failures before first success)

ν_{i}

and success probabilities

p_{i} = \frac{μ_{i}}{μ_{i} + ν_{i}},

where

μ_{i}

is the theoretical mean of the negative binomial distribution, which can be taken as the incidence in patch i

μ_{i} (j, θ) = Z_{i} (j, θ), i = 1, 2, \dots, n j = 1, 2, \dots, M .

(11)

With this setting in mind, we have that, if

y_{i, j}

represents the incidence in Patch i observed in day j with theoretical mean

μ_{i} (j, θ)

, as in (11), then we have

y_{i, j} | θ \sim N B (ν_{i}, \frac{μ_{i} (j, θ)}{μ_{i} (j, θ) + ν_{i}}) .

Then, the likelihood from Patch i,

i = 1, 2, \dots, n

, corresponds to

\begin{matrix} p_{i} (y_{i} = (y_{i, 1}, \dots, y_{i, M}) | θ) & = \prod_{j = 1}^{M} \frac{Γ (y_{i, j} + ν_{i})}{y_{i j}! Γ (ν_{i})} {(\frac{μ_{i} (j, θ)}{μ_{i} (j, θ) + ν_{i}})}^{ν_{i}} {(\frac{ν_{i}}{μ_{i} (j, θ) + ν_{i}})}^{y_{i, j}} \\ \propto (\prod_{j = 1}^{M} \frac{Γ (y_{i, j} + ν_{i})}{Γ (ν_{i})}) {(\frac{μ_{i} (j, θ)}{μ_{i} (j, θ) + ν_{i}})}^{M ν_{i}} {(\frac{ν_{i}}{μ_{i} (j, θ) + ν_{i}})}^{\sum_{j = 1}^{M} y_{i, j}} . \end{matrix}

Hence, the likelihood of the combined data from the n patches is given as

p (y_{obs} | θ) = \prod_{i = 1}^{n} p_{i} (y_{i} | θ)

.

To establish the prior distribution, we define the joint prior distribution as the product of the marginal; that is, if we denote

θ = \cup_{i = 1, \dots, n, l = 1, \dots, m_{i}} θ_{i l}

, where

θ_{i l}

is the ℓ parameter associated with Patch i, then

π (θ) = \prod_{l = 1}^{m_{1}} g (θ_{1 l}) \prod_{l = 1}^{m_{2}} g (θ_{2 l}) \dots \prod_{l = 1}^{m_{n}} g (θ_{n l}) = \prod_{i = 1}^{n} [\prod_{l = 1}^{m_{i}} g (θ_{i l})],

(12)

where

m_{i}

is the number of parameters for Patch i.

Thus, the posterior distribution of the parameters from all n patches can be obtained as the product of (9) and (12), as (9).

6.2. Results

From the deterministic inversion section, we can note that we are interested in estimating the transmission, incubation, and recovery rates

β_{i}

,

κ_{i}

and

γ_{i}

for each zone i (

i = 1, 2, 3, 4

), as well as the exposed and infected incidence initial conditions

E_{i, 0}

and

I_{i, 0}

. For the Bayesian inference (also called probabilistic inversion), four over-dispersion parameters

ν_{i}

are included, related to each zone’s observed incidence data. The dispersion parameters can be loosely defined as the number of trials before we encounter the first successful infection in each zone. These parameters come with the negative binomial distribution, which we use as the distribution of the observed zonal incidences. Thus, we have a total of 24 parameters for the Bayesian inference in this section, which we must estimate:

θ = ⋃_{i = 1}^{4} {β_{i}, κ_{i}, γ_{i}, E_{i, 0}, I_{i, 0}, ν_{i}} .

(13)

An important problem that has been the topic of recent research is the selection of adequate prior distributions, their corresponding hyper-parameters, and the delimitations of the parametric spaces of the parameters to be estimated [129,130,131]. In this research, for the epidemiological parameters, we chose prior distributions with 95% of their body falling within the parametric spaces (upper and lower bounds) determined from the COVID-19 literature, as detailed in Table 2.

In this way, we considered the following prior distributions for each of the parameters:

\begin{matrix} β_{i} & \sim LogNorm (μ_{i}, σ_{i}) = g (β_{i}), \\ κ_{i} & \sim Gamma (4.5264, 19.1006) = g (κ_{i}), \\ γ_{i} & \sim Gamma (1.9826, 3.6943) = g (γ_{i}), \\ E_{i, 0} & \sim Unif (a_{i}, b_{i}) = g (E_{i, 0}), \\ I_{i, 0} & \sim Unif (0, 10) = g (I_{i, 0}), \\ ν_{i} & \sim Gamma (0.7561, 0.2714) = g (ν_{i}) . \end{matrix}

Furthermore, the hyper-parameters for the log-Normal prior distributions of the contact parameters were

μ_{1} = - 0.0529

,

μ_{2} = - 0.0975

,

μ_{3} = - 0.8562

,

μ_{4} = - 0.2583

, and

σ_{1} = σ_{2} = σ_{3} = σ_{4} = 0.2803

; while those for the initial conditions of the exposed state were

a_{1} = 50

,

b_{1} = 100

,

a_{2} = 70

,

b_{2} = 100

,

a_{3} = 15

,

b_{3} = 50

, and

a_{4} = 0

,

b_{4} = 20

. These prior probability distributions, which we selected using the above-mentioned criteria, are versatile and allow for the assumption that some parameter values could occur with lower, equal, or higher probability densities.

To avoid generalizations, it is imperative to perform a diagnostic analysis regarding the adequacy and correctness of the selected prior distributions before sampling. These prior distributions should be coherent with our expectations and correspond to the domain knowledge of each parameter obtained from the deterministic inversion in the previous section. We desire priors that—in accordance with our deterministic inversion domain experience—permit every conceivable configuration of the data while excluding blatantly ludicrous scenarios. Figure 8 shows a graph of the selected prior distributions for the infection, incubation, and recovery parameters, vis-à-vis the corresponding posterior distributions. We can observe that, for most parameters, we allowed for low informative priors and the posterior distributions were within their respective prior probability support. This conforms with the suggestion of [132], who stated that the model should not be overly confined by the priors, which should instead be too wide to encompass a wide variety of situations and data that are incredibly improbable. We thus conclude that our selected prior distributions were adequate and capable of regularizing our estimates to avoid non-identifiability. We acknowledge that [131] also proposed an interesting criterion for selecting prior distributions together with their corresponding hyper-parameters, based on parametric intervals. Applying the assumption of independence of the parameters, the joint prior distribution is

\tilde{π} (θ) = \prod_{i = 1}^{4} g (β_{i}) g (κ_{i}) g (γ_{i}) g (E_{i, 0}) g (I_{i, 0}) g (ν_{i})

.

The likelihood for the four zonal incidences is calculated as

p (y_{obs} | θ) \propto \prod_{i = 1}^{4} [(\prod_{j = 1}^{M_{i}} \frac{Γ (y_{i, j} + ν_{i})}{Γ (ν_{i})}) {(\frac{μ_{i} (j, θ)}{ν_{i} + μ_{i} (j, θ)})}^{M_{i} ν_{i}} {(\frac{ν_{i}}{ν_{i} + μ_{i} (j, θ)})}^{\sum_{j = 1}^{M_{i}} y_{i, j}}],

where

M_{i}

is the total number of days that the incidence is observed for each zone

i = 1, 2, 3, 4

,

y_{i, j}

is the

j^{th}

observed incidence for zone i, and

μ_{i} (j, θ)

is the incidence provided by model (1). Thus, the posterior distribution is given as

p (θ | y_{obs}) = p (y_{obs} | θ) \tilde{π} (θ)

.

The incidence values output from the epidemic model (1) were very sensitive and exhibited a lot of uncertainty in response to slight changes, especially in terms of the contact, incubation, and recovery parameters. As a consequence, it was challenging to provide a wide range of values as support for the posterior distribution in Stan and t-walk. We addressed this challenge by beginning with deterministic inversion of the epidemic model using GA, as discussed in the previous section. The results of this inversion are given in Table 5. Consequently, we used the point parameter estimates obtained from the GA as the initial sampling points for t-walk and Stan. Moreover, as there is no literature reporting the parametric spaces for the initial conditions of the incubation and prevalence states, we delimited their support in the posterior distribution to within certain neighborhoods of their point parameter estimates obtained from the GA.

After performing the prior predictive checks, using the t-walk package [133], we obtained posterior samples for each of the 24 parameters (13). MCMC methods, by definition, produce samples with high correlation, harboring some level of redundant information. One way of measuring the level of information redundancy in the posterior sample is through use of the effective sample size (ESS). Theoretically, the ESS is the sample size we would obtain if we independently sampled the posterior distribution. The ESS and the proportion of ESS (pESS) for the simulated samples are given in the last two columns of Table A1. Going by how low (high) the ESS (pESS) values are, we can tell that the simulated chains were highly correlated and, thus, the utilized MCMC method required a very high number of iterations. In other words, in order to obtain samples with insignificant auto-correlations, thinning at large lags should be conducted, which can remarkably reduce the chain size. To obtain chains with reasonable sample sizes for analysis, after running the MCMC method for 1,000,000 iterations, we discarded 360,000 burn-in samples and thinned the chains at lag 20.

The summary statistics of the generated samples are provided in Table A1 (in the Appendix A). The presented summary statistics include the mean, variance, 95% credible interval (CI), median, ESS, and pESS. From this table, we can conclude that, in general, all the samples exhibited low variability and that their means were close to the estimates obtained by the GA in the previous section. Furthermore, we can conclude that the means (and medians) were all within the relevant 95% CIs. The former and the latter observations can be confirmed from Figure 9, which shows the 95% CIs of the samples, the means, and the estimates from the GA. Based on the scales of the values in the intervals, we can conclude that the credible intervals (CIs) were generally short—an indication of good estimates.

Figure 10 and Figure 11 present the posterior predictive checks. These figures graph the incidence simulations of model (1) from 26 February 2020 to 17 October 2020. This period includes all of the dates within which the incidence data were observed (26 February 2020 to 6 September 2020) and 41 prediction dates (7 September 2020 to 17 October 2020). Figure 10 shows 30,000 model incidences (gray curves) from the first 30,000 MCMC iterations, the observed incidence, and the maximum a posteriori (MAP) model incidence. Here, the MAP model incidence—which is apparently covered by the gray curves—is the model incidence output obtained from the parameter values that maximize the posterior distribution. We can observe, from this figure, that: (1) Our fitted model produced simulations that are consistent with the observed incidence data; and (2) the simulated trajectories did not present diverse or variable changes. Indeed, the latter observation is consistent with our previous observation that there was low variability in the simulated samples.

Figure 11 shows the 95% posterior credible intervals (CIs), mean, median, and MAP of the simulated model incidences. From this figure, we can see that most of the observed training incidences for the four zones fell outside the 95% CI of the simulated model incidence, with the zone 1 and zone 3 simulated model incidences providing the narrowest 95% CIs. In contrast, zones 2 and 4 had larger 95% prediction CIs, covering more of the observed incidences. The zone 1 model prediction interval provided the worst coverage of the observed prediction incidences. The proportions of the observed prediction incidences covered by the 95% CIs of the simulated model incidence from t-walk are presented in Table 6. As the CIs for the four zones were generally not too wide and covered a small proportion of the training observed incidence, and considering the prediction observed incidences for zones 3 and 4, we can conclude that the point estimates from t-walk seem to be adequate; however, the uncertainty associated with the estimated parameters tended to be underestimated.

We can observe, from both Figure 10 and Figure 11, that the model predicted that there will still be some infection levels beyond 6 September 2020—the date when we observed the last incidence. In fact, the figures reveal that, in the four zones, some individuals will still be infected up until 17 October 2020, with the median and MAP incidences being within the 95% CI.

After checking the adequacy of the prior distributions with respect to the posterior distribution and sampling, using the t-walk package, we also performed sampling of the posterior distribution using the Stan package. Stan is a Hamiltonian Monte Carlo (HMC)-based package designed for Bayesian statistical analysis; we refer the reader to [134] for more details about the software and [132,135] for insight into how the package can be used to perform inference and uncertainty quantification for compartmental epidemic models.

For Stan, we used the same support for the prior distribution as for t-walk, except for the incubation initial conditions. Using the same support for the incubation initial conditions in t-walk led to a very low overall proposal acceptance rate. As such, we used a different, shorter support range for all the incubation initial conditions prior distributions in t-walk, in order to obtain a reasonable proposal acceptance rate for the algorithm. However, in Stan, the support range for the same parameters were larger, as Stan is more robust. For the over-dispersion parameters, we used the transformed versions and sampled their inverses in Stan.

For the model at hand, we ran one chain with 1000 iterations for each of the 24 parameters and took the first 300 iterations as burn-in samples. The number of iterations was not the same as those obtained with t-walk, as Stan requires longer periods of time than t-walk to complete each iteration. However, from the analysis, we observed that the chains were importantly less correlated, and we thus obtained bigger effective sample sizes. Figure 12 and Figure 13, respectively, show the trace plots and the posterior densities for the samples of the estimated parameters. We obtained unit Rhat values from the Stan summary statistics output, an indication that all but two of the transitions converged. Additional Stan statistics—particularly, the diagnosis of the pairs plot of a subset of the estimated parameters (see Figure 14)—confirmed the convergence of all but two of the transitions of the chain during sampling.

Figure 15 shows the 95% CIs of the posterior distributions obtained from Stan. All the parameters, except for

β_{1}

,

γ_{1}

,

γ_{2}

, and

γ_{3}

, had short CIs. It is worth noting that (just as in t-walk) the samples of the inverse of the over-dispersion parameters

ν_{i}

from Stan had very narrow intervals.

The pair plots in Figure 14 show the posterior distributions of the infection and recovery parameters (diagonal figures) and scatter plots (off-diagonal figures) presenting the relationships between each of the samples. As there were many (24) parameters to estimate, it was not appealing to create a joint pairs plot for all of them. Inasmuch as this is the ideal procedure, doing this would lead to a figure with very tiny, non-visualizable figure entries. Thus, we chose to pair the infection and recovery rates (see Figure 14), as the scatter plots of these parameters exhibited interesting relationships. For instance, we can observe from the figure that there were strong linear positive posterior correlations between

β_{1}

and

γ_{1}

,

β_{2}

and

γ_{2}

, and

β_{4}

and

γ_{4}

. In contrast, the infection parameters for the four zones did not exhibit any important relationships among themselves. This was also the case for the recovery parameters.

The summary statistics of the samples obtained from Stan are shown in Table A2. These statistics were slightly different from those obtained with t-walk, as displayed in Table A1. The differences in the statistics of the prevalence initial conditions and the over-dispersion parameters were obvious, as we have already mentioned that, besides using different support ranges for the distributions of the prevalence initial conditions, we transformed and sampled the inverses of the over-dispersion parameters

ν_{i}

,

i = 1, 2, 3, 4

. Otherwise, the slight differences in the statistics of the other parameters could have been due to the difference in how the t-walk and Stan packages perform sampling. Notably, these differences would not lead to a different conclusion regarding the potency of COVID-19 infections in Hermosillo in 2020. The statistics in Table A2 reveal that the point estimates for the parameters were within the 95% CIs, as was the case for the estimated parameters in t-walk, as shown in Table A1.

The differences in the summary statistics of the samples obtained from both packages are further reflected in Figure 11 and Figure 16, which show the posterior predictive check figures for t-walk and Stan, respectively. We can observe that the mean, median, and observed incidences for each zone were all covered by the respective 95% CIs. In comparison, we can observe, from the t-walk output in Figure 11, that the 95% CI of the epidemic model incidences did not cover some observed daily incidences, while the same interval with Stan covered all of the training observed zonal incidences. We can thus conclude that the estimated parameters from Stan capture a more realistic uncertainty level, compared to those estimated by t-walk. Figure 11 and Figure 16 reveal that the COVID-19 infections in the four zones—and consequently, the whole of Hermosillo—peaked in mid-July of 2020, with a prediction of continued infections beyond 6 September 2020 (i.e., the last date of observed training daily incidences).

Table 2 provides the ranges for the parameters of the model, as obtained with reference to the COVID-19 literature. Our results from the GA, t-walk, and Stan packages (see Table 5, Table A1 and Table A2, respectively) indicated that the point estimates of the infection and incubation parameters (

β_{i}

and

κ_{i}

,

i = 1, 2, 3, 4

) that best fit the incidences for the four zones were within the ranges derived from the literature. However, this was not the case for the recovery parameters (

γ_{i}

), whose estimates were generally close to the one-day recovery period average for the four zones. This result is not surprising, as our incidence data for the four zones were inflated by a large under-reporting factor. This inflation, besides indicating a possible lack of systematic testing for COVID-19 in Hermosillo, also accounted for the existence of sub-clinical and asymptomatic COVID-19 patients in the four zones under study. Going by how well the models fit the data (see Figure 7, Figure 10, Figure 11 and Figure 16), we believe that accounting for incidence under-reporting by a factor of 15 (as reported by [86]) is realistic.

We further evaluated the prediction capability of the multi-patch model by checking the proportion of observed zonal prediction incidences that were within the 95% CIs of the zone model incidences. Table 6 shows these proportions obtained from Stan (see Figure 16) and t-walk (see Figure 11). From this table, we can see that Stan produced 95% prediction bands covering more than 50% of the observed prediction incidences for all of the zones. On the other hand, t-walk provided 95% CIs large enough to cover less than 50% of observed prediction incidences for all the four zones, with zone 1 providing the narrowest simulated model incidence CIs, covering only 22.22% of the observed prediction incidence. In effect, we can conclude that Stan outperforms t-walk, with regard to the percentage of covered observed daily prediction incidence for all of the considered zones.

6.3. Comparison to a Single-Patch Model

To assess the improvement obtained by using a more complex model, compared to a typical single population model, we fit a single-patch model (the global SEIRS model) and compared its predictive performance against the multi-patch model with mobility. The systems of differential equations forming the single-patch model are presented in Equation (14).

\{\begin{matrix} \dot{S} = Λ - β S I / N - μ S + τ R \\ \dot{E} = β S I / N - (κ + μ) E \\ \dot{I} = κ E - (γ + ϕ + μ) I \\ \dot{R} = γ I - (μ + τ) R . \end{matrix}

(14)

We run the single-patch model using similar birth, natural death, and disease-induced mortality rate values as in the multi-patch model. We estimated the infection (

β

), incubation (

κ

), and recovery (

γ

) rates, as well as the initial conditions

E_{0}

and

I_{0}

, for the single-patch model. The prior distributions of these parameters and the likelihood of the global observed incidence remained the same as in the multi-patch model. The initial population was taken as the sum of the population sizes for zones 1–4, as shown in Figure 1; that is,

N_{0} = \sum_{i = 1}^{4} N_{i} (0) = 858, 812

.

We used the observed global incidence in the period of 26 February 2020 to 6 September 2020 in order to train the global single-patch model, then used the incidence in the period of 7 September 2020 to 17 October 2020 to assess the prediction efficiency of the model. Table 7 provides the summary statistics of the estimated parameters. For comparison with the multi-patch model, we summed the predicted mean and median incidences for all of the zones and computed the efficiency measures. The efficiency measures used in this case were the Root Mean Squared Error (RMSE) and Mean Absolute Percentage Error (MAPE), defined as

RMSE = \sqrt{\frac{1}{n} \sum_{t = 1}^{n} {(Y_{t} - {\hat{Y}}_{t})}^{2}} and MAPE = \frac{100}{n} \sum_{t = 1}^{n} |\frac{Y_{t} - {\hat{Y}}_{t}}{Y_{t}}|,

where n is the total number of the predicted incidences,

Y_{t}

is the actual observed incidence, and

{\hat{Y}}_{t}

is the predicted incidence (which, in this case, is the mean and median incidences for both the multi-patch and global single-patch models).

Figure 17 shows the global single-patch predicted incidences together with the 95% credible incidence intervals, and Table 8 provides the efficiency measures of the predictions for both the multi- and single-patch models. From this table (and, indeed, by comparing Figure 16 and Figure 17), we can see that the multi-patch model outperformed the global single-patch model, as it obtained predicted incidences with lower RMSE and MAPE values. Moreover, Figure 16 reveals that a bigger percentage of the observed prediction and model-predicted incidences from the multi-patch model were within the 95% credible incidence interval. Thus, we can conclude that, in terms of modeling COVID-19 in Hermosillo, using the multi-patch model with mobility represents an important improvement, when compared to the single-patch model.

7. Conclusions

In this work, we utilized three techniques to estimate the parameters and perform inference and uncertainty quantification on a multi-patch epidemic model including mobility, residency, and demography factors. An analysis was conducted using mobile phone GPS data and confirmed COVID-19 cases in four zones of Hermosillo, Mexico. The model comprises two sets of parameters—20 mobility and residence time parameters and 20 epidemiological parameters (including the incubation and prevalence initial conditions)—which were all estimated.

The complexity of epidemic models makes them susceptible to the problem of parameter non-identifiability, a complication which we addressed by using two distinctive data sets and different estimation techniques to estimate the two sets of parameters separately. In the first step, we used a Brownian Bridge model and mobile phone GPS data to estimate the first set of 20 mobility and residence times for the four zones in Hermosillo. The next set of the model epidemiological parameters were estimated using confirmed positive COVID-19 cases and two estimation techniques: deterministic inversion using GA and probabilistic inversion, and the latter using two different MCMC methods implemented through the t-walk and Stan statistical packages.

The incidence outputs of the epidemic model were very sensitive to slight changes in its parameters, especially the infection, incubation, and recovery rates. This necessitated the use of the deterministic inversion technique as the first step of estimation, which provided us with an idea of the possible parametric values and model incidences that best fit the observed incidences. This first step of estimation thus allowed for the determination of the scope of the parametric space within which the support of the posterior distribution and the initial sampling values used in the probabilistic inversion stage (with t-walk and Stan) could be delimited.

We are not concerned with a detailed comparison of t-walk and Stan, as there could be many factors affecting their performance, which undoubtedly vary among different models. Some of these factors may include the chosen parametrization of the model, the prior distribution, and initial values from where the sampling starts. However, we should mention that some studies have established that it may be difficult to use MCMC-based inference tools to model systems of ODEs, as the relevant likelihood functions may contain several local minima that causes them to violate the standard regulatory conditions. In these cases, Stan offers high statistical efficiency and, in complex scenarios, may produce more accurate results when compared to t-walk. Indeed, Stan performed better than t-walk for the presented epidemic model, as the posterior distributions were characterized by highly correlated parameter spaces. We observed this fact in the present study based on the large ESS proportions obtained from Stan, compared to those obtained from t-walk.

The estimation results obtained in this study indicated that the infection and incubation rates for the four considered zones in Hermosillo were within the ranges postulated in various studies in the COVID-19 literature. However, the recovery parameters fell outside the estimated ranges reported in other COVID-19-based studies. This deviation was, however, justified as possibly resulting from the inflation of incidences in the four zones by a factor of 15, in order to account for under-reporting. Regarding the estimated parameters, the model presented a good fit to COVID-19 infections in the four zones and, consequently, the whole city of Hermosillo; furthermore, under the model, the cases also peaked in mid-July of 2020.

The epidemic model and the results obtained in this study can be useful to public health officials as they not only provide better predictions, but as mobility parameters that can be modified to assess the effectiveness of different levels of mobility restrictions were explicitly introduced. These restrictions can be global or applied to only some patches. We believe that the detailed model and methods can be developed into a user-friendly tool to guide public health policies for the management and mitigation of emerging and re-emerging infectious diseases.

The epidemic model detailed in this article was confined within the limits of modeling a single strain of the COVID-19 virus. However, like many infectious diseases, multiple strains of COVID-19 have emerged as a result of its genetic mutation. For example, four mutants of the pathogen were detected within the first one and a half years of the pandemic. Therefore, it would be of great relevance to extend the proposed epidemic model to include the dynamics of multiple COVID-19 pathogenic strains. Such an extension could lead to better understanding and management of future infectious outbreaks.

In addition, besides urban mobility, the epidemic model used in this research did not account for government intervention measures, such as vaccination and regional or national mobility. As the vaccination campaign in Mexico started in December 2020, this factor is absent; however, it would be desirable to introduce these (and other) mitigation strategies in future models, in order to model subsequent outbreaks of COVID-19 and other infectious diseases.

It is worth noting that the assumptions for the proposed model include those for the ODEs to describe the dynamics within each patch, while ODEs can be effective in approximating dynamics in many scenarios, there are cases where—even with homogeneous mixing—the mathematical system may fail to accurately represent the behavior of a small population.

One reason for this is the assumption of continuous variables and smooth transitions inherent to ODEs. In small populations, discrete effects and stochastic events can have a significant impact on the dynamics. These discrete effects, such as random fluctuations or individual interactions, introduce variability that is not captured by the ODEs, leading to the model inaccurately reflecting the true behavior of the system.

To address this, one option for small spatial scales and patch sizes is to leverage computational techniques such as agent-based modeling. This approach captures the important variability that arises from individual interactions [136,137,138,139,140,141]. Additionally, in order to reduce the variability of simulations, relevant contact patterns produced by street or building layouts can be incorporated. At a more detailed level, modeling individual interactions as a network of contacts still allows for fitting or performing statistical inference using computer-intensive methods [55,142,143,144]. For these reasons, scaling up such methods to slightly larger patches can pose serious challenges that require careful considerations.

Author Contributions

Conceptualization, A.O.A. and L.L.R.-R.; methodology, A.O.A. and L.L.R.-R.; software, A.O.A., L.L.R.-R., and J.F.E.; validation, A.O.A., L.L.R.-R., and J.F.E.; formal analysis, A.O.A. and L.L.R.-R.; investigation, A.O.A. and L.L.R.-R.; resources, A.O.A., L.L.R.-R., and J.F.E.; data curation, A.O.A., L.L.R.-R., and J.F.E.; writing—original draft preparation, A.O.A. and L.L.R.-R.; writing—review and editing, A.O.A. and L.L.R.-R.; visualization, A.O.A. and L.L.R.-R.; supervision, L.L.R.-R.; project administration, L.L.R.-R. All authors have read and agreed to the published version of the manuscript.

Funding

This research was facilitated by the CONACyT scholarship No. 1013383.

Institutional Review Board Statement

Not applicable, as this study did not require any ethical approval. The use of data in this work complies with the agreement (referenced as 12615-5700000-000412) signed between University of Sonora (UNISON) and LUMEX CONSULTORES, S. C., that protects private information and restricts its use to non profit and academic activities.

Informed Consent Statement

Not applicable.

Data Availability Statement

The global Hermosillo COVID-19 data are publicly available from the CONACyT website https://datos.COVID-19.conacyt.mx (accesed on 1 November 2022). The GPS mobility and zonal COVID-19 data utilized in this study are available upon request from the corresponding author. These data sets are not publicly accessible, due to the inclusion of private and personal individual information. The geo-referenced COVID-19 data provided by the Secretaría de Salud del Estado de Sonora (SSA) are protected by confidentiality agreements and in accordance with the laws of personal data of Mexico, as they contain sensitive personal information such as medical records and residence locations. Detailed information regarding the processing, analysis, and visualization of this data can be found at https://www.mat.uson.mx/web/index.php/coronavirus-COVID-19/ (accesed on 1 November 2022), specifically on the official website of the Mathematics Department at UNISON. The codes used in this research are publicly available at https://github.com/AlbertAkuno/Inference (accesed on 1 November 2022).

Acknowledgments

We would like to thank the Supercomputers Laboratory at CIMAT for providing us with the necessary computer resources that we used to run the codes in this work, and LUMEX CONSULTORES, S.C. for contributing the mobile phone GPS data and approving its use for academic purposes.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Posterior Statistics from t-Walk and STAN

Appendix A.1. Posterior Statistics from t-Walk

Table A1. Statistics of model parameters estimated using probabilistic (t-walk) inversion.

	Probabilistic Inversion (t-Walk) Statistics
Parameter	Mean	SD	$q_{0.025}$	$q_{0.250}$	$q_{0.5}$	$q_{0.750}$	$q_{0.975}$	ESS	pESS
$β_{1}$	1.3552	0.1039	1.1166	1.2951	1.3724	1.4307	1.5033	25	0.0008
$β_{2}$	1.2258	0.1014	1.0668	1.1321	1.2366	1.3036	1.4168	22	0.0007
$β_{3}$	0.4223	0.0663	0.3046	0.3734	0.4225	0.4629	0.5734	28	0.0009
$β_{4}$	1.0839	0.1404	0.8218	0.9383	1.1300	1.1700	1.2970	18	0.0006
$κ_{1}$	0.4649	0.0368	0.3754	0.4510	0.4757	0.4888	0.4989	233	0.0073
$κ_{2}$	0.4730	0.0294	0.4013	0.4627	0.4811	0.4925	0.4994	112	0.0035
$κ_{3}$	0.4236	0.0575	0.2735	0.3940	0.4320	0.4683	0.4964	122	0.0038
$κ_{4}$	0.2614	0.0401	0.1991	0.2318	0.2526	0.2897	0.3436	21	0.0006
$γ_{1}$	1.3740	0.1082	1.1382	1.3000	1.4047	1.4649	1.4977	22	0.0007
$γ_{2}$	1.3461	0.1243	1.1416	1.2259	1.3920	1.4620	1.4945	20	0.0006
$γ_{3}$	1.3196	0.1551	0.9050	1.2604	1.3547	1.4367	1.4959	32	0.0010
$γ_{4}$	0.5198	0.0673	0.4061	0.4546	0.5358	0.5669	0.6372	18	0.0006
$E_{1, 0}$	61.3153	3.1875	53.1884	59.6405	62.2854	63.8354	64.8765	92	0.0029
$I_{1, 0}$	6.6851	2.6790	0.6896	4.9450	7.3324	8.9846	9.8718	175	0.0055
$E_{2, 0}$	80.5102	3.8104	71.4076	78.1892	81.6369	83.5947	84.8731	88	0.0027
$I_{2, 0}$	6.3754	2.8778	0.3521	4.4653	7.1644	8.8401	9.9016	75	0.0023
$E_{3, 0}$	27.8516	1.9766	22.5930	26.8572	28.4460	29.3748	29.9438	160	0.0050
$I_{3, 0}$	5.7194	2.8518	0.3452	3.4194	6.0118	8.2424	9.8920	288	0.0090
$E_{4, 0}$	7.9665	1.5100	4.2772	7.1346	8.2710	9.1433	9.9091	87	0.0027
$I_{4, 0}$	1.6821	1.5480	0.0551	0.4926	1.2054	2.3330	5.7315	70	0.0022
$ν_{1}$	9.6957	0.2948	8.8788	9.5804	9.7863	9.9084	9.9933	226	0.0071
$ν_{2}$	9.1320	0.7812	7.3177	8.7525	9.3166	9.7064	9.9798	141	0.0044
$ν_{3}$	9.7135	0.3543	8.9596	9.6075	9.8080	9.9251	9.9939	147	0.0046
$ν_{4}$	7.8458	1.3651	5.2517	6.8248	7.9591	9.0183	9.9349	35	0.0010

Appendix A.2. Posterior Statistics from Stan

Table A2. Statistics of model parameters estimated using probabilistic (Stan) inversion.

	Probabilistic Inversion (Stan) Statistics
Parameter	Mean	SD	$q_{0.025}$	$q_{0.250}$	$q_{0.5}$	$q_{0.750}$	$q_{0.975}$	ESS	pESS
$β_{1}$	1.2476	0.1080	1.0107	1.1807	1.2498	1.3193	1.4317	255	0.3643
$β_{2}$	1.1591	0.0546	1.0432	1.1294	1.1698	1.2000	1.2460	213	0.3043
$β_{3}$	0.3811	0.0380	0.3129	0.3543	0.3805	0.4057	0.4536	244	0.3486
$β_{4}$	0.7728	0.0245	0.7489	0.7548	0.7656	0.7855	0.8440	173	0.2471
$κ_{1}$	0.4409	0.0377	0.3606	0.4166	0.4427	0.4729	0.4976	188	0.2686
$κ_{2}$	0.4660	0.0282	0.3973	0.4501	0.4728	0.4881	0.4985	248	0.3543
$κ_{3}$	0.4019	0.0517	0.2978	0.3641	0.4053	0.4385	0.4930	244	0.3486
$κ_{4}$	0.3766	0.0280	0.3228	0.3578	0.3758	0.3933	0.4365	196	0.2800
$γ_{1}$	1.2275	0.1359	0.9530	1.1274	1.2277	1.3274	1.4783	241	0.3443
$γ_{2}$	1.4091	0.0795	1.2259	1.3692	1.4280	1.4729	1.4980	209	0.2986
$γ_{3}$	1.3743	0.0983	1.1104	1.3161	1.3922	1.4521	1.4958	204	0.2914
$γ_{4}$	0.3550	0.0113	0.3408	0.3472	0.3514	0.3612	0.3846	188	0.2686
$E_{1, 0}$	94.4414	5.1047	80.4055	92.2135	95.8150	98.3510	99.7569	234	0.3343
$I_{1, 0}$	6.5046	2.5618	1.0641	4.6339	7.0732	8.6031	9.8693	202	0.2886
$E_{2, 0}$	94.8808	4.5037	83.3513	92.6637	95.9344	98.4198	99.8949	143	0.2043
$I_{2, 0}$	5.8008	2.8349	0.3588	3.6777	6.1113	8.2312	9.8222	217	0.3100
$E_{3, 0}$	43.6993	5.1480	32.0696	40.9228	45.1198	47.7972	49.8012	224	0.3200
$I_{3, 0}$	5.2505	2.8159	0.1604	2.5964	5.4800	7.5498	9.6372	217	0.3100
$E_{4, 0}$	9.1961	1.9174	5.6108	7.9395	9.2555	10.4385	12.6814	262	0.3743
$I_{4, 0}$	0.8923	0.9075	0.0397	0.2712	0.6660	1.1865	3.4709	202	0.2886
$ν_{1}$	9.6200	0.3765	8.6530	9.4789	9.7231	9.8887	9.9887	167	0.2386
$ν_{2}$	8.8543	0.7276	7.3756	8.3520	8.9653	9.4043	9.8811	222	0.3171
$ν_{3}$	7.4251	1.1378	5.1812	6.6442	7.4181	8.2871	9.4924	146	0.2086
$ν_{4}$	7.1626	1.0035	5.2185	6.4972	7.1173	7.7591	9.2361	151	0.2158

References

Jašková, D.; Havierniková, K. The human resources as an important factor of regional development. Int. J. Bus. Soc. 2020, 21, 1464–1478. [Google Scholar] [CrossRef]
Krypa, N. Social economic development and the human resources management. Acad. J. Interdiscip. Stud. 2017, 6, 73. [Google Scholar] [CrossRef] [Green Version]
Peterson, E.W.F. The role of population in economic growth. Sage Open 2017, 7, 2158244017736094. [Google Scholar] [CrossRef] [Green Version]
Ramírez-Ramírez, L.L.; Montoya, J.A.; Espinoza, J.F.; Mehta, C.; Akuno, A.O.; Bui-Thanh, T. Use of mobile phone sensing data to estimate residence and mobility times in urban patches during the COVID-19 epidemic: The case of the 2020 outbreak in Hermosillo, Mexico. arXiv 2022, arXiv:2212.09963. [Google Scholar]
Vlad, C.A.; Ungureanu, G.; Militaru, M. Human resources contribution to economic growth. Rev. Econ. 2012, 850. [Google Scholar]
Oaks, S.C., Jr.; Shope, R.E.; Lederberg, J. Emerging Infections: Microbial Threats to Health in the United States; National Academies Press: Washington, DC, USA, 1992. [Google Scholar]
Brower, J.; Chalk, P. The Global Threat of New and Reemerging Infectious Diseases: Reconciling US National Security and Public Health Policy, Santa Monica: Rand, 2003. Available online: https://www.rand.org/pubs/monograph_reports/MR1602.html (accessed on 1 August 2022).
Adiga, A.; Dubhashi, D.; Lewis, B.; Marathe, M.; Venkatramanan, S.; Vullikanti, A. Mathematical models for covid-19 pandemic: A comparative analysis. J. Indian Inst. Sci. 2020, 100, 793–807. [Google Scholar] [CrossRef]
Schmidtchen, M.; Tse, O.; Wackerle, S. A multiscale approach for spatially inhomogeneous disease dynamics. arXiv 2016, arXiv:1602.05927. [Google Scholar] [CrossRef] [Green Version]
Castillo-Chavez, C.; Bichara, D.; Morin, B.R. Perspectives on the role of mobility, behavior, and time scales in the spread of diseases. Proc. Natl. Acad. Sci. USA 2016, 113, 14582–14588. [Google Scholar] [CrossRef] [Green Version]
Rvachev, L.A.; Longini, I.M., Jr. A mathematical model for the global spread of influenza. Math. Biosci. 1985, 75, 3–22. [Google Scholar] [CrossRef]
Baroyan, O.; Rvachev, L.; Basilevsky, U.; Ermakov, V.; Frank, K.; Rvachev, M.; Shashkov, V. Computer modelling of influenza epidemics for the whole country (USSR). Adv. Appl. Probab. 1971, 3, 224–226. [Google Scholar] [CrossRef]
Anifandis, G.; Tempest, H.G.; Oliva, R.; Swanson, G.M.; Simopoulou, M.; Easley, C.A.; Primig, M.; Messini, C.I.; Turek, P.J.; Sutovsky, P.; et al. COVID-19 and human reproduction: A pandemic that packs a serious punch. Syst. Biol. Reprod. Med. 2021, 67, 3–23. [Google Scholar] [CrossRef] [PubMed]
Prieto, K. Current forecast of COVID-19 in Mexico: A Bayesian and machine learning approaches. PloS ONE 2022, 17, e0259958. [Google Scholar] [CrossRef] [PubMed]
Weston, D.; Hauck, K.; Amlôt, R. Infection prevention behaviour and infectious disease modelling: A review of the literature and recommendations for the future. BMC Public Health 2018, 18, 336. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Luo, M.; Qin, S.; Tan, B.; Cai, M.; Yue, Y.; Xiong, Q. Population Mobility and the Transmission Risk of the COVID-19 in Wuhan, China. ISPRS Int. J. Geo-Inf. 2021, 10, 395. [Google Scholar] [CrossRef]
Changruenngam, S.; Bicout, D.J.; Modchang, C. How the individual human mobility spatio-temporally shapes the disease transmission dynamics. Sci. Rep. 2020, 10, 11325. [Google Scholar] [CrossRef]
Findlater, A.; Bogoch, I.I. Human mobility and the global spread of infectious diseases: A focus on air travel. Trends Parasitol. 2018, 34, 772–783. [Google Scholar] [CrossRef]
Danon, L.; Ford, A.P.; House, T.; Jewell, C.P.; Keeling, M.J.; Roberts, G.O.; Ross, J.V.; Vernon, M.C. Networks and the epidemiology of infectious disease. Interdiscip. Perspect. Infect. Dis. 2011, 2011, 284909. [Google Scholar] [CrossRef] [Green Version]
Apostolopoulos, Y.; Sönmez, S.F. Population Mobility and Infectious Disease; Springer: New York, NY, USA, 2007. [Google Scholar]
Gushulak, B.D.; MacPherson, D.W. Population mobility and infectious diseases: The diminishing impact of classical infectious diseases and new approaches for the 21st century. Clin. Infect. Dis. 2000, 31, 776–780. [Google Scholar] [CrossRef]
Dalziel, B.D.; Pourbohloul, B.; Ellner, S.P. Human mobility patterns predict divergent epidemic dynamics among cities. Proc. R. Soc. B Biol. Sci. 2013, 280, 20130763. [Google Scholar] [CrossRef]
Ni, S.; Weng, W. Impact of travel patterns on epidemic dynamics in heterogeneous spatial metapopulation networks. Phys. Rev. E 2009, 79, 016111. [Google Scholar] [CrossRef]
Tang, M.; Liu, Z.; Li, B. Epidemic spreading by objective traveling. EPL Europhys. Lett. 2009, 87, 18005. [Google Scholar]
Yang, Z.; Cui, A.X.; Zhou, T. Impact of heterogeneous human activities on epidemic spreading. Phys. A Stat. Mech. Its Appl. 2011, 390, 4543–4548. [Google Scholar]
Meloni, S.; Perra, N.; Arenas, A.; Gómez, S.; Moreno, Y.; Vespignani, A. Modeling human mobility responses to the large-scale spreading of infectious diseases. Sci. Rep. 2011, 1, 1–7. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Wang, B.; Cao, L.; Suzuki, H.; Aihara, K. Safety-information-driven human mobility patterns with metapopulation epidemic dynamics. Sci. Rep. 2012, 2, 887. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Epstein, J.M.; Goedecke, D.M.; Yu, F.; Morris, R.J.; Wagener, D.K.; Bobashev, G.V. Controlling pandemic flu: The value of international air travel restrictions. PLoS ONE 2007, 2, e401. [Google Scholar] [CrossRef]
Colizza, V.; Barrat, A.; Barthelemy, M.; Valleron, A.J.; Vespignani, A. Modeling the worldwide spread of pandemic influenza: Baseline case and containment interventions. PLoS Med. 2007, 4, e13. [Google Scholar] [CrossRef] [Green Version]
Bajardi, P.; Poletto, C.; Ramasco, J.J.; Tizzoni, M.; Colizza, V.; Vespignani, A. Human mobility networks, travel restrictions, and the global spread of 2009 H1N1 pandemic. PLoS ONE 2011, 6, e16591. [Google Scholar] [CrossRef] [Green Version]
Ferguson, N.M.; Cummings, D.A.; Fraser, C.; Cajka, J.C.; Cooley, P.C.; Burke, D.S. Strategies for mitigating an influenza pandemic. Nature 2006, 442, 448–452. [Google Scholar] [CrossRef]
Keeling, M.; Rohani, P. Modeling Infectious Diseases Humans and Animals; Princeton University Press: Princeton, NJ, USA, 2008. [Google Scholar]
Frıas-Martınez, E.; Williamson, G.; Frıas-Martınez, V. Agent-based modelling of epidemic spreading using social networks and human mobility patterns. In Proceedings of the 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust and IEEE Third International Conference on Social Computing, Citeseer, Boston, MA, USA, 9–11 October 2011; pp. 57–64. [Google Scholar]
Cota, W.; Soriano-Paños, D.; Arenas, A.; Ferreira, S.C.; Gómez-Gardeñes, J. Infectious disease dynamics in metapopulations with heterogeneous transmission and recurrent mobility. New J. Phys. 2021, 23, 073019. [Google Scholar] [CrossRef]
Balcan, D.; Gonçalves, B.; Hu, H.; Ramasco, J.J.; Colizza, V.; Vespignani, A. Modeling the spatial spread of infectious diseases: The GLobal Epidemic and Mobility computational model. J. Comput. Sci. 2010, 1, 132–145. [Google Scholar] [CrossRef] [Green Version]
Driessche, P.v. Spatial structure: Patch models. In Mathematical Epidemiology; Springer-Verlag: Berlin/Heidelberg, Germany, 2008; pp. 179–189. [Google Scholar]
Herrera-Valdez, M.A.; Cruz-Aponte, M.; Castillo-Chavez, C. Multiple outbreaks for the same pandemic: Local transportation and social distancing explain the different" waves" of A-H1N1pdm cases observed in México during 2009. Math. Biosci. Eng. 2011, 8, 21. [Google Scholar]
Khan, K.; Arino, J.; Hu, W.; Raposo, P.; Sears, J.; Calderon, F.; Heidebrecht, C.; Macdonald, M.; Liauw, J.; Chan, A.; et al. Spread of a novel influenza A (H1N1) virus via global airline transportation. N. Engl. J. Med. 2009, 361, 212–214. [Google Scholar] [CrossRef] [PubMed]
Bichara, D.; Kang, Y.; Castillo-Chavez, C.; Horan, R.; Perrings, C. SIS and SIR Epidemic Models Under Virtual Dispersal. Bull. Math. Biol. 2015, 77, 2004–2034. [Google Scholar] [CrossRef] [Green Version]
Bichara, D.; Abderrahman, I. Multi-patch and multi-group epidemic models: A new framework. J. Math. Biol. 2018, 77, 107–134. [Google Scholar] [CrossRef] [Green Version]
Riley, S. Large-scale spatial-transmission models of infectious disease. Science 2007, 316, 1298–1301. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Akuno, A.O.; Ramirez-Ramirez, L.L.; Mehta, C.; Krishnanuni, C.; Bui-Thanh, T.; Montoya, J.A. Multi-Patch Epidemic Models with Partial Mobility, Residency, and Demography. Available online: https://ssrn.com/abstract=4414906 (accessed on 1 November 2022).
Arino, J.; van den Driessche, P. A multi-city epidemic model. Math. Popul. Stud. 2003, 10, 175–193. [Google Scholar] [CrossRef]
Brauer, F. Mathematical epidemiology: Past, present, and future. Infect. Dis. Model. 2017, 2, 113–127. [Google Scholar] [CrossRef] [PubMed]
Moualeu, D.; Bowong, S.; Tsanou, B.; Temgoua, A. A patchy model for the transmission dynamics of tuberculosis in sub-Saharan Africa. Int. J. Dyn. Control 2018, 6, 122–139. [Google Scholar] [CrossRef]
Salmani, M.; van den Driessche, P. A model for disease transmission in a patchy environment. Discret. Contin. Dyn. Syst.-B 2006, 6, 185. [Google Scholar] [CrossRef]
Phaijoo, G.R.; Gurung, D.B. Mathematical study of dengue disease transmission in multi-patch environment. Appl. Math. 2016, 7, 1521. [Google Scholar] [CrossRef] [Green Version]
Swallow, B.; Birrell, P.; Blake, J.; Burgman, M.; Challenor, P.; Coffeng, L.E.; Dawid, P.; De Angelis, D.; Goldstein, M.; Hemming, V.; et al. Challenges in estimation, uncertainty quantification and elicitation for pandemic modelling. Epidemics 2022, 38, 100547. [Google Scholar] [CrossRef] [PubMed]
Capistran, M.A.; Capella, A.; Christen, J.A. Forecasting hospital demand in metropolitan areas during the current COVID-19 pandemic and estimates of lockdown-induced 2nd waves. PLoS ONE 2021, 16, e0245669. [Google Scholar] [CrossRef] [PubMed]
Wang, Y.; Wang, P.; Zhang, S.; Pan, H. Uncertainty Modeling of a Modified SEIR Epidemic Model for COVID-19. Biology 2022, 11, 1157. [Google Scholar] [CrossRef] [PubMed]
Chen, P.; Wu, K.; Ghattas, O. Bayesian inference of heterogeneous epidemic models: Application to COVID-19 spread accounting for long-term care facilities. Comput. Methods Appl. Mech. Eng. 2021, 385, 114020. [Google Scholar] [CrossRef]
Huppert, A.; Katriel, G. Mathematical modelling and prediction in infectious disease epidemiology. Clin. Microbiol. Infect. 2013, 19, 999–1005. [Google Scholar] [CrossRef] [Green Version]
Márquez Urbina, J.U.; González Farías, G.; Ramírez Ramírez, L.L.; Rodríguez González, D.I. A multi-source global-local model for epidemic management. PLos ONE 2022, 17, e0261650. [Google Scholar] [CrossRef]
Groendyke, C.; Welch, D.; Hunter, D.R. Bayesian inference for contact networks given epidemic data. Scand. J. Stat. 2011, 38, 600–616. [Google Scholar] [CrossRef]
Dutta, R.; Mira, A.; Onnela, J.P. Bayesian inference of spreading processes on networks. Proc. R. Soc. A Math. Phys. Eng. Sci. 2018, 474, 20180129. [Google Scholar] [CrossRef]
Engl, H.W.; Hanke, M.; Neubauer, A. Regularization of Inverse Problems; Kluwer Academic Publishers: Dordrecht, The Netherlands, 2000; 375p. [Google Scholar]
Tarantola, A. Inverse Problem Theory and Methods for Model Parameter Estimation; SIAM: Philadelphia, PA, USA, 2005. [Google Scholar]
Vogel, C.R. Computational Methods for Inverse Problems (Frontiers in Applied Mathematics); SIAM: Philadelphia, PA, USA, 2002. [Google Scholar]
Alexanderian, A. Optimal experimental design for infinite-dimensional Bayesian inverse problems governed by PDEs: A review. Inverse Probl. 2021, 37, 043001. [Google Scholar] [CrossRef]
Wesolowski, A.; Buckee, C.O.; Engø-Monsen, K.; Metcalf, C.J.E. Connecting mobility to infectious diseases: The promise and limits of mobile phone data. J. Infect. Dis. 2016, 214, S414–S420. [Google Scholar] [CrossRef] [Green Version]
Wei, J.T.; Liu, Y.X.; Zhu, Y.C.; Qian, J.; Ye, R.Z.; Li, C.Y.; Ji, X.K.; Li, H.K.; Qi, C.; Wang, Y.; et al. Impacts of transportation and meteorological factors on the transmission of COVID-19. Int. J. Hyg. Environ. Health 2020, 230, 113610. [Google Scholar] [CrossRef] [PubMed]
Silva, P.C.; Batista, P.V.; Lima, H.S.; Alves, M.A.; Guimarães, F.G.; Silva, R.C. COVID-ABS: An agent-based model of COVID-19 epidemic to simulate health and economic effects of social distancing interventions. Chaos Solitons Fractals 2020, 139, 110088. [Google Scholar] [CrossRef] [PubMed]
Cartenì, A.; Di Francesco, L.; Martino, M. How mobility habits influenced the spread of the COVID-19 pandemic: Results from the Italian case study. Sci. Total Environ. 2020, 741, 140489. [Google Scholar] [CrossRef] [PubMed]
Ogen, Y. Assessing nitrogen dioxide (NO2) levels as a contributing factor to coronavirus (COVID-19) fatality. Sci. Total Environ. 2020, 726, 138605. [Google Scholar] [CrossRef]
He, G.; Pan, Y.; Tanaka, T. The short-term impacts of COVID-19 lockdown on urban air pollution in China. Nat. Sustain. 2020, 3, 1005–1011. [Google Scholar] [CrossRef]
Iacus, S.M.; Santamaria, C.; Sermi, F.; Spyratos, S.; Tarchi, D.; Vespe, M. Human mobility and COVID-19 initial dynamics. Nonlinear Dyn. 2020, 101, 1901–1919. [Google Scholar] [CrossRef]
Alessandretti, L. What human mobility data tell us about COVID-19 spread. Nat. Rev. Phys. 2022, 4, 12–13. [Google Scholar] [CrossRef]
Rahman, M.M.; Paul, K.C.; Hossain, M.A.; Ali, G.M.N.; Rahman, M.S.; Thill, J.C. Machine learning on the COVID-19 pandemic, human mobility and air quality: A review. IEEE Access 2021, 9, 72420–72450. [Google Scholar] [CrossRef]
Hu, T.; Wang, S.; She, B.; Zhang, M.; Huang, X.; Cui, Y.; Khuri, J.; Hu, Y.; Fu, X.; Wang, X.; et al. Human mobility data in the COVID-19 pandemic: Characteristics, applications, and challenges. Int. J. Digit. Earth 2021, 14, 1126–1147. [Google Scholar] [CrossRef]
Lin, C.H.; Wen, T.H. How Spatial Epidemiology Helps Understand Infectious Human Disease Transmission. Trop. Med. Infect. Dis. 2022, 7, 164. [Google Scholar] [CrossRef]
Perez, L.; Dragicevic, S. An agent-based approach for modeling dynamics of contagious disease spread. Int. J. Health Geogr. 2009, 8, 1–17. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Lazebnik, T.; Blumrosen, G. Advanced multi-mutation with intervention policies pandemic model. IEEE Access 2022, 10, 22769–22781. [Google Scholar] [CrossRef]
Fudolig, M.; Howard, R. The local stability of a modified multi-strain SIR model for emerging viral strains. PLoS ONE 2020, 15, e0243408. [Google Scholar] [CrossRef] [PubMed]
Yusuf, A.; Qureshi, S.; Inc, M.; Aliyu, A.I.; Baleanu, D.; Shaikh, A.A. Two-strain epidemic model involving fractional derivative with Mittag-Leffler kernel. Chaos Interdiscip. J. Nonlinear Sci. 2018, 28, 123121. [Google Scholar] [CrossRef] [PubMed]
Lazebnik, T.; Bunimovich-Mendrazitsky, S. Generic approach for mathematical model of multi-strain pandemics. PLoS ONE 2022, 17, e0260683. [Google Scholar] [CrossRef] [PubMed]
Khyar, O.; Allali, K. Global dynamics of a multi-strain SEIR epidemic model with general incidence rates: Application to COVID-19 pandemic. Nonlinear Dyn. 2020, 102, 489–509. [Google Scholar] [CrossRef]
Conacyt. Covid-19 México. 2020. Available online: https://datos.covid-19.conacyt.mx/ (accessed on 1 November 2022).
Del Valle, S.Y.; McMahon, B.H.; Asher, J.; Hatchett, R.; Lega, J.C.; Brown, H.E.; Leany, M.E.; Pantazis, Y.; Roberts, D.J.; Moore, S.; et al. Summary results of the 2014-2015 DARPA Chikungunya challenge. BMC Infect. Dis. 2018, 18, 245. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Lai, C.C.; Liu, Y.H.; Wang, C.Y.; Wang, Y.H.; Hsueh, S.C.; Yen, M.Y.; Ko, W.C.; Hsueh, P.R. Asymptomatic carrier state, acute respiratory disease, and pneumonia due to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2): Facts and myths. J. Microbiol. Immunol. Infect. 2020, 53, 404–412. [Google Scholar] [CrossRef]
Kalajdzievska, D.; Li, M.Y. Modeling the effects of carriers on transmission dynamics of infectious diseases. Math. Biosci. Eng. 2011, 8, 711–722. [Google Scholar]
Duffy, M.R.; Chen, T.H.; Hancock, W.T.; Powers, A.M.; Kool, J.L.; Lanciotti, R.S.; Pretrick, M.; Marfel, M.; Holzbauer, S.; Dubray, C.; et al. Zika virus outbreak on Yap Island, federated states of Micronesia. N. Engl. J. Med. 2009, 360, 2536–2543. [Google Scholar] [CrossRef]
Doll, M.E.; Pryor, R.; Mackey, D.; Doern, C.D.; Bryson, A.; Bailey, P.; Cooper, K.; Godbout, E.; Stevens, M.P.; Bearman, G. Utility of retesting for diagnosis of SARS-CoV-2/COVID-19 in hospitalized patients: Impact of the interval between tests. Infect. Control. Hosp. Epidemiol. 2020, 41, 859–861. [Google Scholar] [CrossRef] [PubMed]
Esbin, M.N.; Whitney, O.N.; Chong, S.; Maurer, A.; Darzacq, X.; Tjian, R. Overcoming the bottleneck to widespread testing: A rapid review of nucleic acid testing approaches for COVID-19 detection. RNA 2020, 26, 771–783. [Google Scholar] [CrossRef] [PubMed]
Li, R.; Pei, S.; Chen, B.; Song, Y.; Zhang, T.; Yang, W.; Shaman, J. Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2). Science 2020, 368, 489–493. [Google Scholar] [CrossRef] [Green Version]
Team, E. The epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (COVID-19)—China, 2020. China CDC Wkly. 2020, 2, 113. [Google Scholar]
Trejo, I.; Hengartner, N.W. A modified Susceptible-Infected-Recovered model for observed under-reported incidence data. PLoS ONE 2022, 17, e0263047. [Google Scholar] [CrossRef] [PubMed]
Hasell, J.; Mathieu, E.; Beltekian, D.; Macdonald, B.; Giattino, C.; Ortiz-Ospina, E.; Roser, M.; Ritchie, H. A cross-country database of COVID-19 testing. Sci. Data 2020, 7, 345. [Google Scholar] [CrossRef] [PubMed]
Tocto-Erazo, M.R.; Espíndola-Zepeda, J.A.; Montoya-Laos, J.A.; Acuña-Zegarra, M.A.; Olmos-Liceaga, D.; Reyes-Castro, P.A.; Figueroa-Preciado, G. Lockdown, relaxation, and acme period in COVID-19: A study of disease dynamics in Hermosillo, Sonora, Mexico. PLoS ONE 2020, 15, e0242957. [Google Scholar] [CrossRef]
Lin, Q.; Zhao, S.; Gao, D.; Lou, Y.; Yang, S.; Musa, S.S.; Wang, M.H.; Cai, Y.; Wang, W.; Yang, L.; et al. A conceptual model for the coronavirus disease 2019 (COVID-19) outbreak in Wuhan, China with individual reaction and governmental action. Int. J. Infect. Dis. 2020, 93, 211–216. [Google Scholar] [CrossRef]
Liu, X.; Hewings, G.; Wang, S.; Qin, M.; Xiang, X.; Zheng, S.; Li, X. Modelling the situation of COVID-19 and effects of different containment strategies in China with dynamic differential equations and parameters estimation. medRxiv 2020. [Google Scholar] [CrossRef]
Chatterjee, K.; Chatterjee, K.; Kumar, A.; Shankar, S. Healthcare impact of COVID-19 epidemic in India: A stochastic mathematical model. Med. J. Armed Forces India 2020, 76, 147–155. [Google Scholar] [CrossRef]
WHO. Coronavirus Disease 2019 (COVID-19): Situation Report, 73; WHO: Geneva, Switzerland, 2020. [Google Scholar]
Ganyani, T.; Kremer, C.; Chen, D.; Torneri, A.; Faes, C.; Wallinga, J.; Hens, N. Estimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data, March 2020. Eurosurveillance 2020, 25, 2000257. [Google Scholar] [CrossRef] [PubMed]
Linton, N.M.; Kobayashi, T.; Yang, Y.; Hayashi, K.; Akhmetzhanov, A.R.; Jung, S.m.; Yuan, B.; Kinoshita, R.; Nishiura, H. Incubation period and other epidemiological characteristics of 2019 novel coronavirus infections with right truncation: A statistical analysis of publicly available case data. J. Clin. Med. 2020, 9, 538. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Wu, J.T.; Leung, K.; Leung, G.M. Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: A modelling study. Lancet 2020, 395, 689–697. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Lauer, S.A.; Grantz, K.H.; Bi, Q.; Jones, F.K.; Zheng, Q.; Meredith, H.R.; Azman, A.S.; Reich, N.G.; Lessler, J. The incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases: Estimation and application. Ann. Intern. Med. 2020, 172, 577–582. [Google Scholar] [CrossRef] [Green Version]
Knoema. World Data Atlas, Mexico, Sonora Mortality. 2021. Available online: https://knoema.com/atlas/Mexico/Sonora/Crude-Death-Rate (accessed on 1 September 2022).
Torres-Ibarra, L.; Basto-Abreu, A.; Carnalla, M.; Torres-Alvarez, R.; Reyes-Sanchez, F.; Hernández-Ávila, J.E.; Palacio-Mejia, L.S.; Alpuche-Aranda, C.; Shamah-Levy, T.; Rivera, J.A.; et al. SARS-CoV-2 infection fatality rate after the first epidemic wave in Mexico. Int. J. Epidemiol. 2022, 51, 429–439. [Google Scholar] [CrossRef]
WHO. COVID-19 Natural Immunity: Scientific Brief, 10 May 2021; Technical Report; World Health Organization: Geneva, Switzerland, 2021. [Google Scholar]
Pilz, S.; Chakeri, A.; Ioannidis, J.P.; Richter, L.; Theiler-Schwetz, V.; Trummer, C.; Krause, R.; Allerberger, F. SARS-CoV-2 re-infection risk in Austria. Eur. J. Clin. Investig. 2021, 51, e13520. [Google Scholar] [CrossRef]
Hansen, C.H.; Michlmayr, D.; Gubbels, S.M.; Mølbak, K.; Ethelberg, S. Assessment of protection against reinfection with SARS-CoV-2 among 4 million PCR-tested individuals in Denmark in 2020: A population-level observational study. Lancet 2021, 397, 1204–1212. [Google Scholar] [CrossRef]
Kim, P.; Gordon, S.M.; Sheehan, M.M.; Rothberg, M.B. Duration of SARS-CoV-2 natural immunity and protection against the Delta variant: A retrospective cohort study. Clin. Infect. Dis. Off. Publ. Infect. Dis. Soc. Am. 2022, 75, e185–e190. [Google Scholar] [CrossRef]
Spicer, K.B.; Glick, C.; Cavanaugh, A.M.; Thoroughman, D. Protective immunity after natural infection with severe acute respiratory syndrome Coronavirus-2 (SARS-CoV-2)–Kentucky, USA, 2020. Int. J. Infect. Dis. 2022, 114, 21–28. [Google Scholar] [CrossRef]
Sheehan, M.M.; Reddy, A.J.; Rothberg, M.B. Reinfection rates among patients who previously tested positive for coronavirus disease 2019: A retrospective cohort study. Clin. Infect. Dis. 2021, 73, 1882–1886. [Google Scholar] [CrossRef]
Hall, V.J.; Foulkes, S.; Charlett, A.; Atti, A.; Monk, E.J.M.; Simmons, R.; Wellington, E.; Cole, M.J.; Saei, A.; Oguti, B.; et al. SARS-CoV-2 infection rates of antibody-positive compared with antibody-negative health-care workers in England: A large, multicentre, prospective cohort study (SIREN). Lancet 2021, 397, 1459–1469. [Google Scholar] [CrossRef] [PubMed]
Abu-Raddad, L.J.; Chemaitelly, H.; Coyle, P.; Malek, J.A.; Ahmed, A.A.; Mohamoud, Y.A.; Younuskunju, S.; Ayoub, H.H.; Al Kanaani, Z.; Al Kuwari, E.; et al. SARS-CoV-2 antibody-positivity protects against reinfection for at least seven months with 95% efficacy. EClinicalMedicine 2021, 35, 100861. [Google Scholar] [CrossRef] [PubMed]
Harvey, R.A.; Rassen, J.A.; Kabelac, C.A.; Turenne, W.; Leonard, S.; Klesh, R.; Meyer, W.A.; Kaufman, H.W.; Anderson, S.; Cohen, O.; et al. Association of SARS-CoV-2 seropositive antibody test with risk of future infection. JAMA Intern. Med. 2021, 181, 672–679. [Google Scholar] [CrossRef] [PubMed]
Pilz, S.; Theiler-Schwetz, V.; Trummer, C.; Krause, R.; Ioannidis, J.P. SARS-CoV-2 reinfections: Overview of efficacy and duration of natural and hybrid immunity. Environ. Res. 2022, 209, 112911. [Google Scholar] [CrossRef] [PubMed]
Capaldi, A.; Behrend, S.; Berman, B.; Smith, J.; Wright, J.; Lloyd, A.L. Parameter estimation and uncertainty quantication for an epidemic model. Math. Biosci. Eng. 2012, 9, 553–576. [Google Scholar]
Anh, D.T.; Bonnet, M.P.; Vachaud, G.; Van Minh, C.; Prieur, N.; Duc, L.V. Biochemical modeling of the Nhue River (Hanoi, Vietnam): Practical identifiability analysis and parameters estimation. Ecol. Model. 2006, 193, 182–204. [Google Scholar]
Bellman, R.; Åström, K.J. On structural identifiability. Math. Biosci. 1970, 7, 329–339. [Google Scholar] [CrossRef]
Evans, N.D.; White, L.J.; Chapman, M.J.; Godfrey, K.R.; Chappell, M.J. The structural identifiability of the susceptible infected recovered model with seasonal forcing. Math. Biosci. 2005, 194, 175–197. [Google Scholar] [CrossRef]
Xia, X.; Moog, C.H. Identifiability of nonlinear systems with application to HIV/AIDS models. IEEE Trans. Autom. Control 2003, 48, 330–336. [Google Scholar] [CrossRef]
White, L.; Evans, N.; Lam, T.; Schukken, Y.; Medley, G.; Godfrey, K.; Chappell, M. The structural identifiability and parameter estimation of a multispecies model for the transmission of mastitis in dairy cows. Math. Biosci. 2001, 174, 77–90. [Google Scholar] [CrossRef]
Reid, J. Structural identifiability in linear time-invariant systems. IEEE Trans. Autom. Control 1977, 22, 242–246. [Google Scholar] [CrossRef]
Prieto, K.; Dorn, O. Sparsity and level set regularization for diffuse optical tomography using a transport model in 2D. Inverse Probl. 2016, 33, 014001. [Google Scholar] [CrossRef]
Prieto, K.; Ibarguen-Mondragon, E. Parameter estimation, sensitivity and control strategies analysis in the spread of influenza in Mexico. J. Phys. Conf. Ser. 2019, 1408, 12–20. [Google Scholar] [CrossRef]
Smirnova, A.; DeCamp, L.; Liu, H. Inverse problems and ebola virus disease using an age of infection model. In Mathematical and Statistical Modeling for Emerging and Re-Emerging Infectious Diseases; Springer: Cham, Switzerland, 2016; pp. 103–121. [Google Scholar]
Capistrán, M.A.; Moreles, M.A.; Lara, B. Parameter estimation of some epidemic models. The case of recurrent epidemics caused by respiratory syncytial virus. Bull. Math. Biol. 2009, 71, 1890–1901. [Google Scholar] [CrossRef] [PubMed]
Kraft, D. A software package for sequential quadratic programming. Forschungsbericht- Deutsche Forschungs- und Versuchsanstalt fur Luft- und Raumfahrt 1988, 28, 33. [Google Scholar]
Spataru, D. Using a Genetic Algorithm for Parameter Estimation in a Modified SEIR Model of COVID-19 Spread in Ontario. Ph.D. Thesis, University of Guelph, Guelph, ON, Canada, 2021. [Google Scholar]
Whitley, D. A genetic algorithm tutorial. Stat. Comput. 1994, 4, 65–85. [Google Scholar] [CrossRef]
Liu, Q.; Wang, D. Stein variational gradient descent: A general purpose bayesian inference algorithm. Adv. Neural Inf. Process. Syst. 2016, 29, 2378–2386. [Google Scholar]
Lindén, A.; Mäntyniemi, S. Using the negative binomial distribution to model overdispersion in ecological count data. Ecology 2011, 92, 1414–1421. [Google Scholar] [CrossRef]
Bliznashki, S. A Bayesian logistic growth model for the spread of COVID-19 in New York. medRxiv 2020. [Google Scholar] [CrossRef]
Arguedas, Y.N.; Santana-Cibrian, M.; Velasco-Hernández, J.X. Transmission dynamics of acute respiratory diseases in a population structured by age. Math. Biosci. Eng. 2019, 16, 7477–7493. [Google Scholar] [CrossRef]
Neyens, T.; Faes, C.; Molenberghs, G. A generalized Poisson-gamma model for spatially overdispersed data. Spat. Spatio-Temporal Epidemiol. 2012, 3, 185–194. [Google Scholar] [CrossRef] [PubMed]
Coly, S.; Yao, A.F.; Abrial, D.; Charras-Garrido, M. Distributions to model overdispersed count data. J. Soc. Fr. Stat. 2016, 157, 39–63. [Google Scholar]
Gutiérrez-Pulido, H.; Aguirre-Torres, V.; Christen, J.A. A practical method for obtaining prior distributions in reliability. IEEE Trans. Reliab. 2005, 54, 262–269. [Google Scholar] [CrossRef]
Gutiérrez-Pulido, H.; Aguirre-Torres, V.; Christen, J.A. A Bayesian approach for the determination of warranty length. J. Qual. Technol. 2006, 38, 180–189. [Google Scholar] [CrossRef]
Gutiérrez Pulido, H.; Gutiérrez González, P. Fundamentos y Aplicaciones de la estadística Bayesiana; Universidad de Guadalajara: Guadalajara, Mexico, 2013. [Google Scholar]
Grinsztajn, L.; Semenova, E.; Margossian, C.C.; Riou, J. Bayesian workflow for disease transmission modeling in Stan. Stat. Med. 2021, 40, 6209–6234. [Google Scholar] [CrossRef]
Christen, J.A.; Fox, C. A General Purpose Sampling Algorithm for Continuous Distributions (the t-walk). Bayesian Anal. 2010, 5, 263–282. [Google Scholar] [CrossRef]
Carpenter, B.; Gelman, A.; Hoffman, M.D.; Lee, D.; Goodrich, B.; Betancourt, M.; Brubaker, M.; Guo, J.; Li, P.; Riddell, A. Stan: A probabilistic programming language. J. Stat. Softw. 2017, 76, 1–32. [Google Scholar] [CrossRef] [Green Version]
Chatzilena, A.; van Leeuwen, E.; Ratmann, O.; Baguelin, M.; Demiris, N. Contemporary statistical inference for infectious disease models using Stan. Epidemics 2019, 29, 100367. [Google Scholar] [CrossRef]
Banks, D.L.; Hooten, M.B. Statistical challenges in agent-based modeling. Am. Stat. 2021, 75, 235–242. [Google Scholar] [CrossRef]
Ju, N.; Heng, J.; Jacob, P.E. Sequential Monte Carlo algorithms for agent-based models of disease transmission. arXiv 2021, arXiv:2101.12156. [Google Scholar]
Shiono, T. Estimation of agent-based models using Bayesian deep learning approach of BayesFlow. J. Econ. Dyn. Control 2021, 125, 104082. [Google Scholar] [CrossRef]
Manzo, G. Agent-based Models and Causal Inference; John Wiley & Sons: Hoboken, NJ, USA, 2022. [Google Scholar]
Dyer, J.; Cannon, P.; Farmer, J.D.; Schmon, S. Black-box Bayesian inference for economic agent-based models. arXiv 2022, arXiv:2202.00625. [Google Scholar]
Dyer, J.; Cannon, P.; Farmer, J.D.; Schmon, S.M. Calibrating agent-based models to microdata with graph neural networks. arXiv 2022, arXiv:2206.07570. [Google Scholar]
Britton, T. Epidemic models on social networks—With inference. Stat. Neerl. 2020, 74, 222–241. [Google Scholar] [CrossRef] [Green Version]
Bu, F.; Aiello, A.E.; Xu, J.; Volfovsky, A. Likelihood-based inference for partially observed epidemics on dynamic networks. J. Am. Stat. Assoc. 2022, 117, 510–526. [Google Scholar] [CrossRef]
Murphy, C.; Laurence, E.; Allard, A. Deep learning of contagion dynamics on complex networks. Nat. Commun. 2021, 12, 4720. [Google Scholar] [CrossRef]

Figure 1. Data for Hermosillo city and Urban Zones. The city location within Mexico is identified as the yellow center of the red circle.

Figure 2. 2020 COVID-19 daily cases in Hermosillo. Source: https://datos.COVID-19.conacyt.mx (accessed on 1 November 2022).

Figure 3. Proportions of zonal 2020 COVID-19 daily cases.

Figure 4. Observed 2020 COVID-19 daily cases in Hermosillo. The incidences (blue dots) are inflated by a factor of 15.

Figure 5. Simulated COVID-19 daily cases in Hermosillo. The daily model incidence (in blue) was simulated using the parameter values given in Table 3, together with the

μ

and

ψ

values provided in Table 2 and

τ = 1 / 180

.

Figure 5. Simulated COVID-19 daily cases in Hermosillo. The daily model incidence (in blue) was simulated using the parameter values given in Table 3, together with the

μ

and

ψ

values provided in Table 2 and

τ = 1 / 180

.

Figure 6. Fitted model incidence and simulated observed daily incidence values.

Figure 7. Fitted model incidence and observed 2020 daily incidence values.

Figure 8. Prior distributions of the infection, incubation, and recovery rate parameters and corresponding posterior distributions obtained from t-walk.

Figure 9. Credible intervals, means (from t-walk), and GA estimates of the incubation and recovery rates, exposed and prevalence initial conditions, and dispersion parameters: (a) Credible intervals, means, and GA estimates of the infection, incubation and recovery rates; and (b) credible intervals, means, and GA estimates of the exposed and prevalence initial conditions and the dispersion parameters.

Figure 10. Observed (red points) and model (gray) incidences. The model incidences were obtained by solving the model using parameters from the first 30,000 MCMC iterations after burn-in. The observed incidences are for the year 2020.

Figure 11. The mean (black), median (medium blue), MAP (medium violet red), and 95% CI (blue bands) of model incidence obtained from t-walk. The red points indicate observed daily incidences for the year 2020.

Figure 12. Posterior distributions of the infection, incubation, and recovery rates; exposed and prevalence initial conditions; and dispersion parameters obtained from Stan: (a) Infection, incubation, and recovery rate parameters; and (b) exposed and prevalence initial conditions and dispersion parameters.

Figure 13. Trace plots of the samples of infection, incubation, and recovery rates; exposed and prevalence initial conditions; and over-dispersion parameters obtained from Stan: (a) Infection, incubation, and recovery rate parameters; and (b) exposed and prevalence initial conditions and the dispersion parameters.

Figure 14. Pair plots for the infection samples and recovery rates obtained from Stan.

Figure 15. 95% Credible intervals of the estimated parameters obtained from Stan: (a) infection, incubation, and recovery rates parameters; and (b) exposed and prevalence initial conditions and the dispersion parameters.

Figure 16. Observed daily incidence for the year 2020, posterior incidence predictions, and 95% uncertainty intervals obtained from Stan.

Figure 17. Global single-patch observed and predicted mean and median incidences together with the 95% CI. The observed daily incidences are for the year 2020.

Table 1. Parameter descriptions.

Parameter	Description
$α_{i}$	The proportion of individuals that leave their residence patch i
$p_{i j}$	The proportion of time that an individual from patch i spends
	in patch j, given that the individual leaves their residence patch
$Λ_{i}$	Recruitment of Susceptible individuals in Patch i
$β_{j}$	Instantaneous risk of infection in Patch j
$μ_{i}$	Per capita natural death rate in Patch i
$γ_{i}$	Per capita recovery rate of individuals in Patch i
$τ_{i}$	Per capita loss of immunity rate in Patch i
$ψ_{i}$	Per capita disease-induced death rate in Patch i
$κ_{i}$	Per capita rate at which the exposed individuals in patch i become infectious

Table 2. Epidemiological parameter ranges/values and their associated references.

Parameters	Range/Value	Literature
$β$	0.0616–1.68	[88,89,90]
$1 / κ$	2–14	[88,91,92,93,94]
$1 / γ$	5–15.6	[49,88,90,95,96]
$μ$	$0.06 / (1000 \times 365)$	[97]
$ψ$	(0.0047 + 0.0030)/2	[98]
$1 / τ$	90–300	[99,100,101,102,103,104,105,106,107,108]

Table 3. Parameter values for Model (1) used for simulation considering the observed daily incidence values (see Figure 5).

Epidemiological		Initial Conditions
Parameter	Value	Parameter	Value
$β_{1}$	1.3000	$E_{1, 0}$	10
$β_{2}$	1.4000	$I_{1, 0}$	1
$β_{3}$	0.9500	$Y_{1, 0}$	1
$β_{4}$	0.8000	$E_{2, 0}$	20
$κ_{1}$	0.0833	$I_{2, 0}$	2
$κ_{2}$	0.0714	$Y_{2, 0}$	2
$κ_{3}$	0.0741	$E_{3, 0}$	5
$κ_{4}$	0.1000	$I_{3, 0}$	0
$γ_{1}$	0.1667	$Y_{3, 0}$	0
$γ_{2}$	0.1429	$E_{4, 0}$	5
$γ_{3}$	0.1818	$I_{4, 0}$	0
$γ_{4}$	0.2000	$Y_{4, 0}$	0

Table 4. Model (1) parameter values estimated from deterministic inversion using a Genetic Algorithm (GA). The values were used to obtain the fitted simulated model incidence shown in Figure 6.

Epidemiological		Initial Conditions
Parameter	GA	Parameter	GA
$β_{1}$	0.8322	$E_{1, 0}$	0.0000
$β_{2}$	0.7473	$I_{1, 0}$	0.0000
$β_{3}$	0.5259	$Y_{1, 0} = E_{1, 0} + I_{1, 0} + R_{1, 0}$	0.0000
$β_{4}$	1.4354	$E_{2, 0}$	200.0000
$κ_{1}$	0.0829	$I_{2, 0}$	126.4590
$κ_{2}$	0.5334	$Y_{2, 0} = E_{2, 0} + I_{2, 0} + R_{2, 0}$	326.4590
$κ_{3}$	0.1983	$E_{3, 0}$	200.0000
$κ_{4}$	0.2115	$I_{3, 0}$	0.0000
$γ_{1}$	0.0357	$Y_{3, 0} = E_{3, 0} + I_{3, 0} + R_{3, 0}$	200.0000
$γ_{2}$	1.0000	$E_{4, 0}$	200.0000
$γ_{3}$	0.4248	$I_{4, 0}$	8.4447
$γ_{4}$	1.0000	$Y_{4, 0} = E_{4, 0} + I_{4, 0} + R_{4, 0}$	208.4447

Table 5. Parameter values for Model (1) estimated from deterministic inversion using a Genetic Algorithm (GA). The values were used to obtain the fitted model and observed daily incidence shown in Figure 7.

Epidemiological		Initial Conditions
Parameter	GA	Parameter	GA
$β_{1}$	1.0950	$E_{1, 0}$	58.9925
$β_{2}$	1.0473	$I_{1, 0}$	0.0000
$β_{3}$	0.4904	$Y_{1, 0} = E_{1, 0} + I_{1, 0} + R_{1, 0}$	58.9925
$β_{4}$	1.4951	$E_{2, 0}$	78.1313
$κ_{1}$	0.3000	$I_{2, 0}$	0.0000
$κ_{2}$	0.3000	$Y_{2, 0} = E_{2, 0} + I_{2, 0} + R_{2, 0}$	78.1313
$κ_{3}$	0.3000	$E_{3, 0}$	22.9404
$κ_{4}$	0.3000	$I_{3, 0}$	0.0000
$γ_{1}$	1.0000	$Y_{3, 0} = E_{3, 0} + I_{3, 0} + R_{3, 0}$	22.9404
$γ_{2}$	1.0000	$E_{4, 0}$	4.4192
$γ_{3}$	1.0000	$I_{4, 0}$	0.0000
$γ_{4}$	0.7853	$Y_{4, 0} = E_{4, 0} + I_{4, 0} + R_{4, 0}$	4.4192

Table 6. Proportion of the observed prediction incidence for considered zones covered by the model incidence CIs obtained from t-walk and Stan.

	% of Observed Prediction Incidence Covered by CIs
Zone	Stan	t-walk
Zone 1	55.56	22.22
Zone 2	55.56	44.44
Zone 3	58.33	30.56
Zone 4	63.89	38.89

Table 7. Statistics of the single-patch model parameters estimated using probabilistic (Stan) inversion.

	Probabilistic Inversion (Stan) Statistics
Parameter	Mean	SD	$q_{0.025}$	$q_{0.250}$	$q_{0.5}$	$q_{0.750}$	$q_{0.975}$	ESS	pESS
$β$	1.0139	0.0972	0.8168	0.9503	1.0191	1.0843	1.1701	244	0.0122
$κ$	0.8690	0.0847	0.6911	0.8140	0.8798	0.9436	0.9917	301	0.0151
$γ$	0.8569	0.0862	0.6797	0.8015	0.8640	0.9191	0.9936	244	0.0122
$E_{0}$	1.2060	0.8259	0.0497	0.5090	1.0952	1.7977	2.8544	296	0.0148
$I_{0}$	1.6101	0.7787	0.1173	1.0340	1.6587	2.2338	2.9343	237	0.0119
$ν$	4.0792	0.5313	3.0715	3.7108	4.0322	4.4569	5.1152	264	0.0132

Table 8. Efficiency measures for the predicted mean and median incidences for the multi- and single-patch epidemic models.

	Multi-Patch		Single-Patch
Efficiency Measure	Mean	Median	Mean	Median
RMSE	395.61	418.61	773.98	796.06
MAPE	34.40	37.06	73.86	76.10

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Akuno, A.O.; Ramírez-Ramírez, L.L.; Espinoza, J.F. Inference on a Multi-Patch Epidemic Model with Partial Mobility, Residency, and Demography: Case of the 2020 COVID-19 Outbreak in Hermosillo, Mexico. Entropy 2023, 25, 968. https://doi.org/10.3390/e25070968

AMA Style

Akuno AO, Ramírez-Ramírez LL, Espinoza JF. Inference on a Multi-Patch Epidemic Model with Partial Mobility, Residency, and Demography: Case of the 2020 COVID-19 Outbreak in Hermosillo, Mexico. Entropy. 2023; 25(7):968. https://doi.org/10.3390/e25070968

Chicago/Turabian Style

Akuno, Albert Orwa, L. Leticia Ramírez-Ramírez, and Jesús F. Espinoza. 2023. "Inference on a Multi-Patch Epidemic Model with Partial Mobility, Residency, and Demography: Case of the 2020 COVID-19 Outbreak in Hermosillo, Mexico" Entropy 25, no. 7: 968. https://doi.org/10.3390/e25070968

APA Style

Akuno, A. O., Ramírez-Ramírez, L. L., & Espinoza, J. F. (2023). Inference on a Multi-Patch Epidemic Model with Partial Mobility, Residency, and Demography: Case of the 2020 COVID-19 Outbreak in Hermosillo, Mexico. Entropy, 25(7), 968. https://doi.org/10.3390/e25070968

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Inference on a Multi-Patch Epidemic Model with Partial Mobility, Residency, and Demography: Case of the 2020 COVID-19 Outbreak in Hermosillo, Mexico

Abstract

1. Introduction

2. The Multi-Patch Model with Mobility, Residency, and Demography

3. Data

4. Estimation of the Mobility and Residence Time Parameters

5. Deterministic Inversion

5.1. Formulation of the Model to Minimize

5.2. Results

6. Bayesian Inference

6.1. Building the Posterior Distribution

6.2. Results

6.3. Comparison to a Single-Patch Model

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Posterior Statistics from t-Walk and STAN

Appendix A.1. Posterior Statistics from t-Walk

Appendix A.2. Posterior Statistics from Stan

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI