1. Introduction
On 31 December 2019, the Country Office in the People’s Republic of China of the World Health Organization (WHO) acquired media reports by Wuhan Municipal Health Commission on cases of
viral pneumonia. On 10 January 2020, the first draft genome of the novel coronavirus was made public, this virus was denominated SARS-CoV-2, and it is the cause of Coronavirus disease 19 (COVID-19) [
1]. In March 2020, due to the spread and severity of the infectious disease, COVID-19 was declared a pandemic [
1]. Different stages have been observed throughout the pandemic, including developing and launching vaccines with high effectiveness, emerging new strains of concern, massive vaccination in developed countries, and significant lag in others. According to [
2], up to 14 October 2022, there have been over 13.5 million reported cases and over 114 thousand deaths in Spain.
Compartmental models, such as the well-known SIR model, are commonly used in epidemic simulations to model the dynamics of Susceptible, Infected, and Recovered compartments. In previous studies, the SIR model with vaccination and immunization was proposed. The vaccination models include those with partial and waning immunity, pulse vaccination, and vaccination in newborns, which are modeled using two rates related to vaccination [
3,
4,
5,
6]. General vaccination models for homogeneous groups in the population were proposed in [
7].
For more detailed vaccination policies and vaccine-related parameters, ref. [
8] studied vaccine coverage, the average duration of immunity acquired by a vaccine, and vaccine leakiness. Additionally, ref. [
9] proposed two models with delay, one with and without vaccination, to establish comparisons of the vaccination effort effects.
Overall, these models provide insight into the effects of vaccination on the spread and control of infectious diseases and can be used to inform public health policy.
The incidence rate is a critical characteristic of compartmental models that drives infections. The most frequently used incidence rates are the bilinear incidence rate and the standard incidence rate. Various incidence rates have been applied in different models and for different applications, such as the Holling-type incidence function [
10], the saturated incidence rate [
11,
12], and other types, including those originally implemented in host–parasite population models, such as the saturated incident rate types in [
13]. Some examples of applications for the different incident rate types are: [
14] where bilinear incidence rate is used in a model considering vaccination, a Holling-type incidence function for a
model in [
10], a standard incidence rate in [
15] where there are different coefficients considered for each defined infected sub-population (infected, asymptomatic, and dead infectious corpses), a saturated incidence rate in a
model used in [
11]. Recent studies have applied compartmental models to simulate the COVID-19 outbreak in Spain and Italy, using nonlinear data-fitting approaches [
16] and Runge–Kutta methods [
17,
18]. Many recent references dealing with mathematical models and numerical simulation of COVID-19 can be mentioned, such as [
19,
20,
21,
22]. In [
16], a study on a modified SEIR model applied to the COVID-19 outbreak in Spain and Italy, solving the ODE system using a nonlinear data-fitting approach minimizing a least-squares error function.
Artificial Neural Networks (ANNs) have been used as efficient tools for estimating suitable values of parameters in epidemiological models. The first reference found with an application relating Artificial Neural Networks to epidemiological models is [
23] and dates back to 1995. Even though other more recent applications [
24,
25,
26] have been published, their use in this task is not widespread, therein lies the opportunity to propose their application as a parameter estimation technique. Several studies have applied ANNs to estimate the rate of infection and spread of an epidemic, including cooperative and supportive neural networks [
24]. Some recent studies have used Long short-term memory (LSTM) ANNs to forecast time-varying parameters for a SIRD model for COVID-19 [
26,
27].
In this work, we modified a model that is a variant of the classical SIR model, including vaccination, called the SVIR model. The model’s formulation, equilibrium points, and reproduction number were studied, and the contact rate () was estimated using LSTM ANNs. The numerical simulations were conducted using the fourth-order Runge–Kutta scheme. A PDE model with diffusion was also included and studied, and numerical simulations were performed.
The results of the numerical simulations were discussed and compared with actual data for the ODE models, and insights from the PDE model implementation were presented. The study demonstrated the effectiveness of LSTM ANNs in estimating the contact rate and the suitability of compartmental models for simulating the COVID-19 pandemic.
In conclusion, this study modified a compartmental model, including vaccination, and applied ANNs to estimate the contact rate. The study also demonstrated the effectiveness of compartmental models and ANNs in simulating the COVID-19 pandemic. The results provide insights into the dynamics of the pandemic and suggest the importance of vaccination efforts in controlling the spread of the virus. This study’s proposed modifications to the SVIR model, including time-varying parameters, may offer additional insights into the dynamics of the pandemic.
The results of the numerical simulations for the PDE model with diffusion suggest that the diffusion term may play a role in reducing the spread of the virus, particularly at the early stages of the outbreak. This finding may have implications for policy decisions, such as the timing of lockdowns and social distancing measures, and highlights the importance of including spatial effects in epidemiological models.
Overall, this study’s use of ANNs and numerical simulations provides a valuable tool for understanding the COVID-19 pandemic’s spread and control. The application of these tools can help policymakers make informed decisions about vaccination and other measures to slow the virus’s spread, ultimately reducing the number of cases and deaths worldwide.
2. ODE SVIR Compartmental Model
In this section, we present the ODE SVIR compartmental model and its modifications. We begin with the model formulation and then discuss its positivity, equilibrium, and reproduction number. Next, we describe how the contact rate is estimated using LSTM, followed by a discussion on numerical simulations.
2.1. ODE Model Formulation
SVIR Compartmental models incorporate vaccinated individuals into the classical SIR model; this is relevant since vaccination greatly influences the evolution of an epidemic or pandemic. In the SVIR compartmental model,
S is for individuals susceptible to the infection,
V is for vaccinated,
I is for infected, and
R is for retired individuals (which in this case includes recovered and deceased). A simplified version of the SVIR compartmental model is considered a baseline for this work’s proposed techniques; this version is one of the cases in [
7] where four compartments describe the dynamics of the epidemic in time, vital dynamics are not included, and reinfection is not considered. Even though the model shown in Equation (
1) is a simplification with multiple assumptions, it is helpful in broadly studying the changes in population dynamics of epidemics involving vaccination.
In system (
1),
is the vaccination rate,
is the transmission rate and
is the recovery rate. The model divides the total population (
N) into these four compartments (
S,
V,
I,
R) such that
, and the values for these compartments are always positive (
,
,
,
).
It is important to state that several assumptions are considered in this model:
The contact rate (), vaccination rate (), and recovery rate () are positive.
The recruitment rate of new susceptible individuals (such as newborns or migration into the established population) is equal to the death rate or migration outside of the established population and, therefore, not included in the model.
Once an individual is vaccinated with two dosages (these dosages are given to an individual with a delay, but the delay is not included in the model), the individual cannot be infected.
As mentioned in the introduction,
Section 1, different versions of the incidence rate have been implemented in compartmental models. Modifications of the standard incidence rate are particularly relevant for the specific case of COVID-19 due to the impact of changes in restrictions and human behavior based on their perceived risk by implementing more personal cautions, which affect the rate of infections. Without explicitly defining parameters representing governmental restrictions and changes in human behavior, an alternative is to modify the incidence rate via parameters that regulate the model based on prior infection trends.
This work proposes modifications to the base
model shown in system (
1); the modified model is as follows:
This system can be simplified by expressing the incidence rate as a function of
S, denoted
:
This modified model combines a saturated incidence rate (such as one mentioned in [
12]) with a modified standard incidence rate to generate a mechanism that can decrease or increase the infection rate based on two parameters acting as logical gates:
and
. In an epidemiological context, these parameters can be considered as reflecting varying restrictions or changes in the spread of the disease due to the appearance of new variants that vary in contagiousness. Both scenarios have been observed at different points throughout the COVID-19 pandemic. The modification introduces two additional parameters,
and
, for regulating the incidence rate:
is used to decrease the incidence rate, while
has the opposite effect.
The parameter
is used to decrease the incidence rate and is defined as follows:
The threshold at which the decrease in the incidence rate takes effect is defined by Equation (
4) and is time-dependent. This threshold is related to outlier detection, which involves considering the median and standard deviation of the estimated
parameter. These parameters display highly different behaviors at different points of the epidemic, particularly at the onset of the epidemic and when new waves of infection are triggered.
Equation (
5) introduces the time-dependent threshold
, which determines the point at which the decrease in the incidence rate takes effect. This threshold is related to the detection of outliers in the estimated
parameter. Specifically, we calculate the median and standard deviation of the absolute daily changes in
, denoted by
. The constant
a can be customized for each Autonomous Community, and we provide the determined values in
Table A1 in
Appendix A.
Similarly,
serves as a mechanism to increase the incidence rate, and it is regulated by the recent changes in the infection trend
. Its definition is as follows:
Equation (
5) sets a time-dependent threshold for the decrease in incidence rate, with
representing the absolute value of the daily changes for the estimated
parameter. The median (
) and standard deviation (
) of this series are calculated, with the constant parameter
a specified for each Autonomous Community.
Table A1 in
Appendix A includes the determined values.
Similarly,
is regulated by the most recent changes in the infection trend
, with its definition provided by Equation (
6). This mechanism is designed to increase the incidence rate and depends on the
i time steps preceding the current one, where
i is an integer. The rate of infected individuals at time
is denoted by
, and the average at each particular time step is calculated by division with
i. The parameter
sets a threshold of action for the
parameter.
Appendix B in
Table A2 includes the specific values per Autonomous Community determined in this study.
In Equation (
2) for the specific case where
for any
t, the dynamic system is simplified as:
Equation (
7) provides a simplified version that only includes a mechanism for reducing the incidence rate, which is often the most relevant scenario. Numerical simulations in
Section 2.6 support this assumption by showing that this version is typically sufficient.
To analyze the model’s behavior at the equilibrium, the model is extended to include a recruitment rate for the Susceptible compartment,
, and a natural death rate
for all compartments.
where
, and
and
are positive constants.
2.2. Positivity of the ODE Model
Theorem 1. All solutions , , , for the model in Equation (8) with initial conditions , , , , are non-negative for all . Proof. To demonstrate the positivity of the ODE model we follow the strategy put forward in [
28]. The model is given by the expression
where
is given by Equation (3).
We first multiply Equation (
9) by
to get
so we have
Integrating Equation (
11) in
Similarly it can be established that , , and , for all . □
2.3. ODE Model Disease-Free Equilibrium and Reproduction Number
For Equation (
8), at the disease-free equilibrium (DFE), the rate of change of each compartment is zero, that is,
,
,
,
. Equations (
4) and (
6) imply that
and
are both 0 at the DFE. At the DFE,
. For S, from
, we obtain
. For V, from
, we obtain
, and
. Therefore, the DFE is
.
To calculate the Basic Reproduction Number
for the proposed SVIR model in Equation (
8), we use the next-generation matrix method to derive an expression for
based on the
I compartment. We linearize the differential equation for
I as
, where
, and calculate the matrices
and
.
These expressions are differentiated with respect to
I and evaluated with the values at the DFE (
). Therefore:
The inverse of
V is needed:
An expression for the Reproduction Number is obtained with
:
This equation provides a useful measure of the potential for disease spread in the proposed SVIR model, taking into account key parameters such as the contact rate, recovery rate, and vaccination rate.
2.4. ODE Model Local Stability
In the system (
8), it is observed that the equation for
is independent of the rest. Taking this into consideration, for the purposes of the local stability analysis, the system is reduced to:
Theorem 2. The disease-free equilibrium of the system defined in Equation (20) is locally unstable when . Proof. The Jacobian matrix is calculated for the system in Equation (
20) and evaluated for at the DFE (
), with the following outcome:
The characteristic equation is given by
, where
I is the identity matrix, and its solution is a cubic equation of the form
, where
The terms
,
,
, in Equation (
22), will always be negative when
, there are no conditions in which the Routh–Hurwitz criterion is met. Therefore when
, the disease-free equilibrium of the system in Equation (
20) is locally unstable. □
Theorem 3. The system defined in Equation (20) is a saddle point at the endemic equilibrium point . Proof. The Jacobian evaluated at
is as follows:
where,
The first eigenvalue from the Jacobian matrix at
is
, which is negative. In order to further analyze this case, we computed the trace of the Jacobian matrix at
and assessed the possible scenarios for
and
. To obtain a negative value in the trace, a condition relating
I and
S was determined:
Subsequently, the determinant of the Jacobian matrix evaluated at
, and there are two cases depending on the possible scenarios of
and
, as can be seen in the following expressions:
Given the negative sign associated with the expressions for the determinant, we cannot ensure that the determinant will be positive, therefore it is established that the system defined in Equation (
20), at
is a saddle point. □
2.5. ODE Model Contact Rate () Estimation with LSTM Neural Networks
The data for the SVIR compartmental model were collected from reliable sources, including the National Center of Epidemiology of Spain (CNE, Centro Nacional de Epidemiología), the Ministry of Health of Spain (Ministerio de Sanidad), the Ministry of Transport, Mobility and Urban Agenda (Ministerio de Transportes, Movilidad y Agenda Urbana), and the National Institute of Statistics of Spain (INE, Instituto Nacional de Estadística), as specified in
Table 1. To address shortcomings in data availability, assumptions were made to simplify the model, such as the required granularity or a complete lack of information. This work is focused on modeling the SARS-CoV-2 epidemic for the Autonomous Communities of Spain, which have undergone a specific set of restrictions.
To handle irregularities in the data and improve the training process for artificial neural networks, data processing techniques were implemented. These techniques are commonly used in data-driven applications. Automated data collection (where applicable), data processing, artificial neural networks, and numerical simulations were coded in Python.
To calculate the system dynamics, we assumed that the epidemics would follow the established system from Equation (
1), resulting in a system with time-varying parameters instead of constant values. The model dynamics were then calculated with available data, and these dynamics were used to train an LSTM neural network for estimating the contact rate (
). LSTM networks were chosen for their ability to handle sequential data, including time series, and their capacity to preserve trends in such data, which is desirable when estimating parameters for dynamical systems that describe the evolution of an epidemic in time. Additionally, LSTM networks can handle multi-dimensional input, allowing for consideration of multiple variables, such as mobility and other underlying contributing factors. The input for the LSTM network in this work is a multi-dimensional array with terms limited to the dynamics established by the system and the contributing factors. We implemented the LSTM neural network in Python using the torch library and set the following hyperparameters: learning rate of 0.1, 1 layer, 1000 training epochs, a sequence length of 28 data points, 80% of the data used for training, and 20% for testing. We trained a separate network for each Autonomous Community to estimate both the contact (
) and recovery (
) rates. The input matrix for the LSTM networks consists of 28 time observations for the variables
I,
, mobility percentage,
, and
V, with the output variable being
. This output is subsequently used to calculate the parameters
and
. An example of the input and output of the LSTM network for Madrid is displayed in
Figure 1 and
Figure 2, respectively.
Figure 2 shows the values of
used to train and test the LSTM network with data for the Community of Madrid and the output value of the network after training.
To train the LSTM network for each Autonomous Community, we used the data specific to that region, which resulted in different errors due to variations in the input data. For each community, we calculated three different errors: one for the training dataset, one for the test dataset, and a consolidated error combining both datasets. The root mean square error (RMSE) for all of the Autonomous Communities is presented in the heatmap shown in
Figure 3. We observed that the errors were highest in the training dataset and lowest in the test dataset, which indicates that the LSTM is capable of generalizing well to unseen data. Specific error values for each community are included in
Table A1 in
Appendix A.
The trained LSTM networks output the time series for the parameter
in the system described by Equation (
1). To obtain this parameter, the value of
from the LSTM is substituted into the system. However, since the LSTM output has an associated estimation error, as illustrated in
Figure 3 and detailed in
Table A1 in
Appendix A, the calculated parameters are also affected by these errors.
2.6. ODE Model Numerical Simulations
This study employed numerical simulations using a fourth-order Runge–Kutta method with LSTM-estimated parameters to solve initial value problems (IVPs) for ordinary differential equation (ODE) systems. IVPs involve finding the solution of a differential equation at a specific initial time, given the initial conditions. The fourth-order Runge–Kutta method is widely used in solving IVPs of the form given in Equation (
28), which is a system with time-varying parameters.
The fourth-order Runge–Kutta method is an explicit multi-stage one-step method that requires only information from the previous time-step and is efficient for ODE systems. Here,
is the vector of unknowns,
is the right-hand side of Equation (
2), and
is the initial condition. Predictor-corrector methods are also efficient solvers for initial value problems but were not used in this study.
The fourth-order Runge–Kutta method is expressed as a set of equations, which include four intermediate values (
,
,
,
) and the new state vector,
, at the next time step. The method is shown below:
Here, is the time for which the solution is computed, and is the previous time. is the vector of unknowns for time , is the vector of unknowns for time , and is the size of the time step considered.
The study conducted different simulations using the ODE models described in
Section 2.1. Simulation A followed the dynamic from Equation (
2), with specific values of
and
that fit best to each Autonomous Community. Simulation B followed the same dynamic, but with a different set of parameters for each Autonomous Community, as in Simulation A. Simulation C followed the same dynamic from Equation (
2), but with the same set of parameters for all Autonomous Communities. Simulation D maintained the dynamic from Equation (
1) but involved outlier handling for the LSTM-estimated parameters, removing them and imputing values through linear interpolation. Simulation E established the baseline, following the dynamic from Equation (
1).
To compare the simulation results, the Root Mean Square Error (RMSE) was used as the error measure, computed as the difference between the actual data and the simulation results. The RMSE values for the different simulations and Autonomous Communities are displayed in
Figure 4, and detailed results can be found in
Table A3 in
Appendix C. The largest differences in RMSE values were observed between Simulation E and the other simulations, except for the Autonomous City of Ceuta and the Autonomous Community of Canarias. Simulation C was largely ineffective, with gaps in the results, mainly in three Autonomous Communities (Andalucía, Aragón, and Extremadura), and in some cases, also for Simulations A and B. These results are discussed further in
Section 4.
4. Results and Discussion
In this section, we discuss the performance of the proposed model for simulating the spread of COVID-19 in Spain. The model was trained using LSTM neural networks to estimate the parameters of the SIR epidemiological model, and it was validated against real data. We start by presenting the results of the baseline simulation and then discuss the effectiveness of the proposed outlier handling techniques.
To improve the performance of the baseline simulation, which is referred to as simulation E and is based on the base model from Equation (
1), we utilized linear interpolation as a technique for outlier handling. This approach aimed to correct the under and overestimations introduced by the LSTM-estimated parameters. While this process improved the RSME values, significant miss-estimations remained, as observed in
Figure 7 for the case of Madrid. Specifically, this figure shows that the infected ratio was underestimated in the first waves and vastly overestimated in the last wave included in the study.
Figure 4 displays the RMSE values for all Autonomous Communities for simulation E, highlighting the significant deviation of the simulation from the actual values. Thus, a pattern of mis-estimations emerges, and the need for further improvements is evident.
The neural network estimated parameters can yield simulations where the evolution of infected individuals is over or underestimated, depending on the input data and other outside factors not included in the input data used to train the neural network. To correct these potential behaviors observed with estimated parameters, a modified version of the SVIR compartmental model is introduced. This model is defined by Equation (
2), which includes explicit terms controlled by established thresholds for the different Autonomous Communities of Spain.
Using the same parameters for the model described by Equation (
2), the simulation is ineffective for nine communities, and poor results are obtained for four of the communities where the simulation can be carried out. These results indicate significant differences in behavior between the different communities, pointing to the need to determine specific values more adequate to the particular series/community. When utilizing specific parameter values for each Autonomous Community, the results obtained by the simulation are much closer to actual infected values, with lower RSME values, for the majority of Autonomous Communities. However, there were a few exceptions involving three communities for which Simulations A, B, and C were ineffective. The improvement is visually observable in the case of Madrid, shown in
Figure 7, particularly in the simulated period’s last wave.
Assessing the specific values of RSME and visual analysis of the simulations, the following can be stated: without modifying the SVIR system from Equation (
1) or the LSTM estimated parameters, there are significant errors in the numerical simulation compared to actual data. As a general solution, post-processing LSTM estimated parameters with well-established techniques can improve the simulation’s result, as observed by comparing Simulation D in
Figure 4 vs. Simulation E. However, outlier handling is still insufficient, leading to the proposed modified SVIR model, for which not only the RSME improved but the deviation of the behavior in the waves from the simulation to actual data is reduced, highlighting the relevance of the proposed modified SVIR model.
We compared the relative
error for 16 Autonomous communities (or autonomous cities in the case of Ceuta and Melilla) with a previous study by Bousquet et al. [
27], who used Deep Learning to forecast values of the
parameter. Their study reported relative
errors ranging from
to
. In contrast, our study achieved an average relative
error of
, with a range from a minimum of
to a maximum of
.
Table A4 lists the computed errors. Our findings suggest that our predictions for the
parameter in COVID-19 are more accurate and precise. Although the two studies used different compartmental models, the
parameter was the contact parameter in both. We did not compare the results for other parameters reported by Bousquet et al. as our study focused solely on the
parameter.
The PDE system from Equations (
30) and (
31), based on the modified SVIR Compartmental Model, can be used to study scenarios including Diffusion with the bi-parameter (
,
) control system. To validate the simulation mechanism, the Manufactured Solutions methodology was implemented, obtaining accurate numerical results compared to the exact solution. Although this is not a completely realistic system and does not have specified-known diffusion coefficients, reasonable constants were selected and applied to carry out the Manufactured Solutions methodology. These coefficients should be further studied and modified if the intention is to explore more realistic situations. In this study, the system was proposed as an extension of the ODE system to increase the potential to resemble a more realistic situation. So far, the study has validated the model and its corresponding order. Further studies could focus on comparing the model’s simulation when changing the currently used Diffusion coefficients.
5. Conclusions
This study presents a detailed simulation of COVID-19 in the Autonomous Communities of Spain using LSTM-estimated parameters and the modified SVIR model. The model includes regulating mechanisms based on the behavior of previous time steps. We have established the positivity of the ODE version of the model and explored its local stability. However, further analysis is needed to confirm the consistency and global stability of the modified model.
The proposed mechanisms show significant improvements in simulation accuracy, as evidenced by lower RSME and closer waves to actual data. Although LSTM proved to be efficient for estimating parameters for the compartmental model. The proposed modified SVIR model and LSTM neural networks approach is a promising tool for forecasting the spread of COVID-19, improving the accuracy of predictions for key parameters such as , with a relatively low relative error, as shown by the results of the comparison with a previous study.
The proposed modified SVIR model has demonstrated its value as a tool for studying epidemic behavior in a short period, and its potential can be extended to other countries or more granular levels in Spain. However, the SVIR and proposed modified SVIR models have limitations, particularly with regard to their assumptions, such as granting permanent immunity to vaccinated and recovered individuals, which is not the case. To improve the accuracy and applicability of these models, future studies should explore strategies to incorporate waning immunity and recurrent vaccination, which could be crucial in accurately simulating the long-term impact of the COVID-19 pandemic.
This work presents an extension of the SVIR compartmental model in the form of a one-dimensional PDE system with diffusion. Although the stability analysis of the PDE system has not been carried out, the numerical scheme for the system is validated using the manufactured solutions methodology. The PDE system is solved using a fifth-order WENO reconstruction in space and an RK3-TVD scheme for time integration. While it is not possible to compare the results of the PDE system with actual data, this extension of the SVIR model has the potential to provide additional insights into the behavior of epidemics. Further studies are needed to fully explore the potential of this extension of the SVIR model.
In summary, this work highlights the relevance of developing and improving models to provide more accurate and comprehensive insights into epidemic behavior, and demonstrates the value of incorporating advanced numerical techniques such as LSTM and PDEs with diffusion into these models. Further studies can build upon the proposed methodology to enhance our understanding of epidemics and support decision-making in the public health sector.