1. Introduction
The coronavirus disease 2019 (COVID-19) is an infectious disease that emerged in Wuhan, China in December 2019 and has rapidly spread in China and worldwide. On 30 January 2020, the World Health Organization (WHO) declared the outbreak as a public health emergency of international concern. Furthermore, the WHO announced the COVID-19 outbreak as a “pandemic” on 11 March 2020 [
1,
2]. The prevention of infection is the responsibility of every human being.
A few mathematical models have been developed to describe the dynamics of the evolution of COVID-19 [
3,
4,
5,
6,
7]. Among them, the susceptible-infected-recovered (SIR) model is treated as a basic mathematical model for the spread of epidemic diseases. This model was proposed in 1927 by Kermack and McKendrick [
8,
9], and has been used as a basic model for epidemics [
10,
11,
12,
13,
14].
The SIR model is not difficult from the viewpoint of mathematics, and can be understood even by high-school students. We present a numerical solution and an exact analytical solution [
15] of the SIR model herein. The analytical solution is emphasized because it is not well known. We provide a more straightforward derivation of the analytical solution compared with that in the original paper; it will serve as a good opportunity for students to learn differentiation and integration. Furthermore, we treat the generalization of the SIR model with vital dynamics including birth and natural death processes. We demonstrate that the complex SIR model can be reduced to an Abel-type differential equation.
2. Exponential Growth
The spread of an epidemic disease depends on both the amount of contact between individuals and the possibility of an infected person transmitting the disease to another person. If each infected person is in contact with two other people before they recover, the disease will soon begin to spread rapidly. Assuming that recovery requires one day, the number of sick people will double each day. This situation is illustrated schematically in
Figure 1. A numerical series for which the ratio of each two consecutive terms is a constant
r (in this case,
) is called a geometric series. The
n-th term is expressed as follows:
where the initial value is set to be
. If
r is greater than 1, then the series increases and approaches positive infinity rapidly (we assume
); if
r is smaller than 1, the series decreases and approaches zero. At the border of
, the series remains constant. As an example, when one borrows money, the interest is typically a compound interest, i.e., a geometric series. When the annual interest is 30%, the return in five years will increase rapidly as follows:
Extending
n to a real number, we introduce the exponential function
The differentiation of the exponential function can be calculated as follows:
The special constant
r that satisfies the limiting equation
is called Napier’s constant
e. With this definition of
e, the differentiation of the exponential function becomes
Hereafter, we treat the form of the exponential function as
The exponential function grows for
, and decays for
. When
, the function becomes constant. We have assumed that
.
The inverse function of
x,
, is a logarithmic function written as
where
r is the base. The logarithmic function with the base
e is called a natural logarithmic function and is denoted as follows:
The exponential function always explodes for
. To verify whether it grows rapidly or slowly, we may consider the doubling time. From the condition
we obtain that the doubling time is a time difference that satisfies
It is instructive to use graph plots to observe the growth behavior. Both the linear and semi-log plots are shown in
Figure 2. A semi-log plot has one axis (
t) on a linear scale, and the other on a logarithmic scale. When the function grows exponentially, a linear increase is observed in a semi-log plot.
Next, we explain the differential equation. Let us consider the case in which the increase in a variable
is proportional to
, i.e.,
The difference in the left-hand side becomes larger when the time difference
becomes longer; hence,
is multiplied to the right-hand side. If we divide both sides by
and take the limit of
, the following differentiation definition is obtained:
It becomes a
equation, which is known as a differential equation. Because Equation (
7) satisfies the differential equation, it is regarded as a general solution to the differential equation shown in Equation (
13). Here,
C is an arbitrary constant and is known as an integral constant. Next, we discuss the mathematical model of the spread of epidemic diseases.
3. Mathematical Model of Epidemics
If exponential growth continues, the number of the infected individuals increases. We may consider that the infected individuals recover and develop immunity. Subsequently, the number of recovered individuals increases, but we hope that the spread of the disease will subside. We may consider a closed society of
N individuals and classify individuals into three types: susceptible (S), infected (I), and recovered (R). The SIR model [
8,
9] is a mathematical model for the spread of epidemic diseases that describes the changes in the numbers of the three types of individuals.
We denote the number of susceptible (S) individuals by
, the number of infected (I) individuals by
, and the number of recovered or removed (R) individuals by
. The time
t is measured in days,
t (day). We may consider that each person is in contact with
m persons per day on average. Because the proportion of infected individuals is
, the number of contacts between the infected and susceptible persons becomes
per day. If we set the probability of infection for each contact as
p, the number of the newly infected individuals within
days becomes
in total. If we set
, the number of non-infected individuals (S) from the
t-th day through the (
)-th day changes as
For the limit of
, the derivative can be obtained as follows:
which is the same procedure as that derived for Equation (
13).
Meanwhile, the infected individuals recover at a removal rate of
per day. Subsequently, the increase in the number of the recovered persons becomes
The inverse of
is the expected duration of infection. Here, the number of the recovered persons includes the number of the deceased persons because it refers to the number of individuals who cannot possibly infect others.
Because the total number of individuals is set to be constant such that
we express
as follows:
The change in the number of the infected individuals can be written as
Gathering the equations, we have
These are simultaneous equations, where
is the rate of infection and
the rate of recovery (the rate of quarantine).
If we introduce the transformations of variables, such as
, and
, then Equations (
21)–(23) can be rewritten as
The equations are in non-dimensional forms. The use of
, and
implies that we only need to consider the fraction of the number of individuals. A single parameter representing the ratio of
and
appears as follows:
The number
is known as the basic reproduction number [
16,
17,
18], and the number of infected individuals increases when
(precisely, in the limit of
), whereas it decreases when
.
4. Numerical Solution of SIR Model
The SIR equations (Equations (
21)–(23)) were derived; subsequently, the differential equations can be solved numerically. The simplest method to solve the differential equation is as follows: It is noteworthy that
Therefore, we may calculate
with a time step size of
. This first-order numerical procedure is known as the Euler method. However, it has a convergence problem; therefore, the errors will be larger. Although many methods have been proposed to reduce errors, we employ the second-order Runge–Kutta method.
The python program is given in
Appendix A. As initial conditions, we use
with
, and
. The calculated results using parameters
and
are shown in
Figure 3a. As shown, the number of infected individuals (I) increases with time, reaches a peak, and gradually decreases. Meanwhile, the number of the recovered individuals (R) increases gradually.
Thus far, we have discussed the number of the infected individuals,
. The number of newly infected individuals is reported daily. How can we express the number of the newly infected individuals in the framework of the SIR model? The newly infected individuals are added to
, but the recovered individuals
are subtracted from
as well. Therefore, the number of the newly infected individuals,
, is expressed as
with
. Because
, we can express
Based on Equation (
21), we obtain
Because
is proportional to
, the behavior of the temporal variation of
is similar to that of
. In
Figure 3a,
is plotted as a dotted red curve.
In
Figure 3b, the temporal variations of the numbers of individuals for the three types (S, I, R) are compared for
= 0.5, 0.45, 0.4, 0.35, and 0.3. Here,
is fixed as 0.2. The delay of the epidemic and the reduction in the peak height (number of the infected individuals) were shown as
decreases. This trend is typically associated with the effect of social distancing.
The basic reproduction number
is frequently used as the “R naught” or “R zero”. Sometimes, it is treated as a time-varying variable and used as a measure for easing lockdowns. In the SIR model,
is a constant and defined as shown in Equation (
27). If we formally define a time-varying number as
in the framework of the SIR model, it judges whether
increases (
) or decreases (
) in Equation (22). In
Figure 3a, this time-varying
is also plotted as a dotted black curve. The initial value
was set to be 2.5; as shown,
gradually decreases, becomes 1 at the point where
reaches a peak, and further decreases below 1. The terminology “effective reproduction number” is sometimes used, but it has several meanings.
Finally, we remark on the cumulative number of the infected individuals, which is reported daily. In this model, the cumulative number is expressed as
because it includes the number of the recovered individuals. In the initial stage of the outbreak of an epidemic, this number increases exponentially, and it is appropriate to use the semi-log plot as shown in
Figure 2b. The initial increase in the cumulative number of the infected individuals,
, is plotted in a semi-log plot for several
in
Figure 4, where
is fixed. As shown, the initial increase is linear, and the slope decreases as
decreases.
5. Analytical Solution of SIR Model
We have demonstrated the numerical solution of the SIR model. Although the SIR model was proposed in 1927 [
8,
9], the analytical solution of the differential equations was only presented in 2014 [
15]. In the analytical solution, only exponential and logarithmic functions are involved. Therefore, high-school students will be able to understand the analytical solution of the SIR model.
Next, we present the main stream of the derivation of the analytical solution. Some detailed calculations are given as questions, and the solutions can be found in
Appendix B. Equations (
21)–(23) are simultaneous differential equations; we can rewrite them as the differential equation of a single function. The procedure is the same as that used to solve a linear simultaneous equation.
We employ a more direct method of deriving the analytical solution compared with that in the original paper [
15]. First, we rewrite Equation (
21) as
where
. If we differentiate both sides, we obtain
Inserting Equation (
37) into Equation (
21) yields
Comparing Equation (
38) and Equation (
39), we have
Therefore, we have obtained a single differential equation of
.
Next we introduce the following function:
Therefore, we have the following equation for
:
An ordinary differential equation of this form is known as the Bernoulli differential equation. The general solution is obtained as follows:
where
is an integral constant.
Problem 1. Show that the general solution of the Bernoulli differential Equation (42) is given by Equation (43). From the relation of the inverse function of Equation (
41), we have
Using Equation (
21), we obtain
Moreover, from Equations (
21) and (23), we have
Subsequently, the relation between
and
becomes
where
is an integral constant. If
and
for
, then
becomes
. Therefore, we have
From the relation
, we obtain the integral constant
as
If we substitute
into Equation (
43), we have
By integrating this equation,
t is obtained as a function of
x as follows:
Problem 2. Confirm that Equation (51) satisfies differential Equations (21)–(23). For the convenience of calculation, we change a variable such that
. Subsequently, we rewrite it as
It is noteworthy that
appears here instead of
.
Because we have obtained the form of the integral of one variable, we then calculate
. We can perform a numerical integration by summing up an integrand with a sufficiently small step size, as follows:
It is noteworthy that the direction of the integral is negative because
in Equation (
52), and the integrand has a negative value. If
is obtained, then
y and
z can be calculated using Equations (
45) and (
48), respectively. The calculated results of
, and
are shown in
Figure 5, where the same conditions are used as in
Figure 3a. As expected, the analytical solution and the numerical solution obtained using the Runge–Kutta method are the same.
Next, we analyze the property of the differential equation. We can investigate the behavior of
. The variable
in the limit of
,
, is obtained when the denominator of the integrand of Equation (
52) becomes zero. If we write the solution to the equation
as
, we have
,
, and
. To solve an equation such as Equation (
54) numerically, an iterative method such as Newton’s method is used. However, for the case of a single variable, it is sufficient to plot the left-hand side of Equation (
54) and find an intersection with a zero.
Figure 3b shows a comparison of the behaviors of the temporal evolutions of
,
, and
for
= 0.5, 0.45, 0.4, 0.35, and 0.3, with
(
).
Figure 6 shows the solutions of Equation (
54) under the same conditions. The solutions reproduce the limit values of
for
in
Figure 3b. Using
defined in Equation (
27), we can rewrite Equation (
54) as
in the limit of
. This relation is known as the solution of the final size equation [
19].
The final number of the infected individuals, (
), is a variable of interest. The
dependence of this number is tabulated in
Table 1. As shown, the final number of the infected individuals decreases with the infection rate
. Furthermore, the table presents the peak values of the infected individuals.
6. Analytical Approach of SIR Model with Vital Dynamics
The standard SIR model has been extended to include some other factors. One example is a model that considers birth and natural death processes [
20,
21]. The nonlinear equations are as follows:
where
represents the natural death rate, which is assumed to be equal to the birth rate. Harko et al. [
15] studied this model and demonstrated that these simultaneous equations can be reduced to an Abel-type equation.
Next, we show a more straightforward derivation than that presented in the original paper [
15]. Similar to the procedure used for the standard SIR model, we first eliminate a variable
from Equations (
56) and (57);
where
.
Next we introduce a function
Subsequently, we obtain an equation for
as follows:
By introducing a function
we obtain the equation for
v as follows:
An ordinary differential equation of this form is known as the Abel-type first-order differential equation of the first kind. The mathematical properties of the Abel-type equations have been investigated intensively. One may use an iterative solution, as demonstrated by Harko et al. [
15]. If we set
, this model becomes the standard SIR model; Equation (
63) is reduced to Equation (
A4) in the case of
. Furthermore, some recent developments regarding the solution of the Abel equation of the first kind have been reported [
22,
23].
Next, we briefly analyze the properties of the SIR model with vital dynamics. We may solve Equations (
56) and (57) simultaneously or solve an equation comprising one variable, i.e., Equation (
63). For the analytical approach, it is beneficial to introduce the following function:
Subsequently, we can rewrite Equation (
63) as
or
This type of equation is known as the Abel equation of the second kind.
We plot
,
, and
of the SIR model with vital dynamics by solving Equations (
56) and (57) simultaneously or solving Equation (
65), in
Figure 7. Both approaches yielded the same results, as expected. The parameters
varied as 0.0, 0.02, 0.04, 0.06, and 0.08, whereas
and
were fixed as 0.5 and 0.2, respectively. In the plot, the data of
are shown by thick dashed curves. As shown, a minimum occurred in
, and
remained non-zero for
. This represents a disease in the endemic steady state. New births can provide more susceptible individuals, thereby resulting in a sustained epidemic.
7. Discussion and Conclusions
We explained the mathematical model of the spread of epidemic diseases. It is noteworthy that in the SIR model, a closed society was assumed, and all the individuals have the same contact possibility. Hence, the obtained results are valid only within such a model.
Next, we consider the meanings of the infected (I) and the recovered (R). In this model, recovered (R) refers to individuals who do not infect others. Hence, the individuals who have been quarantined are regarded as recovered individuals. Therefore,
is called the rate of recovery or the rate of quarantine. To reduce
, we may reduce
because
is expressed by
. This can be realized by social distancing (physical distancing). Because the inverse of the denominator,
, is the average time of infection, shortening the time of confirming the infection by testing provides the same effect as social distancing. In Equations (
21)–(23),
is multiplied by
, whereas
is only multiplied by
.
Next, we discuss the reproduction number. The Robert Koch Institute (RKI) of Germany publicizes the reproduction number R daily. The point estimate of R for a specific day is estimated as the quotient of the number of incident cases on that day divided by the number of incident cases four days earlier. The incident cases are not directly extracted from the notification system, but they are estimated using statistical analysis to predict the true onset of illness. To reduce random effects, a moving 4-day or 7-day average is used. The reproduction number R of the RKI is clearly defined and reflects the measure of disease spread; i.e., R is either greater or smaller than 1. However, it is noteworthy that the absolute value of such a time-varying reproduction number depends on the definition.
As for the mathematical models of epidemics, some extended models have been developed, such as the susceptible-infectious-recovered-deceased (SIRD) model [
24,
25,
26]. This model differentiates between recovered (i.e., individuals who have survived the disease and are now immune) and deceased individuals. Another example is the susceptible-exposed-infectious-recovered (SEIR) model [
27,
28,
29,
30], where an “exposed” category is added. It considers the situation involving a significant incubation period, during which individuals have been infected but are not yet infectious.
We learned not only the numerical solution of the SIR model, but also the exact analytical solution of the model. Furthermore, we treated the generalized SIR model with vital dynamics. Students can learn differentiation and integration using this tutorial. Furthermore, it will be beneficial if they indicate interest in social problems.