Combining Polynomial Chaos Expansions and the Random Variable Transformation Technique to Approximate the Density Function of Stochastic Problems, Including Some Epidemiological Models

Calatayud Gregori, Julia; Chen-Charpentier, Benito M.; Cortés López, Juan Carlos; Jornet Sanz, Marc

doi:10.3390/sym11010043

Open AccessArticle

Combining Polynomial Chaos Expansions and the Random Variable Transformation Technique to Approximate the Density Function of Stochastic Problems, Including Some Epidemiological Models

by

Julia Calatayud Gregori

¹

,

Benito M. Chen-Charpentier

²

,

Juan Carlos Cortés López

^1,*

and

Marc Jornet Sanz

¹

Instituto Universitario de Matemática Multidisciplinar, Universitat Politècnica de València, 46022 Valencia, Spain

²

Department of Mathematics, University of Texas at Arlington, Arlington, TX 76019-0408, USA

^*

Author to whom correspondence should be addressed.

Symmetry 2019, 11(1), 43; https://doi.org/10.3390/sym11010043

Submission received: 15 November 2018 / Revised: 28 December 2018 / Accepted: 30 December 2018 / Published: 3 January 2019

(This article belongs to the Special Issue Mathematical Epidemiology in Medicine & Social Sciences)

Download

Browse Figures

Versions Notes

Abstract

:

In this paper, we deal with computational uncertainty quantification for stochastic models with one random input parameter. The goal of the paper is twofold: First, to approximate the set of probability density functions of the solution stochastic process, and second, to show the capability of our theoretical findings to deal with some important epidemiological models. The approximations are constructed in terms of a polynomial evaluated at the random input parameter, by means of generalized polynomial chaos expansions and the stochastic Galerkin projection technique. The probability density function of the aforementioned univariate polynomial is computed via the random variable transformation method, by taking into account the domains where the polynomial is strictly monotone. The algebraic/exponential convergence of the Galerkin projections gives rapid convergence of these density functions. The examples are based on fundamental epidemiological models formulated via linear and nonlinear differential and difference equations, where one of the input parameters is assumed to be a random variable.

Keywords:

uncertainty quantification; epidemiological stochastic model; probability density function; generalized polynomial chaos; random variable transformation technique

1. Introduction

Mathematical models based on random differential and difference equations consider input parameters as random variables or stochastic processes [1,2,3]. The aim of this kind of models is to capture the uncertainty often met in the analysis of complex phenomena. This randomness comes from errors in measurements or estimations, a lack of information, or ignorance of the phenomenon under analysis, etc. In particular, these are typical features in dealing with the study of epidemiological models, where input data, such as the initial proportion of infected individuals, the rates of contagious and recovery caused by an infectious disease, etc., are not deterministically known. In such a situation, it is more realistic to model all this initial information via random variables or stochastic processes rather than by using deterministic constants or functions, respectively. These facts have motivated the development of powerful mathematical techniques capable to model and quantify uncertainty in dealing with epidemiological problems to better understand and describe the dynamics of a disease.

Given the response stochastic process, the primary goal of Uncertainty Quantification consists in understanding the main probabilistic features of the model outputs, by computing their statistical moments (mainly the mean and the variance), confidence intervals, etc. [4]. The most well-known method to approach this problem is Monte Carlo simulation [5]. It is easy to implement and effective, although computationally expensive. An alternative is based on generalized polynomial chaos (gPC) expansions and the stochastic Galerkin projection technique [6,7,8,9,10,11,12,13,14,15,16], which usually converges at an algebraic/exponential rate (exponential convergence) [6,17,18].

Apart from computing the mean and the variance, a major goal in dealing with stochastic models is to calculate the set of (first) probability density functions of the solution process at each time instant [1] (p. 35). By means of the probability density function, all the statistical moments and confidence intervals corresponding to the solution process can be determined [1] (Chapters 2–3). When dealing with stochastic problems with a closed-form solution, the Random Variable Transformation (RVT) technique has been widely used to calculate the density function of the solution, see [19,20,21] in the setting of random differential equations, [21,22,23,24,25,26] in the context of random partial differential equations, and [27] for random difference equations. When no explicit form of the solution is available or it is given via infinite analytic expressions (such as random power series, random eigenfunctions expansions, etc.), one usually constructs a sequence of approximating processes, whose exact density functions (computed via the RVT technique) form a sequence of approximating density functions that converge rapidly. This methodology has been carried out via numerical methods [28], Karhunen-Loève expansions [29,30], or random power series [29,31], and it has supposed an improvement compared with Monte Carlo simulations and Kernel density estimations in terms of rate of convergence and accuracy.

In the recent contribution [9], RVT and gPC techniques have been successfully combined to study random differential equations. That study focused on the computation of the mean and the variance of the solution stochastic process for random differential equations. In the present paper, we combine both techniques to go further and to compute reliable approximations to the probability density function of stochastic models. Furthermore, we point out that the proposed method possesses rapid convergence and is able to reach its goal even in the case that no closed-form solution to the stochastic problem is available. This is a distinctive feature of our approach since, as it has been previously indicated, the application of the RVT technique to determine the probability density function of stochastic models usually requires knowing a closed-form expression of the solution, see for instance [19,20,21,22,23,24,25,26]. All these features of our approach will be illustrated through several epidemiological models. At this point it is important to underscore that in the context of epidemiological modeling, knowledge of the probability density function is crucial, since from it, apart from computing predictions (via the mean) and confidence intervals (via the mean and the standard deviation), one can also compute the probability that the proportion of infected individuals lies in a certain interval of interest for health authorities [19,32,33] (prevention and control of plagues, identify seasonal patterns of infection, etc.).

Hereinafter, we will work with stochastic models with only one random input parameter. The reason for this restriction is that there is a version of the RVT technique for non-injective transformation mappings on

R

with many applications in general, but not on

R^{s}

,

s > 1

. The RVT method will allow us to compute the exact density function of the Galerkin projections (which are polynomials, usually non-injective), and by taking limits we will approximate the density function of the response process. Since the Galerkin projections converge at algebraic/exponential rate to the solution process, it is expected to have rapid convergence of the sequence of approximating density functions.

From a modeling standpoint, the variability of each parameter (input) produces a variability in the results (output) of the model. However, the effects of the variabilities of the parameters are of different magnitude and can be estimated by using a sensitivity analysis method such as gPC-based Sobol’s indices [12,34,35,36]. These indices explain the influence of each random parameter on the response process, by performing a variance-based sensitivity analysis. The variance of the output is a sum of contributions of each input variable and combinations of them. Indeed, the variance of the output is the squared-sum of the Galerkin coefficients, so the contribution of the main effect of a parameter is the normalized squared-sum of the coefficients corresponding to basis polynomial functions evaluated only at the aforesaid parameter. By computing Sobol’s indices, it is assumed that the parameter that leads to a larger variance of the output has the largest effect on the model prediction. If one is dealing with epidemic models, the parameter with highest gPC-based Sobol’s index is the most important to address and control the epidemic, and the other parameters may be taken to be deterministic. Therefore the fact of introducing randomness into only one input parameter may be justified.

The structure of the paper is the following. In Section 2, we will specify the type of stochastic problems under study, we will show the necessary basics of gPC expansions and the RVT technique, and we will combine both methods to approximate the probability density function of the solution stochastic process. In Section 3, we will illustrate the theoretical development with several particular examples of continuous and discrete stochastic models in an epidemiological and social setting. Finally, Section 4 will draw conclusions.

2. Method

We consider an initial value differential equation problem with one degree of uncertainty:

\{\begin{matrix} y^{'} (t) = F (y (t), t), t \in I, \\ y (t_{0}) = ζ, \end{matrix}

(1)

or

\{\begin{matrix} y^{'} (t) = F (y (t), t, ζ), t \in I, \\ y (t_{0}) = y_{0}, \end{matrix}

(2)

where

I \subseteq R

is an interval,

t_{0} \in I

, F is a deterministic function, and

ζ

is the random input parameter, which can be the initial condition (case (1)), or can appear in the differential equation expression (case (2), with the initial condition

y_{0}

being deterministic). The random variable

ζ

is assumed to be absolutely continuous (we denote its density function by

f_{ζ} (ζ)

) with centered absolute moments of any order. At this point, it is important to point out that the largest part of random variables that are used in practice, specially in epidemiological modeling, such as Beta, Gaussian, Gamma, Uniform, Triangular, etc., satisfy these hypotheses. In this setting, we are assuming an underlying complete probability space

(Ω, F, P)

, where

Ω

is the sample space consisting of outcomes

ω \in Ω

,

F

is the

σ

-algebra of events and

P

is the probability measure.

The solution

y (t)

to (1) or (2) is a stochastic process. We will assume that such a solution exists and is unique, in some stochastic sense (sample path sense, mean square sense, etc., see [1,2]). Our goal is to compute the probability density function of

y (t)

,

f_{y (t)} (y)

, at each time instant

t \in I

. The density function allows performing uncertainty quantification for

y (t)

, since all the statistical moments and confidence intervals can be derived from it [1] (Chapters 2–3).

Other alternative stochastic systems to (1) and (2) are possible as well: random partial differential equations, random difference equations, etc., with one degree of uncertainty. For such stochastic systems, the techniques exposed here work analogously.

2.1. gPC Expansions

Given the random input parameter

ζ

, consider its associated canonical basis

C^{p} = {1, ζ, ζ^{2}, \dots, ζ^{p}}

, for each degree

p \geq 1

. By using a Gram-Schmidt procedure, we orthonormalize this basis in the Hilbert space of random variables with well-defined variance,

(L^{2} (Ω), 〈, 〉)

, where the inner product is defined as

〈 X, Y 〉 = E [X Y]

for

X, Y \in L^{2} (Ω)

, being

E

the expectation operator with respect to the probability measure

P

. Let

Ξ^{p} = {ϕ_{0} (ζ), ϕ_{1} (ζ), \dots, ϕ_{p} (ζ)}

be the orthonormal basis of

C^{p}

,

p \geq 1

, where

ϕ_{0} = 1

. For example, if

ζ \sim Normal (0, 1)

, then

Ξ^{p}

is the set of Hermite polynomials; if

ζ \sim Uniform (- 1, 1)

, then

Ξ^{p}

is formed by Legendre polynomials, etc. [6,7]. Notice that, with this approach, we are not restricted to distributions from the Askey scheme [6] (Chapter 3), as already remarked in [8,9]. This point has great interest in practice, since most of the random variables constructed to model real phenomena do not fit the standard distributions, and therefore we are enlarging the applicability of the classical gPC method.

If Z is a random variable in

L^{2} (Ω)

written as a function of

ζ

,

Z = r (ζ)

, then the gPC expansion of Z is

Z = \sum_{i = 0}^{\infty} {\hat{Z}}_{i} ϕ_{i} (ζ)

, where the sum is considered in

L^{2} (Ω)

and the Fourier coefficients

{\hat{Z}}_{i}

are given by

{\hat{Z}}_{i} = E [r (ζ) ϕ_{i} (ζ)]

. Actually, to ensure that

Z = \sum_{i = 0}^{\infty} {\hat{Z}}_{i} ϕ_{i} (ζ)

, we need that the moment problem is uniquely solvable for

ζ

, see [37] (Th. 3.3). This condition is not so restrictive, as any random variable with finite moment-generating function in a neighborhood of the origin has a uniquely solvable moment problem [37] (Th. 3.4).

If we take the solution stochastic process

y (t)

, then we may formally write

y (t) = \sum_{i = 0}^{\infty} {\hat{y}}_{i} (t) ϕ_{i} (ζ)

in

L^{2} (Ω)

, where

{\hat{y}}_{i} (t) = E [y (t) ϕ_{i} (ζ)]

. The Fourier coefficients

{\hat{y}}_{i} (t)

cannot be explicitly computed, as

y (t)

is unknown. We use the stochastic Galerkin projection technique [6,7,8,9]: We seek a solution of the form

{\tilde{y}}^{p} (t) = \sum_{i = 0}^{p} {\tilde{y}}_{i}^{p} (t) ϕ_{i} (ζ)

, i.e., we impose

\sum_{i = 0}^{p} \frac{d}{d t} {\tilde{y}}_{i}^{p} (t) ϕ_{i} (ζ) = F (\sum_{i = 0}^{p} {\tilde{y}}_{i}^{p} (t) ϕ_{i} (ζ), t) .

The deterministic functions

{\tilde{y}}_{i}^{p} (t)

are calculated in two steps: first, we multiply the previous equation by

ϕ_{k} (ζ)

, and we take the expectation operator and use the orthonormality of

Ξ^{p}

:

\frac{d}{d t} {\tilde{y}}_{k}^{p} (t) = 〈F (\sum_{i = 0}^{p} {\tilde{y}}_{i}^{p} (t) ϕ_{i} (ζ), t), ϕ_{k} (ζ)〉, k = 0, \dots, p;

and secondly, we solve a deterministic system of differential equations via standard numerical techniques. Formally,

{lim}_{p \to \infty} {\tilde{y}}^{p} (t) = y (t)

in

L^{2} (Ω)

. Some general results justify this assertion, see [17,18]. Moreover, algebraic/exponential convergence usually holds (spectral convergence).

The stochastic Galerkin projection technique and gPC expansions have been profusely used in the literature for uncertainty quantification, see for example [10,11,12,13,14,15,16].

We will show how to compute exactly the probability density function of

{\tilde{y}}^{p} (t)

, which will be used as an approximation for

f_{y (t)}

.

2.2. RVT Technique

The RVT technique has been successfully applied to study significant random ordinary differential equations [19,20,21,28], random partial differential equations [21,22,23,24,25,26,30] and random recursive equations [27], that appear in different areas such as Epidemiology, Physics, Engineering, etc. In this paper, we will use the following version of the RVT technique:

Theorem 1

([38] (p. 115)). Let X be an absolutely continuous random variable with density

f_{X}

and with support

D_{X}

contained in an open set

D \subseteq R

. Let

g : D \to R

be such that

D = {\dot{\cup}}_{1 \leq i \leq n} D_{i}

and

g_{i} {= g |}_{D_{i}}

is injective and

C^{1} (D_{i})

with non-vanishing derivative. Then the random variable

Y = g (X)

is absolutely continuous, with density function given by

f_{Y} (y) = \{\begin{matrix} \sum_{i : y \in g (D_{i})} f_{X} (g_{i}^{- 1} (y)) |\frac{d g_{i}^{- 1} (y)}{d y}|, & y \in g (D), \\ 0, & y \notin g (D) . \end{matrix}

(3)

Although there is a version of this theorem for

R^{s}

,

s > 1

, its applicability in our context will be reduced to

R

. For example, when working with transformation mappings g given by polynomials (this will be the case of the Galerkin projections), in dimension one we can find the regions

D_{1}, \dots, D_{n}

by equating the derivative

g^{'}

to 0; however, polynomials on

R^{s}

,

s > 1

, is a different matter, as the regions

D_{1}, \dots, D_{n}

may be impossible to find or even not exist. This fact motivates that in this first step of our research we consider the case

s = 1

, which has specific interest.

2.3. Combining gPC and RVT Technique to Approximate the Density Function

Fixed

t \in I

and

p \geq 1

, the Galerkin projection

{\tilde{y}}^{p} (t) = \sum_{i = 0}^{p} {\tilde{y}}_{i}^{p} (t) ϕ_{i} (ζ)

is an explicit transformation of the random variable

ζ

, let us call it

g (ζ)

. Since

g (ζ)

is a polynomial, by solving numerically

g^{'} (ζ) = 0

we can determine the sets

D_{1}, \dots, D_{n}

where g is strictly monotone and find

g (D_{i})

. At each

ξ \in g (D_{i})

, by finding numerically the unique root

ζ \in D_{i}

such that

ξ = g (ζ)

, we compute the inverse of

g_{i} {= g |}_{D_{i}}

,

h_{i} (ξ) : = g_{i}^{- 1} (ξ)

. By the RVT technique,

f_{{\tilde{y}}^{p} (t)} (ξ) = f_{g (ζ)} (ξ) = \sum_{\begin{matrix} i : 1 \leq i \leq n \\ ζ \in g (D_{i}) \end{matrix}} f_{ζ} (h_{i} (ξ)) | h_{i}^{'} (ξ) | .

Since

h_{i} (ξ)

has been computed numerically for each

ξ

, we do not have

h_{i}^{'}

explicitly. We compute

h_{i}^{'}

as

h_{i}^{'} (ξ) = \frac{1}{g^{'} (h_{i} (ξ))}, ξ \in g (D_{i}),

instead. Thus,

f_{{\tilde{y}}^{p} (t)} (ξ) = f_{g (ζ)} (ξ) = \sum_{\begin{matrix} i : 1 \leq i \leq n \\ ζ \in g (D_{i}) \end{matrix}} f_{ζ} (h_{i} (ξ)) \frac{1}{| g^{'} (h_{i} (ξ)) |} .

Notice that the RVT technique has permitted us to obtain the exact expression for the probability density function of the Galerkin projection

{\tilde{y}}^{p} (t)

. This exact expression is a distinctive feature of the RVT technique comparing with other stochastic methods (for example, gPC expansions plus Monte Carlo simulations for the Galerkin projections).

Since

{lim}_{p \to \infty} {\tilde{y}}^{p} (t) = y (t)

in

L^{2} (Ω)

at algebraic/exponential rate, we expect to have

lim_{p \to \infty} f_{{\tilde{y}}^{p} (t)} (ξ) = f_{y (t)} (ξ)

(4)

with rapid convergence. This procedure works as a computational method to approximate

f_{y (t)} (ξ)

. To the best of our knowledge, there are no available theoretical results that guarantee that (4) holds, but only mean square convergence of the Galerkin projections

{\tilde{y}}^{p} (t)

, instead.

In principle, we know nothing about the type of convergence of the sequence

{f_{{\tilde{y}}^{p} (t)}}_{p = 1}^{\infty}

to the true density

f_{y (t)}

. The numerical experiments that we will carry out in the forthcoming section, see Example 1 for instance, show that uniform convergence may not be expected, as

f_{y (t)}

might be discontinuous but

f_{{\tilde{y}}^{p} (t)}

continuous. Hence, in general, one may expect convergence of

{f_{{\tilde{y}}^{p} (t)}}_{p = 1}^{\infty}

in some Lebesgue space.

Assume that

{f_{{\tilde{y}}^{p} (t)}}_{p = 1}^{\infty}

is Cauchy in

L^{1} (R)

. By the completeness of the Lebesgue spaces, there is a function

f_{t}^{*} \in L^{1} (R)

such that

{lim}_{p \to \infty} f_{{\tilde{y}}^{p} (t)} = f_{t}^{*}

in

L^{1} (R)

, for each

t \in I

. Then the limit function

f_{t}^{*}

is a density function for

y (t)

. Indeed, notice that

f_{t}^{*}

is a density function, because

f_{t}^{*} \geq 0

almost everywhere and

\int_{R} f_{t}^{*} (y) d y = {lim}_{p \to \infty} \int_{R} f_{{\tilde{y}}^{p} (t)} (y) d y = 1

. Let

w (t)

be a stochastic process such that

w (t) \sim f_{t}^{*}

. We have that

{\tilde{y}}^{p} (t) \to w (t)

in law as

p \to \infty

, since

{lim}_{p \to \infty} P ({\tilde{y}}^{p} (t) \leq ξ) = {lim}_{p \to \infty} \int_{- \infty}^{ξ} f_{{\tilde{y}}^{p} (t)} (y) d y = \int_{- \infty}^{ξ} f_{t}^{*} (y) d y = P (w (t) \leq ξ)

. On the other hand, since

{\tilde{y}}^{p} (t) \to y (t)

in

L^{2} (Ω)

, in particular in law, as

p \to \infty

, we conclude that

y (t)

and

w (t)

are equal in distribution, therefore

y (t)

is absolutely continuous and

f_{y (t)} = f_{t}^{*}

. Thus, if the sequence

{f_{{\tilde{y}}^{p} (t)}}_{p = 1}^{\infty}

is Cauchy, then the limit function must be

f_{y (t)}

.

In practice, we have to be careful with the order of truncation p chosen, since if p is too large, numerical errors may arise, see [17,18,39,40] (Section 4). This is a disadvantage of gPC expansions. There may be an order of truncation

p_{0}

such that, for any

p \geq p_{0}

, the results obtained make no sense, therefore the density function may not be computed as accurately as wanted. See the forthcoming Example 3.

3. Numerical Experiments

In this section, we deal with particular continuous and discrete stochastic models that play an important role in Epidemiology. By using the previous ideas, we will approximate the probability density function of the solution stochastic process to some epidemiological models at different time instants. In the field of Epidemiology, the computation of the probability density function is a primary goal, as the expected number or confidence intervals for infective (predictions), or any other information of interest for health authorities, can be derived from it.

The computations have been carried out in the software Mathematica^®, version 11.2 (Wolfram Research, Inc.: Champaign, IL, USA, 2017) [41], installed on an Intel^® Core^TM i7 CPU 3.1 GHz. To compute the roots of

g^{'} (ζ) = 0

and

ξ = g (ζ)

, the built-in function NSolve has been utilized. The solution to deterministic systems of differential equations has been solved by means of the standard NDSolve routine, with automatic method, step size, etc.

When dealing with real data, we will compute deterministic estimates for the model parameters with a least squares fitting procedure (FindFit built-in function). Then we will introduce a small perturbation into one of the parameters (with mean value being the deterministic estimate calculated) and we will apply the methodology previously exposed. Introducing only a small amount of randomness into the model from deterministic estimates allows a more faithful representation of the time evolution of the population, see for example [8] (Example 5), [12,13,14,42]. In this article, as we are interested in testing our methodology on approximating the density function of the model output, but not in estimating probability distributions for the input parameters, we do not deal with inverse parameter estimation, see, for example, references [6] (pp. 95–99), [43,44], related with Bayesian inference and gPC expansions.

In the examples, we will see that it suffices to take a small order p of basis to get accurate approximations of the probability density function. The optimal p is that from which the subsequent probability density functions constructed for larger orders than p are virtually the same. In previous contributions dealing with approximations of the mean and variance statistics in the setting of continuous and discrete epidemic models (SIS, SIR, etc. models), see [10,14], it was shown numerically that, although for

p = 1

the approximations are not good, for

p = 2

and

p = 3

very similar results are produced.

In Example 1, the approximations of the probability density functions constructed by combining gPC expansions and the RVT technique will be compared with the exact density function. For Examples 2–4, we will compare our approximations against a Kernel density estimation from Monte Carlo simulations sampled from the stochastic problem directly, with the built-in function SmoothKernelDistribution of Mathematica^®. We will check that Kernel density estimations do not approximate well when the sought density function is not smooth (see the documentation on the built-in function SmoothKernelDistribution, or the theoretical reference [45]), while our method substantiated on the RVT technique is certainly able to identify discontinuities and peaks, i.e., the exact shape of the target density function. Moreover, gPC-based methods converge much faster than Monte Carlo simulations, even when there is only one random input coefficient, see the discussions from previous contributions [6,7,8]. Indeed, Monte Carlo simulations converge relatively slow (for example, the mean value converges at rate

1 / \sqrt{m}

, where m is the number of realizations), whereas gPC expansions converge at spectral rate.

When no explicit form of the solution is available, representations of it via infinite expansions (gPC expansions, random power series, Karhunen-Loève expansions, etc.) or discretizations from numerical methods are usually required. By truncating these infinite expansions or from the discretizations, one constructs a sequence of approximating processes, whose exact density functions (computed via the RVT technique) form a sequence of approximating density functions that converge rapidly [28,29,30,31]. In terms of rapidity and accuracy, this sort of approximations are preferable to Monte Carlo simulations and Kernel density estimations.

In Remark 1 from Example 4, we will show how gPC-based Sobol’s indices (see Section 1) may be used to determine the parameter whose variability has the most important effects and therefore justify the fact of introducing uncertainty into one input parameter only, in the context of a social model for addictive behavior.

Example 1.

In this example, we consider the simplest continuous epidemiological model to describe the growth of a population of bacteria with no limitation of resources and no antibiotics (predators). The model has demonstrated to be still useful to describe this population growth in presence of competitors during the initial stage. Proposed by Thomas Malthus in 1798 in his essay [46], this model is considered as the first law of population dynamics and is usually referred to as exponential or Malthusian model [47] (Chapter 1), [48]. With the aim of illustrating the main ideas exhibited in the previous section, next we deal with the following particular case of the Malthusian model which has been addressed in [6] (p. 69–70), [7] (p. 10):

\{\begin{matrix} y^{'} (t) = ζ y (t), \\ y (0) = 3, \end{matrix}

where

ζ \sim Beta (2, 1)

. This initial value problem may correspond to a bacteria with initial population 3 million of individuals and whose relative instantaneous growth rate fluctuates randomly in the unit interval. For this stochastic model, the solution stochastic process is well-known:

y (t) = 3 e^{ζ t}

. By means of the RVT technique, the exact probability density function of

y (t)

can be calculated:

f_{y (t)} (y) = f_{ζ} (\frac{1}{t} log (\frac{y}{3})) \frac{1}{y | t |}, y > 0, t \neq 0 .

Figure 1 plots

f_{y (t)}

for

t = 0.5

and

t = 1

.

Let us test our methodology. We use our computational method to approximate the probability density function of

y (t)

at the times

t = 0.5

and

t = 1

. In Figure 2 we show the corresponding results. We observe that, as the order of basis p increases, the sequence of density functions

f_{{\tilde{y}}^{p} (t)} (ξ)

of the Galerkin projections

{\tilde{y}}^{p} (t)

tends to

f_{y (t)} (ξ)

. Based on observation of Figure 2, this convergence seems to hold in

L^{q} (R)

,

1 \leq q < \infty

, but not uniformly on

R

, since

f_{{\tilde{y}}^{p} (t)} (ξ)

seems continuous for

p > 1

. Table 1 shows the

L^{1} (R)

error for each order p:

∥ f_{y (t)} - f_{{\tilde{y}}^{p} (t)} ∥_{L^{1} (R)} = \int_{R} |f_{y (t)} (ξ) - f_{{\tilde{y}}^{p} (t)} (ξ)| d ξ .

This integral has been calculated in Mathematica^® with the standardNIntegrateroutine. Observe that the error decreases to 0 very fast (exponentially), although convergence deteriorates when time t increases.

Example 2.

In this example, we deal with a stochastic model applied to real laboratory data of bacteria growth [42]. We consider a species of bacteria from a laboratory experiment: Rhodobacter capsulatus (R. capsulatus). Direct cell counts were made every two or three days until a stationary phase was achieved. For more information regarding the experiment, we refer the reader to [42].

Table 2 shows the population sizes of R. capsulatus under infrared lightning conditions in different mediums. The number of cells/mL has been rescaled by dividing by

10^{6}

. Figure 3 plots the cell counts from Table 2.

In [42], the authors used the Malthusian model to describe the bacteria growth at the first 9 days (before the stationary phase), while a logistic equation model was applied up to day 14. In this example, we suppose that the growth rate

y^{'} (t)

is not proportional to the total number of individuals

y (t)

, but to the total number of interactions,

y {(t)}^{2}

, via a parameter r. Moreover, we take into account the competition for the limited resources with a carrying capacity

K > 0

(limit population under no death assumption), and also the beginning of the bacteria decline phase via a death parameter

δ > 0

. The parameters r, K, and δ are assumed to be deterministic, while the initial condition ζ is taken as a random variable:

\{\begin{matrix} y^{'} (t) = r y {(t)}^{2} (1 - \frac{y (t)}{K}) - δ y (t), \\ y (0) = ζ . \end{matrix}

(5)

This Equation (5) corresponds to the simplified Fitzhugh-Nagumo model [49] (p. 2).

A least squares fitting gives the deterministic estimates

\hat{r} = 0.439310

,

\hat{ζ} = 0.557972

,

\hat{K} = 5.60822

, and

\hat{δ} = 0.126042

, with a very small residual squared error (smaller than the exponential growth model and the logistic equation presented in [42]). To randomize the initial condition ζ, we consider a distribution centered at

\hat{ζ}

:

ζ \sim Uniform (0.497972, 0.617972)

. The introduction of randomness into the initial condition requires the search of suitable techniques to perform uncertainty quantification. Via gPC expansions and the stochastic Galerkin projection technique with order of basis

p = 4

(this is enough to achieve convergence), pointwise estimates via

E [y (t)]

, and a confidence interval with the rule

E [y (t)] \pm \sqrt{V [y (t)]}

, can be constructed, see Figure 4. Observe that the actual data belong to the confidence interval and that the solid line reflects properly the growth of the bacteria population. Notice also that the confidence interval is wider in the middle of the growth phase, while smaller uncertainty seems to occur near the initial condition and the carrying capacity.

By using our previous exposition on the combination of the RVT method and gPC expansions, we are even able to determine the probability density function of

y (t)

. In Figure 5, we plot the approximations for

p = 1, 2, 3, 4

. The similarity of the approximating density functions shows that convergence has been achieved. Our results have been compared with a Kernel density estimation with the built-in functionSmoothKernelDistribution. The densities constructed via gPC expansions show that, possibly, the limit density function

f_{y (t)} (y)

has two jump discontinuities, therefore the Kernel density estimation does not approximate well those discontinuities and draws tails.

Table 3 shows consecutive errors for orders of truncation p and

p + 1

,

p = 1, 2, 3

:

∥ f_{{\tilde{y}}^{p + 1} (t)} - f_{{\tilde{y}}^{p} (t)} ∥_{L^{1} (R)} = \int_{R} |f_{{\tilde{y}}^{p + 1} (t)} (ξ) - f_{{\tilde{y}}^{p} (t)} (ξ)| d ξ .

Notice that the error decreases to 0 very fast.

Example 3.

We deal with uncertainty quantification for the following discrete stochastic system [18] (Section 4):

\{\begin{matrix} y (m + 1) = ζ y (m) (100 - y (m)), \\ y (0) = 3, \end{matrix}

where

ζ \sim Triangular (0.02, 0.03)

. In the epidemiological context, if

z (m)

and

y (m)

denote, respectively, the number of susceptible and infected individuals in a population of one hundred individuals at the period

m = 0, 1, 2, \dots

(e.g., months), the above discrete system describes the dynamics for the number of infected individuals over the time (in months). The parameter ξ can be interpreted as the contagious rate at which a susceptible becomes infected. More precisely, by considering that the product defining the right-hand side of the difference equation gives the number of encounters between susceptible individuals (

z (m) = 100 - y (m)

) and infected individuals (

y (m)

), then the parameter ξ can be interpreted as the probability that such encounters be successful (i.e., the disease spreading). Since the value of this probability depends upon complex factors such as immunity of each individual of the population, weather, etc., whose nature is unpredictable, it is more realistic to treat ξ as random variable instead of being a deterministic number. As previously indicated and for illustrative purposes only, we will assume that ξ lies in the interval

[0.02, 0.03]

according to a triangular distribution. As a consequence, at each time step m, the solution

y (m)

is a random variable. The Galerkin projection takes the form

{\tilde{y}}^{p} (m) = \sum_{i = 0}^{p} {\tilde{y}}_{i}^{p} (m) ϕ_{i} (ζ)

, where the deterministic coefficients

{\tilde{y}}_{i}^{p} (m)

satisfy a difference equation.

The stochastic Galerkin projection technique has not been applied to discrete models with the same emphasis as with continuous models. In our recent contributions [14,18], the applications of gPC to discrete dynamical systems have been analyzed. For a large time m, the explicit expression of

y (m)

cannot be obtained or is too complex, so the application of the transformation method is inviable. Thus, one needs appropriate methods to perform uncertainty quantification for

y (m)

. We think that the combination of the RVT technique and gPC expansions is a good approach to try uncertainty quantification for

y (m)

. The computation of the probability density function of

y (m)

is a step beyond the numerical experiment performed in ([18], Section 4), as [18] only considered the approximation of the expectation and variance statistics.

We approximate the probability density function

f_{y (m)} (y)

at

m = 30

. In Figure 6, we show the results. From

p \geq 5

, catastrophic numerical errors invalidate the results. Thus, we are restricted to

p \leq 4

, and we cannot approximate the density function as accurately as desired. This illustrates the main drawback of our computational method. Our results have been compared with a Kernel density estimation with the built-in functionSmoothKernelDistribution. Notice that, since the sought density function

f_{y (m)}

seems not smooth (there may be three peaks because of the triangular form), the Kernel density estimation might not approximate well

f_{y (m)}

near the peaks.

Example 4.

In this example, we deal with a social model for the addictive behavior of smoking. We use the data from our recent contribution [14] (Table 1), with original source the National Spanish Health Survey [50]. In Table 4, we show the percentage of nonsmokers Spanish men aged over 16 years old during the period 1995–2014, denoted by

S_{j}

,

j \in {0, 2, 6, 8, 11, 16, 19}

.

As in [14], we use a discrete SIS model:

\begin{matrix} S (m + 1) = & S (m) - b S (m) I (m) + a I (m), \\ I (m + 1) = & I (m) + b S (m) I (m) - a I (m) . \end{matrix}

SIS-type models are useful to describe the spread of diseases (smoking) whose infection does not confer immunity [32,48,51]. In the model formulation, a is the proportion of infective (smokers) that leave this compartment (people giving up tobacco), and b is the proportion of contacts between susceptible and infective (non-smokers and smokers) that give rise to a new infected individual (smoker). We are assuming homogeneous mixing, i.e., all individuals are equally likely to contact any other individual. We are also supposing a constant total population

N = S (m) + I (m)

. Since our data reflects percentages, we take

N = 1

.

From

I (m) = N - S (m)

, we can get rid of the second equation and work with the first one:

S (m + 1) = a N + (1 - b N - a) S (m) + b S {(m)}^{2} .

In our usual notation,

y (m) = S (m)

. By using a least squares fitting procedure, the best deterministic estimates for a and b are

\hat{a} = 0.0378749

and

\hat{b} = 0.0225118

. We introduce randomness into a, with distribution

Normal (0.0378749, 0.000025)

. The infection parameter b is constant:

b = 0.0225118

.

By using a gPC approach with

p = 5

, we approximate the expectation and variance of susceptible individuals

y (m)

. Figure 7 plots the results. We observe that the proposed SIS model captures the uncertainty of the smoking habit.

By combining the RVT technique and gPC expansions, we can go a step further and compute the probability density function of

y (m)

at times

m = 15

and

m = 18

, see Figure 8. With orders of basis

p = 5

, the results agree with a Kernel density estimation with the built-in functionSmoothKernelDistribution. In this case, since the density function of

y (m)

seems smooth (with no peaks), the Kernel density estimation is correct.

Remark 1.

As we commented, by using a least squares fitting procedure, the best deterministic estimates for a and b are

\hat{a} = 0.0378749

and

\hat{b} = 0.0225118

. In order to take into account the inherent uncertainty associated to the modeling, we introduce randomness into both parameters: We set

a \sim Normal (0.0378749, 0.000025)

and

b \sim Uniform (0.0175118, 0.0275118)

. We perform a stochastic Galerkin projection technique with order

p = 5

for both canonical bases. We plot the output of the model in Figure 9.

Notice that the outputs from Figure 7 and Figure 9 are very similar, indicating that parameter a may have the main effect on the model predictions. Indeed, let us check this fact via the gPC-based Sobol’s indices. Figure 10 depicts the Sobol’s indices for a and b. Observe that the main effect on the output variance is due to parameter a. Thus, the methodology of considering

b = \hat{b}

constant and putting randomness into a only, as done in Example 4 to compute probability density functions via the RVT technique, is justified.

From an epidemic point of view, people giving up tobacco (parameter a) is the main reason of the augment of nonsmokers from years 1995 to 2014. Hence, the best policy to address the smoking epidemic is to accomplish strategies that help smokers to give up this addictive behavior.

4. Conclusions

In this paper we have approximated the probability density function for differential and difference equations with one random input parameter, by combining the transformation method and generalized polynomial chaos expansions. The method consists in computing the exact density function of the stochastic Galerkin projections, via the RVT technique, for a sufficiently large order of truncation. The optimal order of truncation is that from which the subsequent probability density functions of the Galerkin projections are virtually the same. The sequence of these approximating density functions converges rapidly due to the spectral mean square convergence of the Galerkin projections. Several numerical examples, dealing with both continuous and discrete random dynamical systems in the setting of epidemiological modeling, have illustrated the theoretical ideas. Our approach is restricted to one random input coefficient, since the transformation method for non-injective maps only works on dimension one. Moreover, from a modeling standpoint, one may include uncertainty only into the parameter with largest gPC-based Sobol’s index. If we are dealing with an epidemic model, this parameter is the most important to address, control and prevent the epidemic.

To the best of our knowledge, this paper is the first one in which the RVT technique has been combined with gPC expansions to determine the probability density function of stochastic models and go beyond the computation of the expectation and variance, which has been the usual goal in previous contributions dealing with gPC expansions. Moreover, our novelty also relies on applying our methodology to epidemiological models based on continuous and discrete stochastic systems. Being able to quantify the uncertainty associated to epidemics is essential to find the optimal prevention policies.

Part of future research may be devoted to a deeper mathematical analysis of conditions under which the approximating density functions indeed converge pointwise or in some Lebesgue space, so that our computational method would be justified mathematically.

Author Contributions

Investigation, J.C.G. and M.J.S.; Methodology, J.C.G. and M.J.S.; Software, J.C.G. and M.J.S.; Supervision, B.M.C.-C. and J.C.C.L.; Validation, B.M.C.-C. and J.C.C.L.; Visualization, J.C.G., B.M.C.-C., J.C.C.L. and M.J.S.; Writing—original draft, J.C.G. and M.J.S.; Writing—review & editing, J.C.G., B.M.C.-C., J.C.C.L. and M.J.S.

Funding

This work has been supported by the Spanish Ministerio de Economía y Competitividad grant MTM2017–89664–P. The author Marc Jornet acknowledges the doctorate scholarship granted by Programa de Ayudas de Investigación y Desarrollo (PAID), Universitat Politècnica de València.

Acknowledgments

The authors express their deepest appreciation and respect to the editor and reviewers for their valuable comments.

Conflicts of Interest

The authors declare no conflict of interest.

References

Soong, T.T. Random Differential Equations in Science and Engineering; Academic Press: New York, NY, USA, 1973. [Google Scholar]
Strand, J.L. Random ordinary differential equations. J. Differ. Eq. 1970, 7, 538–553. [Google Scholar] [CrossRef] [Green Version]
Bharucha-Reid, A.T. On the theory of random equations. Proc. Sympos. Appl. Math. 1964, 16, 40–69. [Google Scholar]
Smith, R.C. Uncertainty Quantification. Theory, Implementation, and Application; SIAM Computational Science & Engineering: Philadelphia, PA, USA, 2014. [Google Scholar]
Fishman, G. Monte Carlo: Concepts, Algorithms, and Applications; Springer Science & Business Media: Berlin, Germany, 2013. [Google Scholar]
Xiu, D. Numerical Methods for Stochastic Computations. A Spectral Method Approach; Princeton University Press: Princeton, NJ, USA, 2010. [Google Scholar]
Xiu, D.; Karniadakis, G.E. The Wiener-Askey polynomial chaos for stochastic differential equations. SIAM J. Sci. Comp. 2002, 24, 619–644. [Google Scholar] [CrossRef]
Chen-Charpentier, B.M.; Cortés, J.C.; Licea, J.A.; Romero, J.V.; Roselló, M.D.; Santonja, F.J.; Villanueva, R.J. Constructing adaptive generalized polynomial chaos method to measure the uncertainty in continuous models: A computational approach. Math. Comput. Simul. 2015, 109, 113–129. [Google Scholar] [CrossRef] [Green Version]
Cortés, J.C.; Romero, J.V.; Roselló, M.D.; Villanueva, R.J. Improving adaptive generalized polynomial chaos method to solve nonlinear random differential equations by the random variable transformation technique. Commun. Nonlinear Sci. Numer. Simul. 2017, 50, 1–15. [Google Scholar] [CrossRef]
Chen-Charpentier, B.M.; Stanescu, D. Epidemic models with random coefficients. Math. Comput. Model. 2010, 52, 1004–1010. [Google Scholar] [CrossRef]
Lucor, D.; Su, C.H.; Karniadakis, G.E. Generalized polynomial chaos and random oscillators. Int. J. Numer. Meth. Eng. 2004, 60, 571–596. [Google Scholar] [CrossRef]
Santonja, F.; Chen-Charpentier, B.M. Uncertainty quantification in simulations of epidemics using polynomial chaos. Comput. Math. Method Med. 2012, 2012, 742086. [Google Scholar] [CrossRef]
Stanescu, D.; Chen-Charpentier, B.M. Random coefficient differential equation models for bacterial growth. Math. Comp. Model. 2009, 50, 885–895. [Google Scholar] [CrossRef]
Calatayud, J.; Cortés, J.C.; Jornet, M.; Villanueva, R.J. Computational uncertainty quantification for random time-discrete epidemiological models using adaptive gPC. Math. Method Appl. Sci. 2018, 1–10. [Google Scholar] [CrossRef]
Villegas, M.; Augustin, F.; Gilg, A.; Hmaidi, A.; Wever, U. Application of the polynomial chaos expansion to the simulation of chemical reactors with uncertainties. Math. Comput. Simul. 2012, 82, 805–817. [Google Scholar] [CrossRef]
Xiu, D.; Karniadakis, G.E. Modeling uncertainty in steady state diffusion problems via generalized polynomial chaos. Comput. Method. Appl. Med. 2002, 191, 4927–4948. [Google Scholar] [CrossRef] [Green Version]
Shi, W.; Zhang, C. Error analysis of generalized polynomial chaos for nonlinear random ordinary differential equations. Appl. Numer. Math. 2012, 62, 1954–1964. [Google Scholar] [CrossRef]
Calatayud, J.; Cortés, J.C.; Jornet, M. On the convergence of adaptive gPC for non-linear random difference equations: Theoretical analysis and some practical recommendations. J. Nonlinear Sci. Appl. 2018, 11, 1077–1084. [Google Scholar] [CrossRef]
Casabán, M.C.; Cortés, J.C.; Romero, J.V.; Roselló, M.D. Probabilistic solution of random SI-type epidemiological models using the Random Variable Transformation technique. Commun. Nonlinear Sci. Numer. Simul. 2015, 24, 86–97. [Google Scholar] [CrossRef] [Green Version]
Dorini, F.A.; Cecconello, M.S.; Dorini, M.S. On the logistic equation subject to uncertainties in the environmental carrying capacity and initial population density. Commun. Nonlinear Sci. Numer. Simul. 2016, 33, 160–173. [Google Scholar] [CrossRef]
Calatayud, J.; Cortés, J.C.; Jornet, M.; Navarro-Quiles, A. Solving Random Ordinary and Partial Differential Equations through the Probability Density Function. Theory and Computing with Applications. In Modern Mathematics and Mechanics. Fundamentals, Problems and Challenges; Sadovnichiy, V.A., Zgurovsky, M., Eds.; Springer International Publishing AG (Springer Nature): New York, NY, USA, 2019; pp. 261–282. [Google Scholar]
Dorini, F.A.; Cunha, M.C.C. Statistical moments of the random linear transport equation. J. Comput. Phys. 2008, 227, 8541–8550. [Google Scholar] [CrossRef]
Hussein, A.; Selim, M.M. Solution of the stochastic radiative transfer equation with Rayleigh scattering using RVT technique. Appl. Math. Comput. 2012, 218, 7193–7203. [Google Scholar] [CrossRef]
Hussein, A.; Selim, M.M. Solution of the stochastic generalized shallow-water wave equation using RVT technique. Eur. Phys. J. Plus 2015, 130, 249. [Google Scholar] [CrossRef]
Hussein, A.; Selim, M.M. A general analytical solution for the stochastic Milne problem using Karhunen–Loeve (K–L) expansion. J. Quant. Spectrosc. Radiat. Transf. 2013, 125, 84–92. [Google Scholar] [CrossRef]
Xu, Z.; Tipireddy, R.; Lin, G. Analytical approximation and numerical studies of one-dimensional elliptic equation with random coefficients. Appl. Math. Model. 2016, 40, 5542–5559. [Google Scholar] [CrossRef] [Green Version]
Cortés, J.C.; Navarro-Quiles, A.; Romero, J.V.; Roselló, M.D. Full solution of random autonomous first-order linear systems of difference equations. Application to construct random phase portrait for planar systems. Appl. Math. Lett. 2017, 68, 150–156. [Google Scholar] [CrossRef]
El-Tawil, M.A. The approximate solutions of some stochastic differential equations using transformations. Appl. Math. Comput. 2005, 164, 167–178. [Google Scholar] [CrossRef]
Calatayud, J.; Cortés, J.C.; Jornet, M. The damped pendulum random differential equation: A comprehensive stochastic analysis via the computation of the probability density function. Phys. A 2018, 512, 261–279. [Google Scholar] [CrossRef]
Calatayud, J.; Cortés, J.C.; Jornet, M. Uncertainty quantification for random parabolic equations with non-homogeneous boundary conditions on a bounded domain via the approximation of the probability density function. Math. Method Appl. Sci. 2018. [Google Scholar] [CrossRef]
Cortés, J.C.; Navarro-Quiles, A.; Romero, J.V.; Roselló, M.D. Solving second-order linear differential equations with random analytic coefficients about ordinary points: A full probabilistic solution by the first probability density function. Appl. Math. Comput. 2018, 331, 33–45. [Google Scholar] [CrossRef]
Casabán, M.C.; Cortés, J.C.; Romero, J.V.; Roselló, M.D. A comprehensive probabilistic solution of random SIS-type epidemiological models using the random variable transformation technique. Commun. Nonlinear Sci. Numer. Simul. 2016, 32, 199–210. [Google Scholar] [CrossRef] [Green Version]
Kegan, B.; West, R.W. Modeling the simple epidemic with deterministic differential equations and random initial conditions. Math. Biosci. 2005, 194, 217–231. [Google Scholar] [CrossRef] [PubMed]
Crestaux, T.; Le Maître, O.; Martínez, J.M. Polynomial chaos expansion for sensitivity analysis. Reliab. Eng. Syst. Saf. 2009, 94, 1161–1172. [Google Scholar] [CrossRef]
Sudret, B. Global sensitivity analysis using polynomial chaos expansions. Reliab. Eng. Syst. Saf. 2008, 93, 964–979. [Google Scholar] [CrossRef]
Chen-Charpentier, B.M.; Cortés, J.C.; Romero, J.V.; Roselló, M.D. Some recommendations for applying gPC (generalized polynomial chaos) to modeling: An analysis through the Airy random differential equation. Appl. Math. Comput. 2013, 219, 4208–4218. [Google Scholar] [CrossRef] [Green Version]
Ernst, O.G.; Mugler, A.; Starkloff, H.J.; Ullmann, E. On the convergence of generalized polynomial chaos expansions. ESAIM Math. Model. Numer. Anal. 2012, 46, 317–339. [Google Scholar] [CrossRef]
Kobayashi, H.; Mark, B.L.; Turin, W. Probability, Random Processes, and Statistical Analysis; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
Giraud, L.; Langou, J.; Rozloznik, M. The loss of orthogonality in the Gram-Schmidt orthogonalization process. Comput. Math. Appl. 2005, 50, 1069–1075. [Google Scholar] [CrossRef] [Green Version]
Calatayud, J.; Cortés, J.C.; Jornet, M. Improving the approximation of the first and second order statistics of the response process to the random Legendre differential equation. arXiv, 2018; arXiv:1807.03141. [Google Scholar]
Wolfram Research, Inc. Mathematica, version 11.2; Wolfram Research, Inc.: Champaign, IL, USA, 2017. [Google Scholar]
Stanescu, D.; Chen-Charpentier, B.M.; Jensen, B.J.; Colberg, P.J. Random coefficient differential models of growth of anaerobic photosynthetic bacteria. Electron. Trans. Numer. Anal. 2009, 34, 44–58. [Google Scholar]
Marzouk, Y.M.; Najm, H.N.; Rahn, L.A. Stochastic spectral methods for efficient Bayesian solution of inverse problems. J. Comput. Phys. 2007, 224, 560–586. [Google Scholar] [CrossRef]
Marzouk, Y.; Xiu, D. A stochastic collocation approach to Bayesian inference in inverse problems. J. Commun. Comput. Phys. 2009, 6, 826–847. [Google Scholar] [CrossRef]
Scott, D. On optimal and data-based histograms. Biometrika 1979, 66, 605–610. [Google Scholar] [CrossRef]
Malthus, T.R. An Essay on the Principal of Population; Oxford World’s Classics Paperbacks; Oxford University Press: Oxford, UK, 1999. [Google Scholar]
Brauer, F.; Castillo-Chávez, C. Mathematical Models in Population Biology and Epidemiology; Springer Science + Business Media: Berlin, Germany, 2012. [Google Scholar]
Murray, J.D. Mathematical Biology; Springer: Berlin, Germany, 2002. [Google Scholar]
Cherniha, R.; Davydovych, V. Nonlinear Reaction-Diffusion Systems: Conditional Symmetry, Exact Solutions and Their Applications in Biology; Springer International Publishing: Basel, Switzerland, 2017. [Google Scholar]
National Spanish Health Survey (Encuesta Nacional de Salud de España, ENSE). Available online: http://pestadistico.inteligenciadegestion.msssi.es/publicoSNS/comun/ArbolNodos.aspx (accessed on 28 May 2018).
Hadeler, K. Topics in Mathematical Biology; Springer International Publishing: Basel, Switzerland, 2017. [Google Scholar]

Figure 1. Exact graph of

f_{y (t)} (y)

for

t = 0.5

(left) and

t = 1

(right) in Example 1.

Figure 1. Exact graph of

f_{y (t)} (y)

for

t = 0.5

(left) and

t = 1

(right) in Example 1.

Figure 2. Approximation of

f_{y (t)} (y)

via generalized polynomial chaos (gPC) and random variable transformation (RVT) for

t = 0.5

(up) and

t = 1

(down) in Example 1. Observe the rapid convergence to the exact density function

f_{y (t)} (y)

as p grows.

Figure 2. Approximation of

f_{y (t)} (y)

via generalized polynomial chaos (gPC) and random variable transformation (RVT) for

t = 0.5

(up) and

t = 1

(down) in Example 1. Observe the rapid convergence to the exact density function

f_{y (t)} (y)

as p grows.

Figure 3. Population size of Rhodobacter capsulatus in Example 2.

Figure 4. Model for the R. capsulatus population with

p = 4

in Example 2. The circles represent the actual population size, the solid line shows the estimates via

E [y (t)]

, and the dashed lines reflect the confidence interval.

Figure 4. Model for the R. capsulatus population with

p = 4

in Example 2. The circles represent the actual population size, the solid line shows the estimates via

E [y (t)]

, and the dashed lines reflect the confidence interval.

Figure 5. Approximation of

f_{y (t)} (y)

via gPC and RVT for

t = 4

in Example 2. Comparison with a Kernel density estimation.

Figure 5. Approximation of

f_{y (t)} (y)

via gPC and RVT for

t = 4

in Example 2. Comparison with a Kernel density estimation.

Figure 6. Approximation of

f_{y (m)} (y)

via gPC and RVT for

m = 30

in Example 3. Comparison with a Kernel density estimation. From

p \geq 5

, catastrophic numerical errors invalidate the results, so we are restricted to

p \leq 4

.

Figure 6. Approximation of

f_{y (m)} (y)

via gPC and RVT for

m = 30

in Example 3. Comparison with a Kernel density estimation. From

p \geq 5

, catastrophic numerical errors invalidate the results, so we are restricted to

p \leq 4

.

Figure 7. Model for the percentage of nonsmokers Spanish men aged over 16 years old during the period 1995–2014 in Example 4. The circles represent the actual data, the solid line shows the estimates via

E [y (m)]

, and the dashed lines reflect the confidence interval.

Figure 7. Model for the percentage of nonsmokers Spanish men aged over 16 years old during the period 1995–2014 in Example 4. The circles represent the actual data, the solid line shows the estimates via

E [y (m)]

, and the dashed lines reflect the confidence interval.

Figure 8. Approximation of

f_{y (m)} (y)

via gPC and RVT for

m = 15

(up) and

m = 18

(down) in Example 4. Observe the rapid convergence to the exact density function

f_{y (m)} (y)

. The results agree with a Kernel density estimation.

Figure 8. Approximation of

f_{y (m)} (y)

via gPC and RVT for

m = 15

(up) and

m = 18

(down) in Example 4. Observe the rapid convergence to the exact density function

f_{y (m)} (y)

. The results agree with a Kernel density estimation.

Figure 9. Model for the percentage of nonsmokers Spanish men aged over 16 years old during the period 1995–2014 in Remark 1 of Example 4, with two random input parameters. The circles represent the actual data, the solid line shows the estimates via

E [y (m)]

, and the dashed lines reflect the confidence interval.

Figure 9. Model for the percentage of nonsmokers Spanish men aged over 16 years old during the period 1995–2014 in Remark 1 of Example 4, with two random input parameters. The circles represent the actual data, the solid line shows the estimates via

E [y (m)]

, and the dashed lines reflect the confidence interval.

Figure 10. Influence of the parameters a and b in the nonsmokers model prediction, Remark 1 from Example 4.

Table 1. Error

∥ f_{y (t)} - f_{{\tilde{y}}^{p} (t)} ∥_{L^{1} (R)}

for

t = 0.5

and

t = 1

, and

p = 1, 2, 3, 4, 5, 6

, in Example 1. Note the rapid convergence (exponential) to the exact density function

f_{y (t)}

as p increases.

Table 1. Error

∥ f_{y (t)} - f_{{\tilde{y}}^{p} (t)} ∥_{L^{1} (R)}

for

t = 0.5

and

t = 1

, and

p = 1, 2, 3, 4, 5, 6

, in Example 1. Note the rapid convergence (exponential) to the exact density function

f_{y (t)}

as p increases.

	t = 0.5	t = 1
$p = 1$	$0.0776146$	$1.81093$
$p = 2$	$0.00599041$	$0.0304913$
$p = 3$	$0.000230145$	$0.00181233$
$p = 4$	$7.06091 \times 10^{- 6}$	$0.000109031$
$p = 5$	$1.71501 \times 10^{- 7}$	$5.29540 \times 10^{- 6}$
$p = 6$	$1.00343 \times 10^{- 8}$	$2.15623 \times 10^{- 7}$

Table 2. Bacteria population sizes of Rhodobacter capsulatus [42], Example 2.

Time (days)	Population (Cells/mL, Scale $10^{6}$ )
0	$0.583$
2	$0.635$
4	$1.08$
7	$3.20$
9	$5.23$
11	$5.28$
14	$5.30$

Table 3. Consecutive error

∥ f_{{\tilde{y}}^{p + 1} (t)} - f_{{\tilde{y}}^{p} (t)} ∥_{L^{1} (R)}

for

t = 4

, and

p = 1, 2, 3

, in Example 2. Notice the rapid convergence of the consecutive errors to 0 as p grows.

Table 3. Consecutive error

∥ f_{{\tilde{y}}^{p + 1} (t)} - f_{{\tilde{y}}^{p} (t)} ∥_{L^{1} (R)}

for

t = 4

, and

p = 1, 2, 3

, in Example 2. Notice the rapid convergence of the consecutive errors to 0 as p grows.

	t = 4
$p = 1$ − $p = 2$	$0.202513$
$p = 2$ − $p = 3$	$0.0105528$
$p = 3$ − $p = 4$	$3.52486 \times 10^{- 7}$

Table 4. Percentage of nonsmokers Spanish men aged over 16 years old during the period 1995–2014. Source [50].

Year	1995	1997	2001	2003	2006	2011	2014
j	0	2	6	8	11	16	19
$S_{j}$	$0.5298$	$0.5514$	$0.5783$	$0.6244$	$0.6467$	$0.6863$	$0.6957$

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Calatayud Gregori, J.; Chen-Charpentier, B.M.; Cortés López, J.C.; Jornet Sanz, M. Combining Polynomial Chaos Expansions and the Random Variable Transformation Technique to Approximate the Density Function of Stochastic Problems, Including Some Epidemiological Models. Symmetry 2019, 11, 43. https://doi.org/10.3390/sym11010043

AMA Style

Calatayud Gregori J, Chen-Charpentier BM, Cortés López JC, Jornet Sanz M. Combining Polynomial Chaos Expansions and the Random Variable Transformation Technique to Approximate the Density Function of Stochastic Problems, Including Some Epidemiological Models. Symmetry. 2019; 11(1):43. https://doi.org/10.3390/sym11010043

Chicago/Turabian Style

Calatayud Gregori, Julia, Benito M. Chen-Charpentier, Juan Carlos Cortés López, and Marc Jornet Sanz. 2019. "Combining Polynomial Chaos Expansions and the Random Variable Transformation Technique to Approximate the Density Function of Stochastic Problems, Including Some Epidemiological Models" Symmetry 11, no. 1: 43. https://doi.org/10.3390/sym11010043

APA Style

Calatayud Gregori, J., Chen-Charpentier, B. M., Cortés López, J. C., & Jornet Sanz, M. (2019). Combining Polynomial Chaos Expansions and the Random Variable Transformation Technique to Approximate the Density Function of Stochastic Problems, Including Some Epidemiological Models. Symmetry, 11(1), 43. https://doi.org/10.3390/sym11010043

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Combining Polynomial Chaos Expansions and the Random Variable Transformation Technique to Approximate the Density Function of Stochastic Problems, Including Some Epidemiological Models

Abstract

1. Introduction

2. Method

2.1. gPC Expansions

2.2. RVT Technique

2.3. Combining gPC and RVT Technique to Approximate the Density Function

3. Numerical Experiments

4. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI