1. Introduction
Currently, fractional differential equations are used to simulate non-standard diffusive processes when the diffusion processes can’t be desribed by classical mathematical models. We mention applications of chemical and contaminant transport in heterogeneous aquifer [
1], recent models in physics [
2], biomedicine [
3], and optics and image applications [
4]. More examples are given in [
5,
6].
It is well-known that the fractional power of an elliptic operator
can be defined in a non-unique way. We restrict to the spectral definition, and detail definitions are presented in
Section 2. It is interesting to stress, that the classical spectral algorithm and some modifications can be used as computational tools to solve parabolic type problems with nonlocal operator
, making the complexity of the nonlocal algorithms the same as for spectral methods targeted to solve parabolic problems with standard elliptic operators. However, clearly, this strategy is computationally efficient only if the differential problem is solved in a rectangular domain, if the complete set of eigenfunctions of operator
are known in advance, and if the FFT technique can be applied.
Another general approach is based on the Lanczos method. The main benefit of this algorithm is that the required matrix-function-vector products are computed without ever forming the dense matrix
. A preconditioned Lanczos method can be used to accelerate the convergence of this method (see [
7] and references given therein).
Thus, alternative approaches should be used in the case of non-uniform space meshes and/or elliptic operators with non-constant coefficients. The main idea of such techniques is to transform non-local problems to some local classical differential problems. A very good review on these methods is given in [
8].
At the end of this short review we note one more important approach. There exist very interesting analytical methods that can solve the same equations in the form of a series on the root vectors of the right-hand [
9,
10]. They look very promising for the development of not only theoretical results on the existence and uniqueness of solutions, but also for the numerical solution of real world applied problems. A practical comparison of these approaches for well-selected benchmark problems is an important task.
The rest of the paper is organized in the following way. In
Section 2, the problem is formulated. The non-stationary parabolic equation with a fractional power elliptic operator is formulated in the 3D parallelepiped, and the initial and boundary conditions are specified. As mentioned above, we use the spectral definition of the fractional power of elliptic operators.
Implicit approximations of parabolic problems with fractional powers of discrete elliptic operators are defined in
Section 3. General stability results are presented in the case of a linear source term. Two approaches are proposed on how to construct efficient discrete algorithms. The AAA algorithm and the splitting type scheme are used to define discrete solvers for the non-local time dependent problem. At the end of this section, parallel versions of all algorithms are described and the motivation of the selected approach is discussed.
In
Section 4, results of numerical experiments are presented. Some final conclusions are done in
Section 5.
2. Problem Formulation
Let
be some open and bounded domain
,
. Define a self-adjoint elliptic diffusion operator:
with
symmetric and uniformly positive definite. Operator
is supplemented with either a homogeneous Dirichlet boundary condition on
or a homogeneous Neumann one.
For (
1), we define the bilinear form
where
denotes the
-inner product in
and the norm is defined as
.
Let
denote a finite-dimensional space of functions that satisfy the homogeneous Dirichlet boundary conditions and we assume that this space is spanned by piecewise linear basis
. Then, the discrete elliptic operator
is defined as
As was mentioned above the spectral definition of fractional power of the operator
is used in our paper [
8,
11]. Under standard assumptions on the diffusion coefficients
and the boundary
, discrete operator
admits a system of eigenfunctions
with corresponding eigenvalues
such that
Then, a fractional power of
is defined as
It is easy to show that the nonlocal operator
is also self-adjoint and positive definite:
We solve a Cauchy problem for the evolutionary first-order equation:
where
F represents an associated reaction term.
3. Implicit Approximations of Parabolic Problems with
Fractional Powers of Discrete Elliptic Operators
In order to construct a discrete approximation of the differential problem, we define a non-uniform time grid
Let
be a numerical approximation to the exact solution
of problem (
5). We define the averaging operator
All parallel algorithms are defined for a general case of .
Let us consider the implicit scheme
It is clear that for
the symmetrical scheme has the second order of approximation. First, in this paper we always assume that the given problem has a sufficiently smooth solution
. Then, applying Taylor series we can compute the approximation (truncation) error
Second, the discrete fractional power operator is bounded for sufficiently smooth functions and .
We note that the investigation of the approximation accuracy in space is more complicated and results can depend essentially on the fractional order parameter
. These questions are investigated in detail in many papers dealing with discrete approximations of the stationary problems with a fractional power of elliptic operators. We can recommend [
12], where
-adaptive methods are applied (see also references given therein).
For the analysis of time dependent errors we demand a sufficient regularity (in time) of the right-hand side along with compatibility conditions for the initial condition datum. The source functions should satisfy the standard regularity (in time) conditions similar to classical parabolic problem cases. The regularity of
F in space is a more complicated question and the appropriate fractional Sobolev spaces are specified in papers dealing with non-stationary problems (see [
13]).
We also present results on the stability of the difference scheme (
6) in the case of a linear source term
(the proof is similar to one presented in [
6]).
Lemma 1. If the weight parameter , then the difference scheme (6) is unconditionally stable We repeat our note that, if the direct application of the spectral formula (
3) is possible and the FFT algorithm is applicable, then the nonlocal parabolic problem (
6) can be solved efficiently at each time layer (see [
5,
14]). Still, this approach is valid only in very special cases.
In general case, the implementation of the discrete scheme (
6) requires one to use some approximations of the operators with fractional powers of discrete elliptic operators
. Let us write the discrete scheme in the following form
One possibility to linearize the nonlinear problem (
8) is to use the following fixed point iteration method
where
. If the iteration number is restricted to
, then we get the predictor-corrector method, which is sufficient to guarantee the second order of convergence for the symmetrical scheme with
.
In this paper we consider 3D elliptic operators. It is well known that for the solution of parabolic problems with standard multidimensional elliptic operators, the most efficient approaches are based on splitting algorithms. In the case of fractional powers of elliptic operators such algorithms are still not constructed and therefore this topic can be formulated as a new important open problem in numerical mathematics.
Here we restrict to general but less efficient algorithms. At each iteration, the solution
is obtained by solving a non-local problem
The first approach to construct an efficient discrete algorithm is based on the idea to approximate the non-local operator
by some linear local operator
Then, we compute an approximate solution as
In the next lemma, simple sufficient stability conditions of scheme (
11) are formulated in the case of the linear source function
(the proof is obtained by modifying the analysis given in [
6] for one dimensional operators).
Lemma 2. Then, for the difference scheme (11) is unconditionally stable In the second approach, we separately resolve the non-local diffusion and non-linear source terms:
The main goal of this modification is to approximate more accurately the impact of the non-linear source function, since efficient and accurate solvers are developed for approximation of the fractional powers of stationary diffusion operators. One drawback of this approach is that two subproblems (
13) and (
14) for different non-local operators should be solved at each iteration. Therefore, it is interesting to investigate if the improvement in the accuracy of approximation compensates for the increase of computational costs of this modified algorithm. It is clear that the standard algorithm of symmetrical splitting with respect to different physical processes can also be used
This algorithm has two important properties. First, only one subproblem with non-local operators is solved for each time step. Second, the dynamics of the non-linear interaction can be resolved by using specialized solvers targeted for specific non-linear functions, in many cases this part of the problem can even be solved exactly. Therefore this splitting algorithm is a competitive alternative to the other proposed algorithms.
Next we consider an efficient approach for how to construct operators
to solve the implicit non-local systems (
10) and (
13) and (
14).
3.1. The AAA Algorithm
The construction of
is based on the so-called AAA algorithm proposed in [
15]. We will apply this general algorithm for implementation of the discrete scheme (
10). Let us consider a function
Then, a rational approximation of
is given in partial fraction decomposition form [
15]
The approximate solution of the scheme (
10) is computed as
In order to apply the AAA algorithm for the implementation of the discrete scheme (
13) and (
14), let us consider a function
and compute a rational approximation of
given in partial fraction decomposition form
The approximate solution of the scheme (
13) and (
14) is computed as
Stability Analysis
The approximation of the AAA algorithm is based on a rational approximation of function
. Let us consider the solution of Equation (
21) for
, and represent it as a linear combination of the eigenvectors of operator
Substituting this expression into (
21) we get the equations for coefficients
:
We assume that
. Then the stability of scheme (
19) is guaranteed if the following condition is satisfied
As in [
6], we investigate this condition experimentally. The results of numerical experiments confirmed that for fractional powers
,
,
, time steps
,
,
and
,
defined by the minimal and maximal eigenvalues of the operator
, for
, the stability condition (
23) was satisfied unconditionally. Such stability tests are recommended before starting large scale computations for simulations of real world applications.
3.2. Symmetrical Splitting Scheme
By using the same AAA algorithm, we get an implementation of the symmetrical splitting scheme
The non-linear problems (
24) and (
26) are solved by applying an ODE solver optimal for a given type of non-linearities.
3.3. Additive Splitting Scheme
This scheme is constructed using techniques from [
16] where BURA type discrete approximations are constructed for stationary problems with fractional powers of elliptic operators. We note that similar splitting schemes were used also in [
17].
First we rewrite the linearized Equation (
9) in the following form
where
. Next, the AAA method is used to construct a rational approximation of fractional power operator
Let us write the obtained discrete scheme as
where
Here we use the same notation for a new discrete solution.
Since all discrete operators commute, the solution
is computed by using the classical splitting technique in
steps:
The splitting technique introduces additional approximation errors and therefore this algorithm can require to use smaller time steps. This scheme is unconditionally stable but again the transfer operator is different than in the Crank–Nicolson method.
As was done for the AAA algorithm, we consider a modification of this scheme when the influence of the non-homogeneous source term is computed separately by solving Equation (
21):
3.4. Parallel Algorithms
All parallel solvers are constructed by using the same approach, when parallelization is done at the level of solvers for solution of the obtained systems of linear equations. Thus we can use efficient parallel solvers developed for such problems.
Our aim is to investigate how the efficiency of these solvers depend on specific properties of discrete elliptic operators
and parameters
. The values of these parameters depend on the selected non-local operators and on the number
m in approximations (
19) and (
20).
We note one important difference among the developed parallel algorithms. At each time step, the AAA algorithms (
19), (
21), and (
22), and symmetrical splitting schemes (
24)–(
26) define
m independent systems of linear equation, and these problems can be solved in parallel. Thus, an additional parallelization level is obtained and the distribution of tasks can be optimized by using general parallelization templates (see [
18]).
While for the additive splitting scheme (
27) all auxiliary problems should be solved sequentially, one after another. For this method, parallelization can be done only on the level of a solution of linear systems.
It is important to note that the total number of non-local problems solved at one time step is also different for specific schemes. Two problems are solved for modified AAA and additive splitting schemes, while one problem is sufficient for the remaining schemes. In the case of non-linear source functions, the predictor-corrector iterations are done and the number of non-local problems solved at one time step is increased twice. The only exception is the symmetrical splitting scheme, where again it is sufficient to solve only one non-local problem per time step.
As our tool for development parallel versions of the constructed finite difference schemes, we use AGMG multigrid solver [
19]. The main reason to base the construction of parallel solvers on some state of the art tool is that general mathematical libraries are adjusted and optimized to new computer architectures by library developers. At the same time most tools have a sufficiently large set of parameters to fit these libraries to specific needs arising when solving new classes of problems. We also note that in [
20], an extended analysis of parallel solvers is done for the solution of stationary fractional power elliptic problems. The hyper AMG solver BoomerAMG was used in [
20] to construct parallel versions of the finite difference schemes for a 7-point 3D Laplace problem.
For comparison of different approaches we have implemented and investigated the scalability of symmetrical splitting scheme when the spectral solver based on FFT is used. The parallelization is done by using the well-known library FFTW [
21].
4. Numerical Experiments
In this section we solve three test problems. The first one defines a nonlinear problem and is used to analyze the accuracy of different schemes. The second problem has a non-smooth solution and the scalability analysis of the parallel algorithms is done on its basis. The last system of coupled 3D nonlinear parabolic equations enables to analyze the influence of the fractional order parameter on complicated pattern formation. In addition, this problem is used to investigate the parallel efficiency of the symmetric splitting scheme when the non-local operators are resolved by using an FFT algorithm.
4.1. Convergence Tests
In order to investigate the convergence accuracy of the proposed discrete schemes and integration methods, we consider a linear fractional heat equation. Its implementation still requires to apply the predictor-corrector iterative algorithm.
Example 1. In this example we solve a linear fractional heat equation with a linear source term in three-dimensional space (for a similar two-dimensional example, see [5]):where,,,,,,,,,,,,,,,,
,,,,,,,,. Then this problem has the exact solution
We define uniform space meshes in each direction
:
and approximate the differential diffusion operator by using the standard central differences
The predictor-corrector method takes into account the dependence of the source term on the solution. The discrete solution is computed using the spectral method, since only eight spectral modes should be resolved. The obtained results give reference solutions for the estimation of the accuracy of more general integration methods.
We have investigated the space and time discretization errors separately. First, we have fixed
,
and solved the discrete problem till
with different sizes of time step
. The computed maximum errors
confirm the second order accuracy of the approximation:
Next, we have fixed
and solved the same problem for different space meshes
. The results of computational experiments again confirmed the second order accuracy of the approximation:
Note, that in [
5] the accuracy of the 2D solver is tested only for fractional order parameters
. It is obvious that the smoothness of the solution changes for smaller values of the fractional order and for
the time derivative of the solution is singular at the initial time moment
.
In order to test the accuracy of the finite difference scheme we have fixed
,
and solved the discrete problem till
with different sizes of time step
:
The computed maximum errors show a reduced convergence rate of the discrete solution. Therefore, some adaptive (non-uniform) time step should be used to resolve this singularity.
It is clear that such asymptotical dynamics of the error depends mainly on the singularity of the exact solution, but not on the fractional power parameter
. In order to illustrate this statement we solved the parabolic problem (
30) when
, but the source function
F is selected such that the exact solution is defined as
The errors of the discrete solution in the maximum norm are the following
It follows from the computational experiments that the second order convergence rate is restored for the discrete scheme also for .
Next we present results of computational experiments when different algorithms are used to approximate non-local fractional power discrete elliptic operators. In all experiments the space mesh size is fixed to and the test problem is solved till with the time step .
The AAA Algorithm
First we consider the AAA algorithm (
19). In order to apply the AAA approximation method. We sample function
with
50,000 points over the exact interval of eigenvalues of
operator
. The resulting errors for three values of fractional order parameters
and various orders
m of the rational functions
are presented in
Table 1. For comparison we also present the error of the discrete solution of the symmetric Crank–Nicolson finite difference scheme, when at each time step the non-local problem is solved by using FFT method.
It follows from the presented results, that a small number is sufficient to reach the optimal error levels. These values of m do not depend on the fractional parameter .
Next we consider the modified AAA algorithms (
21) and (
22). The resulting errors for
and various
m of the rational functions
are presented in
Table 2.
It follows from the presented results, that a small number is again sufficient to reach the optimal error levels. The accuracy of the modified AAA algorithm is a little better for most cases of parameters.
Next we consider the modified additive splitting algorithm (
28) and (
29). The resulting errors for
and various orders
m of the rational functions
are presented in
Table 3.
The accuracy of the modified additive splitting scheme (
28) and (
29) is better than for the modified AAA algorithm for fractional order parameters
and
.
4.2. Scalability of the Parallel Algorithms
Example 2. In this example we solve the following 3D linear problemwhere the right-hand-side F is the well-known checkerboard function: We note that for stationary problems, similar test problems (mostly in 2D domains) with checkerboard right-hand-side functions are used to demonstrate the convergence of the selected numerical methods in many papers (see [
11,
16,
20] and references given therein). This type of source functions, but for a 1D case, was also used in [
6] for testing the accuracy of different integration methods for parabolic problems with fractional power elliptic operators.
We note that all constructed finite difference algorithms have the same structure, when at each time step a set of systems of linear equations is solved, where each system approximates classical diffusion operators perturbed by adding scaled identity operators.
As explained above, the AAA algorithms (
19), (
21), and (
22), and the symmetrical splitting scheme (
24)–(
26) define sets of independent systems of linear equations, and these problems can be solved in parallel. While for the additive splitting scheme (
27) all auxiliary problems should be solved sequentially, one after another.
In this paper we restrict ourselves to the scalability analysis of parallel algorithms when all processes are used to solve one system of linear equations. This strategy has two advantages: first, it is possible to solve larger systems of equations (i.e., to get more accurate approximations); second, data communication costs are reduced essentially.
In all computational experiments we restrict to the strong scalability analysis when the size of the discrete problem is fixed to the grid.
Parallel numerical tests in this work were performed on the computer cluster “HPC Vanagas” at the High Performance Computing Laboratory of Vilnius Gediminas technical university. We have used up to 16 cores of Intel® Xeon® processor E5-2630 with 20 cores ( GHz) and 32 GB DDR4 of RAM. The parallel algorithms are implemented using the library.
In order to estimate the scalability of parallel algorithms for the solution of parabolic problems with fractional order elliptic operators as a benchmark, we use results obtained for a classical Crank–Nicolson finite difference scheme. The parabolic Poisson problem with a 7-point 3D discrete Laplace operator on a grid is solved and the AGMG multigrid solver is used to solve systems of linear equations with a banded matrix.
The discrete problem is solved on the grid and the time step is fixed to . Five time steps are computed in all the experiments of this example.
The results of the computational experiments are presented in
Table 4. Here
is the speedup of the parallel algorithm,
is the efficiency number, and
is the parallel run-time on
p cores.
The presented results define the reference estimates of the AGMG solver on this parallel computer. They will be used to analyze the strong scalability of parallel solvers constructed to solve problems with fractional power elliptic operators. A small reduction in the efficiency of the parallel solver for processes can be explained as a possible switching of the system from two processes running on two physical cores to two virtual processes running on one physical core (so-called multi-threading).
Strong Scalability Analysis of the Parallel AAA Algorithm
Computational experiments are done for two fractional order parameters
and different numbers of cores
. The order of AAA algorithm is fixed to
. Results of the experiments are presented in
Table 5.
The obtained results show that the run-time
is decreased in comparison with results in
Table 4. The improvement is due to the faster convergence of the AGMG solver when a positive operator
is added to the elliptic operator
(all coefficients
are negative for the constructed AAA approximation of the function
). The efficiency numbers
of the parallel AAA discrete scheme are slightly improved, this fact confirms the robustness of the AGMG solver.
Next we estimated the influence of the order of AAA algorithm
m, since for different values of
m we get different sets of diagonalization coefficients
. In the following experiments
m is fixed to
and fractional order parameter
. The results of the computational experiments are presented in
Table 6.
As expected, the run-time is reduced and even more than twice, but the efficiency of the parallel algorithm is changed only slightly.
Next we investigated the influence of the time step
size. The order of AAA algorithm is taken
, the fractional parameter
, and the time step is increased to
. Five time steps are computed, as in previous experiments. The results of the computational experiments are presented in
Table 7.
We see that the run-time is increased when the time step size is increased, since the AGMG solver converges slower. However, the efficiency of the parallel algorithm changes only very slightly.
It follows from the presented results of computational experiments that the efficiency of the parallel AAA algorithm is robust with respect to fractional power parameter, the order of AAA algorithm parameter m and discrete time step variations.
Example 3. In this example, we consider the 3D space-fractional Gray–Scott model (see [14] for 2D model). It is described by a system of coupled 3D diffusion-reaction equationswhere the diffusion rates,, the dimensionless feed rate, and the dimensionless decay rate constant of the second reaction. The homogeneous Direchlet boundary conditions are defined on the boundary
, therefore a new function
is used in the mathematical model (
34) (compare it with [
14]). The initial conditions are given by
where
.
It is well-known that the solution of the Gray-Scott model exhibits a complicated set of patterns and the process of pattern formation is far from trivial. First, we would like to note that the specific trend of patterns depends essentially on the discrete size of the model (i.e., the space mesh size), therefore, we here fix the mesh to
. Our main goal is to show the dependents of the solution on the fractional power parameter
(i.e., to evaluate the effects of the super-diffusion). A second aim is to investigate the efficiency of the parallel symmetrical splitting algorithm for solving this complicated system of three dimensional non-linear and non-local parabolic equations.
Figure 1 presents the evolution of the numerical solution
v for three fractional orders
. The contours of
v are shown at times
from top to bottom.
We see from the presented results that the way of the replication in pattern formation depends essentially on parameter .
We also investigated the scalability of the parallel symmetrical splitting algorithm (
15)–(
15). The non-local fractional power operator is approximated by using the spectral method. The FFTW library [
21] is used to solve the obtained discrete problems. The results of the computational experiments are presented in
Table 8. The fractional order parameter
and the four time steps are computed.
It follows from the presented results that for the given parallel computer, the efficiency of the spectral FFTW solver is better than the efficiency of the GAMG solver. Still, the latter solver can be used to implement the constructed general finite difference schemes, when the spectral method is not applicable. We note that computations of images presented in
Figure 1 required up to 3 h of run-time on 20 cores system in order to solve one case of the problem.