1. Introduction
To understand a physical phenomenon the mathematical model that mimics it needs often to be discretized since its analytical solution is not available. Therefore, the numerical techniques used to obtain an approximate solution of mathematical models have become an indispensable part of modern science.
When modelling the behaviour of an unsteady flow, the Partial Differential Equations (PDEs) that describe it often lead to a stiff problem, since the physical phenomenon is characterised by a wide range of spatial and temporal scales, e.g., turbulent flows. In this case, implicit methods are potentially a better choice with respect to explicit ones, due to their better stability properties. Nevertheless, although very promising implicit methods have been developed by mathematicians in the past and are well known in the literature, see for example [
1,
2,
3,
4] just to name a few, their use within Computational Fluid Dynamics (CFD) codes is somewhat limited. This is probably due to the greater computational complexity of implicit methods with respect to explicit ones and to the greater CPU time and memory requirements needed by each time-step to advance the solution in time, which is a really important aspect when one has to deal with problems that involve a large number of degrees of freedom (DOFs). Nevertheless, starting from implicit “classical” methods, like the Backward Differentiation Formulae (BDF) [
5] or the second-order accurate Crank–Nicolson scheme (CN2) [
6], many other variants have been developed in the past with the aim to improve their performance. Some examples of methods that have been developed to improve the stability region of the BDF schemes are the MEBDF [
2,
7,
8] or the TIAS [
3,
9] schemes, that are in fact multi-stage methods A-stable up to order 4 and 6, respectively, or the second derivative method proposed by Enright [
1,
10] that has obtained a single-stage method A-stable up to order 4 by using a higher derivative of the solution. Another way to improve the stability properties of the BDF schemes was found with the Composite-Backward Differentiation Formulae (C-BDF) [
4,
11,
12], which are schemes that inherit the L-stability property of the second-order accurate BDF scheme and that, to the knowledge of the author, have been mainly used until now for electromagnetic transient simulations [
13] and for thermal radiative diffusion problems [
14], and only recently has their potential in solving structural mechanics [
15] and fluid dynamics [
16] problems been investigated. Furthermore, note that in the above cited references, the C-BDF schemes have been used without the Richardson Extrapolation (RE) technique.
In view of this, the purpose of this article is to contribute to minimise the gap between the mathematical formulation and the study of theoretical aspects of the C-BDF schemes and their practical use to approximate unsteady compressible inviscid and viscous flows, even when they are equipped with the RE technique. Please note that, to the best of the author’s knowledge, the coupling of the C-BDF schemes with the RE technique is investigated here for the first time. Among all the possible C-BDF schemes, in this work we consider the implementation in a CFD code, based on a discontinuous Galerkin (dG) method for the spatial approximation of the Backward Euler-Backward Differentiation Formula (BE-BDF2) [
11,
17] (note that in these references, this scheme was named with the more general acronym C-BDF2). The dG method is a high-order accurate method [
18,
19,
20], which is very well suited to cope with problems requiring a high accuracy solution and, thanks to its compact form, for parallel computations. Furthermore, thanks to its favourable dissipation and dispersion properties, the dG method has been proved to be very effective to solve turbulent flows with both the Direct Numerical Simulation (DNS) [
21,
22] and the Large Eddy Simulation (LES) [
23,
24]. The BE-BDF2 scheme is very similar to the TR-BDF2 scheme [
4] and can be considered a variant of it. According to the Second Dahlquist barrier [
25], this scheme is only second-order accurate, but, as highlighted in this work, this relatively low order of accuracy can be improved thanks to a very simple and efficient numerical technique, known in the literature as Richardson Extrapolation (RE) [
26,
27]. The RE technique, named by Richardson himself as a “deferred approach to the limit”, is able to dramatically improve the accuracy and raise the order of convergence of any numerical method with a very simple idea. This simple idea is to compute two solutions with the same numerical method, one on a coarse grid and one on a fine grid, and to use a simple linear combination of these two approximations with the goal of eliminating the lowest-order error term(s) of their asymptotic error expansion. The final result of the RE technique is a better approximation of the final solution and an increasing of the order of accuracy of the numerical method used.
To clearly determine and evaluate the effectiveness of the BE-BDF2 scheme, in its standard form and equipped with the RE technique, in this work we present the numerical results of the following test cases: (i) the inviscid isentropic vortex convection; (ii) the laminar vortex shedding behind a circular cylinder. For the first test case, the performance of the numerical schemes presented in this work will be compared with the ones obtained with classical BDF2 and CN2 methods, highlighting that, for a fixed time-step size, the use of these schemes allows us to achieve more accurate solutions, especially when equipped with the RE technique. For the second test case, that is a stiff-problem, the BE-BDF2 and BE-BDF2 & RE schemes will be compared with the explicit SSP-RK3, showing that the force coefficients predicted by all the schemes are very similar, despite the very large time-step used for the implicit schemes, which is about 220 times larger than the one employed by the explicit scheme for stability reason.
In the next Section are reported the Navier–Stokes governing equations.
Section 3 briefly describes the main building blocks of the dG spatial discretization.
Section 4 is devoted to the description of the BE-BDF2 and RE methods and to the main computational aspects related to their implementation. The numerical results in
Section 5 reveal the performance of the BE-BDF2 and BE-BDF2 & RE schemes. Findings, main conclusions and possible future works are summarised in the last Section.
2. The Governing Equations
The equations governing the behaviour of viscous compressible flows, i.e., the Navier–Stokes equations, can be written as:
where
is the fluid density,
p is the pressure,
E is the total energy per unit mass,
is the
ith component velocity and
d is the number of geometrical dimensions. With
and
are denoted the viscous stress tensor and the heat flux components, respectively, given by:
where
is the viscosity coefficient,
is the Prandtl number, here equal to
, and
. Finally, for a perfect gas,
where
is the ratio of gas specific heats, here equal to
.
In compact form, the system (
1) can be written as:
where
is the vector of the conservative variables, with
and
and
is the sum of inviscid and viscous flux functions.
3. The Spatial Discretization: Discontinuous Galerkin Method
The discontinuous Galerkin (dG) discretization is here only briefly presented for the sake of conciseness. For a more exhaustive overview of dG methods, the interested reader is referred to [
19,
20].
To perform the dG discretization of Equation (
3), we consider an approximation
of the domain
, such as
, where
is a set of non-overlapping elements
K of arbitrary shape. Moreover, we denote with
the set of faces that belong to the boundary of the discrete approximation and with
the set of internal edges of the mesh and
. Now, we define the solution and test function space as:
where
is the space of polynomial functions of degree at most
k with no global continuity requirement, i.e., they are continuous functions only inside each element
K but not on
.
The dG formulation of Equation (
3) consists in seeking
that, for an arbitrary test function
, satisfies the following equation:
In this equation,
and
are the global and local lifting operators, respectively, and
is a stability parameter [
18]. The symbol
is a jump that takes into account that the functions are discontinuous on a mesh face
, and it is defined, for a generic vector
, as:
where the ± superscripts indicate the traces of the solution over a face shared by two adjacent elements
and
and
and
are the unit outward normals pointing to
and
, respectively. In the boundary integral of Equation (
5),
is the sum of the inviscid and viscous numerical flux functions defined at element interfaces and boundary faces.
The inviscid numerical flux has been computed in this work using the exact Riemann solver of Gottlieb and Groth [
28]. The discretization of the viscous terms is based on the BR2 scheme presented in [
29] and theoretically analyzed in [
18,
30].
4. The Temporal Discretization
This section describes the BE-BDF2 and the RE methods and the computational aspects related to their implementation.
4.1. The BE-BDF2 Time Integration Method
Following the Method Of Lines (MOL), after that Equation (
5) is numerically integrated by using the Gauss quadrature rules, it results in the following system of Ordinary Differential Equations (ODEs):
where
is the global block mass matrix,
is the global vector of unknown degrees of freedom and
is the vector of residuals resulting from the integrals of the dG discretized space differential operators in Equation (
5).
In this work, Equation (
7) is advanced in time by using the second-order accurate BE-BDF2 scheme [
11,
17]. This method can be considered a variant of the one-step, two-stages TR-BDF2 scheme developed by Bank et al. [
4]. In the TR-BDF2 scheme, a first solution is computed over a portion of the time-step using the Trapezoid Rule and, in the second stage, the solution at time level
is computed with the BDF2 method. The BE-BDF2 method, like the TR-BDF2 scheme, is a one-step, two-stages scheme, but in the first stage the TR rule is replaced with the simpler BE method [
17].
More precisely, The BE-BDF2 scheme consists in advancing Equation (
7) in time from a generic time level
to the time level
, successively solving the following two stages:
In these equations, is a known solution at the time level , is an approximated solution computed at the time level , that, for , is a generic time level between and and is an approximated solution computed at the time level .
The value of
determines the stability function of the BE-BDF2 scheme. In particular, in this work we have used
, a value that minimises the local truncation error of the approximation and that makes the BE-BDF2 scheme both A-stable and L-stable, i.e., unconditionally stable and stiffly accurate for stiff problems. The stability function and local truncation error of the BE-BDF2 scheme are reported in [
11,
17].
4.2. The Richardson Extrapolation Method
The Richardson Extrapolation (RE) is a very simple and powerful technique applicable to any numerical method [
26,
27,
31]. More precisely, once a numerical method has been chosen for solving a system of ODEs, the application of RE results in a new numerical method with new properties, i.e., a new order of accuracy, a new local truncation error and a new stability function.
To understand the idea behind this technique it is described in its simplest form, which is the one used in this work. It can be resumed in three steps:
Using a general time integration method of order p and a time-step size , an approximation of the exact solution at time level n is computed and denoted as ;
Using the same time integration method but a halved time-step size, i.e., , a better approximation of the exact solution at the same time level n is computed and denoted as ;
The final approximation at time
n is given by:
Equation (
9) is just a linear interpolation of the two approximations computed at steps 1 and 2. For the BE-BDF2 scheme, since
, the RE coefficients that multiply the solutions obtained with the smaller and the larger time-step size are
and
, respectively.
In what follows, the “origin” of Equation (
9) is briefly described, but, for a more comprehensive survey of RE, its application to several explicit and implicit methods and for the derivation of the equations reported in this subsection, please refer to [
27]. Assuming that
is the exact solution at time
, once that
and
have been computed with a time integration method of order
p, it is possible to write the two following equations:
Note that in these equations
depends only on the time integration method chosen and the problem to be solved, which are both the same in steps 1 and 2. The RE consists in eliminating
from the above two equations, thus obtaining, with very simple algebraic manipulations:
Examining this equation, we can immediately notice that the second term on the left-hand side is exactly the approximation calculated at step 3 with Equation (
9), from which it immediately follows that this approximation is of order
. We can generalize this result to iteratively use RE. In the previous example, in fact, only two numerical solutions were involved, one with the time-step
and the other with the time-step
, but, if we compute other approximations with more refined grids, and we apply the same principle, an arbitrary high order accurate approximation can be found.
In the literature the RE technique is often used in two different ways: global or local extrapolation. With global extrapolation, also called passive extrapolation, the values obtained with the RE process are not used to advance the solution in time. With local extrapolation, also called active extrapolation, the values obtained with the RE process are used to calculate the next time-step. In this work, the local/active Richardson extrapolation is employed.
An important contribution to the extrapolation technique was given by Gragg [
32] by considering a symmetric method, like the midpoint or the trapezoidal rules, as the “base method” for extrapolation. In this case in fact, the asymptotic error expansion of the base method contains only even powers of
and the extrapolation process is particularly convenient since it eliminates two powers of
at each iteration. For example, by using a symmetric method with
and the extrapolation technique with only two grids, the final method is of order
.
The last aspect of the extrapolation technique is that it is particularly suitable for parallel implementations, since the preliminary approximations necessary to evaluate the final solution can be computed independently of each other. This last aspect is particularly attractive for large-scale problems solved by GPUs, for which an extreme parallelism in space is already employed, e.g., numerical weather prediction.
4.3. Implementation Details
In this work, since we use conservative variables and orthonormal shape functions obtained with the Gram–Schmidt process [
33], the mass matrix
, reported in Equation (
8), reduces to the identity matrix
and therefore has been omitted in what follows. Reformulating Equation (
8) in terms of a nonlinear residual function
and a constant vector
, the nonlinear system of equations of a generic stage can be written in a more compact and general form as:
with:
where
is a generic known solution calculated at time
and
are generic stage coefficients.
Applying Newton’s scheme to Equation (
12), the following linear system of equations is obtained:
where
i is the index of the Newton iteration and
and
are equal to:
Denoting with
a preconditioning matrix and by applying left preconditioning to Equation (
15), we solve:
In particular, at each Newton iteration, we solve Equation (
18) with the restarted Generalized Minimum RESidual (GMRES) method, using a preconditioner that is based on the ILU(0) decomposition of the analytically computed
matrix and on the classical Block–Jacobi (BJ) approach for parallel computations. For GMRES, ILU(0) and BJ methods we rely on the algorithms available in the PETSc library [
34].
To reduce the CPU time required for updating the Jacobian
and for ILU(0) preconditioning, we compute the matrix
only once per time-step. This last approximation greatly contributes to speed up the solution process, since the dG solver is typically characterized by a large system matrix that is expensive to compute and factorize. However, to avoid a too slow convergence of Newton’s algorithm, the Jacobian is recomputed when
. The maximum number of Newton iterations is
and the criterion of convergence of the Newton procedure is
, a value chosen to minimize the temporal discretization error for any time-step size used to perform the simulations [
7,
35]. The GMRES parameters used are a maximum number of Krylov-subspace vectors equal to 240 with a number of restarts equal to 1 and a linear solver tolerance
, a value that minimizes the CPU time required by each time-step advancement [
7].
Algorithms 1 and 2 provide the most important details of the BE-BDF2 and the BE-BDF2 & RE implementation, respectively. Note that, for the sake of conciseness, Algorithm 1 describes the solution process of a generic stage, since the numerical strategies adopted for both the BE-BDF2 stages are the same.
Algorithm 1 BE-BDF2 |
- 1:
procedure BE-BDF2(, , ) - 2:
▹ Necessary value for - 3:
for itmax do ▹ is the maximum number of Newton iterations - 4:
Solve Equation (18) to compute - 5:
Compute and store - 6:
if then ▹ is the Newton’s method convergence criteria - 7:
Accept the solution and exit - 8:
end if - 9:
Compute - 10:
if then - 11:
▹ It avoids a too slow convergence of Newton’s method - 12:
end if - 13:
end for - 14:
end procedure
|
Algorithm 2 BE-BDF2 & RE |
- 1:
procedure BE-BDF2 and RE(, ) ▹ Compute and and then apply RE - 2:
▹ Initial guess of Newton’s method - 3:
▹ Compute the system matrix - 4:
procedure BE-BDF2 ▹ Compute the coarse solution - 5:
- 6:
▹ Initial guess of Newton’s method - 7:
procedure BE-BDF2 ▹ Compute an intermediate solution - 8:
▹ Initial guess of Newton’s method - 9:
procedure BE-BDF2 ▹ Compute the fine solution - 10:
- 11:
▹ RE, see Equation ( 9) - 12:
end procedure
|
5. Numerical Results
The aim of this section is to show the behaviour and the performance of the BE-BDF2 scheme, with and without the RE technique, when employed to solve unsteady compressible inviscid and viscous flows. To this end, the following test cases have been used:
The inviscid isentropic convecting vortex in
Section 5.1;
The laminar vortex shedding behind a circular cylinder in
Section 5.2.
The isentropic vortex [
9,
36,
37] is a well-known test case that has an exact analytical solution, here used to evaluate the accuracy and to assess the order of convergence of the investigated time integration methods. The computed order of convergence is evaluated by means of the
error defined as:
where ∘ and
are the numerical and the "reference" solutions, respectively, with the reference solution that is set equal to the
–projection of the initial solution on the dG polynomial space.
The laminar vortex shedding has been computed at a Reynolds number
and a Mach number
. For this test case, since an exact solution is not available, the accuracy of the time integration methods will be evaluated by comparing the results with experimental data [
38] and other numerical results [
39,
40,
41,
42,
43].
In order to have a fair comparison, the same Newton and GMRES parameters were used for all the implicit schemes (please see
Section 4.3), and, for the explicit time integration method here used as reference, the time-step employed is the largest one allowed by the stability condition of the scheme.
5.1. The Isentropic Vortex Convection Problem
This test case consists of a uniform flow to which a perturbation is added. The initial flow field is defined by:
where
T is the temperature,
are the “free-stream” non dimensional velocity components and
r is the distance of a generic grid point, using coordinates
, with respect to the vortex center, which is initially placed in the middle of the computational domain. The “free-stream” Mach number, denoted as
, is equal to
, and
and
are set equal to
and
, respectively. The computational domain is a square with side length
that has been discretized with two uniform Cartesian grids with
(coarse) and
(fine) elements. Boundary conditions are periodic and the simulations are performed up to a non-dimensional final time
, corresponding to one period of vortex revolution using the
dG approximation. In what follows, the non-dimensional time-step size is defined as
, where ncyc is the total number of time-steps necessary to reach the end time of the simulation.
Figure 1 compares the density contours computed by several time integration schemes with the largest time-step here used, equal to
. In the plots the greater diffusion and dispersion errors given by BDF2 and CN2 with respect to BE-BDF2 are evident; as it is evident that, when this scheme is equipped with the RE technique, the accuracy of the solution is greatly improved.
The better accuracy of BE-BDF2 & RE is evident even when using a smaller time-step size, as highlighted in
Figure 2a for
, while in
Figure 2b, which shows the results obtained with
, diffusion and dispersion errors are evident only for BDF2, which is in fact the least accurate between all the schemes here reported.
A more quantitative analysis is reported in
Figure 3 and
Table 1 and
Table 2, which summarize the results of a time refinement study performed using the coarse and the fine grid. The plots of
Figure 3 report the
errors of all conservative variables. By looking at these plots, it is evident that BE-BDF2 outperforms BDF2 and CN2. Furthermore, when this time integration scheme is equipped with the RE technique, the higher accuracy obtained with a fixed time-step, with respect to the BE-BDF2 scheme alone, is relevant, and becomes noticeable for smaller and smaller time-step sizes (for
the
error of BE-BDF2 & RE is about two orders of magnitude smaller than the one shown by BE-BDF2). This behaviour is obviously due to the different orders of convergence of the time integration methods here investigated, whose theoretical values of 2 and 3 have been reported in the plots. Finally, note that for relatively small time-step sizes, when the spatial error overwhelms the temporal one, the
error on each variable reaches a plateau regardless of the time scheme used, e.g., the lower bound of
is around
for the coarse grid and
for the fine grid.
For a more precise analysis of the order of convergence shown by the simulations,
Table 1 and
Table 2 report the results of the time refinement study performed with the coarse and the fine grid, respectively. In these tables have been listed the values of the
errors and the related order of convergence only for the density variable, since the analysis for the other variables came to the same conclusions. By looking at these values we can notice that, while the schemes that are not equipped with RE reach the designed order of convergence of 2 with a monotonic increasing slope, the BE-BDF2 & RE tends towards the designed order of convergence of 3 with a monotonic decreasing slope.
5.2. The Laminar Vortex Shedding behind a Circular Cylinder
In this section are shown the results of the numerical investigation performed for the laminar flow around a two-dimensional circular cylinder with and .
The computational grid used, shown in
Figure 4, is a quadratic mesh obtained with the Gmsh software [
44]. It is composed of 3690 quadrilateral elements, with 72 elements lying on the cylinder surface and a wall distance of the first grid nodes around the cylinder that is approximately
of the cylinder radius. The boundary conditions imposed are a zero-heat flux no-slip condition at the wall and the far-field boundary conditions, based on the characteristic variables, for the other boundaries. Furthermore, we have verified that the position of the far-field boundary does not significantly influence the accuracy of the numerical solution. This has been achieved by performing a numerical simulation with a computational domain that has been extended to about 700 diameters in radial direction and comparing the data obtained using this last grid with the ones shown in this section.
Starting with an already periodic initial flow field, obtained with
dG approximation and the explicit third order accurate SSP-RK3 scheme [
45], the flow was then simulated for about 20 vortex-shedding periods using different time integration methods. In particular, the results of BE-BDF2 have been compared with respect to the ones obtained with SSP-RK3, since, due to the low Mach number of the flow, this case can be considered a mildly stiff problem for which the explicit schemes are well known to be very inefficient due to their stringent time-step restrictions. For the simulations performed with SSP-RK3, the maximum time-step allowed by stability condition has been used, i.e., about
being
the vortex shedding period, while for implicit schemes the time-step size has been set as equal to
. From these data it is evident the considerable saving of iterations that can be obtained by using the implicit schemes. Note, in fact, that the time-step size used for the BE-BDF2 and BE-BDF2 & RE schemes is about 220 times larger than the one employed for the explicit scheme.
In
Figure 5 are reported the velocity magnitude contours obtained at the final simulation time. The comparison of these plots highlights that there are no significant differences between the wake flow patterns computed with the different time integration methods.
Figure 6 shows the time histories of lift and drag coefficients with a close-up of the last computed vortex-shedding period on the right plots, which demonstrate that, after about 20 vortex-shedding cycles, implicit time histories present no significant differences in phase and amplitude with respect to the corresponding explicit data, with the BE-BDF2 & RE data that are slightly closer to the explicit ones with respect to the data obtained with BE-BDF2.
A quantitative comparison is reported in
Table 3 in which are listed the mean values of lift and drag coefficients and the Strouhal numbers obtained for the last 10 vortex-shedding cycles. We note that the force coefficients and the Strouhal numbers given by explicit and implicit schemes are almost the same and are in agreement with the ones obtained by other flow solvers [
39,
40,
41,
42,
43]. Furthermore, the computed Strouhal values are also close to the experimental values, since, as shown in [
38], the experimental measurements can be approximated with the equation
. What can be concluded is that the vortex-shedding phenomenon is well-predicted using both BE-BDF2 and BE-BDF2 & RE time integration methods despite the very large time-step size here employed.
6. Conclusions
In this paper we have investigated the potential of the BE-BDF2 scheme for solving unsteady compressible flows, while also considering it in combination with the Richardson Extrapolation (RE) technique. For the isentropic vortex test case, the numerical investigation shows that BE-BDF2 outperforms in accuracy the more classical BDF2 and CN2 methods, and that, when equipped with RE, the gain in accuracy is remarkable. For the cylinder test case, considered here to evaluate the performance of BE-BDF2 and BE-BDF2 & RE for stiff-problems, the numerical investigation has shown that the vortex-shedding phenomenon and the force coefficients on the cylinder are well-predicted despite the very large time-step used, which was about 220 times larger than the one employed by the explicit SSP-RK3 scheme. These results, together with the theoretical aspects related to the BE-BDF2 formulation, have led to the following conclusions about this scheme:
It is well suited for stiff problems thanks to its stability properties;
Even if it is composed of two stages, the coefficient that multiplies the two Jacobian matrices is the same, i.e.,
, therefore, in particular when it is coupled with a Matrix-Free approach and a Frozen Preconditioner strategy [
8], the CPU time and memory required to advance the solution in time can be greatly reduced;
Extrapolated values can be easily computed, exploiting the previous known solutions, and can be used as very good initial guesses for both the stages, thus accelerating the convergence of Newton’s method.
The main drawback of BE-BDF2 is that it is only second-order accurate, but this aspect can be overcome if it is used in combination with RE.
The positive aspects of BE-BDF2 & RE are:
It is third-order accurate;
It dramatically improves the accuracy of the solution with respect to BE-BDF2, especially for high precision requirements;
It allows the incorporation of a strategy to adapt the time-step size, since it immediately provides error estimators evaluated on several approximation levels;
Even if it requires the computation of two solutions, these solutions can be computed in parallel.
Encouraged by the preliminary results obtained in the present study and by the promising directions listed above, future works could contribute to the advancement of numerical simulations of unsteady flows by exploring the potential of the BE-BDF2 & RE scheme for less regular solutions and geometries, e.g., turbulent flows and real-world applications.