To construct sparse grids, one uses a tensor product construction to obtain a multi-dimensional basis on the
d-dimensional unit cube
from the one-dimensional hierarchical basis. We use the multi-index
to denote the level. We then consider the set of
d-dimensional rectangular grids
with mesh size
With each individual grid point
, where
indicates its spatial position, we associate a piecewise d-linear nodal basis function,
which is the product of the one-dimensional basis functions and has a support of size
. The one-dimensional
are the well-known hat functions. Using these basis functions, we can define the
d-dimensional nodal function spaces
which are zero on the boundary
and consist of piecewise
d-linear functions, and the
d-dimensional hierarchical increment spacesNow consider the
d-linear interpolation of a function
f by a
, i.e., a representation as in (
9). For illustration, we look at the linear interpolation in one dimension; for the hierarchical coefficients
,
,
i odd, it holds
This illustrates why the
are also called
hierarchical surplus: they specify what has to be added to the hierarchical representation from level
to obtain the one of level
l. We can rewrite this in the following operator form
and with that we generalize to the
d-dimensional hierarchization operator as follows,
We denote
with the
hierarchization operator that performs the transformation (
10) from the nodal function values of
to obtain all the arising hierarchical values
. The inverse operator
is called the
dehierarchization operator and computes from the hierarchical values
the corresponding (nodal) function values
f on all the grid points.
The idea is now to use, instead of full grid spaces,
sparse grid spaces of level
n, defined by
where instead of the maximum of the level indices their sum is used. Here, hierarchical basis functions with a small support, and therefore a small contribution to the function representation [
30,
31], are not included in the discrete space
of level
n anymore. Note that the change from
n to
has to do with the boundary treatment, the underlying aspects are not relevant in the scope of this work, they can be found, e.g., in [
31]. A function in
is represented in the hierarchical basis analogue to (
9) as
We define the set
of all indices of functions in
by
.
For a simplified exposition, we have so far only considered functions that are zero on the boundary of the domain. To allow non-zero boundary values, one introduces additional nodes on the boundary. This can be achieved by adding two boundary points on level in the construction. By conducting this, we can obtain a modified set of subspaces by the construction explained before.
3.1. Finite Difference Schemes on Sparse Grids
Finite differences on sparse grids were introduced and studied in [
5], where consistency proofs and convergence results are given; see also [
6,
7,
33]. The construction of finite difference operators is based on a dimensional splitting combined with the nodal to hierarchical basis transformation (
10) and its respective back-transform.
Let us first describe the original sparse grid finite difference approach, where operators are a composition of three partial operators [
5]:
The finite difference operators use per dimension the neighboring grid points of the respective grid point, i.e., the closest grid points in the dimension. For the approximation of the derivative on regular sparse grids, one uses appropriate equidistant difference stencils. If adaptive refinement, as later explained in
Section 3.2, is used, the grid points in the different dimensions are no longer equidistant, that is the distance can no longer be defined based on the grid refinement level, but the stencil is still chosen such that the closest neighbors in the respective dimensions are used.
We consider an alternative approach [
9], which employs additional points and is simpler to implement since it only involves function evaluations. Instead of using the function values on the sparse grid points, one interpolates on nodes that we will refer to as
ghost nodes. This way we do not have to use basis transformations such as (
10) and one could simply take any sparse grid library, such as SG++ [
32], to implement this approach.
To describe the approach from [
9], we first define the above noted ghost points. Afterwards, we describe interpolation operators working on these points. Finally, we introduce the finite difference operators by using these interpolation operators.
To define ghost nodes, we start by defining the ghost node step size.
Definition 1 (Ghost node step size). We define the ghost node step size in dimension , for a grid point by where denotes the maximal level used in dimension j.
Note that this is half of the size of the smallest support of the basis functions in this dimension. This makes sense since for this step size the local behavior of the approximation is still captured. For adaptive sparse grids, one could also take a bigger distance in some grid points, but due to the linearity of the approximation in this part, this does not change the result.
Note that for the different sparse grid operators, we need to interpolate on different points. For the forward difference, we have to add the ghost node step size in the respective dimension and for the backward difference we have to subtract it in the respective dimension. We refer to them as
forward difference ghost nodes and
backward difference ghost nodes. Examples for forward difference ghost nodes are given in
Figure 1. Notice that for the second derivative finite difference, we can, as usual, employ the first derivative operators twice. Other difference operators are possible but we restrict ourselves for a simplified presentation.
Definition 2 (Ghost node)
. For a grid point in which we aim to compute the finite differences in dimension , we define the corresponding forward difference ghost node byand similarly for the backward difference, we define the corresponding backward difference ghost node by The main idea of the construction of finite difference operators is simply to collect the terms that are considered constants under partial differentiation. When one computes partial derivatives, other variables are held constant by definition. Therefore, if a function is multiplicatively separable in other variables, one can just compute the derivative of the variable as in one dimension. As the set of basis functions are built from hierarchical functions in one dimension, the basis functions lead to separable functions.
Without a loss of generality, we describe the finite difference operators in two dimensions with the derivative taken in the first dimension. The generalization to higher dimensions is straightforward using
j for the dimension in which the derivative is taken and a multi-index
represents all other dimensions besides
j. Using the fact that
from
does not depend on
, and the linearity of the differential operator, we obtain
Now, the expression in the
-brackets to be differentiated is an one-dimensional function of
, on which any finite difference scheme can be applied to approximate the derivative.
Using, e.g., the forward difference,
we can now define a finite difference operator on sparse grids that is based on interpolation at
and
. For a given grid and the desired difference operator, we can simply consider all respective ghost nodes arising from
and interpolate on these.
Definition 3 (Interpolation-based sparse grid finite difference operator)
. Let us formally denote the interpolation operator to be evaluated on on the sparse grid bywhich is essentially (11) for . We define the corresponding interpolation operator for the by shifted sparse grid, that is the grid of ghost nodes, needed for the forward difference and backward difference , respectively, in dimension byrespectively. We define the sparse grid forward difference operator reflecting (12) byThe corresponding sparse grid backward difference operator is given by We point out that one does not need to use interpolation for the boundary points in the respective dimension (see the right picture in
Figure 1 with red points on top of sparse grid points) since the function values for these grid points are already known.
Notice that the interpolation operators work on hierarchical values. Note further that we can similarly define other finite difference operators by interpolating on the required points. Let us turn to the boundary handling. Since the forward difference is not defined on the upper boundary, we use the backward difference and thus also the backward difference ghost nodes here. Similarly, we use the forward difference on the lower boundary since the backward difference is not defined here.
For the second derivative difference operator, we can also use the above approach by interpolating to the respective points. If we need both the first and the second derivative, there are two approaches to avoid recomputation. First, one can use the interpolated values that one used for the first derivative finite differences also for the second derivative finite differences. Second, as is pointed out in [
5], one can use the computed first derivative operators to compute the second derivative one by using the following relationship between the first and the second derivative operators on sparse grids given by
which is a well-known identity for the full grid operators. This is due to the observation that locally the operator works on one-dimensional full grids [
5].
The operators are linear and can thus be represented by matrices; therefore, a comparison of the approaches can be easily undertaken using the corresponding matrices. Observe that the version presented in [
5] is working on nodal values, whereas the interpolation-based version is working on hierarchical basis coefficients. One thus applies the nodal to hierarchical basis transformation as the first operation in the interpolation-based version to compare the arising discretization matrices of both sparse grid finite difference approaches. For regular sparse grids of level
, it is confirmed in [
33] that both approaches yield the same finite difference operator. The formal relationship between the two approaches needs further investigation.
3.2. Adaptive Sparse Grids
One can further reduce the computational complexity by using an adaptive sparse grid. For example, this is the case if the function has changes in its steepness, i.e., large second derivatives. The idea is to add new points to the sparse grid if it is likely that they increase the obtained accuracy. This is called
adaptive refinement. As a criterion for adaptation, one typically uses a local error estimation based on the hierarchical surplus (coefficient), and then adds child nodes (in the hierarchical structure) to those points with a large estimate. Vice versa, grid points whose corresponding basis functions do not contribute much can be removed in a coarsening step. For a description of similar algorithms for refinement and coarsening in the optimal control setting, see [
34,
35], and a general view can be found in [
32].
We use different types of adaptivity criteria that are based on the hierarchical surpluses as an error indicator. The overall algorithm uses both the adaptive refinement and the adaptive coarsening, together. Given an index set
, a refinement parameter
, a coarsening parameter
, and the approximated function
v on
, we can refine and coarsen the grid and approximate the function on the new grid. For our experiments, we use the coarsening parameter
, which is a typical choice [
34,
35]. An additional component can be self-adaptivity as also used in [
5], where the refinement threshold and coarsening parameter are automatically decreased when no new points are added by adapting with the current parameters.
3.3. (Non-)Convergence of Sparse Grid Finite Difference Schemes for Solving the HJB Equation
The requirements one needs to fulfill to obtain a convergent approximation scheme for HJB equations by means of Barles and Souganidis [
11] involve a stable, consistent, and monotone scheme. While it is trivial to show that one needs monotone interpolation, interpolation on sparse grids is not monotone in general, as it is already pointed out in [
36]; see also [
32]. Since we are approximating value functions arising from economic models, it is of interest if one can achieve monotonicity by restricting ourselves to concave monotonically increasing functions as they arise for the employed models. Ref. [
9] notes that the introduced upwind finite difference scheme converges even though the scheme is not monotone. However, Ref. [
9] does not give examples or justifications that it is not monotone.
In just one dimension, the scheme is monotone since it is uses standard linear interpolation. Observe that we only need one-dimensional monotonicity per dimension, that is with respect to the used ghost points.
Let us give a definition of monotone interpolation.
Definition 4 (Monotone interpolation in one dimension)
. Let be data points with . A function f is called monotone if it holds that or . In case of strict inequalities f is strictly monotone. The interpolation of f is monotone, if for every pair of two points it holdsorwith strict inequality for strictly monotone interpolation. Note that we are interested in higher dimensions and aim for one-dimensional monotone interpolation for the restriction to the different dimensions.
Non-Monotone Sparse Grid Interpolation for Concave Monotonically Increasing Functions
We give a counter example to show that sparse grid interpolation for monotonically increasing concave functions is not monotone in general. To achieve this, we give concave monotonically increasing functions for which negative hierarchical coefficients arise. Let us consider the interpolation of the function
which is similar to functions that arise as value functions for some models. Computing the eigenvalues of the Hessian shows that it is negative semidefinite in
and thus the function is concave, but note that it is not strictly concave. Unfortunately, we see in the function plots in
Figure 2 and the contour plots in
Figure 3 that the interpolation is not monotone. Increasing the factors in front of
x and
y increases this effect. Here, we use a sparse grid level
, but corresponding counter examples can be constructed for other levels.
Note further that the above interpolation is in particular not monotone with respect to the points used for our sparse grid finite differences. In
Figure 4, one can see that the value shown in red is higher than the one in green. Thus, even if we restrict ourselves to the ghost points, we do not have the monotonicity we hoped for.
Strictly concave functions can also yield non-monotone sparse grid interpolation, e.g., . Here, one can check the concavity by using the leading principal minors criteria. Finally, we point out that one cannot simply set the negative hierarchical coefficients to zero to obtain a monotone approximation.
3.4. Overcoming the Non-Monotonicity of Sparse Grid Interpolation
There are several possible approaches to obtain monotone interpolation on sparse grids. The most trivial way is to go to a higher sparse grid level; this is visualized in
Figure 5, where the interpolation of
is presented for sparse grid levels
instead of
. The approach is simple, but one cannot go to arbitrarily high levels in higher dimensions; moreover, for higher levels, corresponding counter examples can be constructed as well, therefore a guarantee of monotonicity cannot be expected.
Alternatively, one can identify the areas where non-monotonicities arise and insert points only in these areas. For that, one can adapt the sparse grid using the hierarchical coefficients as error indicators, as is performed in standard adaptive approaches. Another approach is to go to a “full” grid in the critical area. Here, one determines the maximal one-dimensional level used in the critical area and then simply uses the full grid level. Thereby non-monotonicity cannot arise in this area. For investigations on these approaches for monotone functions see [
33]. Being of heuristic nature, we do not expect that monotonicity can be guaranteed. Specific to our situation, the most promising alternative refinement is to use the computed derivatives. That is, one can use the approximations of the derivative and check if these are non-negative since that indicates a non-monotone interpolation. By marking such points for adaption, one can iterate until all derivative approximations are non-negative or below a certain error threshold. With this approach, a guarantee of monotonicity might be achievable.
In our numerical experiments, we do not investigate these strategies and consider the enforcing of monotone interpolation on adaptive sparse grids a topic of future research.