1. Introduction
Consider solving the heterogeneous Helmholtz problem: for a bounded domain
,
, we wish to find
such that
where
incorporates some appropriate boundary conditions. For the heterogeneous problem, we suppose the wave number
is a function of space, defined by the ratio
of the angular frequency
and the wave speed
. In this work, we investigate the use of an overlapping Schwarz preconditioner with a suitably chosen coarse space based on solving local eigenvalue problems on each subdomain.
The ability to compute solutions to the Helmholtz problem (1) is important across many disciplines of science and engineering. As the prototypical model for frequency-domain wave propagation, it features within the fields of optics, acoustics, and seismology, amongst others. Further applications can be found in imaging science, such as through medical imaging techniques and geophysical studies of the Earth’s subsurface within, for instance, the oil industry. Nonetheless, it is challenging to develop efficient computational methods to solve (1), particularly when the wave number k becomes large.
Discretisation of (1) by standard approaches, such as Lagrange finite elements, as we shall use here, results in large linear systems to be solved which are indefinite, non-self-adjoint, and ill-conditioned; see also [
1]. These systems present various difficulties to solve, especially in the presence of complex heterogeneities, at high frequencies (large
k), or when solutions include many wavelengths in the domain. Classical methods for solving such large systems typically fail for several reasons, as detailed in [
2,
3], and specialist approaches must be employed for a robust solver. While much progress has been made for symmetric positive definite problems, such techniques cannot be applied out-of-the-box and extensions to tackle indefinite and non-self-adjoint problems may not be clear. This has led to a number of approaches being developed in recent years aiming to bridge this gap. For the Helmholtz problem, this includes parallel direct solvers, such as [
4,
5], and preconditioned iterative methods that utilise multigrids, such as [
6,
7], or the so-called “shifted Laplace” approach [
8,
9]. This latter (complex) shifted Laplace preconditioner has seen much interest into its practical use [
10] and further developments through deflation techniques, most recently in [
11,
12].
Another broad class of solvers are domain decomposition methods, which provide a natural balance between using direct and iterative solvers. Specialist methods are again required for the Helmholtz problem, a popular set of which fall under the heading of “sweeping” methods [
13,
14]. These are multiplicative domain decomposition methods linked to a variety of other approaches, including optimised Schwarz methods, as detailed in the recent survey [
3]. While sweeping is a conceptually serial approach, much work has been done to incorporate parallelism. Of particular note is the “L-Sweeps” method [
15], stated to be the first parallel scalable preconditioner for high-frequency Helmholtz problems; a review of developments in this area is provided in the introductions of [
15,
16]. Other popular approaches are FETI methods; for the Helmholtz problem, these include FETI-H [
17] and FETI-DPH [
18].
Within the domain decomposition community, there has also been renewed work on additive Schwarz methods, which offer a naturally parallel approach. Following on from the seminal work [
19], which utilised Robin (or impedance) transmission conditions to provide a convergent Schwarz method for the Helmholtz problem, a wealth of non-overlapping Schwarz methods have been devised; see the introduction of [
20] for a recent overview. In these methods, one has to be careful to either avoid or treat cross points (where three or more subdomains meet), as can be done in the robust treatment of [
20]. Many optimised approaches rely on deriving higher-order transmission conditions, such as through second-order impedance operators in [
21], absorbing boundary conditions (ABCs) [
22], or non-local operators [
23]. Ideally, one would use the Dirichlet-to-Neumann (DtN) map (a Poincaré–Steklov operator) to provide transparent transmission conditions, but this is prohibitive in practice, and so these optimised Schwarz methods in essence try to approximate this operator.
Overlapping Schwarz methods for the Helmholtz problem—see for example [
24,
25,
26]—have also received renewed attention in recent years, and it is this type of method we shall consider. A successful approach is to design additive Schwarz methods based on including absorption (a complex shift
), with absorption parameter
; see [
27,
28]. Theoretical work to understand the effectiveness of this approach can be found in [
29,
30] and for the heterogeneous problem in [
31]; see also [
32]. To be scalable, such additive Schwarz methods require a second level, known as a coarse space (see [
33] for a novel analysis for the absorptive problem). For the two-level methods in [
27,
29], this is provided through a coarse grid—an approach that is effective also for the time-harmonic Maxwell problem [
34].
As well as to provide scalability, coarse spaces have been devised to provide robustness to heterogeneity. This is exemplified by the “Generalised Eigenproblems in the Overlap” (GenEO) approach for symmetric positive definite (SPD) problems [
35]. This approach provides a spectral coarse space, where appropriate local eigenvalue problems are solved to provide a two-level method. Another spectral coarse space is the DtN coarse space [
36], which has been extended and investigated for the Helmholtz problem in [
37,
38]. While the standard GenEO theory applies only in the SPD case (see [
39]), in this work, we develop and explore a GenEO-type method for the Helmholtz problem and show numerically that, for a 2D model problem of a wave guide, it provides a scalable approach that is robust to heterogeneity and increasing wave number in terms of the iteration count of a preconditioned GMRES method. Companion results for large benchmark problems on coarsely resolved meshes that typically arise in applications, along with comparisons to other methods, are found in [
40]. Theoretical results on variants of DtN and GenEO for Helmholtz problems are currently out of reach, but promising numerical results, based on heuristics, were obtained in [
37,
38], showing the potential of these methods in practice.
The primary aim of this work is to explore the utility of a GenEO-type method for the heterogeneous Helmholtz problem (1); we call this approach H-GenEO. In particular, we highlight the following contributions:
We present a range of numerical tests, on pollution-free meshes, comparing our proposed H-GenEO approach with another spectral coarse space applicable to the Helmholtz problem, namely the DtN method.
We investigate the use of appropriate thresholding for the required generalised eigenproblems in both the DtN and H-GenEO coarse spaces.
We consider robustness to non-uniform decomposition, heterogeneity, and increasing wave number as well as the scalability of the methods. We find that only the H-GenEO approach is scalable and robust to all of these factors for a 2D model problem.
We provide both weak and strong scalability tests for H-GenEO applied to high wave number problems.
The remainder of this work is structured as follows. We begin by considering a finite element discretisation of the Helmholtz problem in
Section 2.1, before outlining the underlying domain decomposition methodology we use in
Section 2.2. The main topic of interest, that of suitable spectral coarse spaces for the heterogeneous Helmholtz problem, is then detailed in
Section 2.3. Extensive numerical results on a 2D model problem are provided in
Section 3, along with a discussion of our findings. Finally, we draw our conclusions in
Section 4.
3. Results and Discussion
In this section, we present and discuss the numerical results computed using FreeFEM [
52], in particular through the functionality of
ffddm, which handles the underlying domain decomposition data structures. As a model problem, we consider the case of a wave guide in 2D, defined on the unit square
. We impose homogeneous Dirichlet conditions on two opposite sides, namely (2b) with
on
, and Robin conditions on the two remaining sides, that is (2c) on
. A point source is located in the centre of the domain at
and provides the forcing function
f. A schematic of this model problem is found in
Figure 2.
To discretise the problem, we triangulate
using a Cartesian grid with spacing
h and alternating diagonals to form a simplicial mesh (see
Figure 2 where
). The discrete problem (5) is then built using a P1 finite element approximation on this mesh. In order to avoid the pollution effect, we choose
k and
h simultaneously so that
is fixed. The large sparse linear system (5) is solved using right-preconditioned GMRES, with the preconditioner given by the two-level ORAS method (9) and the choice of coarse space as stated. We terminate the GMRES iteration once a relative residual tolerance of
is reached. Unless otherwise stated, within the domain decomposition preconditioner, we use minimal overlap: that is, one layer of adjoining mesh elements are added to the non-overlapping subdomains via the extension in (6). To solve the local eigenvalue problems in the two-level methods, we make use of ARPACK [
53], while both the subdomain solves and coarse space operator solves are given by MUMPS [
54].
We first compare coarse spaces in the simplest case of a homogeneous problem, investigating the choice of eigenvalue threshold used. We then show results on how the methods perform in a variety of settings; for instance, with non-uniform subdomains, heterogeneity, and additional overlap. Finally, both weak and strong scalability tests are performed for H-GenEO applied to high wave number problems with timings reported.
3.1. A Comparison of Methods for the Homogeneous Problem with Uniform Partitioning
In
Table 1, we give some benchmark results for the simplest problem of a homogeneous wave guide using a uniform decomposition into 25 square subdomains. We see that the one-level ORAS method (8) performs relatively poorly as the wave number
k increases. The standard DtN coarse space (14) (with
) is able to reduce iteration counts with a relatively small coarse space size; however, there is still a clear increase in iterations as
k increases. The
-GenEO method (18), with
, performs poorly here, often performing worse than the one-level method despite a larger coarse space than the DtN approach; this may be because the impedance conditions from the wave guide problem are not included in the definition of the
-GenEO coarse space and so the eigenfunctions are not appropriate here. Finally, the standard H-GenEO method (19) (with
) performs well and significantly reduces the iteration counts, by a factor of 10 for the largest wave number, and provides robustness to increasing wave number
k (in fact, iteration counts tend to decrease with
k). We note that the size of the H-GenEO coarse space is larger than the DtN coarse space, and we now explore this further.
In
Table 2, we provide results for both the DtN and H-GenEO methods with differing eigenvalue thresholds
. For DtN, we use the standard threshold
, the suggested threshold from [
37]
, and the larger threshold of
. For H-GenEO, we use the standard threshold
as well as the weaker thresholds of
and
, the latter giving coarse space sizes more comparable to the standard DtN approach. To differentiate between these methods, we use the notation DtN(
) and H-GenEO(
) where
is as specified. We notice in
Table 2 that increasing the DtN threshold to
significantly improves the iteration counts for this problem so that they are almost independent of
k, albeit very slightly growing. Increasing the threshold further only marginally improves the iteration counts, which still grow slightly with
k, but at the expense of a coarse space almost twice the size. On the other hand, if we relax the H-GenEO threshold, we start to see higher iteration counts and lose some robustness, but generally iteration counts do not increase with
k as they do for the standard DtN method. We note that, for this homogeneous problem, roughly comparable coarse space sizes give approximately similar iteration counts, and so it is primarily the thresholds used in DtN and H-GenEO that dictate the different growth behaviour we observe.
To explore the growth in the size of the coarse space further, in
Figure 3a, we plot coarse space size against the wave number
k. From this, we can see that growth for DtN(
) is approximately proportional to
, while for H-GenEO(
), it is around
for our model problem. When the thresholds are relaxed, the precise relationship becomes less clear, but we note that the coarse space sizes grows more slowly with a weaker threshold, especially for the DtN approach. The faster growth seen for DtN(
) and H-GenEO(
) may help to accommodate the stronger robustness to
k observed in the iteration counts of
Table 2. These two approaches appear to provide the best trade-off for obtaining a well-behaved method, and so we focus primarily on these approaches, but first we consider the question of scalability.
3.2. Scalability of DtN and H-GenEO for the Homogeneous Problem with Uniform Partitioning
We now investigate the scalability of the DtN and H-GenEO methods. This will depend on the threshold used, and so we compare results for DtN(
k) and H-GenEO(
) as well as DtN(
) and H-GenEO(
), with each pair of approaches giving broadly similar iteration counts. Results for the homogeneous problem with
and
are given in
Table 3 for an increasing number of uniform square subdomains
N. We see that the DtN(
k) approach does not exhibit scalability here, with iteration counts that noticeably increase with
N. Similarly, with the weaker threshold, H-GenEO(
) also fails to be scalable. On the other hand, both DtN(
) and H-GenEO(
) are scalable here, with low iteration counts that vary little with
N.
Comparing the size of the coarse spaces employed, we see that the DtN(
) coarse space grows faster with
N and becomes larger than the H-GenEO(
) coarse space, which may account for its particularly strong robustness to
N here. From
Figure 3b, we see that for both DtN approaches, the coarse space size grows approximately proportional to
, while for the H-GenEO method, it is around
. This suggests that H-GenEO may be advantageous when
N becomes large due to the smaller coarse space required. For each method, the average number of eigenvectors taken per subdomain decreases as
N increases; this means that, as well as smaller subdomain solves, we also benefit from requiring fewer eigenvectors to be computed per subdomain, even if the global coarse space size increases. We note that the size of the DtN(
k) or H-GenEO(
) coarse space sometimes shrinks as we increase
N, and in these cases, the iteration counts tend to be particularly poor, suggesting that the thresholds of
for DtN and
are not doing a suitable job in capturing the eigenfunctions required for scalability. As such, we now narrow our focus to the DtN(
) and H-GenEO(
) approaches.
3.3. Robustness of DtN and H-GenEO for the Homogeneous Problem with METIS Decomposition
We now consider utilising non-uniform subdomains, as provided through the software METIS [
55]. In
Table 4, we compare DtN(
) and H-GenEO(
) in this situation, again for the homogeneous problem. We observe that both methods retain their robustness, meaning that iteration counts only depend mildly on the wave number
k and the number of subdomains
N. For H-GenEO(
), the scalability becomes more favourable for larger
k, and we can see that for the smallest wave numbers with large
N, the average number of eigenvectors per subdomain becomes very small; in fact, on many subdomains, only a single eigenvector is taken, and so the achieved tolerance on the eigenvalue may be somewhat weaker than
, which may explain the slightly poorer performance. One way this could be overcome is by always taking a minimum number of eigenvectors per subdomain; for instance, using at least 5 eigenvectors per subdomain when
gives iteration counts bounded by 19. Since we are primarily interested in approaches that remain effective for increasingly large wave numbers, where this issue diminishes, we do not worry further about this and assert that for problems of interest, H-GenEO(
) provides good scalability. On the other hand, for DtN(
), the mild increase in iteration count is seen as
k increases here, with scalability observed for all wave numbers.
We note that the coarse space sizes for each method tend to be slightly larger with the more general decompositions used by METIS, but otherwise the same trends are seen. As such, we conclude that non-uniform decompositions can be well handled by the spectral coarse spaces employed here.
3.4. The Effect of Heterogeneity
We now turn our attention to the key property of robustness to heterogeneities. For this, we consider layered media within the wave guide. Three configurations, each having 10 layers, are used and are detailed in
Figure 4. The heterogeneity is introduced in the wave speed
, and in each case,
c takes values from 1 to
, where
is a contrast parameter determining the strength of the heterogeneity. The wave number is then given by
where
is the angular frequency; we vary both
and
in our tests. Note that for the DtN method, the eigenvalue threshold now depends on
, which may be different for different subdomains
. To avoid notational clutter, we omit the subscript when referring to the method, namely retaining the name DtN(
).
Results for DtN(
) and H-GenEO(
) for the increasing layers problem (
Figure 4a) are provided in
Table 5. Unfortunately, we see that the DtN(
) loses robustness with the heterogeneity present in this problem. In particular, we lose any robustness to the wave number
k, and for the largest wave number used, we also see that changes in the contrast, given by
, can begin to have a sizeable impact on the iteration counts despite otherwise being relatively stable to changes in
. To a lesser extent, we also lose scalability with DtN(
) as the iteration counts now slowly increase with
N. On the other hand, H-GenEO(
) has strong robustness throughout, both with respect to the wave number
k, the contrast in the heterogeneity
, and scalability as
N increases. As such, we see a clear preference for H-GenEO(
) as a stable and reliable method for heterogeneous problems.
We also consider the case of a diagonal layers problem (
Figure 4c) in
Table 6. Here, the issues with DtN(
) are reduced, but there is still some increase in iteration counts, especially for higher wave numbers
k and a larger number of subdomains
N. We note, in results not shown here, that DtN(
) also suffers from the same lack of robustness. For H-GenEO(
), however, we still have good robustness to the parameters of the problem.
We further consider the heterogeneous problem when making use of METIS for more general non-uniform subdomain decompositions in order to ensure that H-GenEO is able to handle both difficulties together. For this, we consider the alternating layers problem (
Figure 4b) and provide results in
Table 7. We find that H-GenEO(
) performs very well and continues to provide a rather robust method, even in the presence of heterogeneity on non-uniform subdomains. This further evidences the strength of the H-GenEO(
) approach, and we briefly study this more closely, dropping reference to the eigenvalue tolerance and simply denoting the method as H-GenEO. First, however, we provide a brief examination into the use of higher order discretisation.
3.5. Higher Order Finite Elements
We briefly investigate the use of higher order finite elements. In particular, we consider the use of P2 elements as opposed to P1 elements. To give a direct comparison, we utilise the same meshes, and in
Table 8 we give results for the heterogeneous diagonal layers problem with
; equivalent results for P1 elements are given in
Table 6. We observe that both iteration counts and coarse space sizes remain rather similar to the case of P1 elements. This suggests that, by itself, the order of the underlying finite elements used does not strongly affect performance. However, the typical meshes employed for higher order elements may be coarser, and further studies would be required to observe how this affects the utility of the spectral coarse spaces presented here. Further investigation into which method and order of finite element approximation provides the most efficient choice is beyond the scope of the present study.
3.6. The Effect of Boundary Conditions within the H-GenEO Eigenproblem
We now consider the choice of boundary conditions within H-GenEO in light of the fact that, for wave propagation problems, impedance conditions can often prove more practical within overlapping Schwarz methods. To this end, we consider the H-GenEO eigenproblem where the Neumann boundary condition is replaced by an impedance condition instead (i.e., the Robin condition in (7b)). Results for this impedance–H-GenEO method are given in
Table 9 for the homogeneous problem with uniform square subdomains. We see that, while the use of this eigenproblem retains the good behaviour of H-GenEO as
k increases, it lacks scalability as we increase the number of subdomains
N. We note that the size of the coarse space is very similar to that when the standard Neumann condition is used (a direct comparison can be made for
with results in
Table 3), and so this is not simply an artefact of a smaller coarse space. This shows that the Neumann condition within the eigenproblem is an important aspect of H-GenEO.
3.7. The Effect of More Overlap When Using H-GenEO
So far, all our results use minimal overlap. Here, we consider the case of increasing the overlap between subdomains—this is done by adding on layers of adjoining elements to each subdomain in a symmetric way so that minimal overlap is given by an overlap parameter of 2; that is, the overlapping region has a width of 2 elements. In
Table 10, we report results for increasing overlap when using H-GenEO for the homogeneous problem with
and
and where a uniform decomposition is used. We see that adding on a small amount of overlap can slightly decrease the iteration counts, but increasing the overlap width further can give much poorer results, especially when using a large number of subdomains
N. One possible explanation could be an increase in the “colouring constant” (see, e.g., [
45] (Definition 5.5)). We note that the size of the coarse space decreases somewhat as the overlap is increased; however, the extra computational effort required to deal with the larger subdomains will hamper any gains from this, along with the increased iteration counts. From these results, we determine that the H-GenEO coarse space is best suited to the case of minimal overlap, as we have used elsewhere throughout this work.
3.8. Weak Scalability and Timing Results for H-GenEO
In this section, we consider the weak scalability of the H-GenEO method. To approach this, we consider a growing wave guide domain where fixed size subdomains are added, each with the same number of degrees of freedom (dofs). This is done by repeatedly adding a unit square, which is split into 25 non-overlapping square subdomains, to the right of the existing domain
L times to give
. Along the long edges of the global domain, we prescribe homogeneous Dirichlet boundary conditions, while Robin conditions are used at each end of the wave guide; a schematic for this weak scaling test is given in
Figure 5. Heterogeneity is given by the alternating layers problem (see
Figure 4b) with the 10 layers extending across the length of the wave guide.
To deal with the large problem sizes (reaching up to dofs) and provide appropriate timing results, we assign one core per subdomain and solve using the ARCHIE-WeSt supercomputing facility on up to 400 cores (the machine uses Intel Xeon Gold 6138 processors at GHz with GB RAM per core).
In
Table 11, we give results for a length of domain
up to
; that is, from
to
subdomains, using
,
and
. We observe that the iteration counts remain almost constant, increasing only very mildly with
N. On the other hand, the coarse space size grows linearly with
N, as expected given that subdomains have a fixed size, and we see that the time taken to solve the eigenproblems stays constant. However, due the increasing size of the coarse space, the run times slowly increase, meaning that the efficiency degrades as we solve increasingly large problems. Further, the setup of the decomposition and partitioning is performed sequentially, and so this setup time increases with
N. Nonetheless, when removing this initial setup, the efficiency still slowly reduces as we increase the problem size, as can be seen in the final row of
Table 11. To overcome this, the factorisation of the coarse space operator and associated solves must be made scalable. While a multi-level approach is therefore an attractive option, it is not yet clear how to formulate such a strategy for Helmholtz problems. Additionally, the coarse problem is only assigned to one process here, and so spreading it over more processes as
N increases may stop the coarse solve from becoming a bottleneck.
3.9. High Wave Number Strong Scalability and Timing Results for H-GenEO
To conclude our numerical results, we consider the use of H-GenEO within a high wave number example and explore timings and the overall strong scalability of the approach. To this end, we consider the original homogeneous wave guide, as outlined in
Figure 2, with a wave number of
and mesh width
, giving a total problem size of
dofs. To solve, we use METIS to give a non-uniform decomposition into subdomains and again utilise the ARCHIE-WeSt supercomputing facility to assign one core per subdomain.
Results are tabulated in
Table 12, in which we detail the run time (in seconds) and the percentage of time spent by the eigensolver and coarse factorisation. The run times are further used to determine the parallel efficiency based on the smallest run on
cores. Timing results are also displayed graphically in
Figure 6, where we show that the bulk of the computation is spent in the eigensolves and factorisation of the coarse problem; as such, while the METIS decomposition is sequential, we do not separate out the setup time here, given that it is comparatively rather small. From
Table 12, we see that the iteration counts show good scalability, increasing only very mildly as we increase the number of subdomains fivefold. This is also seen in the run times, which decrease as we use more cores, and hence more subdomains, to solve the problem. In particular, we see that the parallel efficiency remains over 100%, showing the strong scalability of the approach, in part due to the fact that as
N increases, we have to solve smaller eigenproblems, and so the percentage of time spent by the eigensolver drops significantly as we increase
N. Overall, these provide promising results that the H-GenEO method can be effective for the solution of high wave number problems in 2D.
4. Conclusions
In this work, we have developed and explored a GenEO-type coarse space for additive Schwarz methods that is appropriate for the heterogeneous Helmholtz problem. We have conducted extensive numerical tests to show how this approach behaves on a 2D model test problem of a wave guide discretised using finite elements on a pollution-free mesh, comparing our method with the DtN coarse space. We find that only our H-GenEO approach is robust to heterogeneity and increasing wave number k and further provides good scaling such that iteration counts of right-preconditioned GMRES are only mildly dependent on the number of subdomains. This dependence is strongest for lower wave numbers on many subdomains and reduces as k grows. Furthermore, convergence does not deteriorate with increasing wave number, albeit at the cost of a coarse space that grows as k increases. This is achieved consistently for non-uniform partitioning into subdomains and in the presence of strong heterogeneity. These results show promise that H-GenEO can be used as an effective coarse space for challenging heterogeneous problems.
Finally, we discuss how these findings differ from that in the companion paper [
40], where none of the approaches (including the spectral coarse spaces explored in detail here) are seen to be clearly favourable over a wide range of problem settings. A key difference is that well-resolved pollution-free meshes are used in the present work, while the more realistic benchmark problems examined in [
40] use under-resolved meshes with a fixed number of points per wavelength, as is typical in engineering practice. As well as the more complex test cases considered, another contributing factor to the difference in studies is that in [
40], a fixed number of eigenvectors was taken per subdomain (limiting the size of the coarse space), whereas here, eigenvalue thresholding was employed to provide a more robust approach, at the cost of a potentially larger coarse space. As such, the present study can be thought of as investigating the extent of what can be achieved in terms of a robust method in the ideal case, while [
40] presents a viewpoint within the regime of more challenging practical problems. Identifying coarse spaces that can bridge this gap and provide a consistently robust approach even for the most demanding real-world applications remains an open area of research.