Next Article in Journal
Particle Swarm Training of a Neural Network for the Lower Upper Bound Estimation of the Prediction Intervals of Time Series
Next Article in Special Issue
ANOVA-GP Modeling for High-Dimensional Bayesian Inverse Problems
Previous Article in Journal
A Hybrid Large Eddy Simulation Algorithm Based on the Implicit Domain Decomposition
Previous Article in Special Issue
Distribution of Eigenvalues and Upper Bounds of the Spread of Interval Matrices
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Flexible Extended Krylov Subspace Method for Approximating Markov Functions of Matrices

School of Mathematical and Statistical Sciences, Clemson University, O-110 Martin Hall, Box 340975, Clemson, SC 29634, USA
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Mathematics 2023, 11(20), 4341; https://doi.org/10.3390/math11204341
Submission received: 21 September 2023 / Revised: 15 October 2023 / Accepted: 17 October 2023 / Published: 19 October 2023
(This article belongs to the Special Issue Advances in Numerical Linear Algebra and Its Applications)

Abstract

:
A flexible extended Krylov subspace method ( F -EKSM) is considered for numerical approximation of the action of a matrix function f ( A ) to a vector b, where the function f is of Markov type. F -EKSM has the same framework as the extended Krylov subspace method (EKSM), replacing the zero pole in EKSM with a properly chosen fixed nonzero pole. For symmetric positive definite matrices, the optimal fixed pole is derived for F -EKSM to achieve the lowest possible upper bound on the asymptotic convergence factor, which is lower than that of EKSM. The analysis is based on properties of Faber polynomials of A and ( I A / s ) 1 . For large and sparse matrices that can be handled efficiently by LU factorizations, numerical experiments show that F -EKSM and a variant of RKSM based on a small number of fixed poles outperform EKSM in both storage and runtime, and usually have advantages over adaptive RKSM in runtime.

1. Introduction

Consider a large square matrix A R n × n and a function f such that the matrix function f ( A ) R n × n is well-defined [1,2]. The numerical approximation of
q = f ( A ) b ,
where b R n is a vector, is a common problem in scientific computing. It arises in numerical solutions of differential equations [3,4,5,6], matrix functional integrators [7,8], model order reduction [9,10], and optimization problems [11,12]. Note that approximating the action of f ( A ) to a vector b and approximating the matrix f ( A ) are different. For a large sparse matrix A, f ( A ) is usually fully dense and infeasible to form explicitly.
Numerical methods for approximating the action of f ( A ) to a vector have been extensively studied in recent years, especially for large-scale problems; see, e.g., [13,14,15] and references therein. Existing algorithms often construct certain polynomial or rational approximations to f over the spectrum of A and apply such approximations directly to the vector b without forming any dense matrices of the same size as A. A class of mostly common projection methods are based on Krylov subspaces K m ( A , b ) ; however, for many large matrices this may require a very large dimension of approximation spaces. Rational Krylov subspace methods have been investigated to decrease the size of subspaces for approximations; see, e.g., [16,17,18,19,20]. Two well-known examples are the extended rational Krylov subspace method (EKSM) [14,21,22] and the adaptive rational Krylov subspace method (adaptive RKSM) [23,24].
In this paper, we explore a generalization of EKSM that uses one fixed nonzero pole alternately with the infinite pole for approximating the action of Markov-type (Cauchy–Stieltjes) functions [25]. Markov-type functions can be written as
f ( z ) = 0 d μ ( ζ ) z ζ , z C ( , 0 ] ,
where μ is a measure that ensures that the integral converges absolutely. Note that this definition can be generalized to integrals defined on [ α , β ] , where α < β < . We consider the interval ( , 0 ] here, as this is sufficient for the most widely-studied Markov-type functions f ( z ) = z γ with 1 < γ < 0 , f ( z ) = e θ z 1 z and for their simple modifications, such as z f ( z ) with Z + . Our analysis can be extended to integrals on ( , β ] as long as the measure μ ( ζ ) satisfies β | d μ ( ζ ) | < , as assumed in [21].
This study is motivated by [21], which provided an upper bound on the convergence factor of EKSM for approximating f ( A ) b . Our work concerns a generalization of EKSM, replacing the zero pole used in EKSM with a fixed nonzero pole s; hence, we call it the flexible extended Krylov subspace method ( F -EKSM). This algorithm can apply the three-term recurrence to enlarge the subspace for symmetric matrices such as EKSM, while full orthogonalization process may be necessary for adaptive RKSM regardless of the symmetry of matrices. Beckermann and Reichel [26] studied the asymptotic convergence rate of RKSM with different pole selections for approximating f ( A ) b of Markov functions via Faber transform; however, they did not provide explicit expressions of the optimal cyclic 2 poles or the corresponding rate of convergence, which could be done by solving a quartic equation analytically. In this paper, we derive explicit expressions of the optimal pole s and the corresponding convergence factor using the different analytic tool from [21]. While our bounds on the convergence factor seem to not be as tight as the bounds in [26] numerically, our poles and bounds are provided in explicit expressions; in addition, our pole usually leads to faster convergence for discrete Laplacian matrices based on finite difference discretizations and many practical nonsymmetric matrices.
We explore the optimal pole s to achieve the lowest upper bound on the asymptotic convergence factor, which is guaranteed to outperform EKSM on symmetric positive definite (SPD) matrices. For nonsymmetric matrices with an elliptic numerical range, we provide numerical evidence to demonstrate the advantage of F -EKSM over EKSM in convergence rate. Numerical experiments show that F -EKSM converges at least as rapidly as the upper bound suggests. In practice, if the linear systems needed for constructing rational Krylov subspaces can be solved efficiently by LU factorizations, then F -EKSM outperforms EKSM in both time and storage cost over a wide range of matrices, and it could run considerably faster than adaptive RKSM for many problems. In this paper, we only consider factorization-based direct methods for solving the inner linear systems; for the use and implications of iterative linear solvers see, e.g., [27].
Rational Krylov subspace methods may exhibit superlinear convergence for approximating f ( A ) b . As the algorithms proceed and more rational Ritz values converge to the exterior eigenvalues of A, the ‘effective spectrum’ of A not covered by the converged Ritz values shrinks, leading to gradual acceleration of the convergence. This analysis has been performed for EKSM applied to the 1D discrete Laplacian ([28], Section 5.2) based on the result from [21], leading to a sharper explicit bound on the convergence. The same idea could be explored with F -EKSM; however, it is not considered here, as we did not observe superlinear convergence in our experiments. This was probably because the effective spectrum of our large test matrices did not shrink quickly enough to exhibit convergence speedup before the stopping criterion was satisfied.
Though F -EKSM is closely connected to EKSM, we emphasize that the convergence of F -EKSM cannot be derived directly from that of EKSM applied to a shifted matrix. Admittedly, with the same vector b it is the case that F -EKSM with a pole s applied to A and EKSM applied to I A / s both generate the same subspaces Q 2 m + 1 ( s ) ( A , b ) = ( I A / s ) m span { b , A b , . . . , A 2 m b } , however, the existing theory of EKSM [21] can only provide a bound on the convergence factor for approximating f ( I A / s ) b , which is not what is needed and has no obvious relationship with the convergence for approximating f ( A ) b from the same subspace. Our analysis is based on a special min–min optimization instead of the results of EKSM applied to a shifted matrix.
The remainder of this paper is organized as follows. In Section 3, we discuss the implementation of F -EKSM. In Section 4, we analyze the linear convergence factor of F -EKSM and provide the optimal pole with which the lowest upper bound on the convergence factor can be achieved for SPD matrices. In addition, we numerically explore the optimal pole and the convergence factor of F -EKSM for nonsymmetric matrices with an elliptic numerical range. In Section 5, we consider a variant of RKSM that applies a few fixed cyclic poles to provide faster approximations than F -EKSM for certain challenging nonsymmetric matrices. In Section 6, we show the results of numerical experiments for different methods on a variety of matrices. Our conclusions are provided in Section 7, followed by several proofs of Lemmas in the Appendices.

2. Rational Krylov Subspace Methods and F -EKSM

For a wide range of matrix function approximation problems, polynomial Krylov subspace methods converge very slowly [29,30]. To speed up convergence, a more efficient approach is to apply rational Krylov subspace methods; see, e.g., [31,32] and references therein.
The procedure of RKSM is outlined as follows. Starting with an initial nonzero vector b that generates Q 1 ( A , b ) = span { v 1 } , where v 1 = b / | | b | | , RKSM keeps expanding the subspaces to search for increasingly more accurate approximate solutions to our problem of interest. In order for RKSM to expand the current subspace Q m ( A , b ) to Q m + 1 ( A , b ) , we apply the linear operator ( γ m I η m A ) 1 ( α m I β m A ) to a vector u Q m ( A , b ) Q m 1 ( A , b ) . To build an orthonormal basis { v 1 , v 2 , . . . , v m + 1 } of the enlarged subspace Q m + 1 ( A , b ) , we may choose u = v m and adopt the modified Gram–Schmidt orthogonalization, obtaining
( γ m I η m A ) 1 ( α m I β m A ) v m = i = 1 m + 1 h i m v i ,
where h i m = v i * ( γ m I η m A ) 1 ( α m I β m A ) v m . To ensure that the linear operator is well-defined and nonsingular, we require that | γ m | 2 + | η m | 2 0 , | α m | 2 + | β m | 2 0 , ( γ m , η m ) ( α m , β m ) up to a scaling factor, and that α m β m and γ m η m (if β m and η m are nonzeros) not be an eigenvalue of A. The use of four parameters ( α m , β m , γ m , η m ) provides the flexibility to accommodate both the zero ( γ m = 0 , η m = 1 ) and the infinity ( γ m = 1 , η m = 0 ) poles in a unified framework. The expansion of rational Krylov subspaces does not have to be based on the last orthonormal basis vector u = v m , as in (2). There are alternative ways to choose the continuation vector to expand the subspaces; see, e.g., [33].
The shift-inverse matrix vector product ( γ m I η m A ) 1 ( α m I β m A ) v m is equivalent to solving ( γ m I η m A ) w = ( α m I β m A ) v m (the inner linear system) for w. Multiplying both sides of (2) by γ m I η m A , moving all terms containing A to the left-hand side and all other terms to the right-hand side, we have
A i = 1 m + 1 η m h i m v i β m v m = α m v m + γ m i = 1 m + 1 h i m v i .
Note that the above relation should hold for each index value m = 1 , 2 , . . . , thus, it is not hard to see that (3) can be written in the following matrix form:
A V m + 1 F ̲ m A V m + 1 H ̲ m diag ( η 1 , , η m ) diag ( β 1 , , β m ) 0 1 × m = V m + 1 H ̲ m diag ( γ 1 , , γ m ) diag ( α 1 , , α m ) 0 1 × m V m + 1 G ̲ m ,
where V m + 1 = v 1 , v 2 , . . . , v m + 1 contains the orthonormal basis vectors of the rational Krylov subspace:
Q m + 1 A , v 1 = q m A 1 K m + 1 A , v 1 = k = 1 m ( γ k I η k A ) 1 span v 1 , A v 1 , A 2 v 1 , . . . , A m v 1 ,
and H ̲ m , F ̲ m , and G ̲ m R ( m + 1 ) × m are all upper Hessenberg matrices. Specifically,
H ̲ m = H m h ( m + 1 ) m e m * , F ̲ m = F m f ( m + 1 ) m e m * = H m diag ( η 1 , . . . , η m ) diag ( β 1 , . . . , β m ) h ( m + 1 ) m η m e m * , G ̲ m = G m g ( m + 1 ) m e m * = H m diag ( γ 1 , . . . , γ m ) diag ( α 1 , . . . , α m ) h ( m + 1 ) m γ m e m * ,
where e m = [ 0 , , 0 , 1 ] * R m .
The idea of RKSM as a projection method is the same as standard Krylov: first, solve the Galerkin projected problem defined for V m * A V m = F m G m 1 , then project the solution back to the m-dimensional subspace as an approximate solution for the original problem defined for A.
The parameters α m , β m , γ m , η m can be changed in each iteration to determine the poles. For example, if we set α k = η k = 0 , γ k = 1 and β k = 1 for all k Z + , it is the standard Krylov subspace method; if we set β 2 k = η 2 k 1 = 1 , α 2 k 1 = γ 2 k = 1 , and α 2 k = β 2 k 1 = γ 2 k 1 = η 2 k = 0 for k Z + , it becomes EKSM. A special variant of EKSM [34,35] constructs the following extended subspaces:
K l , m ( A , b ) = span A l + 1 b , . . . , A 1 b , b , A b , . . . , A m 1 b .
A practical choice for the two indices l and m leads to subspaces of the form K m , i m + 1 ( A , b ) for some i N + , which requires an orthonormal basis for the Krylov subspaces with vectors
b , A b , A 2 b , . . . , A i b , A 1 b , A i + 1 b , . . . , A 2 i b , A 2 b , A 2 i + 1 b , . . . .
However, there is no convergence theory for this special variant.
The general rational Krylov space of order m is provided by [36,37]:
Q m ( A , b ) = q m 1 ( A ) 1 span { b , A b , . . . , A m 1 b } , where q m 1 ( z ) = ( γ 1 η 1 z ) ( γ 2 η 2 z ) . . . ( γ m 1 η m 1 z ) ,
with γ i , η i prescribed in (2). For EKSM, the rational Krylov subspace of dimension is
Q 2 m + 1 ( E ) ( A , b ) = K m + 1 ( A , b ) K m + 1 ( A 1 , b ) = A m span { b , A b , . . . , A 2 m b } .
EKSM applies the operators A 1 and A in an alternating manner in each iteration.
For adaptive RKSM, the operation at step m can be written as follows:
( I A / s m ) 1 ( A σ m I ) v m = i = 1 m + 1 h i m v i ,
where s m is a nonzero pole and σ m is a zero of the underlying rational function. To find the optimal poles and zeros at each step, we first restrict the poles and zeros to disjoint sets Ξ and Σ , respectively, where Σ W ( A ) and Ξ C W ( A ) [38] and where W ( A ) = x * A x | x C n , x 2 = 1 is the numerical range of A. The pair ( Σ , Ξ ) is called a condenser [39,40]. An analysis of RKSM considers a sequence of rational nodal functions
r m ( z ) = j = 1 m z σ j 1 z / s j ,
where the zeros σ j Σ and the poles s j Ξ . Adaptive RKSM tries to obtain asymptotically optimal rational functions by defining σ j + 1 and s j + 1 recursively with the following conditions: after choosing σ 1 and s 1 of minimal distance, define [38]:
σ j + 1 = arg max z Σ r j ( z ) , s j + 1 = arg min z Ξ r j ( z ) .
The points { ( σ j , s j ) } are called generalized Leja points [41,42]. In practice, we compute approximations with respect to the poles and zeros defined in (6) during the progress of iteration. Adaptive RKSM usually converges with fewer iterations than EKSM while using a smaller approximation subspace [24,38,43]. While usually converging in fewer iterations than the variants with a few cyclic poles [32], each step of adaptive RKSM requires a solution to a shifted linear system with a new shift, which is more expensive than using existing LU factorizations to solve the linear system with the same coefficient matrix that has been factorized. If the linear system at each RKSM step is solved by a direct method, adaptive RKSM tends to require longer runtimes than variants with a few cyclic poles based on reusing LU factorizations for each distinct pole. Adaptive RKSM is most competitive if the linear systems arising from each step need to be solved approximately by an iterative method and if effective preconditioning can be structured for each linear system with different shift.
In this paper, we consider generating rational Krylov subspaces with cyclic poles s, + , s, + , . . . ( s 0 ), which we call the flexible extended Krylov subspace method ( F -EKSM). The corresponding linear operators are provided by ( I A / s ) 1 and A, which are applied in an alternating manner. To this end, we set β 2 k = 1 , η 2 k 1 = 1 / s , α 2 k 1 = γ k = 1 , and α 2 k = β 2 k 1 = η 2 k = 0 for k Z + . The approximation space of F -EKSM is
Q 2 m + 1 ( s ) ( A , b ) = ( I A / s ) m span { b , A b , . . . , A 2 m b } .
The choice of the repeated pole s influences the convergence rate of F -EKSM. Our goal is to find the optimal pole s * for Markov-type functions of matrices such that F -EKSM achieves the lowest upper bound on the linear convergence factor. This subspace is identical to the one generated by EKSM applied to the shifted matrix I A / s ; the convergence theory of EKSM [21] would provide a convergence factor bound for approximating f ( I A / s ) b instead of f ( A ) b , however, this is not our concern here, as our results are derived with a special min–min optimization analysis, not from the results of EKSM applied to a shifted matrix.

3. Implementation of F -EKSM for Approximating f ( A ) b

Without loss of generality, suppose that | | b | | 2 = 1 in (1) and let the initial subspace be span b , ( I A / s ) 1 b . The approximation to f ( A ) b after ( m 1 ) steps is
q m = V 2 m f ( A m ) V 2 m * b = V 2 m f ( A m ) e 1 ,
where V 2 m C n × ( 2 m ) is an orthonormal set of basis vectors of the subspace
Q 2 m + 1 ( s ) ( A , b ) = K m ( A , b ) K m ( ( I A / s ) 1 , ( I A / s ) 1 b )
and A m = V 2 m * A V 2 m denotes the restriction of matrix A in Q 2 m + 1 ( s ) ( A , b ) .
Because K m ( A , b ) = K m ( ( I A / s ) , b ) , we have Q 2 m + 1 ( s ) ( A , b ) = Q 2 m + 1 ( E ) ( I A / s , b ) . To construct this subspace, we can apply EKSM to the matrix I A / s , obtain V 2 m = v 1 , . . . , v 2 m as the orthonormal set of basis vectors of Q 2 m + 1 ( E ) ( I A / s , b ) , and obtain
T 2 m = V 2 m * ( I A / s ) V 2 m R 2 m × 2 m ,
which is a block upper Hessenberg matrix (see Section 3 in [22]). From (7), we obtain A m = V 2 m * A V 2 m = s ( I 2 m T 2 m ) . Following the derivation of EKSM [22], we can derive a similar Arnoldi relation A V 2 m = V 2 m + 2 T ̲ 2 m for our proposed F -EKSM, where T ̲ 2 m R ( 2 m + 2 ) × 2 m is a block upper Hessenberg matrix with 2 × 2 blocks, and we obtain A m = V 2 m * A V 2 m = T 2 m . More implementation details of RKSM can be found in [33,44]. Notice that similar to EKSM, for symmetric problems, the orthogonalization cost of F -EKSM can be saved with the block three-term recurrence to enlarge the subspace.
The residual norm of F -EKSM is f ( A ) b V 2 m f ( A m ) e 1 ; however, it is not directly computable because f ( A ) b is unknown. One stopping criterion for the Arnoldi approximation is to compute h m + 1 , m e m * f ( A m ) e 1 [30,45,46]; however, this may not be valid for RKSM. Another possibility is to compute q m q m 1 , which is the norm of the difference between two computed approximations; see, e.g., [21,47]. Alternatively, it is possible to monitor the angle ( q m , q m 1 ) between the approximations [48] in two consecutive iterations. This convergence criterion is sometimes used in the literature on eigenvalue computations; see, e.g., [49,50]. The two criteria usually exhibit very similar behavior. In this section, we choose the latter.
For all RKSM, linear system solvers are required in common, as the action of ( γ m I η m A ) 1 to vectors is needed in (2). For EKSM and F -EKSM, respectively, we need the action of A 1 and ( I A / s ) 1 to the vectors. If these linear systems can be solved efficiently by direct methods, both of them need only one LU factorization performed one time and applicable to all linear solves. However, because of the adaptive poles, for adaptive RKSM it is necessary to solve a linear system with a different coefficient matrix for every iteration. Although adaptive RKSM achieves an asymptotically optimal convergence rate, it can be more time-consuming than EKSM and F -EKSM, as a new LU factorization in each step is usually much more expensive than a linear solution using existing LU factors.

4. Convergence Analysis of F -EKSM

Next, we study the optimal pole s for F -EKSM to achieve the lowest upper bound on the convergence factor through min–min optimization.

4.1. General Convergence Analysis

In this section, we explore the asymptotic convergence of F -EKSM. Consider the class of Markov-type functions f ( z ) in (8). For any a 0 , a Markov-type function can be split into the sum of two integrals [14]:
f ( z ) = 0 d μ ( ζ ) z ζ , z C ( , 0 ] ,
f ( z ) = f 1 ( z ) + f 2 ( z ) , where f 1 ( z ) = a d μ ( ζ ) z ζ , f 2 ( z ) = a 0 d μ ( ζ ) z ζ .
Here, we let W 1 : = W ( A ) = { w * A w : w 2 = 1 } be the numerical range of matrix A and define W 2 : = { s z s | z W 1 } , where s is the repeated pole for F -EKSM. We assume that W 1 is symmetric with respect to the real axis R and lies strictly in the right half of the complex plane. Then, for s < 0 , W 2 is symmetric with respect to the real axis R and lies in the left half of the complex plane. We define ϕ i : C ¯ W i C ¯ D as the direct Riemann mapping [51] for W i ( i = 1 , 2 ), where D is the unit disk, and define ψ i = ϕ i 1 as the inverse Riemann mapping.
Our convergence analysis initially follows the approach in [21], then analyzes a special min–min optimization. It first uses the Faber polynomials [52], providing a rational expansion of functions for investigating the approximation behavior of F -EKSM. The main challenge is to find how the fixed real pole s impacts the convergence and to determine the optimal s to achieve the lowest upper bound on the convergence factor.
Lemma 1.
For the Markov-type function defined by (8) (where μ is a measure such that the integral converges absolutely) and some given a > 0 , the following inequalities hold for any m N , m > 1 :
f 1 ( z ) k = 0 m 1 γ 1 , k F 1 , k ( z ) c 1 ϕ 1 ( a ) m , z W 1 ,
f 2 ( z ) k = 0 m 1 γ 2 , k F 2 , k 1 z / s 1 c 2 ϕ 2 ( z 0 ) m , z W 1 ,
where | ϕ 2 ( z 0 ) | = min | ϕ 2 ( 1 ) | , ϕ 2 s a s and where ϕ 1 ( a ) , ϕ 2 ( 1 ) , and ϕ 2 s a s are all greater than 1. Here, for i = 1 , 2 , γ i , k are some real numbers and F i , k denotes the Faber polynomials of degree k associated with the Riemann mapping ϕ i , while c 1 and c 2 are constant positive real numbers independent of m.
Proof. 
The proof is provided in Appendix A.   □
Lemma 2.
Assume that b 2 = 1 . For any given a 0 used in (9), the error of approximating f ( A ) b by F -EKSM with cyclic poles s, + , s, + , . . . satisfies
f ( A ) b V 2 m f ( A m ) e 1 c 8 min ϕ 1 ( a ) , ϕ 2 ( 1 ) , ϕ 2 s a s m .
Proof. 
Let us define
g ( z ) = f 1 ( z ) k = 0 m 1 γ 1 , k F 1 , k ( z ) , h ( z ) = f 2 ( z ) k = 0 m γ 2 , k F 2 , k 1 z / s 1 .
Because both g and h are analytic in W 1 , and as W ( T 2 m ) W 1 , from Theorem 2 in [53] we have
g ( A ) , g ( A m ) 11.08 max z W 1 g ( z ) , h ( A ) , h ( A m ) 11.08 max z W 1 h ( z ) .
Next, we follow the proof in Section 3, Theorem 3.4 in [21], and use the above inequality:
f ( A ) b V 2 m f ( A m ) e 1 = f 1 ( A ) b k = 0 m 1 γ 1 , k F 1 , k ( A ) b V 2 m f 1 ( A m ) e 1 + V 2 m k = 0 m 1 γ 1 , k F 1 , k ( A m ) e 1 + f 2 ( A ) b k = 0 m γ 2 , k F 2 , k s A s I 1 b V 2 m f 2 ( A m ) e 1 + V 2 m k = 0 m γ 2 , k F 2 , k s A m s I 1 e 1 = g ( A ) b V 2 m g ( A m ) e 1 + h ( A ) b V 2 m h ( A m ) e 1 g ( A ) + g ( A m ) + h ( A ) + h ( A m ) 2 × 11.08 max z W 1 g ( z ) + max z W 1 h ( z ) 22.16 c 1 ϕ 1 ( a ) m + c 2 min ϕ 2 ( 1 ) , ϕ 2 s a s m c 8 min ϕ 1 ( a ) , ϕ 2 ( 1 ) , ϕ 2 s a s m .
Remark 1.
The results of our analysis can additionally be applied to a linear combination of several Markov-type functions with monomials z l , l Z + . One example of these functions is f ( z ) = z ν , ν ( 0 , 1 ) , as z ν = z z ν 1 and z ν 1 is a Markov function. In addition, if the support of the underlying measure of the Markov function is a proper subset of ( , 0 ] , the error bound may not be sharp. The asymptotic convergence might be superlinear as well; see, e.g., [28]. While this idea could be explored with F -EKSM, it is not considered here because we did not observe superlinear convergence in our experiments. This was probably because the effective spectrum of our large test matrices did not shrink quickly enough to exhibit convergence speedup before the stopping criterion was satisfied.
To find the optimal pole to achieve the lowest upper bound on the asymptotic convergence factor of F -EKSM, we need to determine s 0 such that
min ϕ 1 ( a ) , ϕ 2 ( 1 ) , ϕ 2 s a s
is maximized. Let us define
ρ ( s , a ) = 1 / min ϕ 1 ( a ) , ϕ 2 ( 1 ) , ϕ 2 s a s .
Therefore, F -EKSM converges at least linearly with a convergence factor ρ ( s , a ) < 1 , where ρ depends on s and a. For any given pole s 0 , we can find the artificial parameter a > 0 used in (9) such that it minimizes ρ ( s , a ) . Let ρ ˜ ( s ) be the minimized ρ ( s , a ) ; then, we need to find s 0 that minimizes ρ ˜ ( s ) . We denote the minimized ρ ˜ ( s ) by ρ * , which is the lowest upper bound on the asymptotic convergence factor.
In summary, to find the optimal pole s needed to obtain the lowest upper bound on the asymptotic convergence factor of F -EKSM, we can solve the following optimization problem:
ρ * = min s 0 ρ ˜ ( s ) = 1 / max s 0 max a 0 min ϕ 1 ( a ) , ϕ 2 ( 1 ) , ϕ 2 s a s .
The asymptotic convergence factor of F -EKSM in (13) is dependent on the Riemann mapping ϕ . The formula of ϕ is different for matrices with different numerical ranges, which leads to different values of ρ * . In the following section, we show that this problem has an analytical solution if A is symmetric positive definite.

4.2. The Symmetric Positive Definite Case

To explore the optimal pole s and the corresponding bound on the convergence factor of F -EKSM, we can consider a symmetric positive definite matrix A. Assume that α , β > 0 are the smallest and the largest eigenvalues of A, respectively. The Riemann mappings ϕ 1 , ϕ 2 are
ϕ 1 ( z ) = z c d + z c d 2 1 , ( z c ) > 0 z c d z c d 2 1 , ( z c ) < 0 , z C ¯ W 1
ϕ 2 ( z ) = z c ^ d ^ + z c ^ d ^ 2 1 , z c ^ d ^ > 0 z c ^ d ^ z c ^ d ^ 2 1 , z c ^ d ^ < 0 , z C ¯ W 2 ,
where c = α + β 2 , d = β α 2 , c ^ = 1 2 ( s α s + s β s ) and d ^ = 1 2 ( s β s s α s ) . It follows that
ϕ 1 ( a ) = M + M 2 1 , where M = c + a d > 0 , ϕ 2 ( 1 ) = N 1 + N 1 2 1 , where N 1 = 1 + c ^ d ^ > 0 , and ϕ 2 s a s = N 2 + N 2 2 1 , where N 2 = s a s c ^ d ^ .
Note that all the three expressions are the values of the function q ( t ) = t + t 2 1 at different values of t. Therefore, to compare ϕ 1 ( a ) , ϕ 2 ( 1 ) , and ϕ 2 s a s , it is sufficient to compare M, N 1 , and N 2 , which is much easier.
Lemma 3.
Using the notation provided in (14)–(16),
max a 0 min { M , N 1 , N 2 } = α + β 2 α β / s β α , i f   s ( , s 0 ] α + β 2 s + 2 ( α s ) ( β s ) β α , i f   s ( s 0 , 0 ] ,
where s 0 = α β κ 1 / 6 + κ 1 / 6 and κ = β / α is the condition number of matrix A.
Proof. 
The proof is provided in Appendix B.    □
We are now ready to show the major result regarding the optimal pole and the corresponding lowest upper bound on the asymptotic convergence factor of F -EKSM for approximating f ( A ) b of Markov-type functions.
Theorem 1.
Let ρ ˜ ( s ) = min a 0 ρ ( s , a ) be the convergence factor of F -EKSM for approximating f ( A ) b as defined in (12), where the matrix A is symmetric positive definite. Then, for the optimization problem
ρ * = min s 0 min a 0 ρ ( s , a ) = min s 0 ρ ˜ ( s ) ,
the optimal solution is
s * = s 0 = α β κ 1 / 6 + κ 1 / 6
and the optimal objective function value is
ρ * = 1 Z * + Z * 2 1 , w h e r e Z * = κ + 1 + 2 κ κ 1 / 6 + κ 1 / 6 κ 1 .
Proof. 
It is equivalent to find the optimal s that solves the following problem:
max s 0 T , where T = max a 0 min M , N 1 , N 2 .
From Lemma 3, T is a piecewise function with variable s, and we only need to find its maximum value for s ( , 0 ] .
For s ( , s 0 ] , T ( s ) = α + β 2 α β / s β α is a monotonically increasing function; therefore, when s = s 0 , T ( s ) has its maximum value on this interval.
For s ( s 0 , 0 ] , T ( s ) = α + β 2 s + 2 ( α s ) ( β s ) β α . We find the first derivative of T ( s ) to be
T ( s ) = 1 β α α s + β s 2 ( α s ) ( β s ) < 0 .
Because T ( s ) decreases monotonically on ( s 0 , 0 ] , it has its maximum value when s = s 0 . Therefore,
max s 0 T ( s ) = T ( s 0 ) = α + β 2 α β / s 0 β α = κ + 1 + 2 κ κ 1 / 6 + κ 1 / 6 κ 1 .
To sum up, for both s ( , s 0 ] and s ( s 0 , 0 ] , the maximizer of T ( s ) is s = s 0 . Consequently, it is the global optimal solution for s ( , 0 ] .
With s = s 0 , we can now return to (13) and (16) and obtain
ρ * = 1 max s 0 max a 0 min ϕ 1 ( a ) , ϕ 2 ( 1 ) , ϕ 2 s a s = 1 T ( s 0 ) + T ( s 0 ) 2 1 ,
where T ( s 0 ) is provided in (18). The proof is established.    □

4.3. Nonsymmetric Case

Similar to the SPD case, to explore the lowest upper bound on the convergence factor of F -EKSM we can consider a nonsymmetric matrix A with eigenvalues located in the right half of the complex plane. Let α , β , γ > 0 and assume that the numerical range of matrix A can be covered by an ellipse centered at point c = α + β 2 with a semi-major axis d = β α 2 and semi-minor axis γ .
The Riemann mapping ϕ 1 is provided by
ϕ 1 ( z ) = c z d 1 + η 2 + c z d 1 + η 2 2 η , z C ¯ W 1 ,
where η = d γ d + γ . Although the Riemann mapping ϕ 2 is not easy to derive explicitly, for a given s we can first approximate W 2 by a polygon, then use the Schwarz–Christoffel mapping toolbox [54] to approximate ϕ 2 numerically. Then, we can compare ϕ 1 ( a ) , ϕ 2 ( 1 ) , and ϕ 2 s a s for different values of a. Based on (13), we tested different values of s to find the optimal pole such that max a 0 min ϕ 1 ( a ) , ϕ 2 ( 1 ) , ϕ 2 s a s is maximized. Table 1 shows the optimal pole s and the upper bound on the asymptotic convergence factor ρ for matrices with different elliptic numerical ranges.
In Table 1, η = 1 indicates the SPD case; with η = 0 , the numerical range becomes a disk. It can be seen that when η decreases, the convergence factor ρ for both EKSM and F -EKSM increases, which implies that in the case of an elliptic numerical range both methods converge significantly more slowly than in the SPD case. In particular, when β = 10 4 and η = 0 , it takes about 4.5 times as many steps as are needed for the corresponding SPD case ( β = 10 4 , η = 1 ). It is worthwhile to compare these two methods with adaptive RKSM to determine whether the slowdown is severe.
Another observation from Table 1 is that the optimal pole s * in the nonsymmetric case is not far away from that in the SPD case. Hence, it is reasonable to approximate the optimal shift s * for the nonsymmetric case using the one for the SPD case (see (17)), as the actual optimal s * based on an accurate estimate of the numerical range is generally difficult to evaluate, if it is even possible at all. Actually, for a nonsymmetric matrix A R n × n , the approximation of s * using (16) is exactly the optimal pole for its symmetric part ( A + A * ) / 2 . Because
W ( A ) = x H A x | x C n , x * x = 1 = ( p + q i ) H A ( p + q i ) | p , q R n , p * p + q * q = 1 = ( p * A p + q * A q ) + ( p * A q q * A p ) i | p , q R n , p * p + q * q = 1 , W A + A * 2 = ( p + q i ) H A + A * 2 ( p + q i ) | p , q R n , p * p + q * q = 1 = p * A p + q * A q | p , q R n , p * p + q * q = 1 ,
it is clear that W A + A * 2 = W ( A ) . If W ( A ) has an ellipse boundary centered at point c = α + β 2 with a semi-major axis d = β α 2 and semi-minor axis γ , it follows that W ( A + A * 2 ) = x | α x β . To obtain such an approximation with respect to s * , we only need to run a modest number of Arnoldi steps within an acceptable amount of time in order to obtain the approximations with respect to α and β that are needed in (17).

4.4. Convergence Analysis with Blaschke Product

Another convergence analysis for approximating functions of matrices can be seen in [26]. Using the same notation as above and combining Theorem 5.2 with Equation (6.4) in [26], we obtain a bound with the following form:
f ( A ) b V 2 m f ( A m ) e 1 c max y ϕ ( [ , 0 ] ) 1 | B ( y ) | = c max y ϕ 1 ( [ , 0 ] ) j = 1 2 m y w j 1 w j y ,
where B ( y ) is called the Blaschke product and w j = ϕ 1 ( s j ) . Using the cyclic poles s , as F -EKSM, we find w [ ϕ 1 ( 0 ) , ] , ( ϕ 1 ( 0 ) 1 ) to minimize
max y ϕ 1 ( [ , 0 ] ) y w y ( 1 w y ) .
Note that for y [ ϕ 1 ( 0 ) , ] , y w y ( 1 w y ) achieves its maximum either when y = ϕ 1 ( 0 ) or when y = w + w 2 1 . The problem then becomes the following optimization problem for w:
ρ * ˜ = min w [ ϕ 1 ( 0 ) , ] max ϕ 1 ( 0 ) w ϕ 1 ( 0 ) ( 1 w ϕ 1 ( 0 ) ) , w w 2 1 2 .
It can be shown that the minimum is achieved when ϕ 1 ( 0 ) w ϕ 1 ( 0 ) ( 1 w ϕ 1 ( 0 ) ) = w w 2 1 2 . The optimal w is then one root of a fourth-order equation, which is greater than ϕ 1 ( 0 ) :
4 w 1 2 w 4 + 4 w 1 ( w 1 2 + 1 ) w 3 + ( w 1 2 1 ) 2 w 2 4 w 1 ( w 1 2 + 1 ) w + 4 w 1 2 = 0 ,
where w 1 = ϕ 1 ( 0 ) .
For the symmetric positive definite case, where ϕ 1 ( z ) is defined as in (14), w 1 = κ + 1 κ 1 ; thus, the optimal w in (20) only depends on the condition number of the matrix A.
The convergence analysis for the optimal pole based on [26] involves a quartic function in (20), and it is difficult to to find an explicit formula for the optimal pole. On the other hand, our analysis based on [21] provides an explicit formula for the optimal pole in Theorem 1. Next, in Section 6, we compare the theoretical convergence rates and actual performance for these two optimal poles to different benchmarks.

5. RKSM with Several Cyclic Poles for Approximating f ( A ) b

In our problem setting, the shift-inverse matrix/vector operations for RKSM are performed by factorization-based direct linear solvers; F -EKSM usually outperforms EKSM in both space size and runtime. Compared with adaptive RKSM, F -EKSM often takes more steps but less time to converge for large sparse SPD matrices, although its performance in both space and time can become inferior to adaptive RKSM for certain challenging nonsymmetric problems. To improve the performance of F -EKSM, we consider using a few more fixed repeated poles. The rationale for this strategy is to take a balanced tradeoff between F -EKSM and the adaptive variants of RKSM, ensuring that this variant of RKSM has modest storage and runtime costs.
For example, we can consider such a method based on four repeated poles. Starting with the optimal pole s 1 < 0 of F -EKSM (17) and s 2 = β (the negative of the largest real part of all eigenvalues), we apply several steps of adaptive RKSM to find and use new poles until we find at least one pole smaller than s 1 and one pole greater than s 1 (both in terms of modulus). For all poles obtained adaptively during this procedure, we let s 3 be the smallest (in modulus) and s 4 be the largest one. It is not hard to see that the adaptive RKSM steps terminate with the last pole s f being either s 3 or s 4 . Our numerical experience suggests that additional simple adjustment to s 3 or s 4 can help to improve convergence. Specifically, if s f = s 3 , then s 3 is divided by a factor of μ ; otherwise, s 4 is multiplied by the same factor. Experimentally, we found that μ = 10 provides the best overall performance. Thus, we keep the LU factorizations associated with the four poles, and in each step we choose the pole cyclically from the set s 1 , s 2 , s 3 , s 4 .
In fact, we can use convergence analysis with Blaschke product in Section 4.4. If we want to use four cyclic poles, we can solve the following optimization problem:
min w 1 , w 2 , w 3 , w 4 [ ϕ 1 ( 0 ) , ] max y [ ϕ 1 ( 0 ) , ] j = 1 4 y w j 1 w j y .
It takes time to compute the optimal w 1 , w 2 , w 3 , w 4 numerically for the specific problem setting, and our numerical experience shows that it takes a similar number of iterations to converge compared to the above F -EKSM variant with four poles.

6. Numerical Experiments

We tested different variants of RKSM for approximating f ( A ) b , where the functions were f 1 ( z ) = z 1 / 2 , f 2 ( z ) = e z , f 3 ( z ) = tanh z z , f 4 ( z ) = z 1 / 4 , and f 5 ( z ) = log ( z ) . The first four consist of Markov-type functions and a Markov-type function multiplied with monomials z l , l Z ; while the last function f 5 is non-Markov type, our algorithms exhibit similar behavior when approximating f 5 ( A ) b as on the other functions. All experiments were carried out in MATLAB R2019b on a laptop with 16 GB DDR4 2400 MHz memory, Windows 10 operating system, and 2.81 GHz Intel dual-core CPU.

6.1. Asymptotic Convergence of EKSM and F -EKSM

For a real symmetric positive definite matrix A, EKSM with cyclic poles 0 , , 0 , , . . . converges at least linearly as follows:
f ( A ) b V 2 m f ( A m ) e 1 C ρ m ,
where ρ = 1 Z + Z 2 1 , Z = κ + 1 + 2 κ κ 1 (and see Proposition 3.6 in [21]). Similarly, F -EKSM with cyclic poles s * , , s * , , . . . converges at least linearly with factor
ρ * = 1 Z * + ( Z * ) 2 1 , w h e r e Z * = κ + 1 + 2 κ κ 1 / 6 + κ 1 / 6 κ 1
because Z * > Z , F -EKSM has a smaller upper bound on the convergence factor than EKSM.
For the optimal pole from (20), using the Blaschke product technique we can denote the method as F -EKSM* and its optimal pole as s * ˜ = ϕ 1 1 ( w 1 ) , with the convergence factor ρ * ˜ in (19). Because s * ˜ is a root of a fourth-order equation, it is difficult to explicitly find its value; thus, we list several examples to compute the poles and convergence factors for both single pole methods. In addition, we list the convergence factors for the shift-inverse Arnoldi method (SI) based on one fixed nonzero pole:
ρ S I = κ 1 κ + 1 + 2 κ 1 / 4
(see [26], Corollary 6.4a).
For a matrix A with the smallest eigenvalue α = 1 and largest eigenvalue β = κ , Table 2 shows the difference in the upper bounds of their convergence factors; note that the asymptotic convergence factors are independent of the function f.
It can be seen from Table 2 that F -EKSM has a lower upper bound on the asymptotic convergence factor than EKSM, with F -EKSM* having an even lower upper bound. The optimal pole s * for F -EKSM is roughly two to three times that of s * ˜ for F -EKSM*. The shift-inverse Arnoldi has the largest converegence factor; thus, we did not compare it with the other methods in our later tests.
To check the asymptotic convergence factor for each method, it is necessary to know the exact vector f ( A ) b for a given matrix A and vector b in order to calculate the norm of the residual at each step. For relatively large matrices it is only possible to directly evaluate the exact f ( A ) b for diagonal matrices within a reasonable time. Because each SPD matrix is orthogonally similar to the diagonal matrix of its eigenvalues, our experiment results can be expected to reflect the behavior of EKSM, F -EKSM, and F -EKSM* applied to more general SPD matrices.
We constructed two diagonal matrices A 1 , A 2 . The diagonal entry d j for A 1 is d j = ( α 1 + β 1 ) / 2 + cos ( θ j ) ( β 1 α 1 ) / 2 , 1 j 10,000, where the θ j s are uniformly distributed on the interval 0 , 2 π and α 1 = 10 7 , β 1 = 1 . The diagonal entry d j for A 2 is d j = 10 8 ρ 2 ( j 1 ) , 1 j 20,000, ρ 2 = 1.001 . We approximated f i ( A i ) b using EKSM, F -EKSM, and F -EKSM*, where b is a fixed vector with entries of standard normally distributed random numbers. The experimental results are shown in Figure 1.
From the figures for different Markov-type functions, it can be seen that all methods converge with factors no worse than their theoretical bounds, which verifies the validity of the results shown in Theorem 1. Moreover, F -EKSM and F -EKSM* always converge faster than EKSM for all the test functions and matrices. In particular, for approximating f 2 ( A 2 ) b , our F -EKSM only takes about one quarter as many iterations to achieve a relative error less than 10 5 as compared to EKSM. Furthermore, the theoretical bounds are not always sharp for those methods.
For nonsymmetric matrices with elliptic numerical ranges, the theoretical convergence rate cannot be derived by an explicit formula; Table 1 shows the numerical results. To verify these results, we constructed a block diagonal matrix A 3 R 4901 × 4901 with 2 × 2 diagonal blocks with eigenvalues that lie on the circle centered at z = 5000.5 with radius 4999.5 . We then constructed another block diagonal matrix A 4 R 4901 × 4901 with eigenvalues that lie in the ellipse centered at z = 5000.5 , with a semi-major axis 4999.5 and semi-minor axis 714.2 . The optimal pole for F -EKSM can be computed using the strategy described in Section 4.2 ( s 3 * = 20.87 and s 4 * = 11.02 ). Figure 2 shows the spectra of the matrices A i , A i 1 and ( I A i / s i * ) 1 for i = 3 , 4 . Figure 3 shows the results for the nonsymmetric matrices. Table 1 shows the asymptotic convergence factors of EKSM and F -EKSM for matrices A 3 and A 4 , where β = 10 4 , η = 1 , ρ F E K * = 0.65 , and ρ E K = 0.82 for A 3 , while β = 10 4 , η = 0.75 , ρ F E K * = 0.84 , and ρ E K = 0.95 for A 4 .
Similar to the results for SPD matrices, F -EKSM converges faster than EKSM for the two artificial nonsymmetric problems. The upper bounds on the convergence factor in Table 1 match the actual convergence factor quite well.

6.2. Test for Practical SPD Matrices

Next, we tested EKSM, F -EKSM, F -EKSM*, and adaptive RKSM on several SPD matrices and compared their runtimes and the dimension of their approximation subspaces. Note that for general SPD matrices the largest and smallest eigenvalues are usually not known in advance. The Lanczos method or its restarted variants can be applied to estimate them, and this computation time should be taken into consideration for F -EKSM and F -EKSM*. The variant of EKSM in (5) is not considered in this section, as there is no convergence theory to compare with the actual performance and the convergence rate largely depends on the choice of l and m in (5).
For the SPD matrices, we used Cholesky decomposition with approximate minimum degree ordering to solve the linear systems involving A or I A / s for all four methods. The stopping criterion of all methods was to check whether the angle between the approximate solutions obtained at two successive steps fell below a certain tolerance. EKSM and F -EKSM, and F -EKSM* all apply the Lanczos three-term recurrence and perform local re-orthogonalization to enlarge the subspace, whereas adaptive RKSM applies a full orthogonalization process with global re-orthogonalization.
We tested four 2D discrete Laplacian matrices of orders 128 2 , 256 2 , 512 2 , and 1024 2 based on standard five-point stencils on a square. For all problems, the vector b was a vector with entries of standard normally distributed random numbers, allowing the behavior of all four methods to be compared for matrices with different condition numbers.
Table 3 reports the runtimes and the dimensions of the rational Krylov subspaces that the four methods entail when applied to all test problems; in the table, EK, FEK, and ARK are abbreviation of EKSM, F -EKSM, and adaptive RKSM, respectively. The stopping criterion was that the angle between the approximate solutions from two successive steps was less than τ = 10 9 . The single pole s * results for F -EKSM are 352.26 , 569.84 , 915.56 , and 1464.6 , respectively, while for F -EKSM* the s * ˜ results are 140.89 , 227.29 , 364.53 , and 582.43 , respectively. The shortest CPU time appearing in each line listed in the table is marked in bold.
With only one exception, that of f 2 ( z ) = e z , it is apparent that F -EKSM converges the fastest of the four methods in terms of wall clock time for all the test functions and matrices with different condition numbers. While F -EKSM takes more steps than adaptive RKSM to converge, it requires fewer steps than EKSM. Furthermore, the advantage of F -EKSM becomes more pronounced for matrices with a larger condition number. Notably, the advantage of F -EKSM in terms of computation time is stronger than in terms of the spatial dimension, which is due to the orthogonalization cost being proportional to the square of the spatial dimension. F -EKSM* takes slightly more steps than F -EKSM to converge in these examples, and both methods have similar computation times.
The unusual behavior of all methods for f 2 ( z ) = e z can be explained as follows. For these Laplacian matrices, the largest eigenvalues λ m a x range from 1.3 × 10 5 to 8.4 × 10 6 . Because f 2 ( λ min ) 0.0118 and f 2 ( λ max ) f 2 ( 10 5 ) 4.6 × 10 138 (because f 2 decreases monotonically on [ 0 , ) ), the eigenvector components in vector b associated with relatively large eigenvalues would be eliminated in vector f 2 ( A ) b in double precision. In fact, because f 2 ( 10 3 ) f 2 ( λ m i n ) 1.6 × 10 12 τ , all eigenvalues of A greater than 10 3 are essentially ‘invisible’ for f 2 under tolerance τ = 10 12 , and the effective condition number of all four Laplacian matrices is about 10 3 λ m i n 51 . As a result, it takes EKSM the same number of steps to converge for all these matrices; the shift for F -EKSM and F -EKSM* computed using λ m i n and λ m a x of these matrices is in fact not optimal for matrices with such a small effective condition number.
The pole s * in (17) for the SPD matrix is optimal, as we have proved that it has the smallest asymptotic convergence factor among all choices of the single pole. In order to numerically compare the behaviors for different setting of the single pole, we tested matrices Lap. B and Lap. C in Table 3 with f 1 and f 3 , respectively. For each problem, we tested F -EKSM by setting different single poles s to s * , 2 s * , s * / 2 , 10 s * , s * / 10 and setting s * ˜ for F -EKSM*. Figure 4 shows the experimental results. It can be seen that F -EKSM has the fastest asymptotic convergence rate among all the 6 different values of single poles when setting the optimal single pole s * , which confirms that s * is indeed optimal in our experiments.

6.3. Test for Practical Nonsymmetric Matrices

We consider 18 nonsymmetric real matrices, all of which have all eigenvalues on the right half of the complex plane. While these are all real sparse matrices, they all have complex eigenvalues with positive real parts. Half are in the form of M 1 K , where both M and K are sparse and A = M 1 K is not formed explicitly. Table 4 reports several features for each matrix A: the matrix size is n, the smallest and largest eigenvalues in terms of absolute value are λ sm and λ lm , respectively, the smallest and largest real parts of the eigenvalues are Re λ sr and Re λ lr , respectively, and the largest imaginary part of the eigenvalues is Im λ li . Note that all these original matrices have spectra strictly in the left half of the complex plane; we simply switched their signs to make f ( A ) well-defined for Markov-type functions.
For the single pole of F -EKSM and the initial pole of the four-pole variant (4Ps) in Section 5, we used the optimal pole for the SPD case matrices (17) by setting α = min λ lm , Re λ sr and β = max λ lm , Re λ lr . The same setting of α and β for building the Riemann mapping ϕ 1 ( z ) in (14) was applied to compute the optimal pole for F -EKSM* in (20). In particular, because precise evaluation of α and β is time-consuming, we approximated them using the ‘eigs’ function in MATLAB, which is based on the Krylov–Schur Algorithm; see, e.g., [55]. We set the residual tolerance to equal 10 3 for ‘eigs’, ensuring that all the test matrices could find the largest and smallest eigenvalues within a reasonable time. Higher accuracy in computing the eigenvalues is not required when determining the optimal pole, as the convergence performance for F -EKSM does not change noticeably with tiny changes of the value of the pole. Note that the single pole we used for each problem is independent of the Markov-type function; see Table 4. For nonsymmetric matrices, we used LU factorizations to solve the linear systems involving coefficient matrices of A or I A / s for all methods. The stopping criterion was either when the angle between the approximate solutions was less than a tolerance for two successive steps, or when the dimension of the Krylov subspaces reached 1000. There have been a few discussions about restarting for approximating f ( A ) b , though only for polynomial approximation based on Arnoldi-like methods; see, e.g., [17,56]. In this paper, we only focus on the comparison of convergence rates for several different Krylov methods without restarting. Here, we need to choose a proper tolerance for each problem such that it is small enough to fully exhibit the rate of convergence for all methods while not being too small to satisfy. The last column of Table 4 reports the tolerances, which are fixed regardless of the different Markov-type functions.
In Table 5, Table 6, Table 7, Table 8 and Table 9, we report the runtime and dimension of the approximation spaces that the four methods entail for approximating f i ( A ) b to the specified tolerances. The runtime includes the time spent on the evaluation of optimal poles for F -EKSM, F -EKSM*, and the four-pole variant. The “−” symbol indicates failure to find a sufficiently accurate solution when the maximum dimension of approximation space (1000) was reached. The shortest CPU time appeared in each line of the listed tables is marked in bold. Figure 5 shows an example plot for each method, with the relative error sin ( q k + 1 , q k ) (where q k is the approximation to f ( A ) b at step k) plotted against the dimension of the approximation space for each function.
In the ninety total cases for eighteen problems and five functions shown in Table 5, Table 6, Table 7, Table 8 and Table 9, the four-pole variant is the fastest in runtime in sixty cases and F -EKSM is the fastest in fourteen cases. Among all cases when the four-pole variant is not the fastest, it is no more than 10 % slower than the fastest in twelve cases and 10– 20 % slower in eight cases.
Overall, the four-pole variant is the best in terms of runtime, though there are several exceptions. The first is tolosa, which is the only problem on which adaptive RKSM ran the fastest for all functions. For this problem, the dimension of the matrix is relatively small; this makes it more efficient to perform a new linear solver at each step, as the LU cost is cheap. Moreover, for tolosa, Im λ li is close to λ lm , meaning that the algorithms based on repeated poles converge slower; see Table 1. The second exception is venkat, where the four-pole variant is not the fastest for f 3 . In fact, for f 3 , EKSM, F -EKSM, and F -EKSM* converge within a much smaller dimension of the approximation space than for the other functions. A possible explanation is that in computer arithmetic it is difficult to accurately capture the relative change in function values for f 3 at small variables, and venkat has majority of eigenvalues that are small in terms of absolute value. For example, f 3 6 × 10 5 f 3 6 × 10 5 1 + 10 7 f 3 6 × 10 5 = 2.0 × 10 11 , which means that a relative change of 10 7 in the independent variable of f 3 near 6 × 10 5 can lead to a relative change of 2.0 × 10 11 in function value; thus, f 3 fails to observe such a difference in input above the given tolerance 10 10 . The third exception is convdiffA, for which EKSM is fastest for four out of all five functions. In fact, EKSM takes fewer steps to converge than F -EKSM, which can be seen at the bottom right of Figure 5. A possible reason for this is that λ lm and Re λ sr can only be evaluated approximately by several iterations of the Arnoldi method, and sometimes their values cannot be found accurately. The ‘optimal’ pole based on inaccurate α can sometimes be far away from the real optimal pole. Notable, for this exception F -EKSM* takes fewer steps to converge than EKSM, though it requires more computation time. This is because EKSM uses infinite poles; for the matrix in the form of M 1 K , where M is an identity matrix, it is not necessary to apply a linear solver for infinite poles, only a simple matrix vector multiplication. For the other cases in which F -EKSM or F -EKSM* runs fastest, the four-pole variant usually runs only slightly slower, as in those cases it takes less time to enlarge the approximation space than to compute more LU factorizations for using more repeated poles.
It is important to underscore that the runtime needed for F -EKSM with our optimal pole is less than that for F -EKSM* with an optimal pole derived based on [26] for a majority of the nonsymmetric test matrices, which is similar to the minor advantage in runtime of our F -EKSM shown in Table 3 for Laplacian matrices. In addition, the runtime of the four-pole variant suggests that if sparse LU factorization is efficient for the shifted matrices needed for RKSM, then using a small number of near optimal poles seems to be an effective way to achieve the lowest overall runtime.
In terms of the dimension of the approximation space, adaptive RKSM always need the smallest subspace to converge, with the four-pole variant in second place except for f 2 for tolosa. In most, cases EKSM needs the largest subspace to converge, while in others F -EKSM needs the largest subspace. In the cases where F -EKSM and the four-pole variant converge, the four-pole variant takes 7.8 % to 87.6 % fewer steps.
In summary, our experiments suggest that F -EKSM, F -EKSM*, and the four-pole variant are competitive in reducing the runtime of rational Krylov methods based on direct linear solvers for approximating f ( A ) b ; on the other hand, if the goal is to save storage cost, adaptive RKSM is preferable.

7. Conclusions

In this paper, we have studied an algorithm called the flexible extended Krylov subspace method ( F -EKSM) for approximating f ( A ) b for Markov-type functions. The central idea is to find an optimal pole to replace the zero pole in EKSM such that F -EKSM needs only the same single LU factorization as EKSM while converging more rapidly.
In the main theoretical contribution of this work, Theorem 1, we prove that there exists a unique optimal pole for a symmetric positive-definite matrix that guarantees the fastest convergence of F -EKSM, which always outperforms EKSM. The theorem provides a formula for both the optimal pole and an upper bound on the convergence factor. Numerical experiments show that F -EKSM is more efficient than EKSM and that it is competitive in runtime compared with adaptive RKSM if the shifted linear systems needed for rational Krylov methods are solved using a direct linear solver.
F -EKSM may lose its advantages for challenging nonsymmetric matrices because of possible failure to compute the optimal poles numerically and due to its relatively slow convergence rate for these problems. This performance can be improved by using four fixed poles chosen flexibly in the early stage of computation. Our numerical results show that the four-pole variant is the most efficient in terms of runtime for many problems.

Author Contributions

Conceptualization, S.X. and F.X.; methodology, S.X. and F.X.; software, S.X.; validation, S.X.; formal analysis, S.X. and F.X.; investigation, S.X.; resources, F.X.; data curation, F.X.; writing—original draft preparation, S.X.; writing—review and editing, F.X.; visualization, S.X.; supervision, F.X.; project administration, F.X.; funding acquisition, F.X. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the U.S. National Science Foundation under grants DMS-1719461, DMS-1819097, and DMS-2111496.

Data Availability Statement

Not applicable.

Acknowledgments

The authors would like to thank Jeong-Rock Yoon for useful conversations and suggestions on the proof of the bounded rotation of f ( z ) = 1 z . In addition, we thank the two anonymous reviewers for their helpful comments and suggestions.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proof of Lemma 1

Because ϕ i : C ¯ W i C ¯ D ( i = 1 , 2 ) and a C ¯ W 1 , 1 C ¯ W 2 , s a s C ¯ W 2 , we can conclude that ϕ 1 ( a ) , ϕ 2 ( 1 ) and ϕ 2 s a s are all greater than 1.
Equation (10) has been proved in Section 3, Lemma 3.1 in [21]. For (11) involving f 2 , our proof is analogous to what was done for f 1 .
Because ϕ denotes the Riemann mapping, for a fixed z W 1 we can write the Faber polynomials by means of their generating function [57]:
1 z ζ = 1 ψ [ ϕ ( ζ ) ] k = 0 F k ( z ) ϕ ( ζ ) k + 1 , z W 1 , ζ W 1 .
We define a new variable y = 1 z / s 1 . Then, from (9), we have
f 2 ( z ) = f 2 s 1 + 1 y = a 0 d μ ( ζ ) s ( 1 + 1 y ) ζ = y a 0 1 s ζ d μ ( ζ ) y s ζ s = y a 0 1 s ζ 1 ψ 2 ϕ 2 s ζ s k = 0 F 2 , k ( y ) ϕ 2 s ζ s k + 1 d μ ( ζ ) = y k = 0 F 2 , k ( y ) a 0 d μ ( ζ ) ( s ζ ) ψ 2 ϕ 2 s ζ s ϕ 2 s ζ s k + 1 .
Because W 2 is symmetric with respect to the real axis R and lies in the left half of the complex plane, ψ 2 monotonically maps ( , 1 ) onto ( , min { R W 2 } ) and ( 1 , + ) onto ( max { R W 2 } , + ) . We then have the following properties for ψ 2 and ϕ 2 :
ϕ 2 ( ζ ) c 3 ζ ,   and   ψ 2 [ ϕ 2 ( ζ ) ] c 4 , ζ R W 2 , ϕ 2 ( ζ ) ϕ 2 ( z 0 ) , ζ ( , z 0 ) for z 0 ( , min { R W 2 } ) , ϕ 2 ( ζ ) ϕ 2 ( z 0 ) , ζ ( z 0 , ) for z 0 ( max { R W 2 } , + ) ,
where c 3 , c 4 > 0 are constants independent of ζ .
It follows that the integral in the last expression of (A1) satisfies
a 0 d μ ( ζ ) ( s ζ ) ψ 2 ϕ 2 s ζ s ϕ 2 s ζ s k + 1 a 0 d μ ( ζ ) ( s ζ ) ψ 2 ϕ 2 s ζ s ϕ 2 s ζ s k + 1 a 0 d μ ( ζ ) ( s ζ ) c 4 c 3 s ζ s ϕ 2 s ζ s k c 5 min ζ ( a , 0 ) ϕ 2 s ζ s k .
Note that for any ζ ( , 0 ) it is the case that s ζ s ( , 1 ) ( 0 , ) C ¯ W 2 . It follows that ϕ 2 s ζ s > 1 .
We note that the map f ( z ) = s z s ( s < 0 ) from W 1 to W 2 can be written as a composition of three maps f = f 3 f 2 f 1 , where f 1 ( z ) = z s , f 2 ( z ) = 1 / z , and f 3 ( z ) = s z , all of which are bijective. We denote W 1 and W 2 as the boundary of W 1 and W 2 , respectively. Let Γ 1 be the image of W 1 under f 1 and let Γ 2 be the image of Γ 1 under f 2 (which is the preimage of W 2 under f 3 ). Because W 1 is the numerical range of A, it is convex and compact per the Hausdorff–Toeplitz theorem; therefore, W 1 has a boundary rotation of 2 π . As f 1 translates W 1 horizontally to the direction of the positive real axis, it preserves the shape of the preimage such that Γ 1 is of bounded rotation as well. In addition, it can be shown that f 2 = 1 z maps Γ 1 to Γ 2 with bounded rotation (see details in Lemma A1 shown below). Finally, because f 3 ( z ) = s z preserves the shape of the preimage, W 2 has bounded rotation. From Chapter IX, Section 3, Theorem 11 in [52], we have max y W 2 | F 2 , k ( y ) | c 6 , and the following inequality holds for y , z W 2 :
k = m F 2 , k ( y ) ϕ 2 ( z ) k c 6 k = m ϕ 2 ( z ) k = c 7 ϕ 2 ( z ) m ,
where c 7 is some real positive constant independent of m.
In light of the above observations, if we denote
γ 2 , k = y a 0 d μ ( ζ ) ( s ζ ) ψ 2 ϕ 2 s ζ s ϕ 2 s ζ s k + 1 ,
then
f 2 ( z ) k = 0 m 1 γ 2 , k F 2 , k 1 z / s 1 y k = m F 2 , k ( y ) a 0 d μ ( ζ ) ( s ζ ) ψ 2 ϕ 2 s ζ s ϕ 2 s ζ s k + 1 y c 5 k = m F 2 , k ( y ) min ζ ( a , 0 ) ϕ 2 s ζ s k y c 7 c 5 min ζ ( a , 0 ) ϕ 2 s ζ s m c 2 min ζ ( a , 0 ) ϕ 2 s ζ s m ,
where c 2 = c 7 c 5 is an upper bound of y c 7 c 5 due to y = 1 z / s 1 [ 1 , 0 ) .
For ζ ( a , 0 ) , there are two cases to derive the minimum of ϕ 2 s ζ s .
Case 1: if a s , then s ζ s [ s a s , 1 ] . Because W 1 lies strictly in the right half of the plane, by definition W 2 lies between the vertical lines r e a l ( z ) = 0 and r e a l ( z ) = 1 ; thus, min { R W 2 } 1 and we have
min ζ ( a , 0 ) ϕ 2 s ζ s = ϕ 2 ( 1 ) .
Case 2: if a < s , then s ζ s ( , 1 ] [ s a s , + ) ; clearly, s a s 0 , and because max { R W 2 } 0 , we have
min ζ ( a , 0 ) ϕ 2 s ζ s = min ϕ 2 ( 1 ) , ϕ 2 s a s = | ϕ 2 ( z 0 ) | .
Note that the conclusion of Case 2 is valid for Case 1; therefore,
f 2 ( z ) k = 0 m 1 γ 2 , k F 2 , k 1 z / s 1 c 2 | ϕ 2 ( z 0 ) | m .
Lemma A1.
Define the mapping f : Γ 1 Γ 2 , f ( z ) = 1 z , where Γ 1 C is the boundary of a compact convex domain lying strictly in the right half of the complex plane and is symmetric with respect to the real axis. Let Γ 1 be the image of the interval 0 , 2 π under the injection γ ( t ) , which is assumed to be absolutely continuous. Then, Γ 2 has bounded rotation.
Proof. 
We define I 1 0 , 2 π as the subset where γ ( t ) is continuous. Note that the directional angle of the tangent line to γ ( t ) at t is θ ( t ) = arg [ γ ( t ) ] . The boundary rotation of Γ 1 is defined in Page 270 of reference [58] as follows:
BD ( Γ 1 ) = 0 2 π d θ ( t ) = I 1 d θ ( t ) + γ 1 ( Γ 1 ) I 1 d θ ( t ) = 2 π ,
which is due to the convexity of the domain enclosed by Γ 1 .
Note that Γ 2 is the image of the interval 0 , 2 π under the injection ( f γ ) ( t ) . Similarly, the directional angle of the tangent line to ( f γ ) ( t ) at t is
ϕ ( t ) = arg [ ( f γ ) ( t ) ] = arg [ f ( γ ( t ) ) γ ( t ) ] = arg [ f ( γ ( t ) ) ] + arg [ γ ( t ) ] = arg [ f ( γ ( t ) ) ] + θ ( t ) .
Because f is a conformal mapping that preserves the angles, the variation of the directional angle at the discontinuities of γ ( t ) (if any) are preserved. This can be written as γ 1 ( Γ 1 ) I 1 d ϕ ( t ) = γ 1 ( Γ 1 ) I 1 d θ ( t ) . The boundary rotation of Γ 2 can be written as
BD ( Γ 2 ) = I 1 d ϕ ( t ) + γ 1 ( Γ 1 ) I 1 d ϕ ( t ) = I 1 d ϕ ( t ) + γ 1 ( Γ 1 ) I 1 d θ ( t ) I 1 | d arg [ f ( γ ( t ) ) ] | + I 1 | d θ ( t ) | + γ 1 ( Γ 1 ) I 1 | d θ ( t ) | = I 1 d arg [ f ( γ ( t ) ) ] + BD ( Γ 1 ) ,
where
| d arg [ f ( γ ( t ) ) ] | = | d arg [ 1 / γ 2 ( t ) ] | = 2 | d arg [ γ ( t ) ] | .
Because Γ 1 is the boundary of a compact convex domain, lies strictly in the right half of the complex plane, and is symmetric with respect to the real axis, there exists θ 0 ( 0 , π 2 ) such that
max t [ 0 , 2 π ) arg [ γ ( t ) ] = θ 0 , min t [ 0 , 2 π ) arg [ γ ( t ) ] = θ 0 ,
and we can select t + , t [ 0 , 2 π ) such that
arg [ γ ( t + ) ] = θ 0 , arg [ γ ( t ) ] = θ 0 .
Note that while these may not be unique, they split Γ 1 into two disjoint continuous branches, denoted as Γ 1 + and Γ 1 , on both of which arg [ γ ( t ) ] is monotonic (though not necessarily strictly) with respect to t. It follows that
I 1 d arg [ f ( γ ( t ) ) ] = 2 I 1 | d arg [ γ ( t ) ] | = 2 γ ( t ) Γ 1 + | d arg [ γ ( t ) ] | + γ ( t ) Γ 1 | d arg [ γ ( t ) ] | = 2 ( 2 θ 0 + 2 θ 0 ) = 8 θ 0 ,
because the two end points of Γ 1 + and Γ 1 are γ ( t + ) and γ ( t ) , respectively. Here, θ 0 < π 2 ; thus, we have
I 1 d arg [ f ( γ ( t ) ) ] < 4 π .
Moreover, because BD ( Γ 1 ) = 2 π , we have BD ( Γ 2 ) < 4 π + 2 π = 6 π . The claim is established. □

Appendix B. Proof of Lemma 3

We first define t = min { M , N 1 , N 2 } and T = max a 0 t . For a fixed pole s 0 , M , N 1 , N 2 are functions of the variable a; specifically, M is a linear function of a, N 1 is a constant independent of a, and N 2 is linear to the reciprocal of a shifted value of a (see (16)). Here, we are interested in their absolute values.
We can set up a Cartesian coordinate system to illustrate M, N 1 , and N 2 as functions of a and compare their values. The horizontal asymptote of the function N 2 is f = c ^ d ^ . First, we need to compare this with N 1 .
Case 1: c ^ d ^ 1 + c ^ d ^ s , α β .
The illustration is shown in Figure A1. Because N 1 is constant, T N 1 , and for a , N 1 < N 2 < M ; thus, t = N 1 such that T N 1 . To sum up, in this case we have T = N 1 = 1 + c ^ d ^ = α + β 2 α β / s β α .
Figure A1. Sketch map of Case 1 in proof of Lemma 3.
Figure A1. Sketch map of Case 1 in proof of Lemma 3.
Mathematics 11 04341 g0a1
Case 2: c ^ d ^ < 1 + c ^ d ^ s α β , 0 .
There is an intersection between N 1 and N 2 for a < s , which we denote as p 1 . Solving the equation N 1 = N 2 for a, we obtain a = s 1 2 c ^ + 1 + 1 . Note that p 1 can be either above or below the line of M.
First, if p 1 is below or on the line of M, then
c s 1 2 c ^ + 1 + 1 d 1 + c ^ d ^ ( α + β ) s 3 3 α β s 2 + α 2 β 2 0 .
Letting θ ( s ) = ( α + β ) s 3 3 α β s 2 + α 2 β 2 , we need to find the interval of s such that θ ( s ) 0 . Here, θ ( s ) is a cubic function and its two stationary points have positive function values; thus, it only has one real root. We can use the Cardano formula [59] to obtain its real root:
s 0 = α β κ 1 / 6 + κ 1 / 6 α β .
Therefore, for s α β , s 0 we have θ ( s ) 0 .
We denote the intersection between M and N 2 as p 2 . As shown in Figure A2, when a is between p 1 and p 2 , T = N 1 = 1 + c ^ d ^ = α + β 2 α β / s β α .
Figure A2. Sketch map of Case 2 in proof of Lemma 3: (a) when p 1 is below or on the line of M and (b) when p 1 is above the line of M.
Figure A2. Sketch map of Case 2 in proof of Lemma 3: (a) when p 1 is below or on the line of M and (b) when p 1 is above the line of M.
Mathematics 11 04341 g0a2
Second, if p 1 is above the line of M, then s ( s 0 , 0 ] ; see Figure A2.
When M = N 2 , we have
a + c d = s a s c ^ d ^ a 2 + 2 s a + s ( α + β ) α β = 0 .
We only need the root for a 0 , which is a = s ( α s ) ( β s ) . Therefore,
min { M , N 1 , N 2 } = N 2 , a , s ( α s ) ( β s ) M , a s ( α s ) ( β s ) , 0 .
It is clear from Figure A2 that the maximum of t is achieved at point p 2 , that is, when a = s ( α s ) ( β s ) ,
T = c s + ( α s ) ( β s ) d = α + β 2 s + 2 ( α s ) ( β s ) β α .

References

  1. Higham, N.J. Functions of Matrices: Theory and Computation; Society for Industrial and Applied Mathematics (SIAM): Philadelphia, PA, USA, 2008; pp. xx+425. [Google Scholar]
  2. Schilders, W.H.A.; van der Vorst, H.A.; Rommes, J. (Eds.) Model Order Reduction: Theory, Research Aspects and Applications; Mathematics in Industry; Springer: Berlin/Heidelberg, Germany, 2008; Volume 13, pp. xii+471. [Google Scholar]
  3. Grimm, V.; Hochbruck, M. Rational approximation to trigonometric operators. BIT 2008, 48, 215–229. [Google Scholar] [CrossRef]
  4. Burrage, K.; Hale, N.; Kay, D. An efficient implicit FEM scheme for fractional-in-space reaction-diffusion equations. SIAM J. Sci. Comput. 2012, 34, A2145–A2172. [Google Scholar] [CrossRef]
  5. Bloch, J.C.; Heybrock, S. A nested Krylov subspace method to compute the sign function of large complex matrices. Comput. Phys. Commun. 2011, 182, 878–889. [Google Scholar] [CrossRef]
  6. Kressner, D.; Tobler, C. Krylov subspace methods for linear systems with tensor product structure. SIAM J. Matrix Anal. Appl. 2010, 31, 1688–1714. [Google Scholar] [CrossRef]
  7. Hochbruck, M.; Ostermann, A. Exponential Runge-Kutta methods for parabolic problems. Appl. Numer. Math. 2005, 53, 323–339. [Google Scholar] [CrossRef]
  8. Hochbruck, M.; Lubich, C. Exponential integrators for quantum-classical molecular dynamics. BIT 1999, 39, 620–645. [Google Scholar] [CrossRef]
  9. Bai, Z. Krylov subspace techniques for reduced-order modeling of large-scale dynamical systems. Appl. Numer. Math. 2002, 43, 9–44. [Google Scholar] [CrossRef]
  10. Freund, R.W. Krylov-subspace methods for reduced-order modeling in circuit simulation. J. Comput. Appl. Math. 2000, 123, 395–421. [Google Scholar] [CrossRef]
  11. Wang, S.; de Sturler, E.; Paulino, G.H. Large-scale topology optimization using preconditioned Krylov subspace methods with recycling. Internat. J. Numer. Methods Engrg. 2007, 69, 2441–2468. [Google Scholar] [CrossRef]
  12. Biros, G.; Ghattas, O. Parallel Lagrange-Newton-Krylov-Schur methods for PDE-constrained optimization. I. The Krylov-Schur solver. SIAM J. Sci. Comput. 2005, 27, 687–713. [Google Scholar] [CrossRef]
  13. Simoncini, V. Computational methods for linear matrix equations. SIAM Rev. 2016, 58, 377–441. [Google Scholar] [CrossRef]
  14. Druskin, V.; Knizhnerman, L. Extended Krylov subspaces: Approximation of the matrix square root and related functions. SIAM J. Matrix Anal. Appl. 1998, 19, 755–771. [Google Scholar] [CrossRef]
  15. Bergamaschi, L.; Caliari, M.; Martínez, A.; Vianello, M. Comparing Leja and Krylov approximations of large scale matrix exponentials. In Proceedings of the Computational Science–ICCS 2006, Reading, UK, 28–31 May 2006; Springer: Berlin/Heidelberg, Germany, 2006; pp. 685–692. [Google Scholar]
  16. Knizhnerman, L.; Simoncini, V. Convergence analysis of the extended Krylov subspace method for the Lyapunov equation. Numer. Math. 2011, 118, 567–586. [Google Scholar] [CrossRef]
  17. Eiermann, M.; Ernst, O.G. A restarted Krylov subspace method for the evaluation of matrix functions. SIAM J. Numer. Anal. 2006, 44, 2481–2504. [Google Scholar] [CrossRef]
  18. Frommer, A.; Simoncini, V. Stopping criteria for rational matrix functions of Hermitian and symmetric matrices. SIAM J. Sci. Comput. 2008, 30, 1387–1412. [Google Scholar] [CrossRef]
  19. Popolizio, M.; Simoncini, V. Acceleration techniques for approximating the matrix exponential operator. SIAM J. Matrix Anal. Appl. 2008, 30, 657–683. [Google Scholar] [CrossRef]
  20. Bai, Z.Z.; Miao, C.Q. Computing eigenpairs of Hermitian matrices in perfect Krylov subspaces. Numer. Algorithms 2019, 82, 1251–1277. [Google Scholar] [CrossRef]
  21. Knizhnerman, L.; Simoncini, V. A new investigation of the extended Krylov subspace method for matrix function evaluations. Numer. Linear Algebra Appl. 2010, 17, 615–638. [Google Scholar] [CrossRef]
  22. Simoncini, V. A new iterative method for solving large-scale Lyapunov matrix equations. SIAM J. Sci. Comput. 2007, 29, 1268–1288. [Google Scholar] [CrossRef]
  23. Güttel, S.; Knizhnerman, L. A black-box rational Arnoldi variant for Cauchy-Stieltjes matrix functions. BIT 2013, 53, 595–616. [Google Scholar] [CrossRef]
  24. Druskin, V.; Simoncini, V. Adaptive rational Krylov subspaces for large-scale dynamical systems. Syst. Control Lett. 2011, 60, 546–560. [Google Scholar] [CrossRef]
  25. Benzi, M.; Simoncini, V. Decay bounds for functions of Hermitian matrices with banded or Kronecker structure. SIAM J. Matrix Anal. Appl. 2015, 36, 1263–1282. [Google Scholar] [CrossRef]
  26. Beckermann, B.; Reichel, L. Error estimates and evaluation of matrix functions via the Faber transform. SIAM J. Numer. Anal. 2009, 47, 3849–3883. [Google Scholar] [CrossRef]
  27. Xu, S.; Xue, F. Inexact rational Krylov subspace methods for approximating the action of functions of matrices. Electron. Trans. Numer. Anal. 2023, 58, 538–567. [Google Scholar] [CrossRef]
  28. Beckermann, B.; Güttel, S. Superlinear convergence of the rational Arnoldi method for the approximation of matrix functions. Numer. Math. 2012, 121, 205–236. [Google Scholar] [CrossRef]
  29. Hochbruck, M.; Lubich, C. On Krylov subspace approximations to the matrix exponential operator. SIAM J. Numer. Anal. 1997, 34, 1911–1925. [Google Scholar] [CrossRef]
  30. Druskin, V.; Knizhnerman, L. Krylov subspace approximation of eigenpairs and matrix functions in exact and computer arithmetic. Numer. Linear Algebra Appl. 1995, 2, 205–217. [Google Scholar] [CrossRef]
  31. Druskin, V.; Knizhnerman, L.; Zaslavsky, M. Solution of large scale evolutionary problems using rational Krylov subspaces with optimized shifts. SIAM J. Sci. Comput. 2009, 31, 3760–3780. [Google Scholar] [CrossRef]
  32. Druskin, V.; Knizhnerman, L.; Simoncini, V. Analysis of the rational Krylov subspace and ADI methods for solving the Lyapunov equation. SIAM J. Numer. Anal. 2011, 49, 1875–1898. [Google Scholar] [CrossRef]
  33. Berljafa, M.; Güttel, S. Generalized rational Krylov decompositions with an application to rational approximation. SIAM J. Matrix Anal. Appl. 2015, 36, 894–916. [Google Scholar] [CrossRef]
  34. Jagels, C.; Reichel, L. Recursion relations for the extended Krylov subspace method. Linear Algebra Appl. 2011, 434, 1716–1732. [Google Scholar] [CrossRef]
  35. Jagels, C.; Reichel, L. The extended Krylov subspace method and orthogonal Laurent polynomials. Linear Algebra Appl. 2009, 431, 441–458. [Google Scholar] [CrossRef]
  36. Ruhe, A. Rational Krylov sequence methods for eigenvalue computation. Linear Algebra Appl. 1984, 58, 391–405. [Google Scholar] [CrossRef]
  37. Ruhe, A. Rational Krylov algorithms for nonsymmetric eigenvalue problems. In Recent Advances in Iterative Methods; The IMA Volumes in Mathematics and Its Applications; Springer: New York, NY, USA, 1994; Volume 60, pp. 149–164. [Google Scholar]
  38. Güttel, S. Rational Krylov approximation of matrix functions: Numerical methods and optimal pole selection. GAMM-Mitt. 2013, 36, 8–31. [Google Scholar] [CrossRef]
  39. Bagby, T. The modulus of a plane condenser. J. Math. Mech. 1967, 17, 315–329. [Google Scholar] [CrossRef]
  40. Gončar, A.A. The problems of E. I. Zolotarev which are connected with rational functions. Mat. Sb. 1969, 78, 640–654. [Google Scholar]
  41. Caliari, M.; Vianello, M.; Bergamaschi, L. Interpolating discrete advection-diffusion propagators at Leja sequences. J. Comput. Appl. Math. 2004, 172, 79–99. [Google Scholar] [CrossRef]
  42. Caliari, M.; Vianello, M.; Bergamaschi, L. The LEM exponential integrator for advection-diffusion-reaction equations. J. Comput. Appl. Math. 2007, 210, 56–63. [Google Scholar] [CrossRef]
  43. Druskin, V.; Lieberman, C.; Zaslavsky, M. On adaptive choice of shifts in rational Krylov subspace reduction of evolutionary problems. SIAM J. Sci. Comput. 2010, 32, 2485–2496. [Google Scholar] [CrossRef]
  44. Ruhe, A. Rational Krylov: A practical algorithm for large sparse nonsymmetric matrix pencils. SIAM J. Sci. Comput. 1998, 19, 1535–1551. [Google Scholar] [CrossRef]
  45. Botchev, M.A.; Grimm, V.; Hochbruck, M. Residual, restarting, and Richardson iteration for the matrix exponential. SIAM J. Sci. Comput. 2013, 35, A1376–A1397. [Google Scholar] [CrossRef]
  46. Saad, Y. Analysis of some Krylov subspace approximations to the matrix exponential operator. SIAM J. Numer. Anal. 1992, 29, 209–228. [Google Scholar] [CrossRef]
  47. van den Eshof, J.; Hochbruck, M. Preconditioning Lanczos approximations to the matrix exponential. SIAM J. Sci. Comput. 2006, 27, 1438–1457. [Google Scholar] [CrossRef]
  48. Björck, A.; Golub, G.H. Numerical methods for computing angles between linear subspaces. Math. Comp. 1973, 27, 579–594. [Google Scholar] [CrossRef]
  49. De Sturler, E. Truncation strategies for optimal Krylov subspace methods. SIAM J. Numer. Anal. 1999, 36, 864–889. [Google Scholar] [CrossRef]
  50. Elman, H.C.; Su, T. Low-rank solution methods for stochastic eigenvalue problems. SIAM J. Sci. Comput. 2019, 41, A2657–A2680. [Google Scholar] [CrossRef]
  51. Bak, J.; Newman, D.J. Complex Analysis, 3rd ed.; Undergraduate Texts in Mathematics; Springer: New York, NY, USA, 2010; pp. xii+328. [Google Scholar]
  52. Suetin, P.K. Series of Faber Polynomials; Analytical Methods and Special Functions; Gordon and Breach Science Publishers: Amsterdam, The Netherlands, 1998; Volume 1, pp. xx+301. [Google Scholar]
  53. Crouzeix, M. Numerical range and functional calculus in Hilbert space. J. Funct. Anal. 2007, 244, 668–690. [Google Scholar] [CrossRef]
  54. Driscoll, T.A. Algorithm 843: Improvements to the Schwarz-Christoffel toolbox for MATLAB. ACM Trans. Math. Softw. 2005, 31, 239–251. [Google Scholar] [CrossRef]
  55. Stewart, G.W. A Krylov-Schur algorithm for large eigenproblems. SIAM J. Matrix Anal. Appl. 2001, 23, 601–614. [Google Scholar] [CrossRef]
  56. Frommer, A.; Güttel, S.; Schweitzer, M. Convergence of restarted Krylov subspace methods for Stieltjes functions of matrices. SIAM J. Matrix Anal. Appl. 2014, 35, 1602–1624. [Google Scholar] [CrossRef]
  57. Curtiss, J.H. Faber polynomials and the Faber series. Am. Math. Mon. 1971, 78, 577–596. [Google Scholar] [CrossRef]
  58. Duren, P.L. Univalent Functions; Fundamental Principles of Mathematical Sciences; Springer: New York, NY, USA, 1983; Volume 259, pp. xiv+382. [Google Scholar]
  59. Bronshtein, I.N.; Semendyayev, K.A.; Musiol, G.; Mühlig, H. Handbook of Mathematics, 6th ed.; Springer: Berlin/Heidelberg, Germany, 2015; pp. xliv+1207. [Google Scholar]
Figure 1. Actual and asymptotic convergence of EKSM, F -EKSM, and F -EKSM* for SPD matrices: (a) f 1 ( A 1 ) b , κ = 10 8 and (b) f 2 ( A 2 ) b , κ = 4.799 × 10 8 .
Figure 1. Actual and asymptotic convergence of EKSM, F -EKSM, and F -EKSM* for SPD matrices: (a) f 1 ( A 1 ) b , κ = 10 8 and (b) f 2 ( A 2 ) b , κ = 4.799 × 10 8 .
Mathematics 11 04341 g001
Figure 2. Spectra of several matrices: (a) spectrum of A 3 ; (b) spectrum of A 3 1 ; (c) spectrum of ( I A 3 / s 3 * ) 1 ; (d) spectrum of A 4 ; (e) spectrum of A 4 1 ; (f) spectrum of ( I A 4 / s 4 * ) 1 .
Figure 2. Spectra of several matrices: (a) spectrum of A 3 ; (b) spectrum of A 3 1 ; (c) spectrum of ( I A 3 / s 3 * ) 1 ; (d) spectrum of A 4 ; (e) spectrum of A 4 1 ; (f) spectrum of ( I A 4 / s 4 * ) 1 .
Mathematics 11 04341 g002
Figure 3. Performance and theoretical asymptotic convergence of EKSM and F -EKSM for nonsymmetric matrices with elliptic spectra: (a) approximating f 1 ( A 3 ) b ; (b) approximating f 4 ( A 4 ) b .
Figure 3. Performance and theoretical asymptotic convergence of EKSM and F -EKSM for nonsymmetric matrices with elliptic spectra: (a) approximating f 1 ( A 3 ) b ; (b) approximating f 4 ( A 4 ) b .
Mathematics 11 04341 g003
Figure 4. Convergence of F -EKSM for different setting of single poles: (a) f 1 on Lap. B and (b) f 3 on Lap. C.
Figure 4. Convergence of F -EKSM for different setting of single poles: (a) f 1 on Lap. B and (b) f 3 on Lap. C.
Mathematics 11 04341 g004
Figure 5. Decay of sin ( q k , q k + 1 ) as the approximation space dimension increases: (a) f 1 on aerofoilA; (b) f 2 on matRe500E; (c) f 3 on plate; (d) f 4 on step.
Figure 5. Decay of sin ( q k , q k + 1 ) as the approximation space dimension increases: (a) f 1 on aerofoilA; (b) f 2 on matRe500E; (c) f 3 on plate; (d) f 4 on step.
Mathematics 11 04341 g005
Table 1. Convergence factor of EKSM and F -EKSM for matrices with different elliptical numerical ranges ( α = 1 ).
Table 1. Convergence factor of EKSM and F -EKSM for matrices with different elliptical numerical ranges ( α = 1 ).
η 11 0.75 0.75 0.5 0.5 0.25 0.25 00
β 10 2 10 4 10 2 10 4 10 2 10 4 10 2 10 4 10 2 10 4
ρ F E K * 0.37 0.65 0.46 0.84 0.53 0.87 0.59 0.89 0.64 0.91
ρ E K 0.52 0.82 0.63 0.95 0.71 0.96 0.77 0.97 0.82 0.98
log ρ * log ρ 1.53 2.18 1.69 3.45 1.86 3.77 2.02 4.11 2.17 4.46
s * 3.82 20.6 2.96 11.0 2.92 14.5 3.27 17.6 3.83 20.9
Table 2. Bounds on the asymptotic convergence factor for EKSM, F -EKSM, and F -EKSM* with optimal pole s * .
Table 2. Bounds on the asymptotic convergence factor for EKSM, F -EKSM, and F -EKSM* with optimal pole s * .
κ 10 10 2 10 3 10 4 10 5 10 6 10 8 10 10
ρ F E K * 0.1896 0.3660 0.5195 0.6455 0.7440 0.8182 0.9113 0.9578
ρ * ˜ F E K * 0.0537 0.1853 0.3435 0.4945 0.6235 0.7265 0.8628 0.9339
ρ E K 0.2801 0.5195 0.6980 0.8182 0.8935 0.9387 0.9802 0.9937
ρ S I 0.4365 0.6076 0.7370 0.8333 0.8989 0.9405 0.9804 0.9937
log ρ * log ρ E K 1.3069 1.5349 1.8218 2.1814 2.6264 3.1718 4.6448 6.8140
log ρ * log ρ * ˜ 0.5686 0.5962 0.6128 0.6216 0.6260 0.6281 0.6296 0.6299
s * 1.4714 3.8188 9.0909 20.589 45.4370 99.010 463.16 2153.4
s * ˜ 0.6058 1.5527 3.6568 8.2269 18.0917 39.3540 183.87 854.7
Table 3. Performance of EKSM, F -EKSM, F -EKSM*, and adaptive RKSM on SPD problems.
Table 3. Performance of EKSM, F -EKSM, F -EKSM*, and adaptive RKSM on SPD problems.
Time (s)Space Dimension
FunctionProblemEKFEKFEK*ARKEKFEKFEK*ARK
f 1 Lap. A 0.28 0 . 19 0.23 0.70 52424622
Lap. B 1.10 0 . 70 0.88 3.70 76526025
Lap. C 7.35 4 . 49 5.21 19.30 102667628
Lap. D 43.36 24 . 52 27.82 100.96 138849630
f 2 Lap. A 0 . 18 0.20 0 . 18 0.94 32483428
Lap. B 0 . 48 0.87 0.63 4.32 30604032
Lap. C 2 . 66 5.55 3.62 27.72 38805438
Lap. D 10 . 44 27.60 18.10 135.63 36966439
f 3 Lap. A 0.29 0 . 22 0.26 0.69 52424622
Lap. B 1.17 0 . 71 0.88 3.26 76526025
Lap. C 7.55 4 . 49 5.28 19.36 102667628
Lap. D 43.15 23 . 81 27.61 101.28 138849630
f 4 Lap. A 0.21 0 . 18 0.20 0.71 50384223
Lap. B 0.88 0 . 60 0.74 3.55 66465427
Lap. C 6.29 3 . 90 4.63 20.71 90586830
Lap. D 36.09 19 . 76 23.76 114.46 120728433
f 5 Lap. A 0.32 0 . 17 0.24 0.67 48364221
Lap. B 1.04 0 . 65 0.77 3.20 66465224
Lap. C 6.28 3 . 81 4.53 18.60 88566627
Lap. D 35.06 19 . 34 23.90 97.79 116708429
Table 4. Selected features of the test problems.
Table 4. Selected features of the test problems.
ProblemSize n λ sm λ lm Re ( λ sr ) Re ( λ lr ) Im ( λ li ) s * Tol
aerofoilA16,388 8.41 × 10 2 1.02 × 10 3 1.04 × 10 2 1.01 × 10 3 8.12 × 10 2 1.185 10 11
aerofoilB23,560 2.73 × 10 1 2.63 × 10 2 2.73 × 10 1 1.43 × 10 2 2.60 × 10 2 2.447 10 13
matRe500A3595 4.00 × 10 1 1.21 × 10 2 2.01 × 10 1 1.21 × 10 2 1.12 × 10 2 1.805 10 12
matRe500B9391 3.00 × 10 1 5.22 × 10 2 4.04 × 10 1 5.22 × 10 2 1.98 × 10 2 3.331 10 12
matRe500C22,385 2.78 × 10 1 1.56 × 10 3 4.49 × 10 1 1.56 × 10 3 2.70 × 10 2 4.666 10 12
matRe500D50,527 2.84 × 10 1 5.11 × 10 3 3.28 × 10 1 5.11 × 10 3 3.76 × 10 2 7.163 10 12
matRe500E110,620 2.62 × 10 1 1.38 × 10 4 3.07 × 10 1 1.38 × 10 4 5.37 × 10 2 9.548 10 12
obstacle37,168 2.84 × 10 1 2.95 × 10 5 2.91 × 10 2 2.95 × 10 5 1.50 × 10 2 6.273 10 10
plate37,507 1.00 × 10 2 8.71 × 10 4 7.35 × 10 4 8.71 × 10 4 1.32 × 10 2 2.021 10 9
tolosa4000 1.18 × 10 + 1 4.84 × 10 3 1.56 × 10 1 1.45 × 10 3 4.62 × 10 3 6.772 10 7
raefsky321,200 6.63 × 10 6 7.99 × 10 5 6.63 × 10 6 7.99 × 10 5 1.42 × 10 0 0.033 10 6
step96,307 5.87 × 10 3 2.18 × 10 4 5.87 × 10 3 2.18 × 10 4 6.61 × 10 1 0.903 10 9
cavity37,507 7.53 × 10 3 2.51 × 10 7 4.99 × 10 3 2.51 × 10 7 7.16 × 10 2 11.25 10 8
convdiffA146,689 2.37 × 10 + 1 5.69 × 10 2 unknown 5.62 × 10 2 3.73 × 10 2 41.32 10 13
convdiffB146,689 7.88 × 10 + 1 1.35 × 10 3 unknown 3.98 × 10 2 1.33 × 10 3 317.2 10 13
convdiffC146,689 2.14 × 10 + 2 3.84 × 10 3 unknown 3.63 × 10 2 3.83 × 10 3 117.4 10 13
gt01r7980 5.87 × 10 1 1.96 × 10 4 1.33 × 10 1 2.84 × 10 3 1.96 × 10 4 18.09 10 11
venkat62,424 5.50 × 10 5 1.08 × 10 1 5.50 × 10 5 1.08 × 10 1 2.21 × 10 0 0.003 10 11
Table 5. Performance of five rational Krylov subspace methods for the function f 1 ( z ) = z 1 / 2 .
Table 5. Performance of five rational Krylov subspace methods for the function f 1 ( z ) = z 1 / 2 .
Time (s)Space Dimension
ProblemEKFEKFEK*ARK4PsEKFEKFEK*ARK4Ps
aerofoilA 29.59 10.43 13.22 23.36 5 . 63 5042783305494
aerofoilB 6.34 4 . 70 5.34 26.03 5.59 2061381564475
matRe500A 0.98 0.80 0.95 1.60 0 . 77 1401001204167
matRe500B 2.81 2 . 22 2.73 6.21 2.37 1701161424580
matRe500C 7.17 5.39 6.56 17.48 5 . 21 2141461724688
matRe500D 19.06 16.37 16.18 51.95 12 . 80 2642002005092
matRe500E 54.03 45.71 43.07 139.99 30 . 97 3242482365392
obstacle 28.04 14.58 19.74 36.10 10 . 65 37622629050104
plate 25.16 19.91 23.43 51.20 15 . 61 3161982403585
tolosa 1.44 1.14 1.47 0 . 18 0.59 1741541643176
raefsky3 15.79 6 . 36 2848
step 67.41 45.12 52.46 110.92 36 . 01 3641922244069
cavity 173.99 59.17 39.21 19 . 12 93854042134
convdiffA 16 . 75 32.12 19.06 36.87 20.05 108128783961
convdiffB 19.47 23.25 14.80 33.62 14 . 22 118120863967
convdiffC 17.70 21.86 14.18 34.58 13 . 59 112116823963
GT01R 32.07 18.49 8.74 12.58 4 . 64 60047232863151
venkat 60.62 37.67 13.51 30.22 8 . 72 5284002104069
Table 6. Performance of five rational Krylov subspace methods for the function f 2 ( z ) = e z .
Table 6. Performance of five rational Krylov subspace methods for the function f 2 ( z ) = e z .
Time (s)Space Dimension
ProblemEKFEKFEK*ARK4PsEKFEKFEK*ARK4Ps
aerofoilA 36.69 8.55 15.60 22.82 5 . 77 5282503545598
aerofoilB 7.26 4 . 41 5.94 28.71 5.34 2181261685071
matRe500A 1.14 0.88 1.08 1.82 0 . 75 1501101304568
matRe500B 3.43 2 . 37 2.95 6.75 2.65 1841261504988
matRe500C 8.50 5 . 57 7.56 21.21 5.72 22815019257100
matRe500D 22.45 15.13 17.15 57.99 13 . 44 29018621056100
matRe500E 61.06 42.70 49.31 168.76 33 . 15 35023426463104
obstacle 31.66 15.23 22.72 40.76 9 . 16 4082343125683
plate 35.45 24.05 26.59 64.00 15 . 42 3882442684481
tolosa 2.41 2.08 2.22 0 . 32 6.92 21019419846233
raefsky3 42.25 110.27 17.21 8 . 10 5488043196
step 89.53 46.87 61.80 119.95 36 . 88 4342042784482
cavity 164.15 56.39 46.52 16 . 65 82450450102
convdiffA 21.92 31.98 20.91 42.67 20 . 55 126128864464
convdiffB 23.26 25.79 15.45 43.14 14 . 01 130128924864
convdiffC 22.93 23.13 15.55 41.18 13 . 64 128120924664
GT01R 50.13 17.77 10.52 12.87 3 . 24 65044035268115
venkat 60.80 15.39 14.05 33.22 8 . 58 5162282144367
Table 7. Performance of five rational Krylov subspace methods for the function f 3 ( z ) = tanh z z .
Table 7. Performance of five rational Krylov subspace methods for the function f 3 ( z ) = tanh z z .
Time (s)Space Dimension
ProblemEKFEKFEK*ARK4PsEKFEKFEK*ARK4Ps
aerofoilA 55.22 10.67 21.92 22.07 5 . 14 4962363405476
aerofoilB 8.53 4 . 41 6.37 27.94 5.70 2101181584975
matRe500A 1.54 1.09 1.49 1.70 0 . 79 1401041244259
matRe500B 3.91 2 . 52 3.28 6.53 2.68 1721121384680
matRe500C 8.77 5 . 59 7.52 17.73 5.75 2081361744792
matRe500D 24.04 13.09 17.81 55.73 12 . 67 2661502005488
matRe500E 60.38 33.71 45.53 143.02 30 . 38 3221742385492
obstacle 35.95 16.59 23.63 35.95 11 . 04 37622629050104
plate 39.00 25.73 32.25 61.05 17 . 28 36223628242101
tolosa 2.57 1.99 2.30 0 . 19 0.80 1741541643176
raefsky3 75.56 173.31 17.84 9 . 05 55673231100
step 86.80 46.22 62.19 142.57 36 . 27 3821922685177
cavity 45.50 96.82 46.38 20 . 37 37654850138
convdiffA 17 . 06 29.98 19.30 35.20 18.25 108120783751
convdiffB 20.37 23.29 14 . 13 35.40 14.61 118120863967
convdiffC 18.79 21.79 13 . 39 33.17 13.73 112114823863
GT01R 83.07 7.72 15.42 12.26 5 . 42 60624232866151
venkat 2 . 48 5.13 5.08 39.07 8.27 6664645059
Table 8. Performance of five rational Krylov subspace methods for the function f 4 ( z ) = z 1 / 4 .
Table 8. Performance of five rational Krylov subspace methods for the function f 4 ( z ) = z 1 / 4 .
Time (s)Space Dimension
ProblemEKFEKFEK*ARK4PsEKFEKFEK*ARK4Ps
aerofoilA 42.22 10.18 17.33 23.64 5 . 65 5142543445894
aerofoilB 7.24 4 . 51 5.65 28.03 6.04 2041221564987
matRe500A 1.12 0.88 1.08 1.81 0 . 82 1381001204471
matRe500B 3.14 2 . 35 2.88 6.98 2.42 1681141405079
matRe500C 7.63 5 . 17 6.72 20.17 5.27 2101341705487
matRe500D 19.84 14.03 16.77 58.09 12 . 62 2581661985688
matRe500E 53.91 38.53 43.11 167.40 29 . 82 3162062326388
obstacle 30.11 16.19 22.36 47.98 11 . 67 37623429465116
plate 39.41 26.07 32.72 82.89 18 . 73 38825430856121
tolosa 1.87 1.68 1.86 0 . 15 0.68 1641501583068
raefsky3 48.82 103.27 23.58 7 . 60 5267264286
step 91.72 46.64 59.19 168.31 36 . 99 4342042666286
cavity 128.72 81.85 69.43 25 . 41 71257675190
convdiffA 15 . 64 29.43 18.71 36.12 19.66 104118763859
convdiffB 19.18 19.98 14 . 31 37.46 16.27 116110864275
convdiffC 17.22 19.49 13 . 40 36.65 15.15 110106824171
GT01R 51.81 16.50 11.90 13.62 5 . 27 60838833670163
venkat 82.40 25.55 17.10 37.61 8 . 95 5522942264867
Table 9. Performance of five rational Krylov subspace methods for the function f 5 ( z ) = log ( z ) .
Table 9. Performance of five rational Krylov subspace methods for the function f 5 ( z ) = log ( z ) .
Time (s)Space Dimension
ProblemEKFEKFEK*ARK4PsEKFEKFEK*ARK4Ps
aerofoilA 40.81 10.75 18.15 23.09 5 . 89 5162563445594
aerofoilB 7.39 4 . 53 5.87 27.81 5.98 2061241564983
matRe500A 1.38 1.01 1.25 1.79 0 . 86 1401021224464
matRe500B 3.45 2 . 45 3.07 6.76 2.60 1681141404880
matRe500C 7.87 5 . 26 7.00 19.28 5.41 2101341705288
matRe500D 20.17 14.70 16.95 50.55 12 . 73 2601741984988
matRe500E 54.72 41.43 43.04 145.27 30 . 49 3182222325492
obstacle 30.77 16.28 22.89 40.04 11 . 73 37623029455112
plate 34.92 25.61 32.11 72.58 18 . 98 36225030249117
tolosa 2.63 1.94 2.15 0 . 23 0.92 1701541603076
raefsky3 17.76 8 . 07 3188
step 90.39 46.88 59.46 137.08 37 . 45 4342042665185
cavity 157.56 80.11 56.39 24 . 96 79658261178
convdiffA 15 . 85 29.75 18.80 35.57 19.58 104120763859
convdiffB 18.64 20.67 13 . 72 35.90 15.65 114112844071
convdiffC 17.59 19.75 13 . 43 33.00 14.42 110108823867
GT01R 47.93 20.42 12.02 12.34 5 . 83 61242633663163
venkat 72.81 28.59 15.32 36.06 8 . 78 5423242184768
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xu, S.; Xue, F. A Flexible Extended Krylov Subspace Method for Approximating Markov Functions of Matrices. Mathematics 2023, 11, 4341. https://doi.org/10.3390/math11204341

AMA Style

Xu S, Xue F. A Flexible Extended Krylov Subspace Method for Approximating Markov Functions of Matrices. Mathematics. 2023; 11(20):4341. https://doi.org/10.3390/math11204341

Chicago/Turabian Style

Xu, Shengjie, and Fei Xue. 2023. "A Flexible Extended Krylov Subspace Method for Approximating Markov Functions of Matrices" Mathematics 11, no. 20: 4341. https://doi.org/10.3390/math11204341

APA Style

Xu, S., & Xue, F. (2023). A Flexible Extended Krylov Subspace Method for Approximating Markov Functions of Matrices. Mathematics, 11(20), 4341. https://doi.org/10.3390/math11204341

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop