1. Introduction
We consider computing an arbitrary singular value of a tensor sum:
where
,
,
,
is the
identity matrix, and the symbol “⊗” denotes the Kronecker product. The tensor sum
T arises from a finite difference discretization of three-dimensional constant coefficient partial differential equations (PDE) defined as follows:
where
,
,
, and the symbol “*” denotes elementwise products. If
, then
. Matrix
T tends to be too large even if
and
C are not. Hence it is difficult to compute singular values of
T with regard to the memory requirement.
Previous studies [
1,
2] provided methods to compute the maximum and minimum singular values of
T. By the previous studies, one can compute only the maximum and minimum singular values of
T without shift. On the other hand, one can compute arbitrary singular values of
T with the shift by this work. The previous studies are based on the Lanczos bidiagonalization method (see, e.g., [
3]), which computes the maximum and minimum singular values of a matrix. For insights on Lanczos bidiagonalization method, see, e.g., [
4,
5,
6]. The Lanczos bidiagonalization method for
T was implemented using tensors and their operations to reduce the memory requirement.
The Lanczos method with the shift-and-invert technique, see, e.g., [
3], is widely known for computing an arbitrary eigenvalue
of a symmetric matrix
. This method solves the shift-and-invert eigenvalue problem:
, where
x is the eigenvector of
M corresponding to
, and
is a shift point which is set to the nearby
(
). Since the eigenvalue problem has the eigenvalue
as the maximum eigenvalue, the method is effective for computing the desired eigenvalue
near
. For successful work using the shift-and-invert technique, see, e.g., [
7,
8,
9,
10,
11,
12,
13].
Therefore, we obtain a computing method for an arbitrary singular value of T based on the shift-and-invert Lanczos method. The method solves the following shift-and-invert eigenvalue problem: , where is the desired singular value of T, x is the corresponding right-singular vector, and is close to (). This shift-and-invert Lanczos method requires the solution of large-scale linear systems with the coefficient matrix . Here, can be a dense matrix whose number of elements is even if T is a sparse matrix whose number of elements is when are dense.
Since it is difficult regarding the memory requirement to apply the direct method, e.g., the Cholesky decomposition, which needs generating matrix
, the preconditioned conjugate gradient (PCG) method, see, e.g., [
14], is applied, even though it is difficult in terms of memory requirements to simply implement this shift-and-invert Lanczos method and the PCG method since the size of
T grows rapidly by the sizes of
A,
B, and
C.
We propose the following two techniques in this paper: (1) Efficient implementations of the shift-and-invert Lanczos method for the eigenvalue problem of
and the PCG method for
using three-dimensional arrays (third-order tensors) and the
n-mode products, see, e.g., [
15]. (2) Preconditioning matrices based on the eigenvalue decomposition and the Schur decomposition of
T for faster convergence of the PCG method. Finally, we show the effectiveness of the proposed method through numerical experiments.
2. Preliminaries of Tensor Operations
A tensor means a multidimensional array. Particularly, the third-order tensor
plays an important role. In the rest of this section, the definitions of some tensor operations are shown. For more details, see, e.g., [
15].
Firstly, a summation, a subtraction, an inner product, and a norm for
are defined as follows:
where
denotes the
element of
. Secondly, the
n-mode product of a tensor
and a matrix
is defined as
where
for
and
. Finally,
and
operators are the following maps between a vector space
and a tensor space
:
and
operator can vectorize a tensor by combining all column vectors of the tensor into one long vector. Conversely,
operator can reshape a tensor from one long vector.
3. Shift-and-Invert Lanczos Method for an Arbitrary Singular Value over Tensor Space
This section gives an algorithm for computing an arbitrary singular value of the tensor sum
T. Let
and
x be a desired singular value of
T and the corresponding right singular vectors, respectively. Then, the eigenvalue problem of
T is written by
. Here, introducing a shift
, the shift-and-invert eigenvalue problem is
The shift-and-invert Lanczos method (see, e.g., [
3]) computes the nearest singular value
based on Equation (
3). Reconstructing this method over the
tensor space, we obtain Algorithm 1 whose memory requirement is of
when
.
Algorithm 1: Shift-and-invert Lanczos method for an arbitrary singular value over tensor space |
1: Choose an initial tensor ; |
2: ; |
3: for until convergence do |
4: |
5: (Computed by Algorithms 3 or 4 in Section 4) |
6: ; |
7: ; |
8: ; |
9: ; |
10: end for |
11: Approximate singular value , where is the maximum eigenvalue of .
|
At step
k, we have the following
by Algorithm 1:
To implement Algorithm 1, we need to iteratively solve the linear system
whose coefficient matrix is
, that is, the memory requirement is
when
. Here, the convergence rate of the shift and invert Lanczos method depends on the ratio of gaps between the maximum, the second maximum, and the minimum singular values
of
as follows:
.
In the next section, we consider solving the linear systems with memory requirement of when .
4. Preconditioned Conjugate Gradient (PCG) Method over Tensor Space
This section provides an efficient solver of Equation (
4) using tensors. This linear system is rewritten by
, where
and
. Then we solve
, where
v and
are unknown and known vectors. Since the coefficient matrix is symmetric positive definite, we can use the preconditioned conjugate gradient method (PCG method, see, e.g., [
14]), which is one of the widely used solvers. However, it is difficult to simply apply the method due to the complex nonzero structure of the coefficient matrix
. For applying the PCG method, we consider transforming the linear system
by the eigendecomposition and the complex Schur decomposition as shown in the next subsections.
4.1. PCG Method by the Eigendecomposition
Firstly,
T is decomposed into
, where
X and
D are a matrix whose column vectors are eigenvectors and a diagonal matrix with eigenvalues, respectively. Then, it follows that
where
is the complex conjugate of
D. We rewrite the above linear system into
, where
,
, and
. Here,
X is easily computed by small matrices
and
whose column vectors are eigenvectors of
A,
B, and
C as follows:
. Moreover, eigenvalues of
T in
D are obtained by summations of each eigenvalue of
A,
B, and
C.
The PCG method for solving
is shown in Algorithm 2. Since this algorithm computes
, we need to compute
.
Section 4.1.1 proposes a preconditioning matrix and
Section 4.1.2 provides efficient computations using tensors.
Algorithm 2: PCG method over vector space for |
1: Choose an initial vector and , and an initial scalar ; |
2: ; |
3: ; |
4: for until convergence do |
5: ; |
6: ; |
7: ; |
8: ; |
9: ; |
10: ; |
11: ; |
12: end for |
13: Obtain an approximate solution ;
|
4.1.1. Preconditioning Matrix
Algorithm 2 solves
where
is a preconditioning matrix.
M must satisfy the following two conditions: (1) a condition number of
is close to 1; (2) the matrix-vector multiplication of
is easily computed.
Therefore, we propose a preconditioning matrix based on the eigendecomposition of
TSince M is the diagonal matrix, the second condition of the preconditioning matrix is satisfied. Moreover, if T is symmetric, X is the unitary matrix, that is, . In the case of the symmetric matrix T, we obtain . Namely, the proposed matrix satisfies the first conditions when T is symmetric. So, even if T is not exactly symmetric, if T is almost symmetric, then we can expect the preconditioning matrix M to be effective.
4.1.2. Efficient Implementation of Algorithm 2 by the Eigendecomposition
Similarly to obtaining Algorithm 1, to improve an implementation of Algorithm 2, we reconstruct dimensional vectors into tensors via operator as follows: , , , , and . Most computations of vectors are simply transformed into computations of tensors because of the linearity of operator.
In the rest of this section, we show the computations of
and
, which are required in the PCG method, using the 1, 2, and 3-mode products for tensors and the definition of
T. First, from the definitions of
and
X,
holds. Let
, where
returns an
-dimensional column vector with diagonals of
D. Then,
, where
and
denote the eigenvalues of
, and
C. Note that
since we compute
for
. Using the relation between the Kronecker product and the mode products via
operator, we compute
where “*” denotes elementwise product.
Next, from the definition of the diagonal matrix
M in Equation (
5), we easily obtain
Here, let
. Then it follows that
.
is computed by
As shown in Algorithm 3, the PCG method can be implemented using the preconditioning matrix M and the aforementioned computations, where the linear system is transformed into , where and . Algorithm 3 requires only small matrices A, B, and C and tensors , and . Therefore the memory requirement is of in the case of .
4.2. PCG Method by the Schur Decomposition
Firstly, the Schur decomposition of
T is
, where
R and
Q are upper triangular and unitary matrices, respectively. Then,
This linear system denotes , where , , and . The PCG method for is shown in Algorithm 2. R and Q are obtained from the complex Schur decomposition of , and C as follows: and from the definition of T, where , , and by the Schur decomposition of , and C.
Algorithm 3: PCG method over tensor space for the 5-th line of Algorithm 1 [Proposed inner algorithm using the eigendecomposition] |
1: Choose an initial tensor and , and an initial scalar ; |
2:
; |
3: ; |
4: for until convergence do |
5: ; |
6: ; |
7: ; |
8: ; |
9: ; |
10: ; |
11: ; |
12: end for |
13: Obtain an approximate solution ; |
14: ;
|
4.2.1. Preconditioning Matrix
A preconditioning matrix for
satisfies the conditions in
Section 4.1.1. Therefore, we propose the preconditioning matrix based on the Schur decomposition
where
is a diagonal matrix with diagonals of
R. Since
M is also the diagonal matrix, the above second conditions are satisfied. Moreover, if
T is symmetric,
R is a diagonal matrix, that is,
. Therefore
in the case of the symmetric matrix
T. From this, we expect that the preconditioning matrix
M is effective if
T is not symmetric but almost symmetric.
4.2.2. Efficient Implementation of Algorithm 2 by the Schur Decomposition
We show the computations of
and
for the PCG method over tensor space using the 1, 2, and 3-mode products for tensors and the definition of
T. First, from the definitions of
and
R, we have
. Therefore,
Next, from
, we easily obtain
Similarly to
Section 4.1.2, let
and
. Then, we have
, where
.
is computed by (
7).
As shown in Algorithm 4, the PCG method can be implemented using the preconditioning matrix M and the aforementioned computations, where the linear system is transformed into , where and . Algorithm 4 just requires small matrices A, B, and C and tensors , and , namely, do not require large matrix T. Therefore the memory requirement is of in the case of .
Algorithm 4: PCG method over tensor space for the 5-th line of Algorithm 1 [Proposed inner algorithm using the Schur decomposition] |
1: Choose an initial tensor and , and an initial scalar ; |
2: ; |
3: ; |
4: for until convergence do |
5: ; |
6: ; |
7: ; |
8: ; |
9: ; |
10: ; |
11: ; |
12: end for |
13: Obtain an approximate solution ; |
14: ; |
5. Numerical Experiments
This section provides results of numerical experiments using Algorithm 1 with Algorithm 3 and Algorithm 1 with Algorithm 4. There are the two purposes of this experiments: (1) to confirm convergence to the singular value of T nearest to the shift by Algorithm 1, and (2) to confirm the effectiveness of the proposed precondition matrix in Algorithms 3 and 4. For comparison, the results using Algorithms 3 and 4 in the case of are also given as the results by the CG method. All the initial guesses of Algorithms 1, 3, and 4 are tensors with random numbers. The stopping criteria used in Algorithm 1 was , where is the eigenvector corresponding to the maximum eigenvalue of and denotes the k-th canonical basis for k dimensional vector space. Algorithms 3 and 4 were stopped when either the relative residual or the maximum number of iterations were satisfied.
All computations were carried out using MATLAB R2021a version on a workstation with Xeon processor 3.7 GHz and 128 GB of RAM.
In the following subsection, we show the results computing the 5-th maximum, median, and 5-th minimum singular values
of the test matrices
T. For all the cases, for the first purpose, we set the shift value in Algorithm 1 as
where
’s and
’s are the perturbed singular values of
T and the aforementioned singular values computed by the
svd function in MATLAB, respectively.
Test matrices
T in Equation (
1) are obtained from a seven-point central difference discretization of the PDE (
2) in over an
grid. The test matrices
T in Equation (
1), whose size is
, are generated from
where
,
and
are symmetric and skew-symmetric matrices given below.
Numerical Results
In all tables, the number of iterations of the shift-and-invert Lanczos method (“the Lanczos method" hereafter) and the average of the number of iterations of the CG or PCG method based on the eigendecomposition or the Schur decomposition are summarized. “Not converged” denotes Algorithm 3 or 4 did not converge.
We show the first results in the case of almost symmetric matrix with
and
in Equation (
9) for the shift (
8). From
Table 1,
Table 2 and
Table 3, the numbers of iterations of Lanczos methods using any inner algorithms were almost the same. Focusing on the effectiveness of the proposed preconditioning matrix
M, the numbers of iterations of both PCG methods were less than 19 regardless of the size of
T. On the other hand, the numbers of iterations of both CG methods linearly increased depending on the size of
T. From these facts, the preconditioning matrix
M is effective in the case of almost symmetric matrix
T. Moreover, the number of iterations of the shift and invert Lanczos method for the median singular value is larger than the number for other singular values since the distance between the maximum and second maximum singular values of
for the median singular value of
T is closer than the cases of other singular values.
Here, the running time of
Table 1 is summarized in
Table 4. All the running time by the PCG method were less than the time by the CG method. Moreover, the running time by the PCG methods of Algorithms 3 and 4 were similar since the computational complexities of these algorithms are similar. Thus, the running time is strongly correlated with the number of iterations of Algorithms 3 and 4.
In addition, convergence histories of
in
Table 1 and
Table 2 are shown in
Figure 1 and
Figure 2.
Figure 2 displays that the relative residual norms unsteadily decreased when the number of iterations of the shift and invert Lanczos method is not small.
Next, we show the second results in the case of slightly symmetric matrix with
and
in Equation (
9) for the shift (
8). From
Table 5, both PCG methods did not converge for computing the 5-th maximum singular values of slightly symmetric matrix
T. It seems that the linear system for
is ill-conditioned since
in the shift (
8) is much less than the 5-th maximum singular values of the matrix. In
Appendix A, we show the results using relative shift without the effect of the magnitude of the singular values.
Table 6 shows Algorithms 1 and 4, that is, the algorithms based on Schur decomposition, was more robust than Algorithms 1 and 3, that is, the algorithms based on the eigendecomposition. From
Table 7, both PCG methods converged regardless of
n, and the numbers of iterations of both PCG methods were less than the number of iterations of both CG method. Namely, it seems that the preconditioning matrix
M can be effective in the case of a slightly symmetric matrix
T when computing the 5-th minimum and median singular values of
T.
Finally, we show the third results in the case of marginally symmetric matrix with
and
in Equation (
9) for the shift (
8). Both PCG methods did not converge for computing the 5-th maximum singular values of
T as shown in
Table 8, similarly to
Table 5. Moreover, computing the median singular values of
T sometimes did not converge from
Table 9. In
Table 10, all methods converged for the 5-th minimum singular value of
T. The numbers of iterations by the PCG method with the proposed preconditioning matrix were less than the number of iterations by the CG method in most cases. It seems that the preconditioning matrix
M can be effective in the case of the marginally symmetric matrix
T when computing the 5-th minimum singular values of
T.
6. Conclusions
We considered computing an arbitrary singular value of a tensor sum. The shift-and-invert Lanczos method and the PCG method reconstructed over tensor space. We proposed the preconditioning matrices which are the following two diagonal matrices: (1) whose diagonals of the eigenvalues by the eigendecomposition, and (2) whose diagonals of the upper diagonal matrix by the Schur decomposition. This preconditioning matrix can be effective if the tensor sum is almost symmetric.
From numerical results, we confirmed that the proposed method reduces memory requirements without any conditions. The numbers of iterations of the PCG method by the proposed preconditioning matrix were reduced in most cases of the almost and slightly symmetric matrix. Moreover, the numbers of iterations of the PCG method by the proposed preconditioning matrix were also reduced in certain cases of the marginally symmetric matrix.
For future work, we will consider a robust preconditioning matrix for slightly or marginally symmetric tensor sum, a suitable preconditioning matrix for non-symmetric tensor sum, parallel implementations of the proposed algorithms, and finding real-life applications.