1. Introduction
Consider the linear matrix equation
where
,
and
. Such problems arise in many practical applications such as surface fitting in computer-aided geometric design (CAGD), signal and image processing, photogrammetry, etc.; see, for example, [
1,
2,
3,
4] and the large body of literature therein. If
is consistent,
is the minimum Frobenius norm solution. If
is inconsistent,
is the minimum Frobenius norm least-squares solution. When the matrices
A and
B are small and dense, direct methods based on QR fractions are attractive [
5,
6]. However, for large
A and
B matrices, iterative methods have attracted a lot of attention [
7,
8,
9,
10,
11]. Recently, Du et al. proposed the randomized block coordinate descent (RBCD) method for solving the matrix least-squares problem
without strong convexity assumption in [
12]. This method requires that matrix
B is a full row-rank matrix. Wu et al. [
13] introduced two kinds of Kaczmarz-type methods to solve the consistent matrix equation
: relaxed greedy randomized Kaczmarz (ME-RGRK) and maximal weighted residual Kaczmarz (ME-MWRK). Although the row and column index selection strategy is time-consuming, the ideas of these two methods are suitable for solving large-scale consistent matrix equations.
In this paper, the randomized Kaczmarz method [
14] and the randomized extended Kaczmarz method [
15] are used to solve consistent and inconsistent matrix equation (
1) with the product of the matrix and vector.
All the results in this paper hold in the complex field. But for the sake of simplicity, we only discuss them in terms of the real number field.
In this paper, we denote , , , , and as the transpose, the Moore–Penrose generalized inverse, the rank of A, the column space of A, the Frobenius norm of A and the inner product of two matrices A and B, respectively. For an integer , let . We use I to denote the identity matrix whose order is clear from the context. In addition, for a given matrix , , , and are used to denote the ith row, the jth column, the maximum singular value and the smallest nonzero singular value of G, respectively. Let denote the expected value conditional on the first k iterations, that is, where and are the row and the column chosen at the sth iteration. Let the conditional expectations with respect to the random row index be and with respect to the random column index be By the law of total expectation, it holds that .
The organization of this paper is as follows. In
Section 2, we will discuss the block Kaczmarz method (ME-RBK) for finding the minimal
F-norm solution (
) of consistent matrix equation (
1). In
Section 3, we discuss the extended block Kaczmarz method (IME-REBK) for finding the minimal
F-norm least-squares solution of matrix equation (
1). In
Section 4, some numerical examples are provided to illustrate the effectiveness of our new methods. Finally, some brief concluding remarks are described in
Section 5.
2. The Randomized Block Kaczmarz Method for Consistent Equation
At the
kth iteration, the Kaczmarz method selects randomized a row
of
A and performs an orthogonal projection of the current estimate matrix
onto the corresponding hyperplane
, that is,
The Lagrangian function of the conditional optimization problem (
2) is
where
is a Lagrangian multiplier. Via the matrix differentiation, we obtain the gradient of
and set
to find the stationary matrix:
Using the first equation of (
4), we have
. Substituting this into the second equation of (
4), we can obtain
. So, the projected randomized block Kacmarz (ME-PRBK) for solving
iterates as
However, in practice, it is very expensive to calculate the pseudoinverse of large-scale matrices. Next, we generalize the average block Kaczmarz method [
16] for solving linear equation to matrix equation.
At the
kth step, we obtain the approximate solution
by projecting the current estimate
onto the hyperplane
. Using the Lagrangian multiplier method, we can obtain the following Kaczmarz method for
:
Inspired by the idea of the average block Kaczmaz algorithm for
, we consider the average block Kaczmaz method for
with respect to
B.
where
is stepsize and
are the weights that satisfy
and
. If
, then
Setting
, we obtain the following randomized block Kaczmarz iteration:
where
i is selected with probability
. We describe this method as Algorithm 1, which is called the ME-RBK algorithm.
Algorithm 1 Randomized Block Kaczmarz Method for (ME-RBK) |
- Input:
, , , - 1:
for do - 2:
Pick i with probability - 3:
Compute - 4:
end for
|
We arrange the computational process of calculating
in
Table 1, which only costs
flopping operations (flops) if the square of the row norm of
A has been calculated in advance.
Remark 1. Note that the problem of finding a solution of can be posed as the following linear least-squares problem: Define the component function then differentiate with X to obtain its gradient Therefore, the randomized block Kaczmarz method (6) is equivalent to one step of the stochastic gradient descent method [17] applied to (7) with stepsize . First, we give the following lemma, whose proof can be found in [
12].
Lemma 1 ([
12]).
Let and be any nonzero matrix. LetFor any matrix , it holds that Remark 2. means that and . In fact, is well defined because and .
In the following theorem, with the idea of the RK method [
14], we will prove that
generated by Algorithm 1 converges to the least
F-norm solution of
.
Theorem 1. Assume . If matrix equation (1) is consistent, the sequence generated by the ME-RBK method starting from the initial matrix , in which and , converges linearly to in mean square form. Moreover, the solution error in expectation for the iteration sequence obeyswhere , and picked with probability . Proof. For
, by (
6) and
(consistency), we have
then
By taking the conditional expectation, we have
From
and
, we have
. Noting
, it is easy to show that
through induction. Then, from Lemma 1 and
, we can obtain
Finally, from (
9) and induction on the iteration index
k, we obtain the estimate (
8). □
Remark 3. Using a similar approach to that used in the proof of Theorem 1, we can prove that the iterate generated by ME-PRBK (5) satisfies the following estimate:where . The convergence factor of GRK in [18] is . It is obvious that and when . This means that the convergence factor of ME-PRBK is the smallest and the factor of ME-RBK can be smaller than that of GRK when α is properly selected.
3. The Randomized Extended Block Kaczmarz Method for Inconsistent Equation
In [
15,
19,
20], the authors proved that the Kaczmarz method does not converge to the least-squares solution of
when
is inconsistent. Analogously, if the matrix equation (
1) is inconsistent, the above ME-PRBK method dose not converge to
. The following theorem gives the error bound of the inconsistent matrix equation.
Theorem 2. Assume that the consistent equation has a solution . Let denote the kth iterate of the ME-PRBK method applied to the inconsistent equation for any starting from the initial matrix , in which and . In exact arithmetic, it follows that Proof. Set
,
. Let
Y denote the iterate of the PRBK method applied to the consistent equation
at the
kth step, that is,
By taking the conditional expectation on both sides of (
11), we can obtain
The inequality is obtained using Remark 3. Applying this recursive relation iteratively, we have
This completes the proof. □
Next, we use the idea of the randomized extended Kaczmarz method (see [
20,
21,
22] for details) to solve the least-squares solution of the inconsistent Equation (
1). At each iteration,
is the
kth iterate of ME-RBK applied to
with the initial guess
, and
is the one-step ME-RBK update for
. We can obtain the following randomized extended block Kaczmarz iteration:
where
is the step size, and
i and
j are selected with probability
and
, respectively. The cost of each iteration of this method is
for updating
and
for updating
if the square of the row norm and the column norm of
A have been calculated in advance. We describe this method as Algorithm 2, which is called the ME-REBK algorithm.
Algorithm 2 Randomized Extended Block Kaczmarz Method for (ME-REBK) |
- Input:
, , , , - 1:
for do - 2:
Pick j with probability - 3:
Compute - 4:
Pick i with probability - 5:
Compute - 6:
end for
|
Theorem 3. Assume . Let denote the kth iteration of ME-RBK applied to starting from the initial matrix , in which and . Then, converges linearly to in mean square form, and the solution error in expectation for the iteration sequence obeyswhere the jth column of A is selected with probability . Proof. In Theorem 1, replacing A with , B with and C with 0, we can prove Theorem 3 based on the result of Theorem 1. For the sake of conciseness, we omit the proof process. □
Theorem 4. Assume . The sequence is generated using the ME-REBK method for , starting from the initial matrix and , where , and . For any , it holds thatwhere , are picked with probability and , respectively. Proof. Let
denote the
kth iteration of the ME-REBK method for
, and
be the one-step Kaczmarz update for the matrix equation
from
, i.e.,
For any
, via triangle inequality and Young’s inequality, we can obtain
By taking the conditional expectation on the both sides of (
15), we have
From
, we have
. Then, by using Theorem 1, we can obtain
then
Combining (
16)–(
18) yields
This completes the proof. □
Remark 4. Replacing in (12) with , we obtain the following projection-based randomized extended block Kaczmarz mathod (ME-PREBK) iteration: