Next Article in Journal
Bifurcating Limit Cycles with a Perturbation of Systems Composed of Piecewise Smooth Differential Equations Consisting of Four Regions
Previous Article in Journal
Finite Control Set Model Predictive Control (FCS-MPC) for Enhancing the Performance of a Single-Phase Inverter in a Renewable Energy System (RES)
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

On the Convergence of the Randomized Block Kaczmarz Algorithm for Solving a Matrix Equation

College of Science, China University of Petroleum, Qingdao 266580, China
*
Author to whom correspondence should be addressed.
Mathematics 2023, 11(21), 4554; https://doi.org/10.3390/math11214554
Submission received: 8 October 2023 / Revised: 30 October 2023 / Accepted: 3 November 2023 / Published: 5 November 2023
(This article belongs to the Section Computational and Applied Mathematics)

Abstract

:
A randomized block Kaczmarz method and a randomized extended block Kaczmarz method are proposed for solving the matrix equation A X B = C , where the matrices A and B may be full-rank or rank-deficient. These methods are iterative methods without matrix multiplication, and are especially suitable for solving large-scale matrix equations. It is theoretically proved that these methods converge to the solution or least-square solution of the matrix equation. The numerical results show that these methods are more efficient than the existing algorithms for high-dimensional matrix equations.

1. Introduction

Consider the linear matrix equation
A X B = C ,
where A R m × p , B R q × n and C R m × n . Such problems arise in many practical applications such as surface fitting in computer-aided geometric design (CAGD), signal and image processing, photogrammetry, etc.; see, for example, [1,2,3,4] and the large body of literature therein. If A X B = C is consistent, X * = A C B is the minimum Frobenius norm solution. If A X B = C is inconsistent, X * = A C B is the minimum Frobenius norm least-squares solution. When the matrices A and B are small and dense, direct methods based on QR fractions are attractive [5,6]. However, for large A and B matrices, iterative methods have attracted a lot of attention [7,8,9,10,11]. Recently, Du et al. proposed the randomized block coordinate descent (RBCD) method for solving the matrix least-squares problem min X R p × q C A X B F without strong convexity assumption in [12]. This method requires that matrix B is a full row-rank matrix. Wu et al. [13] introduced two kinds of Kaczmarz-type methods to solve the consistent matrix equation A X B = C : relaxed greedy randomized Kaczmarz (ME-RGRK) and maximal weighted residual Kaczmarz (ME-MWRK). Although the row and column index selection strategy is time-consuming, the ideas of these two methods are suitable for solving large-scale consistent matrix equations.
In this paper, the randomized Kaczmarz method [14] and the randomized extended Kaczmarz method [15] are used to solve consistent and inconsistent matrix equation (1) with the product of the matrix and vector.
All the results in this paper hold in the complex field. But for the sake of simplicity, we only discuss them in terms of the real number field.
In this paper, we denote A T , A , r ( A ) , R ( A ) , A F = trace ( A T A ) and A , B F = trace ( A T B ) as the transpose, the Moore–Penrose generalized inverse, the rank of A, the column space of A, the Frobenius norm of A and the inner product of two matrices A and B, respectively. For an integer n 1 , let [ n ] = { 1 , 2 , , n } . We use I to denote the identity matrix whose order is clear from the context. In addition, for a given matrix G = ( g i j ) R m × n , G i , : , G : , j , σ max ( G ) and σ min ( G ) are used to denote the ith row, the jth column, the maximum singular value and the smallest nonzero singular value of G, respectively. Let E k denote the expected value conditional on the first k iterations, that is, E k [ · ] = E [ · | i 0 , j 0 , i 1 , j 1 , , i k 1 , j k 1 ] , where i s and j s ( s = 0 , 1 , , k 1 ) are the row and the column chosen at the sth iteration. Let the conditional expectations with respect to the random row index be E k i [ · ] = E [ · | i 0 , j 0 , i 1 , j 1 , , i k 1 , j k 1 , j k ] and with respect to the random column index be E k j [ · ] = E [ · | i 0 , j 0 , i 1 , j 1 , i k 1 , j k 1 , i k ] . By the law of total expectation, it holds that E k [ · ] = E k i [ E k j [ · ] ] .
The organization of this paper is as follows. In Section 2, we will discuss the block Kaczmarz method (ME-RBK) for finding the minimal F-norm solution ( A C B ) of consistent matrix equation (1). In Section 3, we discuss the extended block Kaczmarz method (IME-REBK) for finding the minimal F-norm least-squares solution of matrix equation (1). In Section 4, some numerical examples are provided to illustrate the effectiveness of our new methods. Finally, some brief concluding remarks are described in Section 5.

2. The Randomized Block Kaczmarz Method for Consistent Equation

At the kth iteration, the Kaczmarz method selects randomized a row i [ m ] of A and performs an orthogonal projection of the current estimate matrix X ( k ) onto the corresponding hyperplane H i = { X R p × q | A i , : X B = C i , : } , that is,
min X R p × q 1 2 X X ( k ) F 2 s . t . A i , : X B = C i , : .
The Lagrangian function of the conditional optimization problem (2) is
L ( X , Y ) = 1 2 X X ( k ) F 2 + Y , A i , : X B C i , : ,
where Y R 1 × n is a Lagrangian multiplier. Via the matrix differentiation, we obtain the gradient of L ( X , Y ) and set L ( X , Y ) = 0 to find the stationary matrix:
X L ( X , Y ) | X ( k + 1 ) = X ( k + 1 ) X ( k ) + A i , : T Y B T = 0 , Y L ( X , Y ) | X ( k + 1 ) = A i , : X ( k + 1 ) B C i , : = 0 .
Using the first equation of (4), we have X ( k + 1 ) = X ( k ) A i , : T Y B T . Substituting this into the second equation of (4), we can obtain Y = 1 A i , : 2 2 ( C i , : A i , : X ( k ) ) ( B T B ) . So, the projected randomized block Kacmarz (ME-PRBK) for solving A X B = C iterates as
X ( k + 1 ) = X ( k ) + A i , : T A i , : 2 2 ( C i , : A i , : X ( k ) B ) B .
However, in practice, it is very expensive to calculate the pseudoinverse of large-scale matrices. Next, we generalize the average block Kaczmarz method [16] for solving linear equation to matrix equation.
At the kth step, we obtain the approximate solution X ( k + 1 ) by projecting the current estimate X ( k ) onto the hyperplane H i , j = { X R p × q | A i , : X ( k ) B : , j = C i , j } . Using the Lagrangian multiplier method, we can obtain the following Kaczmarz method for A X B = C :
X ( k + 1 ) = X ( k ) + A i , : T ( C i , : A i , : X ( k ) B : , j ) B : , j T A i , : 2 2 B : , j 2 2 .
Inspired by the idea of the average block Kaczmaz algorithm for A x = b , we consider the average block Kaczmaz method for A X B = C with respect to B.
X ( k + 1 ) = X ( k ) + λ A i , : T A i , : 2 2 j = 1 n v j k ( C i , : A i , : X ( k ) B : , j ) B : , j T B : , j 2 2 ,
where λ is stepsize and v j k are the weights that satisfy v j k 0 and j = 1 n v j k = 1 . If v j k = B : , j 2 2 B F 2 , then
X ( k + 1 ) = X ( k ) + λ B F 2 A i , : T A i , : 2 2 ( C i , : A i , : X ( k ) B ) B T .
Setting α = λ B F 2 > 0 , we obtain the following randomized block Kaczmarz iteration:
X ( k + 1 ) = X ( k ) + α A i , : 2 2 A i , : T ( C i , : ( A i , : X ( k ) ) B ) B T , k = 0 , 1 , 2 , ,
where i is selected with probability p i = A i , : 2 2 A F 2 . We describe this method as Algorithm 1, which is called the ME-RBK algorithm.
Algorithm 1 Randomized Block Kaczmarz Method for A X B = C (ME-RBK)
Input: 
A R m × p , B R q × n , C R m × n , X ( 0 ) = 0 R p × q
1:
for k = 0 , 1 , 2 , , do
2:
   Pick i with probability p i ( A ) = A i , : 2 2 A F 2
3:
   Compute X ( k + 1 ) = X ( k ) + α A i , : 2 2 A i , : T ( C i , : ( A i , : X ( k ) ) B ) B T
4:
end for
We arrange the computational process of calculating X ( k + 1 ) in Table 1, which only costs 4 q ( n + p ) + p + 1 2 q flopping operations (flops) if the square of the row norm of A has been calculated in advance.
Remark 1. 
Note that the problem of finding a solution of A X B = C can be posed as the following linear least-squares problem:
min X R p × q 1 2 A X B C F 2 = min X R p × q 1 2 i = 1 m A i , : X B C i , : 2 2 .
Define the component function
f i ( X ) = 1 2 A i , : X B C i , : 2 2 ,
then differentiate with X to obtain its gradient
f i ( X ) = A i , : T ( A i , : X B C i , : ) B T .
Therefore, the randomized block Kaczmarz method (6) is equivalent to one step of the stochastic gradient descent method [17] applied to (7) with stepsize α A i , : 2 2 .
First, we give the following lemma, whose proof can be found in [12].
Lemma 1 
([12]). Let A R m × p and B R q × n be any nonzero matrix. Let
M = { M R p × q | Y R m × n s . t . M = A T Y B T } .
For any matrix M M , it holds that
A M B F 2 σ min 2 ( A ) σ min 2 ( B ) M F 2 .
Remark 2. 
M M means that M : , j R ( A T ) , j = 1 , 2 , , q and ( M i , : ) T R ( B ) , i = 1 , 2 , , p . In fact, M is well defined because 0 M and A C B M .
In the following theorem, with the idea of the RK method [14], we will prove that X ( k ) generated by Algorithm 1 converges to the least F-norm solution of A X B = C .
Theorem 1. 
Assume 0 < α < 2 B 2 2 . If matrix equation (1) is consistent, the sequence { X ( k ) } generated by the ME-RBK method starting from the initial matrix X ( 0 ) R p × q , in which X : , j ( 0 ) R ( A T ) , j = 1 , 2 , , q and ( X i , : ( 0 ) ) T R ( B ) , i = 1 , 2 , , p , converges linearly to A C B in mean square form. Moreover, the solution error in expectation for the iteration sequence X ( k ) obeys
E X ( k ) A C B F 2 ρ k X ( 0 ) A C B F 2 ,
where ρ = 1 2 α α 2 B 2 2 A F 2 σ min 2 ( A ) σ min 2 ( B ) , and i [ m ] picked with probability p i ( A ) = A i , : 2 2 A F 2 .
Proof. 
For k = 0 , 1 , 2 , , by (6) and A i , : A C B B = C i , : (consistency), we have
X ( k + 1 ) A C B = X ( k ) + α A i , : 2 2 A i , : T ( C i , : A i , : X ( k ) B ) B T A C B = ( X ( k ) A C B ) α A i , : 2 2 A i , : T A i , : ( X ( k ) A C B ) B B T ,
then
X ( k + 1 ) A C B F 2 = X ( k ) A C B F 2 + α 2 A i , : 2 4 A i , : T A i , : ( X ( k ) A C B ) B B T F 2 2 α A i , : 2 2 X ( k ) A C B , A i , : T A i , : ( X ( k ) A C B ) B B T F .
It follows from
α 2 A i , : 2 4 A i , : T A i , : ( X ( k ) A C B ) B B T F 2 = α 2 A i , : 2 4 trace ( B B T ( X ( k ) A C B ) T A i , : T A i , : A i , : T A i , : ( X ( k ) A C B ) B B T ) = α 2 A i , : 2 2 trace ( B B T ( X ( k ) A C B ) T A i , : T A i , : ( X ( k ) A C B ) B B T ) = α 2 A i , : 2 2 A i , : ( X ( k ) A C B ) B B T 2 2 ( by trace ( u u T ) = u 2 2 for any vector u ) α 2 B 2 2 A i , : 2 2 A i , : ( X ( k ) A C B ) B 2 2 ( by u T B T 2 = B u 2 B 2 u 2 ) ,
and
2 α A i , : 2 2 X ( k ) A C B , A i , : T A i , : ( X ( k ) A C B ) B B T F = 2 α A i , : 2 2 trace ( B T ( X ( k ) A C B ) T A i , : T A i , : ( X ( k ) A C B ) B ) = 2 α A i , : 2 2 A i , : ( X ( k ) A C B ) B 2 2
that
X ( k + 1 ) A C B F 2 X ( k ) A C B F 2 2 α α 2 B 2 2 A i , : 2 2 A i , : ( X ( k ) A C B ) B 2 2 .
By taking the conditional expectation, we have
E k X ( k + 1 ) A C B F 2 X ( k ) A C B F 2 E k 2 α α 2 B 2 2 A i , : 2 2 A i , : ( X ( k ) A C B ) B 2 2 = X ( k ) A C B F 2 2 α α 2 B 2 2 A F 2 A ( X ( k ) A C B ) B F 2 .
From X ( 0 ) M and A C B M , we have X ( 0 ) A C B M . Noting A i , : T = A T I : , i , it is easy to show that X ( k ) A C B M through induction. Then, from Lemma 1 and 0 < α < 2 B 2 2 , we can obtain
E k X ( k + 1 ) A C B F 2 X ( k ) A C B F 2 2 α α 2 B 2 2 A F 2 · σ min 2 ( A ) σ min 2 ( B ) X ( k ) A C B F 2 = 1 2 α α 2 B 2 2 A F 2 σ min 2 ( A ) σ min 2 ( B ) X ( k ) A C B F 2 .
Finally, from (9) and induction on the iteration index k, we obtain the estimate (8).    □
Remark 3. 
Using a similar approach to that used in the proof of Theorem 1, we can prove that the iterate X ( k ) generated by ME-PRBK (5) satisfies the following estimate:
E X ( k ) A C B F 2 ρ ^ k X ( 0 ) A C B F 2 ,
where ρ ^ = 1 σ min 2 ( A ) σ min 2 ( B ) A F 2 B 2 2 . The convergence factor of GRK in [18] is ρ G R K = 1 σ min 2 ( A ) σ min 2 ( B ) A F 2 B F 2 . It is obvious that
ρ G R K > min 0 < α < 2 B 2 2 ρ = 1 σ min 2 ( A ) σ min 2 ( B ) A F 2 B 2 2 = ρ ^
and ρ < ρ G R K when 1 1 B 2 2 B F 2 B 2 2 < α < 1 + 1 B 2 2 B F 2 B 2 2 . This means that the convergence factor of ME-PRBK is the smallest and the factor of ME-RBK can be smaller than that of GRK when α is properly selected.

3. The Randomized Extended Block Kaczmarz Method for Inconsistent Equation

In [15,19,20], the authors proved that the Kaczmarz method does not converge to the least-squares solution of A X = b when A X = b is inconsistent. Analogously, if the matrix equation (1) is inconsistent, the above ME-PRBK method dose not converge to A C B . The following theorem gives the error bound of the inconsistent matrix equation.
Theorem 2. 
Assume that the consistent equation A X B = C has a solution X * = A C B . Let X ^ ( k ) denote the kth iterate of the ME-PRBK method applied to the inconsistent equation A X B = C + W for any W R m × n starting from the initial matrix X ( 0 ) R p × q , in which X : , j ( 0 ) R ( A T ) , j = 1 , 2 , , q and ( X i , : ( 0 ) ) T R ( B ) , i = 1 , 2 , , p . In exact arithmetic, it follows that
E X ^ ( k ) A C B F 2 ρ ^ k X ( 0 ) A C B F 2 + 1 ρ ^ k 1 ρ ^ W B F 2 A F 2 ,
Proof. 
Set H i = { X | A i X B = C i } , H ^ i = { X | A i X B = C i + W i } . Let Y denote the iterate of the PRBK method applied to the consistent equation A X B = C at the kth step, that is,
Y = X ^ ( k ) + A i , : T A i , : 2 2 ( C i , : A i , : X ^ ( k ) B ) B .
It follows from
Y A C B , X ^ ( k + 1 ) A C B F = Y A C B , A i , : T A i , : 2 2 W i B F = trace ( B ) T W i T A i , : A i , : 2 2 ( Y A C B ) = trace W i T A i , : Y A i , : A C B A i , : 2 2 ( B ) T = trace W i T A i , : Y A i , : A C B A i , : 2 2 B ( B T B ) ( by ( B ) T = B ( B T B ) ) = trace W i T A i , : Y B A i , : A C B B A i , : 2 2 ( B T B ) = 0
and
X ^ ( k + 1 ) Y F 2 = A i , : T W i B F 2 A i , : 2 4 = 1 A i , : 2 4 trace ( B ) T W i T A i , : A i , : T W i B = W i B 2 2 A i , : 2 2
that
X ^ ( k + 1 ) A C B F 2 = Y A C B F 2 + X ^ ( k + 1 ) Y F 2 = Y A C B F 2 + W i B 2 2 A i , : 2 2 .
By taking the conditional expectation on both sides of (11), we can obtain
E k X ^ ( k + 1 ) A C B F 2 = E k Y A C B F 2 + E k W i B 2 2 A i , : 2 2 ρ ^ X ^ ( k ) A C B F 2 + W B F 2 A F 2
The inequality is obtained using Remark 3. Applying this recursive relation iteratively, we have
E X ^ ( k + 1 ) A C B F 2 ρ ^ E X ^ ( k ) A C B F 2 + W B F 2 A F 2 ρ ^ 2 E X ^ ( k 1 ) A C B F 2 + ( ρ ^ + 1 ) W B F 2 A F 2 ρ ^ k + 1 X ^ ( 0 ) A C B F 2 + ( ρ ^ k + + ρ ^ + 1 ) W B F 2 A F 2 = ρ ^ k + 1 X ( 0 ) A C B F 2 + 1 ρ ^ k + 1 1 ρ ^ W B F 2 A F 2 .
This completes the proof.    □
Next, we use the idea of the randomized extended Kaczmarz method (see [20,21,22] for details) to solve the least-squares solution of the inconsistent Equation (1). At each iteration, Z ( k ) is the kth iterate of ME-RBK applied to A T Z B T = 0 with the initial guess Z ( 0 ) , and X ( k ) is the one-step ME-RBK update for A X B = C Z ( k ) . We can obtain the following randomized extended block Kaczmarz iteration:
Z ( k + 1 ) = Z ( k ) α A : , j 2 2 A : , j ( ( ( A : , j T Z ( k ) ) B T ) B ) , X ( k + 1 ) = X ( k ) + α A i , : 2 2 A i , : T ( C i , : Z i , : ( k + 1 ) ( A i , : X ( k ) ) B ) B T ,
where α > 0 is the step size, and i and j are selected with probability p i = A i , : 2 2 A F 2 and p ^ j ( A ) = A : , j 2 2 A F 2 , respectively. The cost of each iteration of this method is 4 n ( q + m ) + m + 1 2 n q for updating Z ( k ) and ( 4 q + 1 ) ( n + p ) + 1 2 q for updating X ( k ) if the square of the row norm and the column norm of A have been calculated in advance. We describe this method as Algorithm 2, which is called the ME-REBK algorithm.
Algorithm 2 Randomized Extended Block Kaczmarz Method for A X B = C (ME-REBK)
Input: 
A R m × p , B R q × n , C R m × n , X ( 0 ) = 0 R p × q , Z ( 0 ) = C
1:
for k = 0 , 1 , 2 , , do
2:
   Pick j with probability p ^ j ( A ) = A : , j 2 2 A F 2
3:
   Compute Z ( k + 1 ) = Z ( k ) α A : , j 2 2 A : , j ( ( ( A : , j T Z ( k ) ) B T ) B )
4:
   Pick i with probability p i ( A ) = A i , : 2 2 A F 2
5:
   Compute X ( k + 1 ) = X ( k ) + α A i , : 2 2 A i , : T ( C i , : Z i , : ( k + 1 ) ( A i , : X ( k ) ) B ) B T
6:
end for
Theorem 3. 
Assume 0 < α < 2 B 2 2 . Let { Z ( k ) } denote the kth iteration of ME-RBK applied to A T Z B T = 0 starting from the initial matrix Z ( 0 ) R m × n , in which Z : , j ( 0 ) C : , j + R ( A ) , j = 1 , 2 , , n and ( Z i , : ( 0 ) ) T ( C i , : ) T + R ( B T ) , i = 1 , 2 , , m . Then, { Z ( k ) } converges linearly to C A A C B B in mean square form, and the solution error in expectation for the iteration sequence X ( k ) obeys
E Z ( k ) ( C A A C B B ) F 2 ρ k Z ( 0 ) ( C A A C B B ) F 2 ,
where the jth column of A is selected with probability p ^ j ( A ) = A : , j 2 2 A F 2 .
Proof. 
In Theorem 1, replacing A with A T , B with B T and C with 0, we can prove Theorem 3 based on the result of Theorem 1. For the sake of conciseness, we omit the proof process. □
Theorem 4. 
Assume 0 < α < 2 B 2 2 . The sequence { X ( k ) } is generated using the ME-REBK method for A X B = C , starting from the initial matrix X ( 0 ) R p × n and Z ( 0 ) R m × n , where X : , j ( 0 ) R ( A T ) , j = 1 , 2 , , q , ( X i , : ( 0 ) ) T R ( B ) , i = 1 , 2 , , p Z : , j ( 0 ) C : , j + R ( A ) , j = 1 , 2 , , n and ( Z i , : ( 0 ) ) T ( C i , : ) T + R ( B T ) , i = 1 , 2 , , m . For any ε > 0 , it holds that
E X ( k ) A C B F 2 ( 1 + ε ) k + 1 ( 1 + ε ) ε 2 α 2 B 2 2 ρ k A F 2 Z ( 0 ) ( C A A C B B ) F 2 + ( 1 + ε ) k ρ k X ( 0 ) A C B F 2 ,
where i [ m ] , j [ p ] are picked with probability p i ( A ) = A i , : 2 2 A F 2 and p ^ j ( A ) = A : , j 2 2 A F 2 , respectively.
Proof. 
Let X ( k ) denote the kth iteration of the ME-REBK method for A X B = C , and X ˜ ( k + 1 ) be the one-step Kaczmarz update for the matrix equation A X B = A A C B B from X ( k ) , i.e.,
X ˜ ( k + 1 ) = X ( k ) + α A i , : 2 2 A i , : T ( A i , : A C B B A i , : X ( k ) B ) B T .
We have
X ˜ ( k + 1 ) A C B = X ( k ) A C B α A i , : 2 2 A i , : T A i , : ( X ( k ) A C B ) B B T
and
X ( k + 1 ) X ˜ ( k + 1 ) = α A i , : 2 2 A i , : T ( C i , : Z i , : ( k + 1 ) A i , : A C B B ) B T .
For any ε > 0 , via triangle inequality and Young’s inequality, we can obtain
X ( k + 1 ) A C B F 2 = ( X ( k + 1 ) X ˜ ( k + 1 ) ) + ( X ˜ ( k + 1 ) A C B ) F 2 ( X ( k + 1 ) X ˜ ( k + 1 ) F + X ˜ ( k + 1 ) A C B F ) 2 X ( k + 1 ) X ˜ ( k + 1 ) F 2 + X ˜ ( k + 1 ) A C B F 2 + 2 X ( k + 1 ) X ˜ ( k + 1 ) F X ˜ ( k + 1 ) A C B F ( 1 + 1 ε ) X ( k + 1 ) X ˜ ( k + 1 ) F 2 + ( 1 + ε ) X ˜ ( k + 1 ) A C B F 2 .
By taking the conditional expectation on the both sides of (15), we have
E k X ( k + 1 ) A C B F 2 ( 1 + 1 ε ) E k X ( k + 1 ) X ˜ ( k + 1 ) F 2 + ( 1 + ε ) E k X ˜ ( k + 1 ) A C B F 2 .
It follows from
X ( k + 1 ) X ˜ ( k + 1 ) F 2 = α A i , : 2 2 A i , : T ( C i , : Z i , : ( k + 1 ) A i , : A C B B ) B T F 2 = α 2 A i , : 2 2 trace B ( C i , : Z i , : ( k + 1 ) A i , : A C B B ) T ( C i , : Z i , : ( k + 1 ) A i , : A C B B ) B T α 2 B 2 2 A i , : 2 2 C i , : Z i , : ( k + 1 ) A i , : A C B B 2 2
that
E k X ( k + 1 ) X ˜ ( k + 1 ) F 2 α 2 B 2 2 E k j E k i C i , : Z i , : ( k + 1 ) A i , : A C B B 2 2 A i , : 2 2 = α 2 B 2 2 E k j 1 A F 2 i = 1 m C i , : Z i , : ( k + 1 ) A i , : A C B B 2 2 = α 2 B 2 2 A F 2 E k Z ( k + 1 ) ( C A A C B B ) F 2 .
By Theorem 3, it yields
E X ( k + 1 ) X ˜ ( k + 1 ) F 2 α 2 B 2 2 A F 2 E Z ( k + 1 ) ( C A A C B B ) F 2 α 2 B 2 2 A F 2 ρ k + 1 Z ( 0 ) ( C A A C B B ) F 2 .
From X ( 0 ) A C B M , we have X ( k ) A C B M . Then, by using Theorem 1, we can obtain
E k [ X ˜ ( k + 1 ) A C B F 2 ] = E k X ( k ) A C B α A i , : 2 2 A i , : T A i , : ( X ( k ) A C B ) B B T F 2 ρ X ( k ) A C B F 2 ,
then
E X ˜ ( k + 1 ) A C B F 2 ρ E X ( k ) A C B F 2 .
Combining (16)–(18) yields
E X ( k + 1 ) A C B F 2 ( 1 + 1 ε ) E X ( k + 1 ) X ˜ ( k + 1 ) F 2 + ( 1 + ε ) E X ˜ ( k + 1 ) A C B F 2 ( 1 + 1 ε ) α 2 B 2 2 ρ k + 1 A F 2 Z ( 0 ) ( C A A C B B ) F 2 + ( 1 + ε ) ρ E X ( k ) A C B F 2 ( 1 + 1 ε ) α 2 B 2 2 ρ k + 1 A F 2 Z ( 0 ) ( C A A C B B ) F 2 [ 1 + ( 1 + ε ) ] + ( 1 + ε ) 2 ρ 2 E X ( k 1 ) A C B F 2 ( 1 + 1 ε ) α 2 B 2 2 ρ k + 1 A F 2 Z ( 0 ) ( C A A C B B ) F 2 i = 0 k ( 1 + ε ) i + ( 1 + ε ) k + 1 ρ k + 1 X ( 0 ) A C B F 2 = ( 1 + ε ) k + 2 ( 1 + ε ) ε 2 α 2 B 2 2 ρ k + 1 A F 2 Z ( 0 ) ( C A A C B B ) F 2 + ( 1 + ε ) k + 1 ρ k + 1 X ( 0 ) A C B F 2 .
This completes the proof. □
Remark 4. 
Replacing B T in (12) with B , we obtain the following projection-based randomized extended block Kaczmarz mathod (ME-PREBK) iteration:
Z ( k + 1 ) = Z ( k ) α A : , j 2 2 A : , j ( ( ( A : , j T Z ( k ) ) B T ) ( B ) T ) , X ( k + 1 ) = X ( k ) + α A i , : 2 2 A i , : T ( C i , : Z i , : ( k + 1 ) ( A i , : X ( k ) ) B ) B ,

4. Numerical Experiments

In this section, we will present some experimental results of the proposed algorithms for solving various matrix equations, and compare them with ME-RGRK and ME-MWRK in [13] for consistent matrix equations and RBCD in [12] for inconsistent matrix equations. All experiments were carried out using MATLAB (version R2020a) on a DESKTOP-8CBRR86 with Intel(R) Core(TM) i7-4712MQ CPU @2.30GHz 2.29GHz, RAM 8GB and Windows 10.
All computations start from the initial guess X ( 0 ) = 0 , and are terminated once the relative error (RE) of the solution, defined by
R E = X ( k ) X * F 2 X * F 2
at the current iteration X ( k ) , satisfies R E < 10 6 or exceeds the maximum iteration K = 50,000, where X * = A C B . We report the average number of iterations (denoted as “IT”) and the average computing time in seconds (denoted as“CPU”) for 20 repeated trial runs of the corresponding method. Three examples are tested, and A and B are generated as follows.
  • Type I: For given m , p , q , n , the entries of A and B are generated from a standard normal distribution, i.e., A = r a n d n ( m , p ) , B = r a n d n ( q , n ) .
  • Type II: Like [18], for given m , p , and r 1 = r a n k ( A ) , we construct a matrix A by A = U 1 D 1 V 1 T , where U 1 R m × r 1 and V 1 R p × r 1 are orthogonal column matrices, D R r 1 × r 1 is a diagonal matrix whose first r 2 diagonal entries are uniformly distributed numbers in [ σ min ( A ) , σ max ( A ) ] , and the last two diagonal entries are σ max ( A ) , σ min ( A ) . The entries of B are generated using a similar method with parameters q , n , r 2 = r a n k ( B ) .
  • Type III: The real-world sparse data come from the Florida sparse matrix collection [23]. Table 2 lists the features of these sparse matrices.
Table 2. The detailed features of sparse matrices from [23].
Table 2. The detailed features of sparse matrices from [23].
NameSizeRankSparsity
ash219 219 × 85 85 2.3529 %
ash958 958 × 292 292 0.68493 %
divorce 50 × 9 9 50 %

4.1. Consistent Matrix Equation

Given A , B , we set C = A X * B with X * = r a n d n ( p , q ) to construct a consistent matrix equation. First, we test the impact of α in the ME-RBK method on the experimental results. Figure 1 plots the IT and CPU versus different “para” with different matrices in Table 3, where p a r a = 0.1:0.1:1.9 so that α = p a r a B 2 2 satisfies 0 < α < 2 B 2 2 in Theorem 1. From Figure 1, it can be seen that the number of iteration steps and the running time decrease with the increase in parameters. However, when p a r a = 1.9 , both IT and CPU begin to increase. The same situations occur when solving consistent or inconsistent equations with different matrices in Table 4 and Table 5. Therefore, we set α = 1.8 B 2 2 in all experiments.
In Table 3, Table 4 and Table 5, we report the average IT and CPU of the ME-RGRK, ME-MWRK, ME-RBK and ME-PRBK methods for solving consistent eqautions. In the following tables, the item “>” represents that the number of iteration steps exceeds the maximum iteration (50,000), and the item “-” represents that the method does not converge.
From these tables, we can see that the ME-RBK and ME-PRBK methods vastly outperform the ME-RGRK and ME-MWRK methods in terms of both IT and CPU times regardless of whether the matrices A and B are full column/row rank or not. As the matrix dimension increases, the CPU time of the ME-RBK and ME-PRBK methods increases slowly, while the running time of ME-RGRK and ME-MWRK increases dramatically.
In addition, when the matrix size is small, the ME-PRBK method is competitive, because the pseudoinverse is less expensive and the number of iteration steps is small. When the matrix size is large, the matrix is large, and the ME-RBK method is more challenging because it does not need to calculate the pseudoinverse (see the last line in Table 3).

4.2. Inconsistent Matrix Equation

To construct an inconsistent matrix equation, we set C = A X * B + R , where X * and R are random matrices which are generated by X * = r a n d n ( p , q ) and R = δ * r a n d n ( p , q ) , δ ( 0 , 1 ) . Numerical results of the RBCD, IME-REBK and IME-PREBK methods are listed in Table 6, Table 7 and Table 8. From these tables, we can see that the IME-PREBK method is better than the RBCD method in terms of IT and CPU time, especially when the σ max σ min is large (see the last line in Table 7). The IME-REBK method is not competitive for B with full row rank because it needs to solve two equations. However, when B does not have full row rank, the RBCD method does not converge, while the IME-REBK and IME-PREBK methods do.

5. Conclusions

In this paper, we have proposed a randomized block Kaczmarz algorithm for solving the consistent matrix equation and its extended version for the inconsistent case. Theoretically, we have proved that the proposed algorithms converge linearly to the unique minimal F-norm solution or least-squares solution (i.e., A C B ) without requirements on A and B having full column/row rank. The numerical results show the effectiveness of the algorithms. Since the proposed algorithms only require one row or one column of A at each iteration without a matrix–matrix product, they are suitable for the scenarios where the matrix A is too large to fit in the memory or matrix multiplication is considerably expensive.

Author Contributions

Conceptualization, L.X.; Methodology, L.X. and W.B.; Validation, L.X. and W.L.; Writing—Original Draft Preparation, L.X.; Writing—Review and Editing, L.X., W.B. and W.L.; Software, L.X.; Visualization, L.X. and W.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data Availability Statement

The datasets that support the findings of this study are available from the corresponding author upon reasonable request.

Acknowledgments

The authors are thankful to the referees for their constructive comments and valuable suggestions, which have greatly improved the original manuscript of this paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Rauhala, U.A. Introduction to array algebra, Photogramm. Eng. Remote Sens. 1980, 46, 177–192. [Google Scholar]
  2. Regalia, P.A.; Mitra, S.K. Kronecker products, unitary matrices and signal processing applications. SIAM Rev. 1989, 31, 586–613. [Google Scholar] [CrossRef]
  3. Liu, M.Z.; Li, B.J.; Guo, Q.J.; Zhu, C.; Hu, P.; Shao, Y. Progressive iterative approximation for regularized least square bivariate B-spline surface fitting. J. Comput. Appl. Math. 2018, 327, 175–187. [Google Scholar] [CrossRef]
  4. Liu, Z.Y.; Li, Z.; Ferreira, C.; Zhang, Y.L. Stationary splitting iterative methods for the matrix equation AXB=C. Appl. Math. Comput. 2020, 378, 125195. [Google Scholar] [CrossRef]
  5. Fausett, D.W.; Fulton, C.T. Large least squares problems involving Kronecker products. SIAM J. Matrix Anal. Appl. 1994, 15, 219–227. [Google Scholar] [CrossRef]
  6. Zha, H.Y. Comments on large least squares problems involving Kronecker products. SIAM J. Matrix Anal. Appl. 1995, 16, 1172. [Google Scholar] [CrossRef]
  7. Peng, Z.Y. An iterative method for the least squares symmetric solution of the linear matrix equation AXB = C. Appl. Math. Comput. 2005, 170, 711–723. [Google Scholar] [CrossRef]
  8. Ding, F.; Liu, P.X.; Ding, J. Iterative solutions of the generalized Sylvester matrix equations by using the hierarchical identification principle. Appl. Math. Comput. 2008, 197, 41–50. [Google Scholar] [CrossRef]
  9. Huang, G.X.; Yin, F.; Guo, K. An iterative method for the skew-symmetric solution and the optimal approximate solution of the matrix equation AXB = C. J. Comput. Appl. Math. 2008, 212, 231–244. [Google Scholar] [CrossRef]
  10. Wang, X.; Li, Y.; Dai, L. On hermitian and skew-hermitian splitting iteration methods for the linear matrix equation AXB = C. Comput. Math. Appl. 2013, 65, 657–664. [Google Scholar] [CrossRef]
  11. Shafiei, S.G.; Hajarian, M. Developing Kaczmarz method for solving Sylvester matrix equations. J. Franklin Inst. 2022, 359, 8991–9005. [Google Scholar] [CrossRef]
  12. Du, K.; Ruan, C.C.; Sun, X.H. On the convergence of a randomized block coordinate descent algorithm for a matrix least squaress problem. Appl. Math. Lett. 2022, 124, 107689. [Google Scholar] [CrossRef]
  13. Wu, N.C.; Liu, C.Z.; Zuo, Q. On the Kaczmarz methods based on relaxed greedy selection for solving matrix equation AXB = C. J. Comput. Appl. Math. 2022, 413, 114374. [Google Scholar] [CrossRef]
  14. Strohmer, T.; Vershynin, R. A randomized Kaczmarz algorithm with exponential convergence. J. Fourier. Anal. Appl. 2009, 15, 262–278. [Google Scholar] [CrossRef]
  15. Zouzias, A.; Freris, N.M. Randomized extended Kaczmarz for solving least squares. SIAM J. Matrix. Anal. Appl. 2013, 34, 773–793. [Google Scholar] [CrossRef]
  16. Ion, N. Faster randomized block kaczmarz algorithms. SIAM J. Matrix Anal. Appl. 2019, 40, 1425–1452. [Google Scholar]
  17. Nemirovski, A.; Juditsky, A.; Lan, G.; Shapiro, A. Robust stochastic approximation approach to stochastic programming. SIAM J. Optimiz. 2009, 19, 1574–1609. [Google Scholar] [CrossRef]
  18. Niu, Y.Q.; Zheng, B. On global randomized block Kaczmarz algorithm for solving large-scale matrix equations. arXiv 2022, arXiv:2204.13920. [Google Scholar]
  19. Needell, D. Randomized Kaczmarz solver for noisy linear systems. BIT Numer. Math. 2010, 50, 395–403. [Google Scholar] [CrossRef]
  20. Ma, A.; Needell, D.A. Ramdas, Convergence properties of the randomized extended Gauss-Seidel and Kaczmarz methods. SIAM J. Matrix Anal. Appl. 2015, 36, 1590–1604. [Google Scholar] [CrossRef]
  21. Du, K. Tight upper bounds for the convergence of the randomized extended Kaczmarz and Gauss-Seidel algorithms. Numer. Linear Algebra Appl. 2019, 26, e2233. [Google Scholar] [CrossRef]
  22. Du, K.; Si, W.T.; Sun, X.H. Randomized extended average block kaczmarz for solving least squares. SIAM J. Sci. Comput. 2020, 42, A3541–A3559. [Google Scholar] [CrossRef]
  23. Davis, T.A.; Hu, Y. The university of Florida sparse matrix collection. Math. Softw. 2011, 38, 1–25. [Google Scholar]
Figure 1. IT (left) and CPU (right) of different para of ME-RBK for consistent matrix equations with differnt matrices in Table 3.
Figure 1. IT (left) and CPU (right) of different para of ME-RBK for consistent matrix equations with differnt matrices in Table 3.
Mathematics 11 04554 g001
Table 1. The complexities of computing X ( k + 1 ) in ME-RBK.
Table 1. The complexities of computing X ( k + 1 ) in ME-RBK.
y 1 = A i , : X ( k ) y 2 = C i , : y 1 B y 3 = y 2 B T y 4 T = α A i , : 2 2 A i , : T Y 1 = y 4 T y 3 X ( k ) + Y 1
( 2 p 1 ) q ( 2 q 1 ) n + n ( 2 n 1 ) q 1 + p p q p q
Table 3. IT and CPU of ME-RGRK, ME-MWRK, ME-RBK and ME-PRBK for the consistent matrix equations with Type I.
Table 3. IT and CPU of ME-RGRK, ME-MWRK, ME-RBK and ME-PRBK for the consistent matrix equations with Type I.
No.mpqn ME-RGRKME-MWRKME-RBKME-PRBK
11004040100IT49,70727,5797834.51152.8
CPU0.782.010.200.04
24010010040IT>49,332.76334.71507.2
CPU>2.150.330.08
3500100100500IT>32,1094021.81866.1
CPU>57.610.520.19
410002003002000IT>>6429.64450.4
CPU>>3.950.72
Table 4. IT and CPU of ME-RGRK, ME-MWRK, ME-RBK and ME-PRBK for the consistent matrix equations with Type II.
Table 4. IT and CPU of ME-RGRK, ME-MWRK, ME-RBK and ME-PRBK for the consistent matrix equations with Type II.
mp r 1 [ σ min ( A ) , σ max ( A ) ] qn r 2 [ σ min ( B ) , σ max ( B ) ] ME-RGRKME-MWRKME-RBKME-PRBK
1004020[1, 2]4010040[1, 2]IT4361.01987.1503.1332.4
CPU0.060.170.010.008
1004020[1, 5]4010020[1, 5]IT22,423.26439.89307.61056.3
CPU0.330.570.200.02
1000200100[1, 2]100100050[1, 2]IT20,055.57047.82587.41674.3
CPU78.4570.560.420.23
100010050[1, 5]2001000200[1, 5]IT>>18,898.32833.6
CPU>>3.610.42
Table 5. IT and CPU of ME-RGRK, ME-MWRK, ME-RBK and ME-PRBK for the consistent matrix equations with Type III.
Table 5. IT and CPU of ME-RGRK, ME-MWRK, ME-RBK and ME-PRBK for the consistent matrix equations with Type III.
AB ME-RGRKME-MWRKME-RBKME-PRBK
divorceash219 T IT43,927.814,164.410,993.43873.5
CPU1.151.350.360.13
divorceash219IT40,198.717,251.411,557.43124.7
CPU0.630.800.430.11
ash219ash958 T IT>>6042.32267.0
CPU>>1.040.36
ash219ash958IT>>5745.42114.2
CPU>>1.220.42
Table 6. IT and CPU of RBCD, IME-REBK and IME-PREBK for the inconsistent matrix equations with Type I.
Table 6. IT and CPU of RBCD, IME-REBK and IME-PREBK for the inconsistent matrix equations with Type I.
mpqn RBCDIME-REBKIME-PREBK
1004040100IT17,212.221,270.52469.4
CPU0.691.570.19
4010010040IT-24,708.32174.6
CPU-2.380.21
500100100500IT6059.17352.82512.3
CPU4.977.282.70
10002003002000IT14,209.212,490.45183.5
CPU152.32246.5699.68
Table 7. IT and CPU of RBCD, IME-REBK and IME-PREBK for the inconsistent matrix equations with Type II.
Table 7. IT and CPU of RBCD, IME-REBK and IME-PREBK for the inconsistent matrix equations with Type II.
mp r 1 [ σ min ( A ) , σ max ( A ) ] qn r 2 [ σ min ( B ) , σ max ( B ) ] RBCDIME-REBKIME-PREBK
1004020[1, 2]4010040[1, 2]IT1035.9762.2384.5
CPU0.040.050.03
1004020[1, 5]4010020[1, 5]IT-16,224.71507.8
CPU-1.230.12
1000200100[1, 2]100100050[1, 2]IT-4067.52217.8
CPU-24.2313.67
1000200100[1, 5]1001000100[1, 5]IT>48,269.64328.0
CPU>365.7830.25
Table 8. IT and CPU of RBCD, IME-REBK and IME-PREBK for the inconsistent matrix equations with Type III.
Table 8. IT and CPU of RBCD, IME-REBK and IME-PREBK for the inconsistent matrix equations with Type III.
AB RBCDIME-REBKIME-PREBK
divorceash219 T IT>20,026.34308.4
CPU>2.260.48
divorceash219IT-19,199.14026.2
CPU-1.490.31
ash219ash958 T IT22,313.410,823.52561.8
CPU15.6919.035.89
ash219ash958IT-10,020.72363.5
CPU-15.834.72
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xing, L.; Bao, W.; Li, W. On the Convergence of the Randomized Block Kaczmarz Algorithm for Solving a Matrix Equation. Mathematics 2023, 11, 4554. https://doi.org/10.3390/math11214554

AMA Style

Xing L, Bao W, Li W. On the Convergence of the Randomized Block Kaczmarz Algorithm for Solving a Matrix Equation. Mathematics. 2023; 11(21):4554. https://doi.org/10.3390/math11214554

Chicago/Turabian Style

Xing, Lili, Wendi Bao, and Weiguo Li. 2023. "On the Convergence of the Randomized Block Kaczmarz Algorithm for Solving a Matrix Equation" Mathematics 11, no. 21: 4554. https://doi.org/10.3390/math11214554

APA Style

Xing, L., Bao, W., & Li, W. (2023). On the Convergence of the Randomized Block Kaczmarz Algorithm for Solving a Matrix Equation. Mathematics, 11(21), 4554. https://doi.org/10.3390/math11214554

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop