The Weighted, Relaxed Gradient-Based Iterative Algorithm for the Generalized Coupled Conjugate and Transpose Sylvester Matrix Equations

Wu, Xiaowen; Huang, Zhengge; Cui, Jingjing; Long, Yanping

doi:10.3390/axioms12111062

Open AccessArticle

The Weighted, Relaxed Gradient-Based Iterative Algorithm for the Generalized Coupled Conjugate and Transpose Sylvester Matrix Equations

College of Mathematics and Physics, Center for Applied Mathematics of Guangxi, Guangxi Minzu University, Nanning 530006, China

^*

Author to whom correspondence should be addressed.

Axioms 2023, 12(11), 1062; https://doi.org/10.3390/axioms12111062

Submission received: 8 October 2023 / Revised: 10 November 2023 / Accepted: 17 November 2023 / Published: 20 November 2023

(This article belongs to the Section Mathematical Analysis)

Download

Browse Figures

Versions Notes

Abstract

:

By applying the weighted relaxation technique to the gradient-based iterative (GI) algorithm and taking proper weighted combinations of the solutions, this paper proposes the weighted, relaxed gradient-based iterative (WRGI) algorithm to solve the generalized coupled conjugate and transpose Sylvester matrix equations. With the real representation of a complex matrix as a tool, the necessary and sufficient conditions for the convergence of the WRGI algorithm are determined. Also, some sufficient convergence conditions of the WRGI algorithm are presented. Moreover, the optimal step size and the corresponding optimal convergence factor of the WRGI algorithm are given. Lastly, some numerical examples are provided to demonstrate the effectiveness, feasibility and superiority of the proposed algorithm.

Keywords:

generalized coupled conjugate and transpose Sylvester matrix equations; weighted relaxed gradient-based iterative algorithm; real representation; relaxation parameter; convergence condition; optimal convergence factor

MSC:

15A06; 15A24; 65F45

1. Introduction

Matrix equations are often used in mathematics and engineering applications, such as control theory, signal processing and computational mathematics [1,2,3,4]. For example, the forward and backward periodic Sylvester matrix equations (PSMEs) with the following forms

A_{i} X_{i} B_{i} + C_{i} X_{i + 1} D_{i} = F_{i}, i = 1, 2, \dots, ω

and

A_{i} X_{i + 1} B_{i} + C_{i} X_{i} D_{i} = F_{i}, i = 1, 2, \dots, ω,

are an indispensable part of pole assignment and the design of state observers for linear discrete periodic systems [5]. Thus, studying the computational methods of matrix equations has become an important subject in the field of computational mathematics and control. For matrix equations, computing their exact solutions is very meaningful for many practical problems. However, in many applications, such as stability analysis of control systems, it is usually not necessary to calculate the exact solution, as the approximate solution is sufficient. Therefore, the research of iterative solutions to matrix equations has attracted many researchers [6,7,8,9,10,11,12].

One of the important ways to study the approximate solutions of matrix equations is to establish iterative methods. In recent years, many researchers have proposed a great deal of iterative methods to solve different kinds of Sylvester matrix equations. For the generalized coupled Sylvester matrix equations

\sum_{j = 1}^{q} A_{i j} X_{j} B_{i j} = F_{i}, i = 1, 2, \dots, p,

(1)

where

A_{i j} \in R^{r_{i} \times m_{j}}, B_{i j} \in R^{n_{j} \times s_{i}}

,

F_{i} \in R^{r_{i} \times s_{i}}

and

X_{j} \in R^{m_{j} \times n_{j}}, i = 1, 2, \dots, p,

j = 1, 2, \dots, q

, Ding and Chen [13] applied the hierarchical identification principle and introduced the block matrix inner product to propose the GI algorithm. Based on the idea of the GI algorithm, some researchers established many improved versions of the GI algorithm and investigated their convergence properties [14,15]. To improve the convergence rate of the GI algorithm, Zhang [16] proposed the residual norm steepest descent (RNSD), conjugate gradient normal equation (CGNE) and biconjugate gradient stabilized (Bi-CGSTAB) algorithms to solve the matrix in Equation (1). Subsequently, Zhang [17] constructed the full-column rank, full-row rank and reduced-rank gradient-based algorithms, with the main idea of them being to construct an objective function and use the gradient search. Afterward, Zhang and Yin [18] developed the conjugate gradient least squares (CGLS) algorithm for Equation (1), which can be convergent within the finite iteration steps in the absence of round off errors.

The generalized coupled Sylvester-conjugate matrix equations have the following form:

\sum_{j = 1}^{q} (A_{i j} X_{j} B_{i j} + C_{i j} \bar{X_{j}} D_{i j}) = F_{i}, i = 1, 2, \dots, p,

(2)

where

A_{i j}, C_{i j} \in C^{m_{i} \times r_{j}}, B_{i j}, D_{i j} \in C^{t_{j} \times n_{i}}

,

F_{i} \in C^{m_{i} \times n_{i}}

and

X_{j} \in C^{r_{j} \times t_{j}}, i = 1, 2, \dots, p,

j = 1, 2, \dots, q

. The matrix in Equation (2) can be regarded as the generalization of Equation (1) in the complex field. For the matrix in Equation (2), Wu et al. [19] extended the GI algorithm to solve it and derived the sufficient condition for the convergence of the GI algorithm. Due to the fact that the sufficient condition in [19] is somewhat conservative, Huang and Ma [20] established the sufficient and necessary conditions for convergence of the GI algorithm based on the properties of the real representation of a complex matrix and the vec operator. Also, they made use of different definitions of the real representation to derive another sufficient and necessary condition for the convergence of the GI algorithm. In [21], Huang and Ma introduced l relaxation factors into the GI algorithm and proposed two relaxed gradient-based iterative (RGI) algorithms. They proved that the RGI algorithms are convergent under suitable restrictions in light of the real representation of a complex matrix and the vec operator. Quite recently, Wang et al. [22] developed a cyclic gradient-based iterative (CGI) algorithm by introducing a modular operator, which is different from previous iterative methods. The most remarkable advantage of the CGI algorithm is that less information is used in each iteration update, which helps save memory and improve efficiency.

In addition, the generalized, coupled Sylvester-transpose matrix equations

\sum_{j = 1}^{l} (A_{i j} X_{j} B_{i j} + C_{i j} X_{j}^{T} D_{i j}) = F_{i}, i = 1, 2, \dots, s,

(3)

where

A_{i j} \in R^{m_{i} \times r_{j}}, C_{i j} \in R^{m_{i} \times t_{j}}, B_{i j} \in R^{t_{j} \times n_{i}}, D_{i j} \in R^{r_{j} \times n_{i}}

,

F_{i} \in R^{m_{i} \times n_{i}}

and

X_{j} \in R^{r_{j} \times t_{j}},

i = 1, 2, \dots, s, j = 1, 2, \dots, l

are related to fault detection, observer design and so forth. Due to the important role of the generalized, coupled Sylvester-transpose matrix equations in several applied problems, numerous methods have been developed to solve them. For example, Song et al. [23] constructed the GI algorithm for the matrix in Equation (3) by using the principle of hierarchical identification. According to the rank of the related matrices of the matrix in Equation (3), Huang and Ma [24] developed three relaxed gradient-based iterative (RGI) algorithms recently by minimizing the objective functions.

In [25], Beik et al. considered the following matrix equations:

T_{v} (X) = \sum_{i = 1}^{p} (\sum_{μ = 1}^{s_{1}} A_{v i μ} X_{i} B_{v i μ} + \sum_{μ = 1}^{s_{2}} C_{v i μ} X_{i}^{T} D_{v i μ} + \sum_{μ = 1}^{s_{3}} M_{v i μ} \bar{X_{i}} N_{v i μ} + \sum_{μ = 1}^{s_{4}} H_{v i μ} X_{i}^{H} G_{v i μ}) = F_{v},

(4)

where

A_{v i μ}, B_{v i μ}, C_{v i μ}, D_{v i μ}, M_{v i μ}, N_{v i μ}, H_{v i μ}, G_{v i μ}

and

F_{v}

,

v = 1, 2, \dots, N

are the known matrices with suitable dimensions in a complex number field,

X = (X_{1}, X_{2}, \dots, X_{p})

is a group of unknown matrices and

X_{j}^{H}

represents the conjugate transpose of the matrix

X_{j}

. The matrix in Equation (4) is quite general and includes several kinds of Sylvester matrix equations, and it can be viewed as a general form of the aforementioned matrices in Equations (1)–(3). By using the hierarchical identification principle, the authors in [25] put forward the GI algorithm over a group of reflexive (anti-reflexive) matrices.

Inspired by the above work, this paper focuses on solving the iterative solution of the generalized coupled conjugate and transpose Sylvester matrix equations:

\sum_{j = 1}^{l} (A_{i j} X_{j} B_{i j} + C_{i j} \bar{X_{j}} D_{i j} + E_{i j} X_{j}^{T} F_{i j} + G_{i j} X_{j}^{H} H_{i j}) = M_{i}, i = 1, 2, \dots, s,

(5)

where

A_{i j}, C_{i j} \in C^{m_{i} \times r_{j}}, B_{i j}, D_{i j} \in C^{s_{j} \times n_{i}}, E_{i j}, G_{i j} \in C^{m_{i} \times s_{j}}, F_{i j}, H_{i j} \in C^{r_{j} \times n_{i}}

and

M_{i} \in C^{m_{i} \times n_{i}},

i = 1, 2, \dots, s, j = 1, 2, \dots, l

are the known matrices and

X_{j} \in C^{r_{j} \times s_{j}}

for

j = 1, 2, \dots, l

are the unknown matrices that need to be determined. When

l = 1

, Wang et al. [26] presented the relaxed gradient iterative (RGI) algorithm for Equation (5), which has four system parameters. Note that the matrices in Equations (1)–(3) are special cases of the matrix in Equation (5), and thus the results obtained in this work contain the existing ones in [20,23,24,27]. Owing to the fact that the convergence speed of the GI algorithm is slow in many cases, it is quite meaningful to further improve the numerical performance of the GI algorithm for the matrix in Equation (5). According to [21,24,26,28,29], it can be seen that the relaxation technique can ameliorate the numerical behaviors of the existing GI-like algorithms. Then, this motivated us to apply the weighted relaxation technique to the GI algorithm. By using different step size factors and the weighted technique, we construct the weighted, relaxed gradient-based iterative (WRGI) algorithm for solving the matrix in Equation (5). The WRGI algorithm contains s relaxation factors. When all relaxation factors are equal, the WRGI algorithm will reduce to the GI one proposed in [25]. In [25], the optimal convergence factor of the GI algorithm was not derived. Compared with the GI algorithm in [25], the WRGI algorithm proposed in this paper can own a higher computational efficiency, and its convergence properties are analyzed in detail, including the convergence conditions, optimal parameters and corresponding optimal convergence factor. The proposed WRGI algorithm has a faster convergence rate than the GI one by adjusting the values of the relaxation factors so it is more conducive to solving matrix equations in control theory, signal processing and computational mathematics, among other applications. The main contributions of this paper are given below:

By using a series of step size factors and the weighted relaxation technique, we establish the weighted, relaxed gradient-based iterative (WRGI) algorithm for solving the matrix in Equation (5), which generalizes and improves the existing GI one in [25]. Aside from that, the proposed WRGI algorithm contains the RGI ones in [21,24].
We analytically provide the necessary and sufficient condition for the convergence of the proposed WRGI algorithm. Also, the expressions of the optimal step size and the corresponding optimal convergence factor of the WRGI algorithm are derived.

The rest of this paper is organized as follows. In Section 2, some definitions and previous results are given, and the GI algorithm that has been proposed before is reviewed. In Section 3, we first design a new algorithm referred to as the WRGI algorithm to solve the matrix in Equation (5). Then, we prove that the WRGI algorithm is convergent for any started matrices under proper conditions, and we explicitly give the optimal step factor such that the convergence rate of the WRGI algorithm is maximized. Section 4 gives two numerical examples to demonstrate the effectiveness and advantages of the proposed WRGI algorithm. Finally, we give some concluding remarks to end this paper in Section 5.

2. Preliminaries

In this section, as a matter of convenience to discuss the main results of this paper, we describe the following notations which will be used throughout this paper. Let

C^{n \times n}

be the set of all

n \times n

complex matrices. For a given matrix

A \in C^{n \times n}

, some related notations are the following:

$\bar{A}$ denotes the conjugate of the matrix A;
$A^{T}$ stands for the transpose of the matrix A;
$A^{H}$ represents the conjugate transpose of the matrix A;
$A^{- 1}$ denotes the inverse of the matrix A;
$∥A∥$ denotes the Frobenius norm of the matrix A;
${∥A∥}_{2}$ denotes the spectrum norm of the matrix A;
$σ_{max} (A)$ indicates the maximum singular value of the matrix A;
$σ_{min} (A)$ indicates the minimum singular value of the matrix A;
$ρ (A)$ stands for the spectral radius of the matrix A;
$λ (A)$ stands for the spectrum of the matrix A;
$rank (A)$ stands for the rank of the matrix A.

Moreover, some useful definitions and lemmas are given below:

Definition 1

([30]). For two matrices

E \in C^{m \times n}

and

F \in C^{k \times l}

, the Kronecker product of the matrices E and F is defined as follows:

E \otimes F = [\begin{matrix} e_{11} F & e_{12} F & \dots & e_{1 n} F \\ e_{21} F & e_{22} F & \dots & e_{2 n} F \\ ⋮ & ⋮ & ⋮ \\ e_{m 1} F & e_{m 2} F & \dots & e_{m n} F \end{matrix}] = {[e_{i j} F]}_{m \times n} \in C^{m k \times n l} .

(6)

Definition 2

([27]). Let

e_{i n}

be the n-dimensional column vector whose ith element of

e_{i n}

is one, and other elements are zero. Then, the vec permutation matrix

P (m, n)

can be defined as follows:

P (m, n) = [\begin{matrix} I_{m} \otimes e_{1 n}^{T} \\ I_{m} \otimes e_{2 n}^{T} \\ ⋮ \\ I_{m} \otimes e_{n n}^{T} \end{matrix}] .

(7)

Definition 3

([30]). Let

A = [a_{1}, a_{2}, \dots, a_{n}] \in C^{m \times n}

, with

a_{i}

being the ith column of A. The vector-stretching function of A is defined as follows:

vec (A) = {[a_{1}^{T}, a_{2}^{T}, \dots, a_{n}^{T}]}^{T} \in C^{m n} .

(8)

Lemma 1

([20]). Let

C Y D = G

be a matrix equation, with the matrices C and D being full-column rank and full-row rank, respectively. Then, the iterative solution

Y (l)

produced by the GI iteration process

Y (l + 1) = Y (l) + τ C^{H} (G - C Y (l) D) D^{H}

(9)

converges to the exact solution

Y^{*}

of

C Y D = G

for any initial matrix

Y (0)

if and only if

0 < τ < \frac{2}{σ_{max} {(C)}^{2} σ_{max} {(D)}^{2}} .

(10)

In addition, the optimal step size

τ_{0}

is

τ_{0} = \frac{2}{λ_{max} (C^{H} C) λ_{max} (D D^{H}) + λ_{min} (C^{H} C) λ_{min} (D D^{H})} .

Lemma 2

([30]). Let

A \in C^{m \times n}

,

B \in C^{s \times t}

and

X \in C^{n \times s}

. Then, we have the following:

(i): $vec (A X B) = (B^{T} \otimes A) vec (X);$
(ii): $vec (X^{T}) = P (n, s) vec (X) .$

Lemma 3

([19]). Let

A, B \in C^{n \times n}

. If

tr (A) + tr (B) \in R

, then

tr (A) + tr (B) = \bar{tr (A) + tr (B)} = tr (\bar{A}) + tr (\bar{B}) .

Next, we introduce two definitions of the real representation of a complex matrix. For

A \in C^{m \times n}

, A can be uniquely expressed as

A = A_{1} + i A_{2} \in C^{m \times n}

, where

A_{1}, A_{2} \in R^{m \times n}

. We define the operators

{(\cdot)}^{\nabla}

and

{(\cdot)}^{▾}

as follows:

A^{\nabla} = [\begin{matrix} A_{1} & - A_{2} \\ A_{2} & A_{1} \end{matrix}], A^{▾} = [\begin{matrix} A_{2} & A_{1} \\ A_{1} & - A_{2} \end{matrix}] .

(11)

It can be seen from Equation (11) that the sizes of

A^{\nabla}

and

A^{▾}

are two times that of A. Then, by combining Equation (11) with the definition of the Frobenius norm, we can obtain

∥ A^{\nabla} ∥^{2} = 2 {∥ A ∥}^{2} .

(12)

Aside from that, for the identity matrix

I_{n}

with the matrix order n, we define the following matrices:

Q_{n} = [\begin{matrix} 0 & I_{n} \\ I_{n} & 0 \end{matrix}], P_{n} = \frac{\sqrt{2}}{2} [\begin{matrix} i I_{n} & I_{n} \\ I_{n} & i I_{n} \end{matrix}] .

The properties of the real representation of several complex matrices are given by the following lemma:

Lemma 4

([21]). Let

A \in C^{m \times n}

and

B \in C^{n \times r}

. Then, the following statements hold:

(1): ${(A B)}^{\nabla} = A^{\nabla} B^{\nabla}, {(A^{T})}^{\nabla} = Q_{n} {(A^{\nabla})}^{T} Q_{m},$

${(A^{H})}^{\nabla} = {(A^{\nabla})}^{T}, {(\bar{A})}^{\nabla} = Q_{m} A^{\nabla} Q_{n}, A^{▾} = Q_{m} A^{\nabla} .$
(2): If $A \in C^{n \times n}$ is nonsingular, then ${(A^{- 1})}^{\nabla} = {(A^{\nabla})}^{- 1}$ .
(3): $∥ B^{\nabla} ∥_{2} = ∥ B^{▾} ∥_{2} = {∥ B ∥}_{2}$ .
(4): If $n = r$ , then it holds that $ρ (B) = ρ (B^{\nabla})$ and $ρ (B^{▾}) = \sqrt{ρ (B \bar{B})}$ .

The lemma below gives the norm relationship between the block matrices and its submatrices:

Lemma 5

([31]). Let B be a block-partitioned matrix with

B = [\begin{matrix} B_{11} & B_{12} & \dots & B_{1 n} \\ B_{21} & B_{22} & \dots & B_{2 n} \\ ⋮ & ⋮ & ⋮ \\ B_{m 1} & B_{m 2} & \dots & B_{m n} \end{matrix}]

(13)

and

B_{i j}

be a matrix with a proper size. Then, it holds that

{∥B∥}_{2} = σ_{max} (B) \leq ∥B∥ = ∥\begin{matrix} ∥B_{11}∥ & ∥B_{12}∥ & \dots & ∥B_{1 n}∥ \\ ∥B_{21}∥ & ∥B_{22}∥ & \dots & ∥B_{2 n}∥ \\ ⋮ & ⋮ & ⋮ \\ ∥B_{m 1}∥ & ∥B_{m 2}∥ & \dots & ∥B_{m n}∥ \end{matrix}∥ = \sqrt{\sum_{i = 1}^{m} \sum_{j = 1}^{n} {∥B_{i j}∥}^{2}} .

If

rank (B) = 1

, or if B is a vector, then we have

{∥B∥}_{2} = ∥B∥ = σ_{max} (B) .

The following result gives the norm inequality for the p norms of the block matrix and its submatrices, which is essential for analyzing the convergence of the WRGI algorithm:

Lemma 6

([21]). Let

B = {[B_{i j}]}_{m \times n}

be a partitioned matrix with the form in Equation (13), and let the orders of the matrices

B_{i j}

(i = 1, \dots, m, j = 1, \dots, n)

be compatible. Then, for any deduced q norm, we have

{∥B∥}_{q} \leq {∥\begin{matrix} {∥B_{11}∥}_{q} & {∥B_{12}∥}_{q} & \dots & {∥B_{1 n}∥}_{q} \\ {∥B_{21}∥}_{q} & {∥B_{22}∥}_{q} & \dots & {∥B_{2 n}∥}_{q} \\ ⋮ & ⋮ & ⋮ \\ {∥B_{m 1}∥}_{q} & {∥B_{m 2}∥}_{q} & \dots & {∥B_{m n}∥}_{q} \end{matrix}∥}_{q} .

3. The Weighted, Relaxed Gradient-Based Iterative (WRGI) Algorithm and Its Convergence Analysis

In this section, we first propose the weighted, relaxed gradient-based iterative (WRGI) algorithm to solve the matrix in Equation (5) based on the hierarchical identification principle. Then, we discuss the convergence properties of the WRGI algorithm, which include the convergence conditions, optimal step size and the corresponding optimal convergence factor of the WRGI algorithm.

First, we define the following intermediate matrices for

i = 1, \dots, s, p = 1, \dots, l

by applying the hierarchical identification principle:

Π_{i p} = M_{i} - \sum_{j = 1}^{l} (A_{i j} X_{j} B_{i j} + C_{i j} \bar{X_{j}} D_{i j} + E_{i j} X_{j}^{T} F_{i j} + G_{i j} X_{j}^{H} H_{i j}) + A_{i p} X_{p} B_{i p},

(14)

γ_{i p} = \bar{M_{i} - \sum_{j = 1}^{l} (A_{i j} X_{j} B_{i j} + C_{i j} \bar{X_{j}} D_{i j} + E_{i j} X_{j}^{T} F_{i j} + G_{i j} X_{j}^{H} H_{i j}) + C_{i p} \bar{X_{p}} D_{i p}},

(15)

Φ_{i p} = {(M_{i} - \sum_{j = 1}^{l} (A_{i j} X_{j} B_{i j} + C_{i j} \bar{X_{j}} D_{i j} + E_{i j} X_{j}^{T} F_{i j} + G_{i j} X_{j}^{H} H_{i j}) + E_{i p} X_{p}^{T} F_{i p})}^{T},

(16)

Ω_{i p} = {(M_{i} - \sum_{j = 1}^{l} (A_{i j} X_{j} B_{i j} + C_{i j} \bar{X_{j}} D_{i j} + E_{i j} X_{j}^{T} F_{i j} + G_{i j} X_{j}^{H} H_{i j}) + G_{i p} X_{p}^{H} H_{i p})}^{H} .

(17)

For the sake of the following discussions, we define

Γ_{i j} = A_{i j} X_{j} B_{i j} + C_{i j} \bar{X_{j}} D_{i j} + E_{i j} X_{j}^{T} F_{i j} + G_{i j} X_{j}^{H} H_{i j},

Γ_{i j} (k) = A_{i j} X_{j} (k) B_{i j} + C_{i j} \bar{X_{j} (k)} D_{i j} + E_{i j} X_{j}^{T} (k) F_{i j} + G_{i j} X_{j}^{H} (k) H_{i j} .

Then, Equations (14)–(17) can be written as

Π_{i p} = M_{i} - \sum_{j = 1}^{l} Γ_{i j} + A_{i p} X_{p} B_{i p},

(18)

γ_{i p} = \bar{M_{i} - \sum_{j = 1}^{l} Γ_{i j} + C_{i p} \bar{X_{p}} D_{i p}},

(19)

Φ_{i p} = {(M_{i} - \sum_{j = 1}^{l} Γ_{i j} + E_{i p} X_{p}^{T} F_{i p})}^{T},

(20)

Ω_{i p} = {(M_{i} - \sum_{j = 1}^{l} Γ_{i j} + G_{i p} X_{p}^{H} H_{i p})}^{H} .

(21)

Therefore, the matrix in Equation (5) can be decomposed into the following matrix equations

Π_{i p} = A_{i p} X_{p} B_{i p},

(22)

γ_{i p} = \bar{C_{i p}} X_{p} \bar{D_{i p}},

(23)

Φ_{i p} = F_{i p}^{T} X_{p} E_{i p}^{T},

(24)

Ω_{i p} = H_{i p}^{H} X_{p} G_{i p}^{H},

(25)

for

i = 1, \dots, s, p = 1, \dots, l

. By applying Lemma 1 to Equations (22)–(25), we can construct the recursive forms as follows

X_{p}^{1, i} (k + 1) = X_{p}^{1, i} (k) + μ A_{i p}^{H} [Π_{i p} - A_{i p} X_{p}^{1, i} (k) B_{i p}] B_{i p}^{H},

(26)

X_{p}^{2, i} (k + 1) = X_{p}^{2, i} (k) + μ C_{i p}^{T} [γ_{i p} - \bar{C_{i p}} X_{p}^{2, i} (k) \bar{D_{i p}}] D_{i p}^{T},

(27)

X_{p}^{3, i} (k + 1) = X_{p}^{3, i} (k) + μ \bar{F_{i p}} [Φ_{i p} - F_{i p}^{T} X_{p}^{3, i} (k) E_{i p}^{T}] \bar{E_{i p}},

(28)

X_{p}^{4, i} (k + 1) = X_{p}^{4, i} (k) + μ H_{i p} [Ω_{i p} - H_{i p}^{H} X_{p}^{4, i} (k) G_{i p}^{H}] G_{i p},

(29)

for

i = 1, \dots, s, p = 1, \dots, l

, where

μ

is a step size factor.

For convenience, we define the following notations:

Γ_{i j}^{r, i} (k) = A_{i j} X_{j}^{r, i} (k) B_{i j} + C_{i j} \bar{X_{j}^{r, i} (k)} D_{i j} + E_{i j} X_{j}^{r, i} {(k)}^{T} F_{i j} + G_{i j} X_{j}^{r, i} {(k)}^{H} H_{i j}, r = 1, 2, 3, 4 .

Substituting Equations (18)–(21) into Equations (26)–(29), respectively, and then using

X_{p}^{1, i} (k), X_{p}^{2, i} (k), X_{p}^{3, i} (k)

and

X_{p}^{4, i} (k)

to replace

X_{p}

for

i = 1 \dots, s, p = 1, \dots, l

gives

X_{p}^{1, i} (k + 1) = X_{p}^{1, i} (k) + μ A_{i p}^{H} [M_{i} - \sum_{j = 1}^{l} Γ_{i j}^{1, i} (k)] B_{i p}^{H},

X_{p}^{2, i} (k + 1) = X_{p}^{2, i} (k) + μ C_{i p}^{T} \bar{[M_{i} - \sum_{j = 1}^{l} Γ_{i j}^{2, i} (k)]} D_{i p}^{T},

X_{p}^{3, i} (k + 1) = X_{p}^{3, i} (k) + μ \bar{F_{i p}} {[M_{i} - \sum_{j = 1}^{l} Γ_{i j}^{3, i} (k)]}^{T} \bar{E_{i p}},

X_{p}^{4, i} (k + 1) = X_{p}^{4, i} (k) + μ H_{i p} {[M_{i} - \sum_{j = 1}^{l} Γ_{i j}^{4, i} (k)]}^{H} G_{i p} .

By taking the average of

X_{p}^{1, i} (k + 1), X_{p}^{2, i} (k + 1), X_{p}^{3, i} (k + 1)

and

X_{p}^{4, i} (k + 1)

in the above equations, one can obtain the following iterative algorithm:

\begin{matrix} X_{p}^{(1)} (k + 1) & = & X_{p}^{(1)} (k) + \frac{μ}{4} \{A_{1 p}^{H} [M_{1} - \sum_{j = 1}^{l} Γ_{1 j} (k)] B_{1 p}^{H} + C_{1 p}^{T} \bar{[M_{1} - \sum_{j = 1}^{l} Γ_{1 j} (k)]} D_{1 p}^{T} \\ + \bar{F_{1 p}} {[M_{1} - \sum_{j = 1}^{l} Γ_{1 j} (k)]}^{T} \bar{E_{1 p}} + H_{1 p} {[M_{1} - \sum_{j = 1}^{l} Γ_{1 j} (k)]}^{H} G_{1 p}\}, \end{matrix}

\dots \dots \dots

\begin{matrix} X_{p}^{(s)} (k + 1) & = & X_{p}^{(s)} (k) + \frac{μ}{4} \{A_{s p}^{H} [M_{s} - \sum_{j = 1}^{l} Γ_{s j} (k)] B_{s p}^{H} + C_{s p}^{T} \bar{[M_{s} - \sum_{j = 1}^{l} Γ_{s j} (k)]} D_{s p}^{T} \\ + \bar{F_{s p}} {[M_{s} - \sum_{j = 1}^{l} Γ_{s j} (k)]}^{T} \bar{E_{s p}} + H_{s p} {[M_{s} - \sum_{j = 1}^{l} Γ_{s j} (k)]}^{H} G_{s p}\} . \end{matrix}

Then, we construct the calculation forms of

X_{p} (k + 1)

(p = 1, \dots, l)

by introducing the suitable relaxation parameters and using the balanced strategies

X_{p} (k + 1) = α_{1} X_{p}^{(1)} (k + 1) + α_{2} X_{p}^{(2)} (k + 1) + \dots + α_{s} X_{p}^{(s)} (k + 1),

with

α_{i} > 0

(i = 1, \dots, s)

being the weighted relaxation factors.

Remark 1.

We apply Lemma 1 to Equations (22)–(25) and then use the same step size factor μ to establish the iterative sequences in Equations (26)–(29), which is conducive to deducing the convergence conditions of the proposed algorithm. It is noteworthy that we also can use different step size factors in Equations (26)–(29) and design a new algorithm, but it may be difficult to derive the convergence conditions of the new algorithm this way. Using different step size factors to construct iterative sequences and proposing a new algorithm will be the direction and focus in our future work.

Based on the above discussions, we obtain the following weighted, relaxed gradient-based iterative (WRGI) algorithm (Algorithm 1) to solve the generalized coupled conjugate and transpose Sylvester matrix equations.

Algorithm 1: The weighted, relaxed gradient-based iterative (WRGI) algorithm

Step 1: Given the matrices

A_{i j}, C_{i j} \in C^{m_{i} \times r_{j}}, B_{i j}, D_{i j} \in C^{s_{j} \times n_{i}}, E_{i j}, G_{i j} \in C^{m_{i} \times s_{j}}, F_{i j},

H_{i j} \in C^{r_{j} \times n_{i}}

and

M_{i} \in C^{m_{i} \times n_{i}}, i = 1, 2, \dots, s, j = 1, 2, \dots, l

, two constants

ε > 0

and

μ > 0

and the relaxation parameters

α_{i} > 0, i = 1, 2, \dots, s

, choose the initial matrices

X_{p}^{(i)}

(0)

(i = 1, 2, \dots, s, p = 1, 2, \dots, l)

, and set

k = 0

;

Step 2: If

δ_{k} = \frac{\sqrt{\sum_{i = 1}^{s} {∥M_{i} - \sum_{j = 1}^{l} Γ_{i j} (k)∥}^{2}}}{\sqrt{\sum_{i = 1}^{s} {∥M_{i}∥}^{2}}} < ε

, then stop; otherwise, go to Step 3;

Step 3: Compute

X_{p} (k + 1)

using

\begin{matrix} X_{p}^{(1)} (k + 1) & = & X_{p}^{(1)} (k) + \frac{μ}{4} \{A_{1 p}^{H} [M_{1} - \sum_{j = 1}^{l} Γ_{1 j} (k)] B_{1 p}^{H} + C_{1 p}^{T} \bar{[M_{1} - \sum_{j = 1}^{l} Γ_{1 j} (k)]} D_{1 p}^{T} \\ + \bar{F_{1 p}} {[M_{1} - \sum_{j = 1}^{l} Γ_{1 j} (k)]}^{T} \bar{E_{1 p}} + H_{1 p} {[M_{1} - \sum_{j = 1}^{l} Γ_{1 j} (k)]}^{H} G_{1 p}\} \end{matrix}

\dots \dots \dots

\begin{matrix} X_{p}^{(s)} (k + 1) & = & X_{p}^{(s)} (k) + \frac{μ}{4} \{A_{s p}^{H} [M_{s} - \sum_{j = 1}^{l} Γ_{s j} (k)] B_{s p}^{H} + C_{s p}^{T} \bar{[M_{s} - \sum_{j = 1}^{l} Γ_{s j} (k)]} D_{s p}^{T} \\ + \bar{F_{s p}} {[M_{s} - \sum_{j = 1}^{l} Γ_{s j} (k)]}^{T} \bar{E_{s p}} + H_{s p} {[M_{s} - \sum_{j = 1}^{l} Γ_{s j} (k)]}^{H} G_{s p}\} \end{matrix}

X_{p} (k + 1) = α_{1} X_{p}^{(1)} (k + 1) + α_{2} X_{p}^{(2)} (k + 1) + \dots + α_{s} X_{p}^{(s)} (k + 1), p = 1, \dots, l .

Step 4: Set

k = k + 1

and return to Step 2.

In the following, we will discuss the convergence properties of the WRGI algorithm by applying the properties of the real representation of a complex matrix and the Kronecker product of the matrices. For convenience, we introduce the following notations. Let

V = V_{1} + V_{2}

(30)

and

R = R_{1} + R_{2},

(31)

where

V_{1} = [\begin{matrix} B_{11}^{\nabla} \otimes {(A_{11}^{\nabla})}^{T} + Q_{s_{1}} D_{11}^{\nabla} \otimes Q_{r_{1}} {(C_{11}^{\nabla})}^{T} & \dots & B_{s 1}^{\nabla} \otimes {(A_{s 1}^{\nabla})}^{T} + Q_{s_{1}} D_{s 1}^{\nabla} \otimes Q_{r_{1}} {(C_{s 1}^{\nabla})}^{T} \\ B_{12}^{\nabla} \otimes {(A_{12}^{\nabla})}^{T} + Q_{s_{2}} D_{12}^{\nabla} \otimes Q_{r_{2}} {(C_{12}^{\nabla})}^{T} & \dots & B_{s 2}^{\nabla} \otimes {(A_{s 2}^{\nabla})}^{T} + Q_{s_{2}} D_{s 2}^{\nabla} \otimes Q_{r_{2}} {(C_{s 2}^{\nabla})}^{T} \\ ⋮ & ⋮ \\ B_{1 l}^{\nabla} \otimes {(A_{1 l}^{\nabla})}^{T} + Q_{s_{l}} D_{1 l}^{\nabla} \otimes Q_{r_{l}} {(C_{1 l}^{\nabla})}^{T} & \dots & B_{s l}^{\nabla} \otimes {(A_{s l}^{\nabla})}^{T} + Q_{s_{l}} D_{s l}^{\nabla} \otimes Q_{r_{l}} {(C_{s l}^{\nabla})}^{T} \end{matrix}],

R_{1} = [\begin{matrix} {(B_{11}^{\nabla})}^{T} \otimes A_{11}^{\nabla} + {(D_{11}^{\nabla})}^{T} Q_{s_{1}} \otimes C_{11}^{\nabla} Q_{r_{1}} & \dots & {(B_{1 l}^{\nabla})}^{T} \otimes A_{1 l}^{\nabla} + {(D_{1 l}^{\nabla})}^{T} Q_{s_{l}} \otimes C_{1 l}^{\nabla} Q_{r_{l}} \\ {(B_{21}^{\nabla})}^{T} \otimes A_{21}^{\nabla} + {(D_{21}^{\nabla})}^{T} Q_{s_{1}} \otimes C_{21}^{\nabla} Q_{r_{1}} & \dots & {(B_{2 l}^{\nabla})}^{T} \otimes A_{2 l}^{\nabla} + {(D_{2 l}^{\nabla})}^{T} Q_{s_{l}} \otimes C_{2 l}^{\nabla} Q_{r_{l}} \\ ⋮ & ⋮ \\ {(B_{s 1}^{\nabla})}^{T} \otimes A_{s 1}^{\nabla} + {(D_{s 1}^{\nabla})}^{T} Q_{s_{1}} \otimes C_{s 1}^{\nabla} Q_{r_{1}} & \dots & {(B_{s l}^{\nabla})}^{T} \otimes A_{s l}^{\nabla} + {(D_{s l}^{\nabla})}^{T} Q_{s_{l}} \otimes C_{s l}^{\nabla} Q_{r_{l}} \end{matrix}],

V_{2} = [\begin{matrix} (Q_{s_{1}} {(E_{11}^{\nabla})}^{T} \otimes Q_{r_{1}} F_{11}^{\nabla} + {(G_{11}^{\nabla})}^{T} \otimes H_{11}^{\nabla}) P (2 m_{1}, 2 n_{1}) & \dots & (Q_{s_{1}} {(E_{s 1}^{\nabla})}^{T} \otimes Q_{r_{1}} F_{s 1}^{\nabla} + {(G_{s 1}^{\nabla})}^{T} \otimes H_{s 1}^{\nabla}) P (2 m_{s}, 2 n_{s}) \\ (Q_{s_{2}} {(E_{12}^{\nabla})}^{T} \otimes Q_{r_{2}} F_{12}^{\nabla} + {(G_{12}^{\nabla})}^{T} \otimes H_{12}^{\nabla}) P (2 m_{1}, 2 n_{1}) & \dots & (Q_{s_{2}} {(E_{s 2}^{\nabla})}^{T} \otimes Q_{r_{2}} F_{s 2}^{\nabla} + {(G_{s 2}^{\nabla})}^{T} \otimes H_{s 2}^{\nabla}) P (2 m_{s}, 2 n_{s}) \\ ⋮ & ⋮ \\ (Q_{s_{l}} {(E_{1 l}^{\nabla})}^{T} \otimes Q_{r_{l}} F_{1 l}^{\nabla} + {(G_{1 l}^{\nabla})}^{T} \otimes H_{1 l}^{\nabla}) P (2 m_{1}, 2 n_{1}) & \dots & (Q_{s_{l}} {(E_{s l}^{\nabla})}^{T} \otimes Q_{r_{l}} F_{s l}^{\nabla} + {(G_{s l}^{\nabla})}^{T} \otimes H_{s l}^{\nabla}) P (2 m_{s}, 2 n_{s}) \end{matrix}],

R_{2} = [\begin{matrix} ({(F_{11}^{\nabla})}^{T} Q_{r_{1}} \otimes E_{11}^{\nabla} Q_{s_{1}} + {(H_{11}^{\nabla})}^{T} \otimes G_{11}^{\nabla}) P (2 r_{1}, 2 s_{1}) & \dots & ({(F_{1 l}^{\nabla})}^{T} Q_{r_{l}} \otimes E_{1 l}^{\nabla} Q_{s_{l}} + {(H_{1 l}^{\nabla})}^{T} \otimes G_{1 l}^{\nabla}) P (2 r_{l}, 2 s_{l}) \\ ({(F_{21}^{\nabla})}^{T} Q_{r_{1}} \otimes E_{21}^{\nabla} Q_{s_{1}} + {(H_{21}^{\nabla})}^{T} \otimes G_{21}^{\nabla}) P (2 r_{1}, 2 s_{1}) & \dots & ({(F_{2 l}^{\nabla})}^{T} Q_{r_{l}} \otimes E_{2 l}^{\nabla} Q_{s_{l}} + {(H_{2 l}^{\nabla})}^{T} \otimes G_{2 l}^{\nabla}) P (2 r_{l}, 2 s_{l}) \\ ⋮ & ⋮ \\ ({(F_{s 1}^{\nabla})}^{T} Q_{r_{1}} \otimes E_{s 1}^{\nabla} Q_{s_{1}} + {(H_{s 1}^{\nabla})}^{T} \otimes G_{s 1}^{\nabla}) P (2 r_{1}, 2 s_{1}) & \dots & ({(F_{s l}^{\nabla})}^{T} Q_{r_{l}} \otimes E_{s l}^{\nabla} Q_{s_{l}} + {(H_{s l}^{\nabla})}^{T} \otimes G_{s l}^{\nabla}) P (2 r_{l}, 2 s_{l}) \end{matrix}] .

Theorem 1.

The matrix in Equation (5) has a unique solution if and only if

rank (R) = \sum_{p = 1}^{l} 4 s_{p} r_{p}

(that is, R is of a full-column rank) and

rank (R) = rank ((R, f))

. The unique solution is given by

x = {(R^{T} R)}^{- 1} R^{T} f,

(32)

and the corresponding homogeneous matrix equations

\sum_{j = 1}^{l} (A_{i j} X_{j} B_{i j} + C_{i j} \bar{X_{j}} D_{i j} + E_{i j} X_{j}^{T} F_{i j} + G_{i j} X_{j}^{H} H_{i j}) = 0, i = 1, 2, \dots, s

have a unique solution

X^{*} = (X_{1}^{*}, X_{2}^{*}, \dots, X_{l}^{*}) = (0, 0, \dots, 0)

, where

x = [\begin{matrix} vec (X_{1}^{\nabla}) \\ vec (X_{2}^{\nabla}) \\ ⋮ \\ vec (X_{l}^{\nabla}) \end{matrix}], f = [\begin{matrix} vec (M_{1}^{\nabla}) \\ vec (M_{2}^{\nabla}) \\ ⋮ \\ vec (M_{s}^{\nabla}) \end{matrix}]

and R is defined by Equation (31).

Proof.

Applying the real representation of the complex matrix to Equation (5) leads to

\sum_{j = 1}^{l} [A_{i j}^{\nabla} X_{j}^{\nabla} B_{i j}^{\nabla} + C_{i j}^{\nabla} Q_{r_{j}} X_{j}^{\nabla} Q_{s_{j}} D_{i j}^{\nabla} + E_{i j}^{\nabla} Q_{s_{j}} {(X_{j}^{\nabla})}^{T} Q_{r_{j}} F_{i j}^{\nabla} + G_{i j}^{\nabla} {(X_{j}^{\nabla})}^{T} H_{i j}^{\nabla}] = M_{i}^{\nabla}, i = 1, 2, \dots, s .

By using the Kronecker products of the matrices and the vector-stretching operator in the above equations, we have

\sum_{j = 1}^{l} [{(B_{i j}^{\nabla})}^{T} \otimes A_{i j}^{\nabla} + {(D_{i j}^{\nabla})}^{T} Q_{s_{j}} \otimes C_{i j}^{\nabla} Q_{r_{j}} + ({(F_{i j}^{\nabla})}^{T} Q_{r_{j}} \otimes E_{i j}^{\nabla} Q_{s_{j}} + {(H_{i j}^{\nabla})}^{T} \otimes G_{i j}^{\nabla}) P (2 r_{j}, 2 s_{j})] vec (X_{j}^{\nabla}) = vec (M_{i}^{\nabla}),

where

i = 1, 2, \dots, s

, which can be equivalently transformed into the following linear system:

R x = f .

(33)

Therefore, Equation (33) has a unique solution if and only if R has a full-column rank and

rank (R) = rank ((R, f))

. In this case, the unique solution in Equation (32) for the matrix in Equation (5) can be obtained. The conclusions follow immediately. □

Based on the properties of the matrix norms and Algorithm 1, we establish the sufficient condition for the convergence of the proposed WRGI algorithm in the following theorem:

Theorem 2.

Suppose that the matrix in Equation (5) has a unique solution

X^{*} = (X_{1}^{*}, X_{2}^{*}, \dots, X_{l}^{*})

. Then, the iterative sequences

{X_{p} (k)}

(p = 1, \dots, l)

generated by Algorithm 1 converge to

X^{*}

for any initial matrices

X_{p} (0)

(p = 1, \dots, l)

if μ satisfies

0 < μ < min_{1 \leq i \leq s} \frac{8}{α_{i} {∥V∥}_{2}^{2}} .

Proof.

Define the error matrices

{\tilde{X}}_{p} (k) = X_{p} (k) - X_{p}^{*}, {\tilde{X}}_{p}^{(i)} (k) = X_{p}^{(i)} (k) - X_{p}^{*}, i = 1, \dots, s, p = 1, \dots, l,

and

\begin{matrix} {\tilde{Γ}}_{i j} (k) = A_{i j} {\tilde{X}}_{j} (k) B_{i j} + C_{i j} \bar{{\tilde{X}}_{j} (k)} D_{i j} + E_{i j} {\tilde{X}}_{j}^{T} (k) F_{i j} + G_{i j} {\tilde{X}}_{j}^{H} (k) H_{i j}, \\ Z_{i} (k) = \sum_{j = 1}^{l} (A_{i j} {\tilde{X}}_{j} (k) B_{i j} + C_{i j} \bar{{\tilde{X}}_{j} (k)} D_{i j} + E_{i j} {\tilde{X}}_{j}^{T} (k) F_{i j} + G_{i j} {\tilde{X}}_{j}^{H} (k) H_{i j}), \\ {(Z_{i} (k))}^{\nabla} = Z_{i}^{\nabla} (k), i = 1, 2, \dots, s . \end{matrix}

(34)

It follows from Algorithm 1 that

\begin{matrix} {\tilde{X}}_{p}^{(i)} (k + 1) \\ = X_{p}^{(i)} (k + 1) - X_{p}^{*} \\ = X_{p}^{(i)} (k) - X_{p}^{*} + \frac{μ}{4} \{A_{i p}^{H} [M_{i} - \sum_{j = 1}^{l} Γ_{i j} (k)] B_{i p}^{H} + C_{i p}^{T} \bar{[M_{i} - \sum_{j = 1}^{l} Γ_{i j} (k)]} D_{i p}^{T} \\ + \bar{F_{i p}} {[M_{i} - \sum_{j = 1}^{l} Γ_{i j} (k)]}^{T} \bar{E_{i p}} + H_{i p} {[M_{i} - \sum_{j = 1}^{l} Γ_{i j} (k)]}^{H} G_{i p}\} \\ = {\tilde{X}}_{p}^{(i)} (k) - \frac{μ}{4} \{A_{i p}^{H} \sum_{j = 1}^{l} {\tilde{Γ}}_{i j} (k) B_{i p}^{H} + C_{i p}^{T} \bar{\sum_{j = 1}^{l} {\tilde{Γ}}_{i j} (k)} D_{i p}^{T} \\ + \bar{F_{i p}} {[\sum_{j = 1}^{l} {\tilde{Γ}}_{i j} (k)]}^{T} \bar{E_{i p}} + H_{i p} {[\sum_{j = 1}^{l} {\tilde{Γ}}_{i j} (k)]}^{H} G_{i p}\} \\ = {\tilde{X}}_{p}^{(i)} (k) - \frac{μ}{4} [A_{i p}^{H} Z_{i} (k) B_{i p}^{H} + C_{i p}^{T} \bar{Z_{i} (k)} D_{i p}^{T} + \bar{F_{i p}} Z_{i}^{T} (k) \bar{E_{i p}} + H_{i p} Z_{i}^{H} (k) G_{i p}], \end{matrix}

(35)

and

\begin{matrix} {\tilde{X}}_{p} (k + 1) & = & X_{p} (k + 1) - X_{p}^{*} \\ = & α_{1} X_{p}^{(1)} (k + 1) + α_{2} X_{p}^{(2)} (k + 1) + \dots + α_{s} X_{p}^{(s)} (k + 1) - X_{p}^{*} \\ = & \sum_{i = 1}^{s} α_{i} {\tilde{X}}_{p}^{(i)} (k + 1) \\ = & \sum_{i = 1}^{s} α_{i} {\tilde{X}}_{p}^{(i)} (k) - \frac{μ}{4} \sum_{i = 1}^{s} α_{i} [A_{i p}^{H} Z_{i} (k) B_{i p}^{H} + C_{i p}^{T} \bar{Z_{i} (k)} D_{i p}^{T} + \bar{F_{i p}} Z_{i}^{T} (k) \bar{E_{i p}} + H_{i p} Z_{i}^{H} (k) G_{i p}] \\ = & {\tilde{X}}_{p} (k) - \frac{μ}{4} \sum_{i = 1}^{s} α_{i} [A_{i p}^{H} Z_{i} (k) B_{i p}^{H} + C_{i p}^{T} \bar{Z_{i} (k)} D_{i p}^{T} + \bar{F_{i p}} Z_{i}^{T} (k) \bar{E_{i p}} + H_{i p} Z_{i}^{H} (k) G_{i p}] . \end{matrix}

(36)

Taking the Frobenius norm in Equation (36) and using the properties of the norm leads to

\begin{matrix} {∥{\tilde{X}}_{p} (k + 1)∥}^{2} = tr [{\tilde{X}}_{p}^{H} (k + 1) {\tilde{X}}_{p} (k + 1)] \\ = {∥{\tilde{X}}_{p} (k)∥}^{2} - \frac{μ}{4} tr [\sum_{i = 1}^{s} α_{i} {\tilde{X}}_{p}^{H} (k) (A_{i p}^{H} Z_{i} (k) B_{i p}^{H} + C_{i p}^{T} \bar{Z_{i} (k)} D_{i p}^{T} + \bar{F_{i p}} Z_{i}^{T} (k) \bar{E_{i p}} + H_{i p} Z_{i}^{H} (k) G_{i p})] \\ - \frac{μ}{4} tr [\sum_{i = 1}^{s} α_{i} (B_{i p} Z_{i}^{H} (k) A_{i p} + \bar{D_{i p}} Z_{i}^{T} (k) \bar{C_{i p}} + E_{i p}^{T} \bar{Z_{i} (k)} F_{i p}^{T} + G_{i p}^{H} Z_{i} (k) H_{i p}^{H}) {\tilde{X}}_{p} (k)] \\ + \frac{μ^{2}}{16} {∥\sum_{i = 1}^{s} α_{i} (A_{i p}^{H} Z_{i} (k) B_{i p}^{H} + C_{i p}^{T} \bar{Z_{i} (k)} D_{i p}^{T} + \bar{F_{i p}} Z_{i}^{T} (k) \bar{E_{i p}} + H_{i p} Z_{i}^{H} (k) G_{i p})∥}^{2} \\ = {∥{\tilde{X}}_{p} (k)∥}^{2} - \frac{μ}{4} tr [\sum_{i = 1}^{s} α_{i} B_{i p}^{H} {\tilde{X}}_{p}^{H} (k) A_{i p}^{H} Z_{i} (k)] - \frac{μ}{4} tr [\sum_{i = 1}^{s} α_{i} D_{i p}^{T} {\tilde{X}}_{p}^{H} (k) C_{i p}^{T} \bar{Z_{i} (k)}] - \frac{μ}{4} tr [\sum_{i = 1}^{s} α_{i} \bar{E_{i p}} {\tilde{X}}_{p}^{H} (k) \bar{F_{i p}} Z_{i}^{T} (k)] \\ - \frac{μ}{4} tr [\sum_{i = 1}^{s} α_{i} G_{i p} {\tilde{X}}_{p}^{H} (k) H_{i p} Z_{i}^{H} (k)] - \frac{μ}{4} tr [\sum_{i = 1}^{s} α_{i} Z_{i}^{H} (k) A_{i p} {\tilde{X}}_{p} (k) B_{i p}] - \frac{μ}{4} tr [\sum_{i = 1}^{s} α_{i} Z_{i}^{T} (k) \bar{C_{i p}} {\tilde{X}}_{p} (k) \bar{D_{i p}}] \\ - \frac{μ}{4} tr [\sum_{i = 1}^{s} α_{i} \bar{Z_{i} (k)} F_{i p}^{T} {\tilde{X}}_{p} (k) E_{i p}^{T}] - \frac{μ}{4} tr [\sum_{i = 1}^{s} α_{i} Z_{i} (k) H_{i p}^{H} {\tilde{X}}_{p} (k) G_{i p}^{H}] \\ + \frac{μ^{2}}{16} {∥\sum_{i = 1}^{s} α_{i} (A_{i p}^{H} Z_{i} (k) B_{i p}^{H} + C_{i p}^{T} \bar{Z_{i} (k)} D_{i p}^{T} + \bar{F_{i p}} Z_{i}^{T} (k) \bar{E_{i p}} + H_{i p} Z_{i}^{H} (k) G_{i p})∥}^{2} . \end{matrix}

(37)

It is not difficult to verify that

tr [\sum_{i = 1}^{s} α_{i} D_{i p}^{T} {\tilde{X}}_{p}^{H} (k) C_{i p}^{T} \bar{Z_{i} (k)}] + tr [\sum_{i = 1}^{s} α_{i} Z_{i}^{T} (k) \bar{C_{i p}} {\tilde{X}}_{p} (k) \bar{D_{i p}}]

and

tr [\sum_{i = 1}^{s} α_{i} \bar{E_{i p}} {\tilde{X}}_{p}^{H} (k) \bar{F_{i p}} Z_{i}^{T} (k)] + tr [\sum_{i = 1}^{s} α_{i} \bar{Z_{i} (k)} F_{i p}^{T} {\tilde{X}}_{p} (k) E_{i p}^{T}]

are real. Then, under Lemma 3, it holds that

\begin{matrix} tr [\sum_{i = 1}^{s} α_{i} D_{i p}^{T} {\tilde{X}}_{p}^{H} (k) C_{i p}^{T} \bar{Z_{i} (k)}] + tr [\sum_{i = 1}^{s} α_{i} Z_{i}^{T} (k) \bar{C_{i p}} {\tilde{X}}_{p} (k) \bar{D_{i p}}] \\ = tr [\sum_{i = 1}^{s} α_{i} D_{i p}^{H} {\tilde{X}}_{p}^{T} (k) C_{i p}^{H} Z_{i} (k)] + tr [\sum_{i = 1}^{s} α_{i} Z_{i}^{H} (k) C_{i p} \bar{{\tilde{X}}_{p} (k)} D_{i p}], \end{matrix}

(38)

and

\begin{matrix} tr [\sum_{i = 1}^{s} α_{i} \bar{E_{i p}} {\tilde{X}}_{p}^{H} (k) \bar{F_{i p}} Z_{i}^{T} (k)] + tr [\sum_{i = 1}^{s} α_{i} \bar{Z_{i} (k)} F_{i p}^{T} {\tilde{X}}_{p} (k) E_{i p}^{T}] \\ = tr [\sum_{i = 1}^{s} α_{i} E_{i p} {\tilde{X}}_{p}^{T} (k) F_{i p} Z_{i}^{H} (k)] + tr [\sum_{i = 1}^{s} α_{i} Z_{i} (k) F_{i p}^{H} \bar{{\tilde{X}}_{p} (k)} E_{i p}^{H}] . \end{matrix}

(39)

With the relations in Equations (38) and (39), it follows from Equation (37) that

\begin{matrix} {∥{\tilde{X}}_{p} (k + 1)∥}^{2} \\ = {∥{\tilde{X}}_{p} (k)∥}^{2} - \frac{μ}{4} tr [\sum_{i = 1}^{s} α_{i} (B_{i p}^{H} {\tilde{X}}_{p}^{H} (k) A_{i p}^{H} + D_{i p}^{H} {\tilde{X}}_{p}^{T} (k) C_{i p}^{H} + F_{i p}^{H} \bar{{\tilde{X}}_{p} (k)} E_{i p}^{H} + H_{i p}^{H} {\tilde{X}}_{p} (k) G_{i p}^{H}) Z_{i} (k)] \\ - \frac{μ}{4} tr [\sum_{i = 1}^{s} α_{i} Z_{i}^{H} (k) (A_{i p} {\tilde{X}}_{p} (k) B_{i p} + C_{i p} \bar{{\tilde{X}}_{p} (k)} D_{i p} + E_{i p} {\tilde{X}}_{p}^{T} (k) F_{i p} + G_{i p} {\tilde{X}}_{p}^{H} (k) H_{i p})] \\ + \frac{μ^{2}}{16} {∥\sum_{i = 1}^{s} α_{i} (A_{i p}^{H} Z_{i} (k) B_{i p}^{H} + C_{i p}^{T} \bar{Z_{i} (k)} D_{i p}^{T} + \bar{F_{i p}} Z_{i}^{T} (k) \bar{E_{i p}} + H_{i p} Z_{i}^{H} (k) G_{i p})∥}^{2} . \end{matrix}

(40)

By making use of the relation in Equation (12), as well as Lemmas 2 and 4, we have

\begin{matrix} \sum_{p = 1}^{l} {∥\sum_{i = 1}^{s} α_{i} (A_{i p}^{H} Z_{i} (k) B_{i p}^{H} + C_{i p}^{T} \bar{Z_{i} (k)} D_{i p}^{T} + \bar{F_{i p}} Z_{i}^{T} (k) \bar{E_{i p}} + H_{i p} Z_{i}^{H} (k) G_{i p})∥}^{2} \\ = \frac{1}{2} \sum_{p = 1}^{l} {∥\sum_{i = 1}^{s} α_{i} {(A_{i p}^{H} Z_{i} (k) B_{i p}^{H} + C_{i p}^{T} \bar{Z_{i} (k)} D_{i p}^{T} + \bar{F_{i p}} Z_{i}^{T} (k) \bar{E_{i p}} + H_{i p} Z_{i}^{H} (k) G_{i p})}^{\nabla}∥}^{2} \\ = \frac{1}{2} \sum_{p = 1}^{l} {∥\sum_{i = 1}^{s} α_{i} [{(A_{i p}^{H})}^{\nabla} Z_{i}^{\nabla} (k) {(B_{i p}^{H})}^{\nabla} + {(C_{i p}^{T})}^{\nabla} {(\bar{Z_{i} (k)})}^{\nabla} {(D_{i p}^{T})}^{\nabla} + {(\bar{F_{i p}})}^{\nabla} {(Z_{i}^{T} (k))}^{\nabla} {(\bar{E_{i p}})}^{\nabla} + H_{i p}^{\nabla} {(Z_{i}^{H} (k))}^{\nabla} G_{i p}^{\nabla}]∥}^{2} \\ = \frac{1}{2} \sum_{p = 1}^{l} ∥\sum_{i = 1}^{s} α_{i} [{(A_{i p}^{\nabla})}^{T} Z_{i}^{\nabla} (k) {(B_{i p}^{\nabla})}^{T} + Q_{r_{p}} {(C_{i p}^{\nabla})}^{T} Q_{m_{i}} Q_{m_{i}} Z_{i}^{\nabla} (k) Q_{n_{i}} Q_{n_{i}} {(D_{i p}^{\nabla})}^{T} Q_{s_{p}} \\ {+ Q_{r_{p}} F_{i p}^{\nabla} Q_{n_{i}} Q_{n_{i}} {(Z_{i}^{\nabla} (k))}^{T} Q_{m_{i}} Q_{m_{i}} E_{i p}^{\nabla} Q_{s_{p}} + H_{i p}^{\nabla} {(Z_{i}^{\nabla} (k))}^{T} G_{i p}^{\nabla}]∥}^{2} \\ = \frac{1}{2} \sum_{p = 1}^{l} {∥\sum_{i = 1}^{s} α_{i} [{(A_{i p}^{\nabla})}^{T} Z_{i}^{\nabla} (k) {(B_{i p}^{\nabla})}^{T} + Q_{r_{p}} {(C_{i p}^{\nabla})}^{T} Z_{i}^{\nabla} (k) {(D_{i p}^{\nabla})}^{T} Q_{s_{p}} + Q_{r_{p}} F_{i p}^{\nabla} {(Z_{i}^{\nabla} (k))}^{T} E_{i p}^{\nabla} Q_{s_{p}} + H_{i p}^{\nabla} {(Z_{i}^{\nabla} (k))}^{T} G_{i p}^{\nabla}]∥}^{2} \\ = \frac{1}{2} \sum_{p = 1}^{l} {∥\sum_{i = 1}^{s} \{B_{i p}^{\nabla} \otimes {(A_{i p}^{\nabla})}^{T} + Q_{s_{p}} D_{i p}^{\nabla} \otimes Q_{r_{p}} {(C_{i p}^{\nabla})}^{T} + [Q_{s_{p}} {(E_{i p}^{\nabla})}^{T} \otimes Q_{r_{p}} F_{i p}^{\nabla} + {(G_{i p}^{\nabla})}^{T} \otimes H_{i p}^{\nabla}] P (2 m_{i}, 2 n_{i})\} vec [{(α_{i} Z_{i} (k))}^{\nabla}]∥}_{2}^{2} \\ = \frac{1}{2} {∥\begin{matrix} \sum_{i = 1}^{s} \{B_{i 1}^{\nabla} \otimes {(A_{i 1}^{\nabla})}^{T} + Q_{s_{1}} D_{i 1}^{\nabla} \otimes Q_{r_{1}} {(C_{i 1}^{\nabla})}^{T} + [Q_{s_{1}} {(E_{i 1}^{\nabla})}^{T} \otimes Q_{r_{1}} F_{i 1}^{\nabla} + {(G_{i 1}^{\nabla})}^{T} \otimes H_{i 1}^{\nabla}] P (2 m_{i}, 2 n_{i})\} vec [{(α_{i} Z_{i} (k))}^{\nabla}] \\ \sum_{i = 1}^{s} \{B_{i 2}^{\nabla} \otimes {(A_{i 2}^{\nabla})}^{T} + Q_{s_{2}} D_{i 2}^{\nabla} \otimes Q_{r_{2}} {(C_{i 2}^{\nabla})}^{T} + [Q_{s_{2}} {(E_{i 2}^{\nabla})}^{T} \otimes Q_{r_{2}} F_{i 2}^{\nabla} + {(G_{i 2}^{\nabla})}^{T} \otimes H_{i 2}^{\nabla}] P (2 m_{i}, 2 n_{i})\} vec [{(α_{i} Z_{i} (k))}^{\nabla}] \\ ⋮ \\ \sum_{i = 1}^{s} \{B_{i l}^{\nabla} \otimes {(A_{i l}^{\nabla})}^{T} + Q_{s_{l}} D_{i l}^{\nabla} \otimes Q_{r_{l}} {(C_{i l}^{\nabla})}^{T} + [Q_{s_{l}} {(E_{i l}^{\nabla})}^{T} \otimes Q_{r_{l}} F_{i l}^{\nabla} + {(G_{i l}^{\nabla})}^{T} \otimes H_{i l}^{\nabla}] P (2 m_{i}, 2 n_{i})\} vec [{(α_{i} Z_{i} (k))}^{\nabla}] \end{matrix}∥}_{2}^{2} \\ = \frac{1}{2} {∥V b∥}_{2}^{2}, \end{matrix}

(41)

where V is defined by Equation (30) and

b = {[{[vec ({(α_{1} Z_{1} (k))}^{\nabla})]}^{T}, {[vec ({(α_{2} Z_{2} (k))}^{\nabla})]}^{T}, \dots, {[vec ({(α_{s} Z_{s} (k))}^{\nabla})]}^{T}]}^{T} .

By applying the properties of the 2-norms of the matrices and vectors, we obtain that

\sum_{p = 1}^{l} {∥\sum_{i = 1}^{s} α_{i} (A_{i p}^{H} Z_{i} (k) B_{i p}^{H} + C_{i p}^{T} \bar{Z_{i} (k)} D_{i p}^{T} + \bar{F_{i p}} Z_{i}^{T} (k) \bar{E_{i p}} + H_{i p} Z_{i}^{H} (k) G_{i p})∥}^{2} = \frac{1}{2} {∥V b∥}_{2}^{2} \leq \frac{1}{2} {∥V∥}_{2}^{2} {∥b∥}_{2}^{2} .

(42)

That aside, in light of the relation in Equation (12), we deduce that

{∥b∥}_{2}^{2} = \sum_{i = 1}^{s} {∥{(α_{i} Z_{i} (k))}^{\nabla}∥}^{2} = 2 \sum_{i = 1}^{s} α_{i}^{2} {∥Z_{i} (k)∥}^{2} .

(43)

Substituting Equation (43) into Equation (42) leads to

\sum_{p = 1}^{l} {∥\sum_{i = 1}^{s} α_{i} (A_{i p}^{H} Z_{i} (k) B_{i p}^{H} + C_{i p}^{T} \bar{Z_{i} (k)} D_{i p}^{T} + \bar{F_{i p}} Z_{i}^{T} (k) \bar{E_{i p}} + H_{i p} Z_{i}^{H} (k) G_{i p})∥}^{2} \leq {∥V∥}_{2}^{2} \sum_{i = 1}^{s} α_{i}^{2} {∥Z_{i} (k)∥}^{2} .

(44)

By combining Equations (40) and (44) and changing the order of addition, we derive

\begin{matrix} \sum_{p = 1}^{l} {∥{\tilde{X}}_{p} (k + 1)∥}^{2} \\ \leq \sum_{p = 1}^{l} {∥{\tilde{X}}_{p} (k)∥}^{2} - \frac{μ}{4} tr [\sum_{i = 1}^{s} α_{i} (\sum_{p = 1}^{l} (B_{i p}^{H} {\tilde{X}}_{p}^{H} (k) A_{i p}^{H} + D_{i p}^{H} {\tilde{X}}_{p}^{T} (k) C_{i p}^{H} + F_{i p}^{H} \bar{{\tilde{X}}_{p} (k)} E_{i p}^{H} + H_{i p}^{H} {\tilde{X}}_{p} (k) G_{i p}^{H})) Z_{i} (k)] \\ - \frac{μ}{4} tr [Z_{i}^{H} (k) \sum_{i = 1}^{s} α_{i} (\sum_{p = 1}^{l} (A_{i p} {\tilde{X}}_{p} (k) B_{i p} + C_{i p} \bar{{\tilde{X}}_{p} (k)} D_{i p} + E_{i p} {\tilde{X}}_{p}^{T} (k) F_{i p} + G_{i p} {\tilde{X}}_{p}^{H} (k) H_{i p}))] + \frac{μ^{2}}{16} {∥V∥}_{2}^{2} \sum_{i = 1}^{s} α_{i}^{2} {∥Z_{i} (k)∥}^{2} \\ = \sum_{p = 1}^{l} {∥{\tilde{X}}_{p} (k)∥}^{2} - \frac{μ}{4} tr [\sum_{i = 1}^{s} α_{i} Z_{i}^{H} (k) Z_{i} (k)] - \frac{μ}{4} tr [\sum_{i = 1}^{s} α_{i} Z_{i}^{H} (k) Z_{i} (k)] + \frac{μ^{2}}{16} {∥V∥}_{2}^{2} \sum_{i = 1}^{s} α_{i}^{2} {∥Z_{i} (k)∥}^{2} \\ = \sum_{p = 1}^{l} {∥{\tilde{X}}_{p} (k)∥}^{2} - \frac{μ}{4} \sum_{i = 1}^{s} α_{i} tr [Z_{i}^{H} (k) Z_{i} (k)] - \frac{μ}{4} \sum_{i = 1}^{s} α_{i} tr [Z_{i}^{H} (k) Z_{i} (k)] + \frac{μ^{2}}{16} {∥V∥}_{2}^{2} \sum_{i = 1}^{s} α_{i}^{2} {∥Z_{i} (k)∥}^{2} \\ = \sum_{p = 1}^{l} {∥{\tilde{X}}_{p} (k)∥}^{2} - \frac{μ}{2} \sum_{i = 1}^{s} α_{i} {∥Z_{i} (k)∥}^{2} + \frac{μ^{2}}{16} {∥V∥}_{2}^{2} \sum_{i = 1}^{s} α_{i}^{2} {∥Z_{i} (k)∥}^{2} \\ = \sum_{p = 1}^{l} {∥{\tilde{X}}_{p} (k)∥}^{2} - \frac{μ}{2} \sum_{i = 1}^{s} α_{i} (1 - \frac{μ}{8} α_{i} {∥V∥}_{2}^{2}) {∥Z_{i} (k)∥}^{2} \\ \leq \sum_{p = 1}^{l} {∥{\tilde{X}}_{p} (k - 1)∥}^{2} - \frac{μ}{2} \sum_{i = 1}^{s} α_{i} (1 - \frac{μ}{8} α_{i} {∥V∥}_{2}^{2}) ({∥Z_{i} (k)∥}^{2} + {∥Z_{i} (k - 1)∥}^{2}) \\ \leq \sum_{p = 1}^{l} {∥{\tilde{X}}_{p} (0)∥}^{2} - \frac{μ}{2} \sum_{i = 1}^{s} α_{i} (1 - \frac{μ}{8} α_{i} {∥V∥}_{2}^{2}) \sum_{ω = 0}^{k} {∥Z_{i} (ω)∥}^{2} . \end{matrix}

(45)

If the parameter

μ

is chosen to satisfy

0 < μ < min_{1 \leq i \leq s} \frac{8}{α_{i} {∥V∥}_{2}^{2}},

then

0 < \frac{μ}{2} \sum_{i = 1}^{s} α_{i} (1 - \frac{μ}{8} α_{i} {∥V∥}_{2}^{2}) \sum_{ω = 0}^{\infty} {∥Z_{i} (ω)∥}^{2} \leq \sum_{p = 1}^{l} {∥{\tilde{X}}_{p} (0)∥}^{2},

and therefore

\sum_{ω = 0}^{\infty} {∥Z_{i} (ω)∥}^{2} < + \infty, i = 1, 2, \dots, s,

According to the convergence theorem of series, we have

lim_{ω \to + \infty} Z_{i} (ω) = 0

(i = 1, 2, \dots, s)

. Having in mind that the matrix in Equation (5) has a unique solution, then it follows from the definition of

Z_{i} (k)

in Equation (34) that

lim_{k \to + \infty} X_{j} (k) = X_{j}^{*}, j = 1, 2, \dots, l .

The proof is completed. □

In the sequel, by applying the properties of the real representation of a complex matrix and the vector-stretching operator, we study the necessary and sufficient condition for the convergence of the WRGI algorithm. This result can be stated as in Theorem 3:

Theorem 3.

Suppose that the matrix in Equation (5) has a unique solution

X^{*} = (X_{1}^{*}, X_{2}^{*}, \dots, X_{l}^{*})

. Then, the iterative sequences

{X_{p} (k)}

(p = 1, \dots, l)

generated by Algorithm 1 converge to

X^{*}

for any initial matrices

X_{p} (0)

(p = 1, \dots, l)

if and only if

0 < μ < \frac{8}{{∥T^{\frac{1}{2}} R∥}_{2}^{2}} .

(46)

Proof.

For convenience, we define

{({\tilde{X}}_{p} (k))}^{\nabla} = {\tilde{X}}_{p}^{\nabla} (k), p = 1, 2, \dots, l .

From Equation (36) and the definition in Equation (34) for

Z_{i} (k)

in Theorem 2, it can be obtained that

\begin{matrix} {\tilde{X}}_{p} (k + 1) \\ = {\tilde{X}}_{p} (k) - \frac{μ}{4} \sum_{i = 1}^{s} α_{i} \{A_{i p}^{H} [\sum_{j = 1}^{l} (A_{i j} {\tilde{X}}_{j} (k) B_{i j} + C_{i j} \bar{{\tilde{X}}_{j} (k)} D_{i j} + E_{i j} {\tilde{X}}_{j}^{T} (k) F_{i j} + G_{i j} {\tilde{X}}_{j}^{H} (k) H_{i j})] B_{i p}^{H} \\ + C_{i p}^{T} \bar{[\sum_{j = 1}^{l} (A_{i j} {\tilde{X}}_{j} (k) B_{i j} + C_{i j} \bar{{\tilde{X}}_{j} (k)} D_{i j} + E_{i j} {\tilde{X}}_{j}^{T} (k) F_{i j} + G_{i j} {\tilde{X}}_{j}^{H} (k) H_{i j})]} D_{i p}^{T} \\ + \bar{F_{i p}} {[\sum_{j = 1}^{l} (A_{i j} {\tilde{X}}_{j} (k) B_{i j} + C_{i j} \bar{{\tilde{X}}_{j} (k)} D_{i j} + E_{i j} {\tilde{X}}_{j}^{T} (k) F_{i j} + G_{i j} {\tilde{X}}_{j}^{H} (k) H_{i j})]}^{T} \bar{E_{i p}} \\ + H_{i p} {[\sum_{j = 1}^{l} (A_{i j} {\tilde{X}}_{j} (k) B_{i j} + C_{i j} \bar{{\tilde{X}}_{j} (k)} D_{i j} + E_{i j} {\tilde{X}}_{j}^{T} (k) F_{i j} + G_{i j} {\tilde{X}}_{j}^{H} (k) H_{i j})]}^{H} G_{i p}\} \\ = {\tilde{X}}_{p} (k) - \frac{μ}{4} \sum_{i = 1}^{s} α_{i} \{A_{i p}^{H} [\sum_{j = 1}^{l} (A_{i j} {\tilde{X}}_{j} (k) B_{i j} + C_{i j} \bar{{\tilde{X}}_{j} (k)} D_{i j} + E_{i j} {\tilde{X}}_{j}^{T} (k) F_{i j} + G_{i j} {\tilde{X}}_{j}^{H} (k) H_{i j})] B_{i p}^{H} \\ + C_{i p}^{T} [\sum_{j = 1}^{l} (\bar{A_{i j}} \bar{{\tilde{X}}_{j} (k)} \bar{B_{i j}} + \bar{C_{i j}} {\tilde{X}}_{j} (k) \bar{D_{i j}} + \bar{E_{i j}} {\tilde{X}}_{j}^{H} (k) \bar{F_{i j}} + \bar{G_{i j}} {\tilde{X}}_{j}^{T} (k) \bar{H_{i j}})] D_{i p}^{T} \\ + \bar{F_{i p}} [\sum_{j = 1}^{l} (B_{i j}^{T} {\tilde{X}}_{j}^{T} (k) A_{i j}^{T} + D_{i j}^{T} {\tilde{X}}_{j}^{H} (k) C_{i j}^{T} + F_{i j}^{T} {\tilde{X}}_{j} (k) E_{i j}^{T} + H_{i j}^{T} \bar{{\tilde{X}}_{j} (k)} G_{i j}^{T})] \bar{E_{i p}} \\ + H_{i p} [\sum_{j = 1}^{l} (B_{i j}^{H} {\tilde{X}}_{j}^{H} (k) A_{i j}^{H} + D_{i j}^{H} {\tilde{X}}_{j}^{T} (k) C_{i j}^{H} + F_{i j}^{H} \bar{{\tilde{X}}_{j} (k)} E_{i j}^{H} + H_{i j}^{H} {\tilde{X}}_{j} (k) G_{i j}^{H})] G_{i p}\} \\ = {\tilde{X}}_{p} (k) - \frac{μ}{4} \sum_{i = 1}^{s} \sum_{j = 1}^{l} α_{i} [A_{i p}^{H} A_{i j} {\tilde{X}}_{j} (k) B_{i j} B_{i p}^{H} + A_{i p}^{H} C_{i j} \bar{{\tilde{X}}_{j} (k)} D_{i j} B_{i p}^{H} + A_{i p}^{H} E_{i j} {\tilde{X}}_{j}^{T} (k) F_{i j} B_{i p}^{H} + A_{i p}^{H} G_{i j} {\tilde{X}}_{j}^{H} (k) H_{i j} B_{i p}^{H} \\ + C_{i p}^{T} \bar{A_{i j}} \bar{{\tilde{X}}_{j} (k)} \bar{B_{i j}} D_{i p}^{T} + C_{i p}^{T} \bar{C_{i j}} {\tilde{X}}_{j} (k) \bar{D_{i j}} D_{i p}^{T} + C_{i p}^{T} \bar{E_{i j}} {\tilde{X}}_{j}^{H} (k) \bar{F_{i j}} D_{i p}^{T} + C_{i p}^{T} \bar{G_{i j}} {\tilde{X}}_{j}^{T} (k) \bar{H_{i j}} D_{i p}^{T} \\ + \bar{F_{i p}} B_{i j}^{T} {\tilde{X}}_{j}^{T} (k) A_{i j}^{T} \bar{E_{i p}} + \bar{F_{i p}} D_{i j}^{T} {\tilde{X}}_{j}^{H} (k) C_{i j}^{T} \bar{E_{i p}} + \bar{F_{i p}} F_{i j}^{T} {\tilde{X}}_{j} (k) E_{i j}^{T} \bar{E_{i p}} + \bar{F_{i p}} H_{i j}^{T} \bar{{\tilde{X}}_{j} (k)} G_{i j}^{T} \bar{E_{i p}} \\ + H_{i p} B_{i j}^{H} {\tilde{X}}_{j}^{H} (k) A_{i j}^{H} G_{i p} + H_{i p} D_{i j}^{H} {\tilde{X}}_{j}^{T} (k) C_{i j}^{H} G_{i p} + H_{i p} F_{i j}^{H} \bar{{\tilde{X}}_{j} (k)} E_{i j}^{H} G_{i p} + H_{i p} H_{i j}^{H} {\tilde{X}}_{j} (k) G_{i j}^{H} G_{i p}] . \end{matrix}

(47)

Then, by combining the real representation with Equation (47) and applying Lemma 4, we obtain

\begin{matrix} {\tilde{X}}_{p}^{\nabla} (k + 1) \\ = {\tilde{X}}_{p}^{\nabla} (k) - \frac{μ}{4} \sum_{i = 1}^{s} \sum_{j = 1}^{l} α_{i} [{(A_{i p}^{\nabla})}^{T} A_{i j}^{\nabla} {\tilde{X}}_{j}^{\nabla} (k) B_{i j}^{\nabla} {(B_{i p}^{\nabla})}^{T} + {(A_{i p}^{\nabla})}^{T} C_{i j}^{\nabla} Q_{r_{j}} {\tilde{X}}_{j}^{\nabla} (k) Q_{s_{j}} D_{i j}^{\nabla} {(B_{i p}^{\nabla})}^{T} \\ + {(A_{i p}^{\nabla})}^{T} E_{i j}^{\nabla} Q_{s_{j}} {({\tilde{X}}_{j}^{\nabla} (k))}^{T} Q_{r_{j}} F_{i j}^{\nabla} {(B_{i p}^{\nabla})}^{T} + {(A_{i p}^{\nabla})}^{T} G_{i j}^{\nabla} {({\tilde{X}}_{j}^{\nabla} (k))}^{T} H_{i j}^{\nabla} {(B_{i p}^{\nabla})}^{T} \\ + Q_{r_{p}} {(C_{i p}^{\nabla})}^{T} Q_{m_{i}} Q_{m_{i}} A_{i j}^{\nabla} Q_{r_{j}} Q_{r_{j}} {\tilde{X}}_{j}^{\nabla} (k) Q_{s_{j}} Q_{s_{j}} B_{i j}^{\nabla} Q_{n_{i}} Q_{n_{i}} {(D_{i p}^{\nabla})}^{T} Q_{s_{p}} \\ + Q_{r_{p}} {(C_{i p}^{\nabla})}^{T} Q_{m_{i}} Q_{m_{i}} C_{i j}^{\nabla} Q_{r_{j}} {\tilde{X}}_{j}^{\nabla} (k) Q_{s_{j}} D_{i j}^{\nabla} Q_{n_{i}} Q_{n_{i}} {(D_{i p}^{\nabla})}^{T} Q_{s_{p}} \\ + Q_{r_{p}} {(C_{i p}^{\nabla})}^{T} Q_{m_{i}} Q_{m_{i}} E_{i j}^{\nabla} Q_{s_{j}} {({\tilde{X}}_{j}^{\nabla} (k))}^{T} Q_{r_{j}} F_{i j}^{\nabla} Q_{n_{i}} Q_{n_{i}} {(D_{i p}^{\nabla})}^{T} Q_{s_{p}} \\ + Q_{r_{p}} {(C_{i p}^{\nabla})}^{T} Q_{m_{i}} Q_{m_{i}} G_{i j}^{\nabla} Q_{s_{j}} Q_{s_{j}} {({\tilde{X}}_{j}^{\nabla} (k))}^{T} Q_{r_{j}} Q_{r_{j}} H_{i j}^{\nabla} Q_{n_{i}} Q_{n_{i}} {(D_{i p}^{\nabla})}^{T} Q_{s_{p}} \\ + Q_{r_{p}} F_{i p}^{\nabla} Q_{n_{i}} Q_{n_{i}} {(B_{i j}^{\nabla})}^{T} Q_{s_{j}} Q_{s_{j}} {({\tilde{X}}_{j}^{\nabla} (k))}^{T} Q_{r_{j}} Q_{r_{j}} {(A_{i j}^{\nabla})}^{T} Q_{m_{i}} Q_{m_{i}} E_{i p}^{\nabla} Q_{s_{p}} \\ + Q_{r_{p}} F_{i p}^{\nabla} Q_{n_{i}} Q_{n_{i}} {(D_{i j}^{\nabla})}^{T} Q_{s_{j}} {({\tilde{X}}_{j}^{\nabla} (k))}^{T} Q_{r_{j}} {(C_{i j}^{\nabla})}^{T} Q_{m_{i}} Q_{m_{i}} E_{i p}^{\nabla} Q_{s_{p}} \\ + Q_{r_{p}} F_{i p}^{\nabla} Q_{n_{i}} Q_{n_{i}} {(F_{i j}^{\nabla})}^{T} Q_{r_{j}} {\tilde{X}}_{j}^{\nabla} (k) Q_{s_{j}} {(E_{i j}^{\nabla})}^{T} Q_{m_{i}} Q_{m_{i}} E_{i p}^{\nabla} Q_{s_{p}} \\ + Q_{r_{p}} F_{i p}^{\nabla} Q_{n_{i}} Q_{n_{i}} {(H_{i j}^{\nabla})}^{T} Q_{r_{j}} Q_{r_{j}} {\tilde{X}}_{j}^{\nabla} (k) Q_{s_{j}} Q_{s_{j}} {(G_{i j}^{\nabla})}^{T} Q_{m_{i}} Q_{m_{i}} E_{i p}^{\nabla} Q_{s_{p}} \\ + H_{i p}^{\nabla} {(B_{i j}^{\nabla})}^{T} {({\tilde{X}}_{j}^{\nabla} (k))}^{T} {(A_{i j}^{\nabla})}^{T} G_{i p}^{\nabla} + H_{i p}^{\nabla} {(D_{i j}^{\nabla})}^{T} Q_{s_{j}} {({\tilde{X}}_{j}^{\nabla} (k))}^{T} Q_{r_{j}} {(C_{i j}^{\nabla})}^{T} G_{i p}^{\nabla} \\ + H_{i p}^{\nabla} {(F_{i j}^{\nabla})}^{T} Q_{r_{j}} {\tilde{X}}_{j}^{\nabla} (k) Q_{s_{j}} {(E_{i j}^{\nabla})}^{T} G_{i p}^{\nabla} + H_{i p}^{\nabla} {(H_{i j}^{\nabla})}^{T} {\tilde{X}}_{j}^{\nabla} (k) {(G_{i j}^{\nabla})}^{T} G_{i p}^{\nabla}] \\ = {\tilde{X}}_{p}^{\nabla} (k) - \frac{μ}{4} \sum_{i = 1}^{s} \sum_{j = 1}^{l} α_{i} [{(A_{i p}^{\nabla})}^{T} A_{i j}^{\nabla} {\tilde{X}}_{j}^{\nabla} (k) B_{i j}^{\nabla} {(B_{i p}^{\nabla})}^{T} + {(A_{i p}^{\nabla})}^{T} C_{i j}^{\nabla} Q_{r_{j}} {\tilde{X}}_{j}^{\nabla} (k) Q_{s_{j}} D_{i j}^{\nabla} {(B_{i p}^{\nabla})}^{T} \\ + {(A_{i p}^{\nabla})}^{T} E_{i j}^{\nabla} Q_{s_{j}} {({\tilde{X}}_{j}^{\nabla} (k))}^{T} Q_{r_{j}} F_{i j}^{\nabla} {(B_{i p}^{\nabla})}^{T} + {(A_{i p}^{\nabla})}^{T} G_{i j}^{\nabla} {({\tilde{X}}_{j}^{\nabla} (k))}^{T} H_{i j}^{\nabla} {(B_{i p}^{\nabla})}^{T} \\ + Q_{r_{p}} {(C_{i p}^{\nabla})}^{T} A_{i j}^{\nabla} {\tilde{X}}_{j}^{\nabla} (k) B_{i j}^{\nabla} {(D_{i p}^{\nabla})}^{T} Q_{s_{p}} + Q_{r_{p}} {(C_{i p}^{\nabla})}^{T} C_{i j}^{\nabla} Q_{r_{j}} {\tilde{X}}_{j}^{\nabla} (k) Q_{s_{j}} D_{i j}^{\nabla} {(D_{i p}^{\nabla})}^{T} Q_{s_{p}} \\ + Q_{r_{p}} {(C_{i p}^{\nabla})}^{T} E_{i j}^{\nabla} Q_{s_{j}} {({\tilde{X}}_{j}^{\nabla} (k))}^{T} Q_{r_{j}} F_{i j}^{\nabla} {(D_{i p}^{\nabla})}^{T} Q_{s_{p}} + Q_{r_{p}} {(C_{i p}^{\nabla})}^{T} G_{i j}^{\nabla} {({\tilde{X}}_{j}^{\nabla} (k))}^{T} H_{i j}^{\nabla} {(D_{i p}^{\nabla})}^{T} Q_{s_{p}} \\ + Q_{r_{p}} F_{i p}^{\nabla} {(B_{i j}^{\nabla})}^{T} {({\tilde{X}}_{j}^{\nabla} (k))}^{T} {(A_{i j}^{\nabla})}^{T} E_{i p}^{\nabla} Q_{s_{p}} + Q_{r_{p}} F_{i p}^{\nabla} {(D_{i j}^{\nabla})}^{T} Q_{s_{j}} {({\tilde{X}}_{j}^{\nabla} (k))}^{T} Q_{r_{j}} {(C_{i j}^{\nabla})}^{T} E_{i p}^{\nabla} Q_{r_{p}} \\ + Q_{r_{p}} F_{i p}^{\nabla} {(F_{i j}^{\nabla})}^{T} Q_{r_{j}} {\tilde{X}}_{j}^{\nabla} (k) Q_{s_{j}} {(E_{i j}^{\nabla})}^{T} E_{i p}^{\nabla} Q_{s_{p}} + Q_{r_{p}} F_{i p}^{\nabla} {(H_{i j}^{\nabla})}^{T} {\tilde{X}}_{j}^{\nabla} (k) {(G_{i j}^{\nabla})}^{T} E_{i p}^{\nabla} Q_{s_{p}} \\ + H_{i p}^{\nabla} {(B_{i j}^{\nabla})}^{T} {({\tilde{X}}_{j}^{\nabla} (k))}^{T} {(A_{i j}^{\nabla})}^{T} G_{i p}^{\nabla} + H_{i p}^{\nabla} {(D_{i j}^{\nabla})}^{T} Q_{s_{j}} {({\tilde{X}}_{j}^{\nabla} (k))}^{T} Q_{r_{j}} {(C_{i j}^{\nabla})}^{T} G_{i p}^{\nabla} \\ + H_{i p}^{\nabla} {(F_{i j}^{\nabla})}^{T} Q_{r_{j}} {\tilde{X}}_{j}^{\nabla} (k) Q_{s_{j}} {(E_{i j}^{\nabla})}^{T} G_{i p}^{\nabla} + H_{i p}^{\nabla} {(H_{i j}^{\nabla})}^{T} {\tilde{X}}_{j}^{\nabla} (k) {(G_{i j}^{\nabla})}^{T} G_{i p}^{\nabla}] . \end{matrix}

(48)

By taking the vector-stretching operator on both sides of Equation (48) and using Lemma 2, we have

\begin{matrix} vec [{\tilde{X}}_{p}^{\nabla} (k + 1)] \\ = vec [{\tilde{X}}_{p}^{\nabla} (k)] - \frac{μ}{4} \sum_{i = 1}^{s} α_{i} [B_{i p}^{\nabla} \otimes {(A_{i p}^{\nabla})}^{T} + Q_{s_{p}} D_{i p}^{\nabla} \otimes Q_{r_{p}} {(C_{i p}^{\nabla})}^{T} + P (2 s_{p}, 2 r_{p}) (Q_{r_{p}} F_{i p}^{\nabla} \otimes Q_{s_{p}} {(E_{i p}^{\nabla})}^{T} + H_{i p}^{\nabla} \otimes {(G_{i p}^{\nabla})}^{T})] \\ \sum_{j = 1}^{l} [{(B_{i j}^{\nabla})}^{T} \otimes A_{i j}^{\nabla} + {(D_{i j}^{\nabla})}^{T} Q_{s_{j}} \otimes C_{i j}^{\nabla} Q_{r_{j}} + ({(F_{i j}^{\nabla})}^{T} Q_{r_{j}} \otimes E_{i j}^{\nabla} Q_{s_{j}} + {(H_{i j}^{\nabla})}^{T} \otimes G_{i j}^{\nabla}) P (2 r_{j}, 2 s_{j})] vec [{\tilde{X}}_{j}^{\nabla} (k)], \end{matrix}

(49)

for

p = 1, 2, \dots, l

. This gives the following result:

\begin{matrix} vec [{\tilde{X}}^{\nabla} (k + 1)] & = & vec [{\tilde{X}}^{\nabla} (k)] - \frac{μ}{4} R^{T} T R vec [{\tilde{X}}^{\nabla} (k)] \\ = & (I - \frac{μ}{4} R^{T} T R) vec [{\tilde{X}}^{\nabla} (k)] \\ = & (I - \frac{μ}{4} R^{T} T^{\frac{1}{2}} T^{\frac{1}{2}} R) vec [{\tilde{X}}^{\nabla} (k)], \end{matrix}

(50)

where

vec [{\tilde{X}}^{\nabla} (k)] = {[{[vec ({\tilde{X}}_{1}^{\nabla} (k))]}^{T}, {[vec ({\tilde{X}}_{2}^{\nabla} (k))]}^{T}, \dots, {[vec ({\tilde{X}}_{l}^{\nabla} (k))]}^{T}]}^{T},

T = [\begin{matrix} α_{1} I_{4 m_{1} n_{1}} & 0 & \dots & 0 \\ 0 & α_{2} I_{4 m_{2} n_{2}} & \dots & 0 \\ \dots & \dots & ⋱ & \dots \\ 0 & 0 & \dots & α_{s} I_{4 m_{s} n_{s}} \end{matrix}]

and R is defined by Equation (31).

Equation (50) implies that the sufficient and necessary condition for the convergent of Algorithm 1 (the WRGI algorithm) is

ρ (I - \frac{μ}{4} R^{T} T^{\frac{1}{2}} T^{\frac{1}{2}} R) < 1 .

Since

R^{T} T^{\frac{1}{2}} T^{\frac{1}{2}} R

is a symmetric matrix, it holds that

\begin{matrix} λ (I - \frac{μ}{4} R^{T} T^{\frac{1}{2}} T^{\frac{1}{2}} R) & = & \{1 - \frac{μ}{4} λ_{q} (R^{T} T^{\frac{1}{2}} T^{\frac{1}{2}} R), q = 1, 2, \dots, η\} \\ = & \{1 - \frac{μ}{4} σ_{q}^{2} (T^{\frac{1}{2}} R), q = 1, 2, \dots, η\}, \end{matrix}

(51)

where

η = rank (R^{T} R) = rank (R) = rank (T^{\frac{1}{2}} R)

. Then,

ρ (I - \frac{μ}{4} R^{T} T^{\frac{1}{2}} T^{\frac{1}{2}} R) < 1

is equivalent to

|1 - \frac{μ}{4} σ_{q}^{2} (T^{\frac{1}{2}} R)| < 1, q = 1, 2, \dots, η .

(52)

It follows from Equation (52) that

- 1 < 1 - \frac{μ}{4} σ_{q}^{2} (T^{\frac{1}{2}} R) < 1, q = 1, 2, \dots, η .

After simple computations, we derive

0 < μ < \frac{8}{{∥T^{\frac{1}{2}} R∥}_{2}^{2}},

which completes the proof. □

Remark 2.

When the relaxation parameters

α_{i}

of the WRGI algorithm are

α_{i} = \frac{1}{s}, i = 1, 2, \dots, s

, then the condition in Equation (46) reduces to the following condition:

0 < μ < \frac{8 s}{{∥R∥}_{2}^{2}} .

Based on Theorem 3, we will study the optimal step size

μ

and the corresponding optimal convergence factor of the WRGI algorithm in the following theorem:

Theorem 4.

Assume that the conditions of Theorem 2 are valid. Then, it holds that

∥X (k) - X^{*}∥ \leq ρ^{k} (I - \frac{μ}{4} R^{T} T^{\frac{1}{2}} T^{\frac{1}{2}} R) ∥X (0) - X^{*}∥,

(53)

where

X (k) = (X_{1} (k), X_{2} (k), \dots, X_{l} (k)) .

In addition, the optimal convergence factor

μ_{o p t}

is

μ_{o p t} = \frac{8}{σ_{max}^{2} (T^{\frac{1}{2}} R) + σ_{min}^{2} (T^{\frac{1}{2}} R)} .

(54)

Under this situation, the convergence rate is maximized, and we have

∥X (k) - X^{*}∥ \leq {(\frac{{cond}^{2} (T^{\frac{1}{2}} R) - 1}{{cond}^{2} (T^{\frac{1}{2}} R) + 1})}^{k} ∥X (0) - X^{*}∥ .

(55)

Proof.

Due to the fact that

I - \frac{μ}{4} R^{T} T^{\frac{1}{2}} T^{\frac{1}{2}} R

is symmetric, it follows that

{∥I - \frac{μ}{4} R^{T} T^{\frac{1}{2}} T^{\frac{1}{2}} R∥}_{2} = ρ (I - \frac{μ}{4} R^{T} T^{\frac{1}{2}} T^{\frac{1}{2}} R) .

(56)

In light of Equations (50) and (56), one has

\begin{matrix} ∥{\tilde{X}}^{\nabla} (k + 1)∥ = ∥vec [{\tilde{X}}^{\nabla} (k + 1)]∥ & = & {∥vec [{\tilde{X}}^{\nabla} (k + 1)]∥}_{2} \\ = & {∥(I - \frac{μ}{4} R^{T} T^{\frac{1}{2}} T^{\frac{1}{2}} R) vec [{\tilde{X}}^{\nabla} (k)]∥}_{2} \\ \leq & {∥I - \frac{μ}{4} R^{T} T^{\frac{1}{2}} T^{\frac{1}{2}} R∥}_{2} {∥vec [{\tilde{X}}^{\nabla} (k)]∥}_{2} \\ = & ρ (I - \frac{μ}{4} R^{T} T^{\frac{1}{2}} T^{\frac{1}{2}} R) ∥{\tilde{X}}^{\nabla} (k)∥, \end{matrix}

(57)

where

{\tilde{X}}^{\nabla} (k) = [{\tilde{X}}_{1}^{\nabla} (k), {\tilde{X}}_{2}^{\nabla} (k), \dots, {\tilde{X}}_{l}^{\nabla} (k)] .

Combining the relations in Equations (12) and (57) results in

\begin{matrix} ∥\tilde{X} (k + 1)∥ & \leq & ρ (I - \frac{μ}{4} R^{T} T^{\frac{1}{2}} T^{\frac{1}{2}} R) ∥\tilde{X} (k)∥, \end{matrix}

(58)

where

\tilde{X} (k) = [{\tilde{X}}_{1} (k), {\tilde{X}}_{2} (k), \dots, {\tilde{X}}_{l} (k)] .

Having in mind that

{\tilde{X}}_{p} (k) = X_{p} (k) - X_{p}^{*}

(p = 1, 2, \dots, l)

, then it follows from Equation (58) that

\begin{matrix} ∥X (k) - X^{*}∥ & \leq & ρ (I - \frac{μ}{4} R^{T} T^{\frac{1}{2}} T^{\frac{1}{2}} R) ∥X (k - 1) - X^{*}∥ \\ \leq & ρ^{2} (I - \frac{μ}{4} R^{T} T^{\frac{1}{2}} T^{\frac{1}{2}} R) ∥X (k - 2) - X^{*}∥ \\ \leq & ρ^{k} (I - \frac{μ}{4} R^{T} T^{\frac{1}{2}} T^{\frac{1}{2}} R) ∥X (0) - X^{*}∥ . \end{matrix}

(59)

It can be seen from Equation (59) that the smaller the

ρ (I - \frac{μ}{4} R^{T} T^{\frac{1}{2}} T^{\frac{1}{2}} R)

, the faster the convergence rate of the WRGI algorithm. Direct calculations give that

ρ (I - \frac{μ}{4} R^{T} T^{\frac{1}{2}} T^{\frac{1}{2}} R)

is minimized if and only if

1 - \frac{μ}{4} σ_{min}^{2} (T^{\frac{1}{2}} R) = \frac{μ}{4} σ_{max}^{2} (T^{\frac{1}{2}} R) - 1,

(60)

from which one can deduce that

μ_{o p t} = \frac{8}{σ_{max}^{2} (T^{\frac{1}{2}} R) + σ_{min}^{2} (T^{\frac{1}{2}} R)} .

If

μ = μ_{o p t}

, then we have

\begin{matrix} ρ (I - \frac{μ}{4} R^{T} T^{\frac{1}{2}} T^{\frac{1}{2}} R) & = & max_{i} \{1 - \frac{2}{σ_{max}^{2} (T^{\frac{1}{2}} R) + σ_{min}^{2} (T^{\frac{1}{2}} R)} λ_{i} (R^{T} T^{\frac{1}{2}} T^{\frac{1}{2}} R)\} \\ = & max_{i} \{1 - \frac{2}{σ_{max}^{2} (T^{\frac{1}{2}} R) + σ_{min}^{2} (T^{\frac{1}{2}} R)} σ_{i}^{2} (T^{\frac{1}{2}} R)\} \\ = & 1 - \frac{2 σ_{min}^{2} (T^{\frac{1}{2}} R)}{σ_{max}^{2} (T^{\frac{1}{2}} R) + σ_{min}^{2} (T^{\frac{1}{2}} R)} \\ = & \frac{σ_{max}^{2} (T^{\frac{1}{2}} R) - σ_{min}^{2} (T^{\frac{1}{2}} R)}{σ_{max}^{2} (T^{\frac{1}{2}} R) + σ_{min}^{2} (T^{\frac{1}{2}} R)} \\ = & \frac{{cond}^{2} (T^{\frac{1}{2}} R) - 1}{{cond}^{2} (T^{\frac{1}{2}} R) + 1} . \end{matrix}

(61)

Substituting Equation (61) into Equation (59) yields Equation (55). The proof is completed. □

Although the convergence conditions of the WRGI algorithm are given in Theorems 2 and 3, the Kronecker product and the real representations of the system matrices are involved in computing

{∥V∥}_{2}

and

{∥T^{\frac{1}{2}} R∥}_{2}

. This may bring about a high dimensionality problem and lead to high computing consumption. To overcome this drawback, we will give a sufficient condition for the convergence of the proposed WRGI algorithm below.

Corollary 1.

Suppose that the matrix in Equation (5) has a unique solution

X^{*} = (X_{1}^{*}, X_{2}^{*}, \dots, X_{l}^{*})

. Then, the iterative sequences

{X_{p} (k)}

(p = 1, \dots, l)

generated by Algorithm 1 converge to

X^{*}

for the arbitrary initial matrices

X_{p} (0)

(p = 1, \dots, l)

if

0 < μ < \frac{2}{\sum_{i = 1}^{s} \sum_{j = 1}^{l} α_{i} ({∥B_{i j}∥}_{2}^{2} {∥A_{i j}∥}_{2}^{2} + {∥D_{i j}∥}_{2}^{2} {∥C_{i j}∥}_{2}^{2} + {∥F_{i j}∥}_{2}^{2} {∥E_{i j}∥}_{2}^{2} + {∥H_{i j}∥}_{2}^{2} {∥G_{i j}∥}_{2}^{2})} .

Proof.

According to Lemmas 4–6 and the fact that

{∥E \otimes F∥}_{2} = {∥E∥}_{2} {∥F∥}_{2}

, we have

\begin{matrix} {∥T^{\frac{1}{2}} R∥}_{2}^{2} & \leq & \sum_{i = 1}^{s} \sum_{j = 1}^{l} {∥\sqrt{α_{i}} [{(B_{i j}^{\nabla})}^{T} \otimes A_{i j}^{\nabla} + {(D_{i j}^{\nabla})}^{T} Q_{s_{j}} \otimes C_{i j}^{\nabla} Q_{r_{j}}] + \sqrt{α_{i}} [({(F_{i j}^{\nabla})}^{T} Q_{r_{j}} \otimes E_{i j}^{\nabla} Q_{s_{j}} + {(H_{i j}^{\nabla})}^{T} \otimes G_{i j}^{\nabla}) P (2 r_{j}, 2 s_{j})]∥}_{2}^{2} \\ \leq & \sum_{i = 1}^{s} \sum_{j = 1}^{l} α_{i} {({∥{(B_{i j}^{\nabla})}^{T} \otimes A_{i j}^{\nabla} + {(D_{i j}^{\nabla})}^{T} Q_{s_{j}} \otimes C_{i j}^{\nabla} Q_{r_{j}}∥}_{2} + {∥({(F_{i j}^{\nabla})}^{T} Q_{r_{j}} \otimes E_{i j}^{\nabla} Q_{s_{j}} + {(H_{i j}^{\nabla})}^{T} \otimes G_{i j}^{\nabla}) P (2 r_{j}, 2 s_{j})∥}_{2})}^{2} \\ \leq & \sum_{i = 1}^{s} \sum_{j = 1}^{l} 2 α_{i} ({∥{(B_{i j}^{\nabla})}^{T} \otimes A_{i j}^{\nabla} + {(D_{i j}^{\nabla})}^{T} Q_{s_{j}} \otimes C_{i j}^{\nabla} Q_{r_{j}}∥}_{2}^{2} + {∥({(F_{i j}^{\nabla})}^{T} Q_{r_{j}} \otimes E_{i j}^{\nabla} Q_{s_{j}} + {(H_{i j}^{\nabla})}^{T} \otimes G_{i j}^{\nabla}) P (2 r_{j}, 2 s_{j})∥}_{2}^{2}) \\ \leq & \sum_{i = 1}^{s} \sum_{j = 1}^{l} 2 α_{i} [{({∥{(B_{i j}^{\nabla})}^{T} \otimes A_{i j}^{\nabla}∥}_{2} + {∥{(D_{i j}^{\nabla})}^{T} Q_{s_{j}} \otimes C_{i j}^{\nabla} Q_{r_{j}}∥}_{2})}^{2} \\ + & {({∥({(F_{i j}^{\nabla})}^{T} Q_{r_{j}} \otimes E_{i j}^{\nabla} Q_{s_{j}}) P (2 r_{j}, 2 s_{j})∥}_{2} + {∥({(H_{i j}^{\nabla})}^{T} \otimes G_{i j}^{\nabla}) P (2 r_{j}, 2 s_{j})∥}_{2})}^{2}] \\ \leq & \sum_{i = 1}^{s} \sum_{j = 1}^{l} 4 α_{i} [{∥{(B_{i j}^{\nabla})}^{T} \otimes A_{i j}^{\nabla}∥}_{2}^{2} + {∥{(D_{i j}^{\nabla})}^{T} Q_{s_{j}} \otimes C_{i j}^{\nabla} Q_{r_{j}}∥}_{2}^{2} \\ + & {∥({(F_{i j}^{\nabla})}^{T} Q_{r_{j}} \otimes E_{i j}^{\nabla} Q_{s_{j}}) P (2 r_{j}, 2 s_{j})∥}_{2}^{2} + {∥({(H_{i j}^{\nabla})}^{T} \otimes G_{i j}^{\nabla}) P (2 r_{j}, 2 s_{j})∥}_{2}^{2}] \\ = & \sum_{i = 1}^{s} \sum_{j = 1}^{l} 4 α_{i} [{∥{(B_{i j}^{\nabla})}^{T} \otimes A_{i j}^{\nabla}∥}_{2}^{2} + {∥{(D_{i j}^{\nabla})}^{T} Q_{s_{j}} \otimes C_{i j}^{\nabla} Q_{r_{j}}∥}_{2}^{2} + {∥{(F_{i j}^{\nabla})}^{T} Q_{r_{j}} \otimes E_{i j}^{\nabla} Q_{s_{j}}∥}_{2}^{2} + {∥{(H_{i j}^{\nabla})}^{T} \otimes G_{i j}^{\nabla}∥}_{2}^{2}] \\ = & \sum_{i = 1}^{s} \sum_{j = 1}^{l} 4 α_{i} [{∥{(B_{i j}^{\nabla})}^{T}∥}_{2}^{2} {∥A_{i j}^{\nabla}∥}_{2}^{2} + {∥{(D_{i j}^{\nabla})}^{T} Q_{s_{j}}∥}_{2}^{2} {∥C_{i j}^{\nabla} Q_{r_{j}}∥}_{2}^{2} + {∥{(F_{i j}^{\nabla})}^{T} Q_{r_{j}}∥}_{2}^{2} {∥E_{i j}^{\nabla} Q_{s_{j}}∥}_{2}^{2} + ∥{(H_{i j}^{\nabla})}^{T}∥ {∥G_{i j}^{\nabla}∥}_{2}^{2}] \\ = & \sum_{i = 1}^{s} \sum_{j = 1}^{l} 4 α_{i} [{∥B_{i j}∥}_{2}^{2} {∥A_{i j}∥}_{2}^{2} + {∥D_{i j}∥}_{2}^{2} {∥C_{i j}∥}_{2}^{2} + {∥F_{i j}∥}_{2}^{2} {∥E_{i j}∥}_{2}^{2} + {∥H_{i j}∥}_{2}^{2} {∥G_{i j}∥}_{2}^{2}] . \end{matrix}

(62)

Substituting Equation (62) into the relation in Equation (46) gives the conclusion of this corollary. □

4. Numerical Experiments

In this section, two numerical examples are provided to verify the effectiveness and advantage of the proposed WRGI algorithm for solving the matrix in Equation (5) in terms of the number of iterations (IT) and the computing times in seconds (CPU). All the computations were performed in MATLAB R2018b on a personal computer with an AMD Ryzen 7 5800H, where the CPU was 3.20 GHz and the memory was 16.0 GB.

Example 1

([32]). Consider the generalized coupled conjugate and transpose Sylvester matrix equations

\{\begin{matrix} A_{11} X_{1} B_{11} + E_{12} X_{2}^{T} F_{12} + C_{13} \bar{X_{3}} D_{13} + G_{14} X_{4}^{H} H_{14} = M_{1}, \\ A_{22} X_{2} B_{22} + E_{23} X_{3}^{T} F_{23} + C_{24} \bar{X_{4}} D_{24} + G_{21} X_{1}^{H} H_{21} = M_{2}, \\ A_{33} X_{3} B_{33} + E_{34} X_{4}^{T} F_{34} + C_{31} \bar{X_{1}} D_{31} + G_{32} X_{2}^{H} H_{32} = M_{3}, \\ A_{44} X_{4} B_{44} + E_{41} X_{1}^{T} F_{41} + C_{42} \bar{X_{2}} D_{42} + G_{43} X_{3}^{H} H_{43} = M_{4}, \end{matrix}

with the following coefficient matrices:

A_{11} = [\begin{matrix} - 12 - 7 i & 10 - 11 i & - 9 + 10 i \\ 2 - 32 i & 27 - 3 i & 1 - 3 i \\ 10 + 11 i & 3 - 7 i & - 14 - 4 i \end{matrix}], B_{11} = [\begin{matrix} - 17 - 7 i & - 8 - 25 i & 13 + 1 i \\ 7 + 4 i & - 2 - 9 i & 0 + 6 i \\ 7 - 11 i & - 4 - 2 i & 7 + 6 i \end{matrix}],

A_{22} = [\begin{matrix} 11 - 9 i & - 8 - 7 i & - 18 - 2 i \\ 33 - 25 i & - 3 + 6 i & 5 - 23 i \\ 7 - 7 i & - 3 + 11 i & - 12 - 15 i \end{matrix}], B_{22} = [\begin{matrix} - 4 + 13 i & 7 - 14 i & - 10 + 2 i \\ 12 + 5 i & - 4 + 3 i & 8 - 16 i \\ 1 - 5 i & 19 + 7 i & 7 - 7 i \end{matrix}],

A_{33} = [\begin{matrix} 7 + 6 i & - 5 + 11 i & 4 + 8 i \\ - 24 - 1 i & 11 - 3 i & 0 + 22 i \\ 0 - 7 i & - 2 - 9 i & 0 - 6 i \end{matrix}], B_{33} = [\begin{matrix} 4 - 9 i & - 20 + 15 i & - 23 + 20 i \\ - 25 + 4 i & - 4 - 2 i & 11 + 1 i \\ - 14 - 4 i & - 4 + 8 i & 10 - 1 i \end{matrix}],

A_{44} = [\begin{matrix} - 12 + 12 i & - 2 + 5 i & 1 - 9 i \\ 2 - 2 i & 10 - 6 i & - 26 + 3 i \\ - 19 + 13 i & - 3 + 15 i & - 7 - 17 i \end{matrix}], B_{44} = [\begin{matrix} - 1 - 5 i & 2 + 9 i & - 3 + 4 i \\ 16 - 7 i & 1 + 4 i & - 9 - 5 i \\ - 8 - 6 i & - 6 + 4 i & 1 \end{matrix}],

C_{13} = [\begin{matrix} 5 + 6 i & 11 + 7 i & - 12 + 4 i \\ - 15 + 2 i & - 5 + 7 i & 0 - 14 i \\ 11 + 4 i & - 9 - 17 i & 2 + 21 i \end{matrix}], D_{13} = [\begin{matrix} 6 + 3 i & - 22 + 9 i & 10 - 4 i \\ 16 + 17 i & 6 + 2 i & 0 + 2 i \\ 14 + 3 i & - 12 - 2 i & - 7 - 2 i \end{matrix}],

C_{24} = [\begin{matrix} - 16 + 1 i & 0 - 3 i & - 7 - 10 i \\ - 3 & - 4 + 6 i & - 9 + 7 i \\ - 9 - 2 i & 19 - 9 i & 6 - 3 i \end{matrix}], D_{24} = [\begin{matrix} - 6 - 14 i & - 7 + 20 i & 4 \\ 0 - 19 i & - 6 - 8 i & - 5 + 8 i \\ - 12 - 6 i & 0 & - 7 - 17 i \end{matrix}],

C_{31} = [\begin{matrix} - 13 - 2 i & 15 + 16 i & 12 - 23 i \\ - 10 - 4 i & - 10 - 3 i & - 15 + 12 i \\ - 3 - 9 i & 5 + 20 i & - 5 + 5 i \end{matrix}], D_{31} = [\begin{matrix} - 6 + 6 i & - 9 + 6 i & - 20 - 19 i \\ 9 + 14 i & 1 + 21 i & - 8 - 12 i \\ - 12 - 12 i & - 17 - 4 i & 3 + 8 i \end{matrix}],

C_{42} = [\begin{matrix} - 1 + 5 i & - 5 + 25 i & - 2 - 4 i \\ 19 - 11 i & - 2 - 6 i & - 2 + 9 i \\ - 10 + 2 i & - 5 - 9 i & 13 + 10 i \end{matrix}], D_{42} = [\begin{matrix} - 5 & 0 + 1 i & 10 + 16 i \\ 2 - 1 i & - 24 + 2 i & - 6 + 1 i \\ - 1 - 12 i & - 6 + 14 i & 4 - 13 i \end{matrix}],

E_{12} = [\begin{matrix} - 14 + 2 i & 2 + 5 i & 4 - 3 i \\ 9 + 5 i & - 3 + 10 i & 8 + 6 i \\ - 8 - 1 i & 5 + 9 i & - 21 - 1 i \end{matrix}], F_{12} = [\begin{matrix} - 13 + 1 i & - 21 + 6 i & - 11 - 3 i \\ 2 + 7 i & - 10 + 4 i & - 10 + 10 i \\ - 3 + 5 i & 9 - 11 i & - 13 + 13 i \end{matrix}],

E_{23} = [\begin{matrix} 8 - 2 i & - 8 + 2 i & 0 - 5 i \\ 3 + 6 i & 8 - 2 i & - 3 - 1 i \\ 10 - 3 i & - 13 + 3 i & - 14 + 7 i \end{matrix}], F_{23} = [\begin{matrix} - 16 - 27 i & - 26 + 2 i & - 12 - 10 i \\ 6 - 8 i & 2 + 33 i & 1 - 9 i \\ - 5 - 12 i & 12 - 18 i & 29 + 7 i \end{matrix}],

E_{34} = [\begin{matrix} 14 + 21 i & - 9 + 8 i & 4 - 14 i \\ 7 + 7 i & 19 - 2 i & 0 + 5 i \\ 7 + 10 i & - 5 + 8 i & 7 - 18 i \end{matrix}], F_{34} = [\begin{matrix} 0 + 8 i & 16 - 12 i & - 8 + 17 i \\ - 5 + 18 i & 4 + 12 i & 10 + 9 i \\ 10 - 1 i & 12 - 14 i & 17 - 5 i \end{matrix}],

E_{41} = [\begin{matrix} 3 + 8 i & - 18 + 10 i & - 9 + 14 i \\ - 17 + 6 i & 3 + 6 i & 14 + 7 i \\ 5 & - 5 - 17 i & - 6 + 5 i \end{matrix}], F_{41} = [\begin{matrix} - 5 - 1 i & 16 + 3 i & - 5 - 13 i \\ - 19 + 6 i & - 12 & - 9 - 3 i \\ - 12 + 4 i & 6 - 12 i & 6 + 3 i \end{matrix}],

G_{14} = [\begin{matrix} 2 - 29 i & - 6 - 3 i & 8 + 8 i \\ - 3 + 3 i & - 6 - 3 i & 1 - 14 i \\ 1 & - 3 - 11 i & - 1 + 19 i \end{matrix}], H_{14} = [\begin{matrix} 8 - 8 i & - 10 - 2 i & - 5 + 6 i \\ 1 & - 3 + 6 i & 16 + 6 i \\ - 9 + 8 i & 9 & - 3 - 2 i \end{matrix}],

G_{21} = [\begin{matrix} 15 + 15 i & - 5 + 19 i & - 16 - 4 i \\ - 7 - 14 i & 5 + 12 i & - 11 - 16 i \\ - 1 + 5 i & 5 - 3 i & - 1 - 3 i \end{matrix}], H_{21} = [\begin{matrix} - 4 + 15 i & - 18 + 3 i & 6 - 10 i \\ 6 - 1 i & 4 - 3 i & 9 - 16 i \\ 10 + 2 i & - 17 + 12 i & - 2 - 8 i \end{matrix}],

G_{32} = [\begin{matrix} 14 - 4 i & 25 - 3 i & 5 - 2 i \\ 4 - 5 i & 2 - 16 i & 3 - 2 i \\ 7 - 22 i & 2 + 5 i & 7 - 18 i \end{matrix}], H_{32} = [\begin{matrix} 10 + 5 i & 8 + 1 i & 12 - 9 i \\ 12 - 3 i & 6 - 7 i & - 3 + 13 i \\ 4 - 3 i & - 7 - 4 i & - 6 - 10 i \end{matrix}],

G_{43} = [\begin{matrix} 5 + 21 i & 1 + 20 i & - 4 + 12 i \\ - 7 - 8 i & 6 - 6 i & - 2 + 15 i \\ - 13 + 5 i & - 22 + 4 i & - 2 + 5 i \end{matrix}], H_{43} = [\begin{matrix} - 13 - 18 i & 3 - 26 i & - 3 - 8 i \\ - 2 + 16 i & 9 - 4 i & 11 - 5 i \\ 9 + 7 i & 5 + 4 i & 2 - 14 i \end{matrix}],

M_{1} = [\begin{matrix} - 2418 + 3322 i & 10353 - 966 i & 5238 + 1933 i \\ - 11927 - 7210 i & - 7206 - 12568 i & 4614 + 7638 i \\ 1619 - 9753 i & 16692 + 11938 i & - 3798 - 2865 i \end{matrix}],

M_{2} = [\begin{matrix} - 4750 + 14828 i & 10137 - 3634 i & - 18315 + 2472 i \\ - 11651 + 15269 i & - 11063 - 9783 i & - 16515 + 18210 i \\ - 2388 + 17370 i & 2934 - 1222 i & - 3823 + 2947 i \end{matrix}],

M_{3} = [\begin{matrix} 22162 - 7358 i & 22122 - 18790 i & - 3570 - 2827 i \\ 7263 + 4634 i & 4142 + 7164 i & 6578 + 26838 i \\ 12844 - 6822 i & - 11828 - 13695 i & 9789 - 20700 i \end{matrix}],

M_{4} = [\begin{matrix} - 5068 - 9357 i & 6306 - 13376 i & - 3738 - 6683 i \\ 9215 + 5146 i & 8946 - 8825 i & 14012 - 7731 i \\ - 2927 - 1660 i & - 5342 + 4327 i & 6279 - 2721 i \end{matrix}] .

It can be verified that the unique solution of this matrix equation is

X_{1}^{*} = [\begin{matrix} - 9 + 8 i & - 12 - 5 i & 9 + 7 i \\ 6 + 7 i & - 10 - 4 i & 14 + 2 i \\ 6 - 10 i & 10 + 11 i & - 1 \end{matrix}], X_{2}^{*} = [\begin{matrix} 13 + 6 i & - 19 + 6 i & - 11 - 11 i \\ 16 - 15 i & 16 - 16 i & 9 + 11 i \\ - 7 - 10 i & 5 + 6 i & - 5 - 2 i \end{matrix}],

X_{3}^{*} = [\begin{matrix} - 13 + 7 i & 14 + 3 i & 2 + 1 i \\ - 3 + 9 i & 8 + 3 i & - 1 + 6 i \\ - 15 - 1 i & - 6 - 4 i & 5 + 4 i \end{matrix}], X_{4}^{*} = [\begin{matrix} - 2 - 14 i & 9 + 11 i & - 9 - 17 i \\ - 9 - 13 i & 2 + 13 i & 9 + 8 i \\ 10 - 13 i & - 2 - 6 i & 7 - 14 i \end{matrix}] .

We take the initial iterative matrices

X_{p}^{(i)} (0) = 10^{- 6} \times I_{3}, X_{p} (0) = 10^{- 6} \times I_{3}, i \in [1, 4], p \in [1, 4]

and set

R E S = \sqrt{\frac{M_{1} (k) + M_{2} (k) + M_{3} (k) + M_{4} (k)}{M_{1} (0) + M_{2} (0) + M_{3} (0) + M_{4} (0)}} \leq ξ

to be the termination condition with a constant

ξ > 0

, where

M_{1} (k) = {∥M_{1} - A_{11} X_{1} (k) B_{11} - E_{12} X_{2}^{T} (k) F_{12} - C_{13} \bar{X_{3} (k)} D_{13} - G_{14} X_{4}^{H} (k) H_{14}∥}^{2},

M_{2} (k) = {∥M_{2} - A_{22} X_{2} (k) B_{22} - E_{23} X_{3}^{T} (k) F_{23} - C_{24} \bar{X_{4} (k)} D_{24} - G_{21} X_{1}^{H} (k) H_{21}∥}^{2},

M_{3} (k) = {∥M_{3} - A_{33} X_{3} (k) B_{33} - E_{34} X_{4}^{T} (k) F_{34} - C_{31} \bar{X_{1} (k)} D_{31} - G_{32} X_{2}^{H} (k) H_{32}∥}^{2},

M_{4} (k) = {∥M_{4} - A_{44} X_{4} (k) B_{44} - E_{41} X_{1}^{T} (k) F_{41} - C_{42} \bar{X_{2} (k)} D_{42} - G_{43} X_{3}^{H} (k) H_{43}∥}^{2},

Here, the prescribed maximum iterative number is

k_{m a x}

= 20,000, and

X_{p} (k)

(p = 1, 2, 3, 4)

represents the kth iteration solution.

It follows from Remark 2 that the WRGI algorithm reduces to the GI one [25] as

α_{1} = α_{2} = α_{3} = α_{4} = 0.25

. In this case, according to Theorem 4, the optimal convergence factor is

μ_{o p t} = 4.5603 \times 10^{- 6}

. However, the optimal convergence factor

μ_{o p t}

calculated by Theorem 4 may not minimize the IT of the GI algorithm, and the reason for this is that there are errors in the calculation process. Thus, the parameter

μ

adopted in the GI algorithm is the experimentally found optimal one, which minimizes the IT of the GI algorithm. After experimental debugging, the experimental optimal parameters of the GI algorithm are

μ_{G I 1} = 4.2 \times 10^{- 6}, μ_{G I 2} = 4.53 \times 10^{- 6}, μ_{G I 3} = 4.556 \times 10^{- 6}

and

μ_{G I 4} = 4.558 \times 10^{- 6}

when

ξ = 0.1, ξ = 0.01, ξ = 0.001

and

ξ = 0.0001

, respectively. Aside from that, we can obtain that the optimal parameter of the WRGI algorithm is

μ = 4.6493 \times 10^{- 6}

in terms of Theorem 4. Owing to the existence of computation errors, the parameter

μ

in the WRGI algorithm is adopted to be the experimentally found one as demonstrated above. Through experimental debugging, the experimental optimal parameters of the WRGI algorithm are

μ_{W R G I 1} = 4.2 \times 10^{- 6},

μ_{W R G I 2} = 4.61 \times 10^{- 6}, μ_{W R G I 3} = 4.645 \times 10^{- 6}

and

μ_{W R G I 4} = 4.647 \times 10^{- 6}

when

ξ = 0.1,

ξ = 0.01,

ξ = 0.001

and

ξ = 0.0001

, respectively. Under these circumstances, the relaxation factors are taken to be

α_{1} = 0.26, α_{2} = 0.26

,

α_{3} = 0.24

and

α_{4} = 0.24

.

Table 1 lists the numerical results of the GI and the WRGI algorithms. As observed earlier in the comparisons of the GI and the WRGI algorithms in Table 1, the two tested algorithms were convergent for all cases, and the IT of them increased gradually with the decreases in the value of

ξ

. Additionally, it can be seen from Table 1 that the proposed WRGI algorithm performed better than the GI one with respect to the IT and CPU times. This implies that by applying the weighted relaxation technique, the WRGI algorithm can own better numerical behaviors than the GI one.

To better validate the advantage of the WRGI algorithm, Figure 1, Figure 2, Figure 3 and Figure 4 describe the convergence curves of the GI and WRGI algorithms for four different values of

ξ

. These curves reflect that the WRGI algorithm is convergent and its convergence speed is faster than the GI one, except for the case of

ξ = 0.1

. Additionally, the advantage of the WRGI algorithm becomes more pronounced as the value of

ξ

decreases. This is consistent with the results in Table 1.

Example 2.

We consider the generalized coupled conjugate and transpose Sylvester matrix equations

\{\begin{matrix} A_{11} X_{1} B_{11} + C_{11} \bar{X_{1}} D_{11} + E_{12} X_{2}^{T} F_{12} + G_{12} X_{2}^{H} H_{12} = M_{1}, \\ A_{21} X_{1} B_{21} + C_{21} \bar{X_{1}} D_{21} + E_{22} X_{2}^{T} F_{22} + G_{22} X_{2}^{H} H_{22} = M_{2}, \end{matrix}

with the coefficient matrices

\begin{matrix} A_{11} = [\begin{matrix} 12 & 5 & 3 \\ 19 & 8 & 2 \\ - 9 & - 3 & 21 \end{matrix}], B_{11} = [\begin{matrix} - 11 & 23 & 30 \\ 4 & - 9 & 3 \\ 1 & - 35 & 20 \end{matrix}], C_{11} = [\begin{matrix} 8 + 1 i & 1 + 21 i & 7 \\ - 1 + 8 i & 1 & 26 \\ 1 i & 8 & 41 i \end{matrix}], \\ D_{11} = [\begin{matrix} 1 & - 1 i & 1 \\ 1 i & 1 & 1 \\ 2 & 1 i & 1 \end{matrix}], E_{12} = [\begin{matrix} 1 & 1 i & 1 \\ 2 i & 1 & 0 \\ 0 & 1 & 1 \end{matrix}], F_{12} = [\begin{matrix} 1 & 1 & 6 \\ 1 & 0 & 0 \\ 0 & 1 & 1 \end{matrix}], \\ G_{12} = [\begin{matrix} 1 & 0 & - 1 \\ 1 & 4 & 1 \\ - 2 & 0 & - 2 \end{matrix}], H_{12} = [\begin{matrix} 1 & 0 & - 1 \\ 0 & 1 i & 0 \\ - 1 & 0 & 1 \end{matrix}], A_{21} = [\begin{matrix} 1 & 0 & 5 i \\ 35 & - 1 & 45 \\ 1 & - 1 & 1 \end{matrix}], \\ B_{21} = [\begin{matrix} - 1 + 4 i & 28 & 1 \\ 1 & - 42 & 1 \\ 1 & 33 & - 3 \end{matrix}], C_{21} = [\begin{matrix} 1 i & 0 & 1 \\ - 1 & 1 & - 2 \\ 0 & 2 & 1 \end{matrix}], D_{21} = [\begin{matrix} 1 & - 1 i & 3 \\ 1 & 1 & 1 \\ 1 i & 1 i & 1 \end{matrix}], \\ E_{22} = [\begin{matrix} 1 & - 1 i & 1 \\ 1 i & 3 & 4 - 2 i \\ - 1 & 2 & 1 \end{matrix}], F_{22} = [\begin{matrix} 1 & 8 & 2 \\ 1 & 0 & 0 \\ 0 & 1 & - 3 \end{matrix}], G_{22} = [\begin{matrix} 1 & 0 & - 1 \\ 1 & 2 & 1 \\ - 2 & 0 & - 2 \end{matrix}] \\ H_{22} = [\begin{matrix} 1 & 0 & - 1 \\ - 1 & - 1 i & 1 \\ - 1 & 2 & 1 \end{matrix}], M_{1} = [\begin{matrix} - 136 - 344 & 1780 - 688 i & 5777 + 2083 i \\ - 1508 - 670 i & 1819 - 1450 i & 8360 + 3079 i \\ 199 - 1526 i & 4844 + 2846 i & - 4882 + 2714 i \end{matrix}], \\ M_{2} = [\begin{matrix} - 19 - 76 i & - 1004 - 949 i & 3 + 166 i \\ - 1222 + 1753 i & - 31239 + 8414 i & 2088 + 432 i \\ - 28 + 10 i & - 1133 - 147 i & 69 + 5 i \end{matrix}] . \end{matrix}

The unique solution for these matrix equations is

X_{1}^{*} = [\begin{matrix} 12 + 2 i & 23 - 2 i & 1 \\ 1 & - 8 & 10 i \\ 2 + 6 i & 2 + 2 i & - 6 \end{matrix}], X_{2}^{*} = [\begin{matrix} 1 + 2 i & 2 - 3 i & 3 + 4 i \\ 2 - 4 i & - 6 + 3 i & - 3 + 2 i \\ 1 + 1 i & 2 + 2 i & - 1 - 2 i \end{matrix}] .

In this example, we chose

X_{p}^{(i)} (0) = 10^{- 6} \times I_{3}, X_{p} (0) = 10^{- 6} \times I_{3}, i \in [1, 2], p \in [1, 2],

as the initial matrices and took

R E S = \sqrt{\frac{M_{1} (k) + M_{2} (k)}{M_{1} (0) + M_{2} (0)}} \leq ξ

as the termination condition with a positive number ξ, where

M_{1} (k) = {∥M_{1} - A_{11} X_{1} (k) B_{11} - C_{11} \bar{X_{1} (k)} D_{11} - E_{12} X_{2}^{T} (k) F_{12} - G_{12} X_{2}^{H} (k) H_{12}∥}^{2}

and

M_{2} (k) = {∥M_{2} - A_{21} X_{1} (k) B_{21} - C_{21} \bar{X_{1} (k)} D_{21} - E_{22} X_{2}^{T} (k) F_{22} - G_{22} X_{2}^{H} (k) H_{22}∥}^{2},

and the prescribed maximum iterative number was

k_{m a x}

= 20,000, while

X_{p} (k)

(p = 1, 2)

denotes the kth iteration solution.

When

α_{1} = α_{2} = 0.5

, the WRGI algorithm degenerates into the GI one, and the optimal convergence factor calculated by Theorem 4 is

μ = 1.3151 \times 10^{- 6}

. When

α_{1} = 0.6

or

α_{2} = 0.4

, the optimal parameter of the WRGI algorithm is

μ = 1.6269 \times 10^{- 6}

in light of Theorem 4. Due to the existence of computation errors, as in Example 1, the parameters used in the GI and WRGI algorithms for Example 2 are the experimentally found optimal ones which minimize the IT of the tested algorithms. Through experimental debugging, the experimental optimal parameters of the GI algorithm were

μ_{G I 1} = 1.21 \times 10^{- 6}

and

μ_{G I 2} = 1.299 \times 10^{- 6}

for

ξ = 0.1

and

ξ = 0.01

, respectively. And when

α_{1} = 0.6

and

α_{2} = 0.4

, the experimental optimal parameters of the WRGI algorithm were

μ_{W R G I 1} = 1.43 \times 10^{- 6}

and

μ_{W R G I 2} = 1.625 \times 10^{- 6}

for

ξ = 0.1

and

ξ = 0.01

, respectively.

In Table 2, we compare the numerical results of the GI and WRGI algorithms to solve the generalized coupled conjugate and transpose Sylvester matrix equations in Example 2 with respect to two different values of ξ. Since the IT of the algorithms exceeded the maximum number of iterations when

ξ = 0.001

and

ξ = 0.0001

, we did not list the corresponding numerical results here. From the numerical results listed in Table 2, we can conclude that when the value of ξ decreases, the IT and CPU times of the tested algorithms increase. The changing scope of the IT of the proposed WRGI algorithm was smaller than that for the GI one. In addition, the WRGI algorithm outperformed the GI one from the point of view of computing efficiency, and the advantage of the WRGI algorithm became more pronounced as ξ became small.

Figure 5 describes the convergence curves of the GI and WRGI algorithms for Example 2 with

ξ = 0.1

and

ξ = 0.01

, respectively. It follows from Figure 5 that the error curves of the WRGI algorithm decreased faster than those for the GI one, which means that the proposed WRGI algorithm is superior to the GI one in terms of IT.

5. Concluding Remarks

In this work, by applying the weighted technique and introducing several relaxation factors into the GI algorithm, the weighted, relaxed gradient-based iterative (WRGI) algorithm was constructed for the generalized coupled conjugate and transpose Sylvester matrix equations. By using the real representation of a complex matrix and the vector-stretching operator, the necessary and sufficient conditions for the convergence of the WRGI algorithm were given, and its optimal convergence factor was settled theoretically. Finally, two numerical examples were offered to show the effectiveness and superiority of the proposed WRGI algorithm.

Note that only one step size factor μ is used in the WRGI algorithm. We will consider to adopt different step size factors in the WRGI algorithm and investigate its algebraic properties. Aside from that, establishing different versions of the WRGI algorithm and their convergent conditions when the system matrix is of a full-row rank, full-column rank or reduced rank deserve to be studied in our future work.

Author Contributions

X.W. wrote the main manuscript text, Z.H. supervised and revised the manuscript, J.C. performed the numerical experiments, and Y.L. examined the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Science Foundations of China (No. 12361078 and 11901123), the Guangxi Natural Science Foundations (No. Guike AD20159056, 2019GXNSFBA185014, 2021GXNSFBA196064 and Guike AD21220129).

Data Availability Statement

Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.

Acknowledgments

We would like to express our sincere thanks to the editor and the anonymous reviewers for their valuable suggestions and constructive comments which greatly improved the presentation of this paper.

Conflicts of Interest

The authors declare that they have no known competing financial interest or personal relationships that could have appeared to influence the work reported in this paper.

References

Fletcher, L.R.; Kuatslcy, J.; Nichols, N.K. Eigenstructure assignment in descriptor systems. IEEE Trans. Autom. Control. 1986, 31, 1138–1141. [Google Scholar] [CrossRef]
Zhou, B.; Zheng, W.-X.; Duan, G.-R. Stability and stabilization of discrete-time periodic linear systems with actuator saturation. Automatica 2011, 47, 1813–1820. [Google Scholar] [CrossRef]
Zhou, B.; Duan, G.-R. Periodic Lyapunov equation based approaches to the stabilization of continuous-time periodic linear systems. IEEE Trans. Autom. Control 2012, 57, 2139–2146. [Google Scholar] [CrossRef]
Ding, F.; Wang, F.-F.; Xu, L.; Wu, M.-H. Decomposition based least squares iterative identification algorithm for multivariate pseudo-linear ARMA systems using the data filtering. J. Frankl. Inst. 2017, 354, 1321–1339. [Google Scholar] [CrossRef]
Lv, L.-L.; Chen, J.-B.; Zhang, L.; Zhang, F.-R. Gradient-based neural networks for solving periodic Sylvester matrix equations. J. Frankl. Inst. 2022, 359, 10849–10866. [Google Scholar] [CrossRef]
Bai, Z.-Z.; Guo, X.-X.; Yin, J.-F. On two iteration methods for the quadratic matrix equations. Int. J. Numer. Anal. Model. 2005, 2, 114–122. [Google Scholar]
Bai, Z.-Z.; Guo, X.-X.; Xu, S.-F. Alternately linearized implicit iteration methods for the minimal nonnegative solutions of the nonsymmetric algebraic Riccati equations. Numer. Linear Algebra Appl. 2006, 13, 655–674. [Google Scholar] [CrossRef]
Bai, Z.-Z.; Huang, Y.-M.; Ng, M.K. On preconditioned iterative methods for Burgers equations. SIAM J. Sci. Comput. 2007, 29, 415–439. [Google Scholar] [CrossRef]
Dehghan, M.; Hajarian, M. An iterative algorithm for the reflexive solutions of the generalized coupled Sylvester matrix equations and its optimal approximation. Appl. Math. Comput. 2008, 202, 571–588. [Google Scholar]
Deng, Y.-B.; Bai, Z.-Z.; Gao, Y.-H. Iterative orthogonal direction methods for Hermitian minimum norm solutions of two consistent matrix equations. Numer. Linear Algebra Appl. 2006, 13, 801–823. [Google Scholar] [CrossRef]
Guo, K.-H.; Hu, X.-Y.; Zhang, L. A new iteration method for the matrix equation AX = B. Appl. Math. Comput. 2007, 187, 1434–1441. [Google Scholar]
Li, F.-L.; Gong, L.-S.; Hu, X.-Y.; Zhang, L. Successive projection iterative method for solving matrix AX = B. J. Comput. Appl. Math. 2010, 234, 2405–2410. [Google Scholar] [CrossRef]
Ding, F.; Chen, T.-W. On iterative solutions of general coupled matrix equations. Siam J. Control. Optim. 2006, 44, 2269–2284. [Google Scholar] [CrossRef]
Chen, Z.-B.; Chen, X.-S. Modification on the convergence results of the Sylvester matrix equation AX + XB = C. J. Frankl. Inst. 2022, 359, 3126–3147. [Google Scholar] [CrossRef]
Ramadan, M.A.; Bayoumi, A.M.E. A modified gradient-based algorithm for solving extended Sylvester-conjufate matrix equations. Asian J. Control. 2018, 20, 228–235. [Google Scholar] [CrossRef]
Zhang, J.-J. A note on the iterative solutions of general coupled matrix equation. Appl. Math. Comput. 2011, 217, 9380–9386. [Google Scholar] [CrossRef]
Zhang, H.-M. Reduced-rank gradient-based algorithms for generalized coupled Sylvester matrix equations and its applications. Comput. Math. Appl. 2015, 70, 2049–2062. [Google Scholar] [CrossRef]
Zhang, H.-M.; Yin, H.-C. Conjugate gradient least squares algorithm for solving the generalized coupled Sylvester matrix equations. Comput. Math. Appl. 2017, 73, 2529–2547. [Google Scholar] [CrossRef]
Wu, A.-G.; Feng, G.; Duan, G.-R.; Wu, W.-J. Iterative solutions to coupled Sylvester-conjugate matrix equations. Comput. Math. Appl. 2010, 60, 54–66. [Google Scholar] [CrossRef]
Huang, B.-H.; Ma, C.-F. Gradient-based iterative algorithms for generalized coupled Sylvester-conjugate matrix equations. Comput. Math. Appl. 2018, 75, 2295–2310. [Google Scholar] [CrossRef]
Huang, B.-H.; Ma, C.-F. The relaxed gradient-based iterative algorithms for a class of generalized coupled Sylvester-conjugate matrix equations. J. Frankl. Inst. 2018, 355, 3168–3195. [Google Scholar] [CrossRef]
Wang, W.-L.; Qu, G.-R.; Song, C.-Q. Cyclic gradient based iterative algorithm for a class of generalized coupled Sylvester-conjugate matrix equations. J. Frankl. Inst. 2023, 360, 7206–7229. [Google Scholar] [CrossRef]
Song, C.-Q.; Chen, G.-L.; Zhao, L.-L. Iterative solution to coupled Sylvester-transpose matrix equations. Appl. Math. Model. 2011, 35, 4675–4683. [Google Scholar] [CrossRef]
Huang, B.-H.; Ma, C.-F. On the relaxed gradient-based iterative methods for the generalized coupled Sylvester-transpose matrix equations. J. Frankl. Inst. 2022, 359, 10688–10725. [Google Scholar] [CrossRef]
Beik, F.P.A.; Salkuyeh, D.K.; Moghadam, M.M. Gradient-based iterative algorithm for solving the generalized coupled Sylvester-transpose and conjugate matrix equations over reflexive (anti-reflexive) matrices. Trans. Inst. Meas. Control. 2014, 36, 99–110. [Google Scholar] [CrossRef]
Wang, W.-L.; Song, C.-Q.; Ji, S.-P. Iterative solution to a class of complex matrix equations and its application in time-varying linear system. J. Appl. Math. Comput. 2021, 67, 317–341. [Google Scholar] [CrossRef]
Zhang, H.-M.; Yin, H.-C. New proof of the gradient-based iterative algorithm for a complex conjugate and transpose matrix equation. J. Frankl. Inst. 2017, 354, 7585–7603. [Google Scholar] [CrossRef]
Niu, Q.; Wang, X.; Lu, L.-Z. A relaxed gradient based algorithm for solving Sylvester equations. Asian J. Control. 2011, 13, 461–464. [Google Scholar] [CrossRef]
Wang, W.-L.; Song, C.-Q. Iterative algorithms for discrete-time periodic Sylvester matrix equations and its application in antilinear periodic system. Appl. Numer. Math. 2021, 168, 251–273. [Google Scholar] [CrossRef]
Wu, A.-G.; Zhang, Y.; Qian, Y.-Y. Complex Conjugate Matrix Equations; Science Press: Beijing, China, 2017. [Google Scholar]
Khalil, I.S.; Doyle, J.C.; Glover, K. Robust and Optimal Control; Prentice Hall: Hoboken, NJ, USA, 1996. [Google Scholar]
Zhang, H.-M. A finite iterative algorithm for solving the complex generalized coupled Sylvester matrix equations by using the linear operators. J. Frankl. Inst. 2017, 354, 1856–1874. [Google Scholar] [CrossRef]

Figure 1. Convergence curves of the tested algorithms for Example 1 with

ξ = 0.1

.

Figure 1. Convergence curves of the tested algorithms for Example 1 with

ξ = 0.1

.

Figure 2. Convergence curves of the tested algorithms for Example 1 with

ξ = 0.01

.

Figure 2. Convergence curves of the tested algorithms for Example 1 with

ξ = 0.01

.

Figure 3. Convergence curves of the tested algorithms for Example 1 with

ξ = 0.001

.

Figure 3. Convergence curves of the tested algorithms for Example 1 with

ξ = 0.001

.

Figure 4. Convergence curves of the tested algorithms for Example 1 with

ξ = 0.0001

.

Figure 4. Convergence curves of the tested algorithms for Example 1 with

ξ = 0.0001

.

Figure 5. Convergence curves of the tested algorithms for Example 2 with

ξ = 0.1

(left) and

ξ = 0.01

(right).

Figure 5. Convergence curves of the tested algorithms for Example 2 with

ξ = 0.1

(left) and

ξ = 0.01

(right).

Table 1. Numerical results of the tested algorithms for Example 1 with different values of

ξ

.

Table 1. Numerical results of the tested algorithms for Example 1 with different values of

ξ

.

Algorithm		$ξ$
		0.1	0.01	0.001	0.0001
GI	IT	17	496	5312	12347
	CPU	0.0450	0.1784	1.9372	3.6098
	RES	$9.93 \times 10^{- 2}$	$1.00 \times 10^{- 2}$	$9.9993 \times 10^{- 4}$	$9.9969 \times 10^{- 5}$
WRGI	IT	17	492	5228	12128
	CPU	0.0381	0.1270	1.7313	3.1916
	RES	$9.9 \times 10^{- 2}$	$1.00 \times 10^{- 2}$	$9.9998 \times 10^{- 4}$	$9.9999 \times 10^{- 5}$

Table 2. Numerical results of the tested algorithms for Example 2 with different values of

ξ

.

Table 2. Numerical results of the tested algorithms for Example 2 with different values of

ξ

.

Algorithm		$ξ$
		0.1	0.01
GI	IT	21	2495
	CPU	0.0482	0.4058
	RES	$9.82 \times 10^{- 2}$	$1.00 \times 10^{- 2}$
WRGI	IT	15	2014
	CPU	0.0179	0.2831
	RES	$9.997 \times 10^{- 2}$	$1.00 \times 10^{- 2}$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wu, X.; Huang, Z.; Cui, J.; Long, Y. The Weighted, Relaxed Gradient-Based Iterative Algorithm for the Generalized Coupled Conjugate and Transpose Sylvester Matrix Equations. Axioms 2023, 12, 1062. https://doi.org/10.3390/axioms12111062

AMA Style

Wu X, Huang Z, Cui J, Long Y. The Weighted, Relaxed Gradient-Based Iterative Algorithm for the Generalized Coupled Conjugate and Transpose Sylvester Matrix Equations. Axioms. 2023; 12(11):1062. https://doi.org/10.3390/axioms12111062

Chicago/Turabian Style

Wu, Xiaowen, Zhengge Huang, Jingjing Cui, and Yanping Long. 2023. "The Weighted, Relaxed Gradient-Based Iterative Algorithm for the Generalized Coupled Conjugate and Transpose Sylvester Matrix Equations" Axioms 12, no. 11: 1062. https://doi.org/10.3390/axioms12111062

APA Style

Wu, X., Huang, Z., Cui, J., & Long, Y. (2023). The Weighted, Relaxed Gradient-Based Iterative Algorithm for the Generalized Coupled Conjugate and Transpose Sylvester Matrix Equations. Axioms, 12(11), 1062. https://doi.org/10.3390/axioms12111062

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

The Weighted, Relaxed Gradient-Based Iterative Algorithm for the Generalized Coupled Conjugate and Transpose Sylvester Matrix Equations

Abstract

1. Introduction

2. Preliminaries

3. The Weighted, Relaxed Gradient-Based Iterative (WRGI) Algorithm and Its Convergence Analysis

4. Numerical Experiments

5. Concluding Remarks

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI