1. Introduction
The
Sylvester matrix equation is the equation
where all matrices are complex, matrices
are given, and
X is unknown. Special cases of the equation already appear in introductory courses to linear algebra, e.g., as the matrix form
of a system of linear equations, the equation
defining the nullspace of
A, the normal equation
determining the solution to a linear least squares problem, the equation
defining eigenvalues and eigenvectors of
A, the equation
defining an inverse of a square matrix
A, and the equation
defining commuting matrices (see [
1] and [
2], Chapter 16).
The Sylvester matrix equation has numerous applications in systems and control theory, signal processing, image restoration, engineering, and differential equations (see [
3,
4] and [
5],
Section 1, for a concise review of literature and methods for solving this equation). To present a concrete example, let us consider the restoration of images, i.e., the reconstruction or estimation of the original image on the base of its noised or degraded version. As described in [
6], in the presence of white Gaussian noise and under suitable assumptions on a two-dimensional image, the minimum mean square error estimate
of the original image is a solution of a Sylvester’s matrix equation
, where the matrix
is defined with the help of the covariance matrix of the vector of samples from the image in the vertical direction,
is defined similarly but with respect to the horizontal direction, and
C is defined with the help of a vector of the noised image (see [
6] for particularities). An interesting and unexpected appearance of a Sylvester’s equation in ring theory is presented in [
7]. Recall that an element
a of a unital and not necessarily commutative ring
R is said to be suitable if, for every left ideal
L of the ring
R such that
, there exists an idempotent
with
. The ring
R is called an exchange ring if all elements in
R are suitable; the study of such rings is an important topic of research in ring theory. In [
7], Khurana, Lam, and Nielsen presented a new criterion for the suitability of an element. They proved that an element
A of the ring
of
n-by-
n matrices over
R is suitable if and only if there exists an idempotent matrix
such that the Sylvester matrix equation
is solvable in
(in [
7], the result is presented in the case where
). In [
7], the authors expressed their “total surprise” that studying the solvability of the Sylvester equation
with
B idempotent turned out to be precisely equivalent to studying when
A is suitable over the ring
R.
Also, conjugate versions and generalizations of the Sylvester matrix equation are extensively studied (see, e.g., [
8,
9] for more information); in the rest of this paragraph and the next one, we list some of them. The matrix equation
, where
denotes the matrix obtained by taking the complex conjugate of each element of
X, is called the
normal Sylvester-conjugate matrix equation. The matrix equation
is called the
Kalman–Yakubovich-conjugate matrix equation (also known as the Stein-conjugate matrix equation, see [
10]). The matrix equation
is called the
Yakubovich-conjugate matrix equation ([
11,
12]).
For positive integers
k and
l, let
denote the set of
k-by-
l complex matrices. In [
13], Wu, Duan, Fu, and Wu studied the so-called
generalized Sylvester-conjugate matrix equation
where
,
,
and
are known matrices, whereas
and
are the matrices to be determined. When
and
, this matrix equation becomes the normal Sylvester-conjugate matrix equation; when
and
, it becomes the Kalman–Yakubovich-conjugate matrix equation; when
and
, it becomes the Yakubovich-conjugate matrix equation. Moreover, when
, the matrix Equation (
1) becomes the
nonhomogeneous Yakubovich-conjugate matrix equation investigated in [
11]; when
, the matrix Equation (
1) becomes the
extended Sylvester-conjugate matrix equation investigated in [
14]; when
(and
X is interchanged with
), the matrix Equation (
1) becomes the
nonhomogeneous Sylvester-conjugate matrix equation investigated in [
8],
Section 3, and furthermore, if
, it becomes the
homogeneous Sylvester-conjugate matrix equation investigated in [
8],
Section 2. Hence, Equation (
1) unifies many important conjugate versions of the Sylvester matrix equation.
In [
9], Wu, Feng, Liu, and Duan proposed a unified approach to solving a more general class of Sylvester-polynomial-conjugate matrix equations that includes the matrix Equation (
1) and the Sylvester polynomial matrix equation (see [
15]) as special cases. To present the main result of [
9], what is done in Theorem 1 below, we first recall some definitions and notations introduced in [
9,
16] (alternatively, see [
17], pp. 98, 99, 368, 389; we refer the reader to [
17], Chapter 10, for more detailed information on the Sylvester-polynomial-conjugate matrix equations). These definitions and notations may look a bit complicated at first glance, but we have to cite them in order to be able to present the main result of [
9], which was our motivation for this paper and which we generalize broadly by putting it in a new context. In
Section 3, we express these definitions and notation in the language of matrices over skew polynomial rings, and from this new perspective, they become clear and easy to understand. The reader who now does not wish to become familiar with these specific objects can move to the paragraph after Theorem 1 and possibly return to the skipped text later.
For any complex matrix
square complex matrix
F, and non-negative integer
k, the matrix
is defined inductively by
with
, and the matrix
is defined to be
The set of polynomials over
in the indeterminate
s is denoted by
, and its elements are called complex polynomial matrices. Given
,
, and
, the
Sylvester-conjugate sum is defined as
For complex polynomial matrices
and
, their
conjugate product is defined as
In [
9], the authors investigated a general type of complex matrix equations, which they called the
Sylvester-polynomial-conjugate matrix equation,
where
, and
are known matrices, and
and
are the unknown matrices to be determined. It is easy to see that by using the Sylvester-conjugate sum (
2), Equation (
4) can be written as
where
and
Let us note that if
,
and
, then
and thus, for such
, and
, Equation (
5) becomes the generalized Sylvester-conjugate matrix Equation (
1). Thus, each method for solving the polynomial matrix Equation (
5) automatically provides a method for solving the matrix Equation (
1) and, hence, the conjugate variants of the Sylvester matrix equation listed in the first four paragraphs of this section.
Recall from [
18] that polynomial matrices
and
are
left coprime in the frame of the conjugate product if there exists a polynomial matrix
such that
is invertible with respect to the conjugate product ⊛ and
. Below, we recall the main result of [
9], which provides a complete solution of the Sylvester-polynomial-conjugate matrix Equation (
5) in the case where
and
are left coprime.
Theorem 1 ([
9], Theorem 2).
Let and be left coprime in the framework of the conjugate product. Hence, there exist polynomial matrices , , , such thatThen, for any polynomial matrix and matrices and , a pair satisfies the equationif and only ifwhere is an arbitrarily chosen parameter matrix. Throughout the paper, the set of positive integers is denoted by , the imaginary unit of the field of complex numbers is denoted by , and the imaginary basis elements of the division ring of quaternions are denoted by .
In this paper, we put the Sylvester-conjugate matrix Equation (
6) in a much more general context of matrices over skew polynomial rings. Recall that if
R is a ring (not necessarily commutative) and
is an endomorphism of the ring
R, then the skew polynomial ring
consists of polynomials over
R in one indeterminate
s (i.e., polynomials of the form
with
) that are added in an obvious way and multiplied formally subject to the rule
for any
, along with the rules of distributivity and associativity. Clearly, if
is the identity map
of
R, then the ring
coincides with the usual polynomial ring
, and thus, the usual polynomial ring is a special case of the skew polynomial ring construction. Skew polynomial rings are a well-known tool in algebra to provide examples of lacking symmetry between many ring objects defined by multiplication from the left and their counterparts defined by multiplication from the right.
The main advantage of our approach to Sylvester-conjugate matrix equations via skew polynomial rings lies in the freedom of choosing both the ring
R and its endomorphism
. For instance, as we see in
Section 4, to obtain matrix Equation (
4), it suffices to take
and
as the complex conjugation (i.e.,
. On the other hand, by taking
with
, we obtain the non-conjugate version of (
4) and non-conjugate versions of all the matrix equations mentioned in the first four paragraphs of this section. Moreover, to obtain
-conjugate versions of these Sylvester-like matrix equations (which are well studied in the literature; see
Section 4), it suffices that
R is the division ring of quaternions
and
is the
-conjugation (i.e.,
.
The paper is organized as follows. In
Section 2, we present a general approach to equations of the form (
6) based on groupoids and vector spaces. In
Section 3, we apply the result of
Section 2 to matrices over skew polynomial rings, obtaining Theorem 3, which describes all solutions to equations of the form (
6) in the case where
and
are left coprime. As immediate consequences, in
Section 4, we obtain Theorem 1 along with its version for the Sylvester-polynomial-
-conjugate matrix equation over quaternions. In particular, we develop some ideas of [
9].
2. Main Result
Recall that a groupoid is a set M together with a binary operation ⊕ on M. If are groupoids, then a map is called a groupoid homomorphism if for any .
Let
be groupoids (soon, it will become clear why they are doubly indexed) and
be vector spaces over a field
K. Assume that for any
, an operation
is given whose value for a pair
is denoted by
. In this section, we consider the following problem of solving equations of the form of (
5):
(∗)
Problem:Given , and , find all and such thatObviously, the structure consisting of groupoids and vector spaces , for which we have formulated Problem (∗), is too poor to provide a satisfactory solution. In the theorem below, we enrich the structure appropriately, obtaining a solution to Problem (∗).
Theorem 2. Let be groupoids with operations commonly denoted by ⊕, and let be finite-dimensional vector spaces over a field K. Assume that for any , operations and are given such that
- (1)
The operation ⊠ induces groupoid homomorphisms with respect to the first operand and linear maps with respect to the second operand, that is,
- (a)
for any , and ;
- (b)
for any , , and ;
- (2)
for any , and .
Let and be such that for some , , , and , the following conditions hold:
- (i)
for any ;
- (ii)
for any ;
- (iii)
for any
Then, for any , a pair satisfies the equationif and only if Proof. To prove the result, we consider the following two maps:
Let us note that
is just the left side of Equation (
7) that we want to solve, and by (1)(b), both
and
are linear maps. For any
, by using (i), (1)(a), and (2), we obtain
and thus,
is a particular solution of Equation (
7).
To find all solutions of (
7), we first note that
is an injection. Indeed, if
, then
, and thus, by using (ii), (1)(a), (2) and (1)(b), we obtain
as desired.
Next, we show that
. Note that for any
, by using (2), (1)(a), and (iii), we obtain
and
follows. Furthermore, since
is injective, and
is surjective (by (
8)), and
are finite-dimensional, we obtain
Thus, the desired equality follows.
Now, for any
, by using (
8) and the equality
, we obtain the following equivalences:
which completes the proof. □
3. An Application to Matrices over Skew Polynomial Rings
Let
R be a (possibly noncommutative) ring with unity, and let
be a ring endomorphism of
R. Then the set of polynomials over
R in one indeterminate
s, with the usual addition of polynomials and multiplication subject to the rule
(along with distributivity and associativity), is a ring, called the
skew polynomial ring and denoted by
(see, e.g., [
19], p. 10). Thus, elements of
are polynomials of the form
with usual addition, i.e., coefficientwise, and multiplication given by
For any
and
, we put
, i.e.,
is the matrix obtained by taking the value of
of each element of
A. We denote by
the set of polynomials over
in the indeterminate
s with usual addition of polynomials and with multiplication of any polynomials
and
performed analogously as in (
9), i.e.,
Let us note that for any matrix , the monomial can be seen as the matrix of monomials , and thus, can be viewed as the set of matrices over the skew polynomial ring , with addition and multiplication induced by those in the ring ; this is why elements of are called polynomial matrices.
Let
T be a ring, let
, and let
and
. We say the matrices
A and
B are
left coprime if there exists an invertible matrix
such that for the block matrix
we have (cf. [
17], Theorem 9.20)
Let us partition
U and
as
and
with
,
,
and
. Hence,
which implies that
,
and
By the foregoing discussion, if polynomial matrices
,
are left coprime, then there exist polynomial matrices
,
,
, and
such that
We use this observation in the following theorem, which solves Problem (∗) for left coprime matrices over a skew polynomial ring.
Theorem 3. Let R be a ring with an endomorphism σ such that R is a finite-dimensional vector space over a field K. Let , and assume that for any an operation is given with the following properties:
- (1)
- (a)
for any , and .
- (b)
for any , , and .
- (2)
for any , , , and .
- (3)
for any and .
Let polynomial matrices and be left coprime. Hence, there exist polynomial matrices , , , satisfying the Equation (11). Then, for any matrix , a pair satisfies the equationif and only if for some . Proof. We apply Theorem 2 with
,
with ⊕ being the usual addition of polynomial matrices and ⊙ being the skew multiplication (
10) of polynomial matrices. We only need to show that assumption (2) of Theorem 2 is satisfied, since clearly so are all other assumptions of Theorem 2. For that, let
,
, and
. Below, using properties (1) and (2) along with the Formula (
10), we derive the desired equality
which completes the proof. □
We present examples of operations ⊠ on matrices over skew polynomial rings to which Theorem 3 can be applied.
Example 1. Let R be a ring with an endomorphism σ such that R is a vector space over a field K. Let be an operation with the following properties:
- (1)
- (a)
for any and ;
- (b)
for any , and ;
- (2)
for any and ;
- (3)
for any .
Let , and p be positive integers. Below, we show how one can extend the operation ⧅ to an operation satisfying conditions (1)–(3) of Theorem 3.
Let be a polynomial with matrices as coefficients (i.e., is the -entry of ). For each m and r, let . Now, for any , we define to be the matrix in whose -entry is equal to . It is easy to verify that the extended operation satisfies conditions (1)–(3) of Theorem 3.
Example 2. Let R be a ring with an endomorphism σ such that R is a vector space over a field K, and let be an operation with the following properties:
- (1)
- (a)
for any ;
- (b)
for any and ;
- (2)
for any ;
- (3)
for any .
Let be a K-linear map such that for any . We extend the operation to an operation by setting . It is easy to verify that the extended operation satisfies conditions (1)–(3) listed in Example 1, so by the method described in Example 1, ⧅ can be further extended to an operation satisfying conditions (1)–(3) of Theorem 3.
Example 3. Let be such that . We define an operation by setting for any complex numbers (written in the algebraic form) that Then, for , , and σ the complex conjugation, the conditions (1)–(3) of Example 2 hold, and thus, by using (as described in Example 2) the map defined by with such that , the operation can be extended to an operation satisfying assumptions of Theorem 3.
Example 4. Let R be a ring with an endomorphism σ, and let be a field such that for any and . For and , we define It is clear that the operation satisfies conditions (1)–(3) of Theorem 3.
Example 5. Let R be a ring with an endomorphism σ, , and let be a field such that and for any and . For and , we define One can verify that the operation satisfies assumptions (1)–(3) of Theorem 3.
4. Solution to the Sylvester-Polynomial-Conjugate Matrix Equations over Complex Numbers and Quaternions
In
Section 1, we recalled Theorem 1, which is the main result of [
9], which gives the complete solution to the Sylvester-polynomial-conjugate matrix Equation (
6) in the case where
and
are left coprime. Below, we obtain Theorem 1 as a direct corollary of Theorem 3.
Proof of Theorem 1. Let
be the complex conjugation. Referring to the notation recalled in
Section 1, it is easy to see that
and the conjugate product ⊛ is just the skew multiplication (
10) of polynomial matrices. Furthermore,
, and thus, the Sylvester-conjugate sum
is a special case of the operation ⊠ in Example 5. Hence, Theorem 1 is an immediate consequence of Theorem 3. □
Let
be the skew field of quaternions, that is,
, with the multiplication performed subject to the rules
and
. For a quaternion matrix
, the matrix
is called the
-conjugate of A (in [
20], where the notion of the
-conjugate of a quaternion matrix was introduced, it is denoted by
, whereas in [
21], it is denoted by
). Similarly to Sylvester-conjugate-matrix equations, their counterparts for matrices over quaternions are intensively studied. For instance, the
normal Sylvester--conjugate matrix equation was investigated in [
22,
23,
24,
25], the
Kalman–Yakubovich--conjugate matrix equation was investigated in [
22,
26,
27,
28,
29], and the
homogenous Sylvester--conjugate matrix equation was investigated in [
30]. Furthermore, the
homogeneous Yakubovich--conjugate matrix equation was investigated in [
28], and the
nonhomogeneous Yakubovich--conjugate matrix equation was investigated in [
31]. The
two-sided generalized Sylvester matrix equation over
was investigated in [
32].
In [
21], Wu, Liu, Li, and Duan defined the notion of the
-conjugate product of quaternion polynomial matrices. First, for any quaternion matrix
A and positive integer
k, they defined inductively the quaternion matrix
by setting
and
. For two quaternion polynomial matrices
and
, their
-conjugate product is defined as
Let
be the map defined by
for any
. Then,
is an automorphism of the division ring
, and for any
and nonegative integer
i, we have that
. Hence, the
-conjugate product (
12) is simply the product of matrices over the skew polynomial ring
. Given
,
, and
, analogously as in [
9], we define the
Sylvester--conjugate sum as
Similarly as in the second paragraph of
Section 1, one can easily see that each of the aforementioned
-conjugate matrix equations is a special case of the polynomial equation
where
,
,
,
are given and
,
are unknown. Since
and
for any
and
, with the use of the argument of Example 5, we can apply Theorem 3 to matrices over the skew polynomial ring
, obtaining the following result as a direct corollary.
Theorem 4. Let and be left coprime in the framework of the -conjugate product ⊛. Hence, there exist polynomial matrices , , , such that Then for any matrices and , a pair satisfies the equationif and only ifwhere is an arbitrarily chosen parameter matrix.