Next Article in Journal
Overview of Tensor-Based Cooperative MIMO Communication Systems—Part 1: Tensor Modeling
Next Article in Special Issue
Quantum Knowledge in Phase Space
Previous Article in Journal
A Transverse Hamiltonian Approach to Infinitesimal Perturbation Analysis of Quantum Stochastic Systems
Previous Article in Special Issue
Uniform Treatment of Integral Majorization Inequalities with Applications to Hermite-Hadamard-Fejér-Type Inequalities and f-Divergences
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Capacity-Achieving Input Distributions of Additive Vector Gaussian Noise Channels: Even-Moment Constraints and Unbounded or Compact Support

Department of Electrical & Computer Engineering, University of Waterloo, 200 University Ave W, Waterloo, ON N2L 3G1, Canada
*
Author to whom correspondence should be addressed.
Entropy 2023, 25(8), 1180; https://doi.org/10.3390/e25081180
Submission received: 25 July 2023 / Revised: 4 August 2023 / Accepted: 5 August 2023 / Published: 8 August 2023
(This article belongs to the Special Issue Shannon Entropy: Mathematical View)

Abstract

:
We investigate the support of a capacity-achieving input to a vector-valued Gaussian noise channel. The input is subjected to a radial even-moment constraint and is either allowed to take any value in R n or is restricted to a given compact subset of R n . It is shown that the support of the capacity-achieving distribution is composed of a countable union of submanifolds, each with a dimension of n 1 or less. When the input is restricted to a compact subset of R n , this union is finite. Finally, the support of the capacity-achieving distribution is shown to have Lebesgue measure 0 and to be nowhere dense in R n .

1. Introduction

In this paper, we consider the support of the capacity-achieving input to a vector-valued channel that is subject to additive non-degenerate Gaussian noise. Vector-valued channels are used in a variety of applications, including the complex-valued inputs and outputs of quadrature channels, which have alternate representations as two-dimensional real vectors. Larger antenna arrays enable Multiple-Input Multiple-Output (MIMO) channels, which have inputs of n 1 complex components. Additionally, noise with memory can be expressed by correlated noise components in a vector-valued channel.
Throughout the paper, the average input power is bounded to limit the consumption of environmental, battery, and monetary resources. Since the output of the amplifiers used in transmitters is severely distorted when the input is too large [1,2,3] and signals that are too small can be challenging to produce, it is also of practical interest to restrict the input to an arbitrary compact set.
There has been appreciable prior effort dedicated to understanding the capacity-achieving input to vector-valued channels subject to average power constraints and restrictions to compact sets. Nevertheless, there are significant technical challenges in working with vector-valued inputs. Therefore, much of the work to this point has been limited to either one-dimensional channels [4] or spherically symmetric channels [5,6,7,8], where the latter case ensures that the capacity-achieving distribution can be expressed as a univariate function of the radius. However, this restriction limits the scope of study to channels in which the input is constrained to a ball and the noise components are independent and identically distributed. In this paper, the only assumption made on the Gaussian noise distribution is that it is non-degenerate. We consider both cases, those in which inputs are restricted to arbitrary compact sets and those in which inputs are allowed to take any value in R n .
The power of a vector-valued signal is equivalent to the second moment of its Euclidean norm. A constraint on the fourth moment then has the practical interpretation of limiting the second moment of the instantaneous power. Furthermore, imposing a moment constraint of order 2 k ensures that the tails of the input distribution decay at least as quickly as a degree 2 k monomial. Therefore, increasing the even-moment constraint penalizes large inputs without imposing a strict cutoff. This motivates us to generalize the average power constraint by limiting some even moment of the input’s Euclidean norm.
The results in this paper apply to any combination of input constraints described above, except for the special case where the input is allowed to take any value in R n and is subject to a second-moment constraint. This case reduces to a classical result in which the capacity-achieving distribution is known to be Gaussian. For all other cases, we show that the capacity-achieving distribution is contained in a countable union of i-dimensional submanifolds, where i ranges { 0 , , n 1 } . Furthermore, this union is finite when the input is restricted to a compact set. We then show that the support of the capacity-achieving distribution is a nowhere dense set with Lebesgue measure 0.
The paper is organized as follows. We first give a review of prior work in Section 2. Section 3.1, Section 3.2 and Section 3.3 provide intermediary results prior to the main results in Section 3.4. Section 4 concludes the paper.

2. Prior Work

Dating back to Shannon’s work in [9], much of the research on continuous channels has focused on average power (equivalently, second-moment) constraints on the input. A transmitter’s inability to produce arbitrarily large powers then led to the consideration of additional peak power constraints, modeled by restricting the input almost surely to compact sets.
The first major result on amplitude-constrained channels considers a scalar Additive White Gaussian Noise (AWGN) model, both with and without a variance constraint [4]. In each case, the support of the capacity-achieving distribution has a finite number of points.
The use of the Identity Theorem for functions of a single complex variable is key to the argument of [4] and many papers that follow. The theorem can be applied to any univariate analytic function that has an accumulation point of zeros. By contrast, the Identity Theorem in n complex dimensions requires an analytic function with an open set of zeros in C n . Therefore, to apply the Identity Theorem directly for n > 1 , a random vector with support containing an open subset of C n must be considered. It was suspected by some authors that, since R n is not open in C n , no topological assumption on the support of the capacity-achieving distribution would be sufficient for this purpose [5,10,11]. Therefore, many papers restrict their models to ones that maintain spherical symmetry so that the capacity-achieving distribution can be expressed as a one-dimensional function of radius (e.g., Refs. [5,6,7,8]).
In [5] and [6], the results of [4] are extended to multivariate spherically symmetric channels, in n dimensions and 2 dimensions, respectively. In both papers, the inputs are subject to average and peak radial constraints. It is shown that it is optimal to concentrate the input on a finite number of concentric shells.
In [7] and [8], the number and positioning of optimal concentric shells under a peak radial constraint is studied. In [7], the least restrictive amplitude constraint for which the optimal distribution is concentrated on a single sphere is found. In [8], it is shown that the number of shells grows at most quadratically in the amplitude constraint. A similar result is found for n = 1 under an additional average power constraint.
While much of the prior work has focused on spherically symmetric channels, some research has considered spherically asymmetric channels. The case of inputs constrained to arbitrary compact sets and subject to a finite number of quadratic cost constraints as well as non-degenerate multivariate Gaussian noise is considered in [12]. It is concluded that the support of the capacity-achieving distribution must be “sparse”—that is, there must exist a not identically zero analytic function that is 0 on the support of the capacity-achieving distribution. Assuming otherwise leads to a contradiction by the n-dimensional Identity Theorem and Fourier analysis. These results, while quite general, do not consider either inputs of unbounded support or inputs subject to higher-moment constraints. Furthermore, outside of the special cases of n = 1 and spherically symmetric channels, they do not explore a characterization of sparse sets in R n .
In [13], MIMO channels with inputs that are restricted to compact sets, yet have no average power constraints, are considered. Using the Real-Analytic Identity Theorem and steps similar to [4], it is determined that the support of the optimal input distribution is nowhere dense in R n and has Lebesgue measure 0. For the case considered in [12] that coincides with this setup, [13] gives an instance of sparsity in terms of subsets of R n , rather than analytic functions.
There has also been work dedicated to generalizing the classic quadratic average cost constraint. In [14], a scalar channel with the input subject to a combination of even-moment constraints and restrictions to compact or non-negative subsets of R is studied. It is shown that, in most of the cases considered, the support has a finite number of points.
In [15], a complex-valued non-dispersive optical channel is considered, where the input is subject to an average cost that grows super-quadratically in radius, a peak constraint, or both. The noise is taken to be circularly symmetric and, under these conditions, so is the optimal input. The number of concentric circles composing the support of the distribution is shown to be finite.
In this paper, we study an n-dimensional channel subject to non-degenerate Gaussian noise. The input can either take any value in R n or is restricted to a compact subset of R n , and its norm is subject to even-moment constraints. The noise need not be spherically symmetric.
This paper gives a characterization of the capacity-achieving distribution to spherically asymmetric channels under peak and average power constraints that improves on prior work in three respects. Firstly, when our cases overlap with [12], our characterization of the capacity-achieving distribution is more detailed than the notion of sparsity used there. Secondly, our results apply to multivariate channels with inputs subject to even-moment constraints greater than 2. Thirdly, we consider both inputs that are restricted to compact sets and those that are allowed to take any value in R n .

3. Results

In this section, we consider R n -valued inputs subject to additive non-degenerate multivariate Gaussian noise. In Section 3.1, the capacity-achieving distribution, F * , is formulated as the objective of an optimization problem; its support is then framed in terms of the zero set of a certain real-analytic function, which is dependent on F * and referred to as s ( · ; F * ) . Section 3.2 finds an equivalent expression for s ( · ; F * ) , which is an intermediary step to showing in Section 3.3 that s ( · ; F * ) is non-constant. Section 3.4 uses the result that s ( · ; F * ) is non-constant to show that the support of the capacity-achieving distribution is contained in a countable union of submanifolds of dimensions in the range { 0 , , n 1 } . This union is finite when the input is constrained to a compact subset of R n . It is then shown that the support of the capacity-achieving input has Lebesgue measure 0 and is nowhere dense in R n .
Appendix A is dedicated to showing the convexity and compactness of the optimization space used in Section 3.1. Appendix B establishes a pointwise characterization of s u p p ( F * ) , which justifies the definition of s ( · , F * ) . Appendix C provides integrability results, which are used throughout the paper. Appendix D shows that the objective functional is weak continuous, strictly concave, and weak differentiable on the optimization space. Appendix E shows that s ( · , F * ) has an analytic extension to C n . Finally, Appendix F supports Section 3.3 by finding bounds for certain functions.
As a first step towards defining the set of feasible input distributions, let F ( R n ) be the set of finite Borel measures on R n . Note that F ( R n ) is contained in the set of finite signed Borel measures on R n , which has an intrinsic vector space structure and can be equipped with a norm [16]. Since F ( R n ) lies within a normed vector space, the convexity and compactness of its subsets can be discussed.
The possibility that the transmitter is unable to produce arbitrary signals in R n is modeled by restricting the input to an alphabet A R n . Denote the set of distributions for which the associated random variable is almost surely in A by
F n ( A ) { F F ( R n ) F ( A ) = 1 } .
Two cases for A are considered:
  • A = R n ;
  • A R n is compact.
In addition to the restriction to A , a radial even-moment constraint is associated with the input. For k Z > 0 , the input must belong to the set
P n ( A , k , a ) { F F n ( A ) E X F [ X 2 k ]     a } .
The resulting channel model, with input X F P n ( A , k , a ) , is
Y = A X + N ,
where Y and N N ( 0 , Σ ) are output and noise, respectively, and A is an invertible matrix known to the transmitter and receiver. It is assumed that the noise covariance matrix Σ is positive-definite.
We will simplify the analysis of (3) by showing that no generality is lost in assuming that A = I n , the n-dimensional identity matrix, and Σ is diagonal. Since Σ is positive-definite and A is invertible, the positive-definite matrix A 1 Σ ( A 1 ) T can be diagonalized by the orthogonal matrix Q .
Now, multiplying the output Y in (3) by Q A 1 , the receiver obtains
Y ˜ Q A 1 Y
= X ˜ + N ˜ ,
where X ˜ = Q X and the covariance matrix of N ˜ = Q A 1 N is diagonal. Since Q A 1 is invertible, I ( X ; Y ) = I ( X ˜ ; Y ˜ ) . Furthermore, since Q is orthogonal,
E [ X ˜ 2 k ] = E [ X 2 k ]
and the set { Q x | x A } is merely a rotated version of A . Hence, no generality is lost by dropping the and adopting the following channel model for the remainder of the paper:
Y = X + N ,
where N N ( 0 , Σ ) and Σ is diagonal with entries 0 < σ 1 2 σ 2 2 σ n 2 . The density of N is denoted by p N ( · ) .

3.1. Optimization Problem

By Theorem 3.6.2 of [17], the capacity of the channel in (7) is given by the optimization problem
C = sup X F P n ( A , k , a ) I ( X ; Y ) = sup F F n ( A ) E X F [ X 2 k ] a I ( X ; Y ) .
Since the relationship between Y , X , and N is known, the mutual information is a function of the distribution of X alone. Thus, the mutual information induced between X F and Y will be denoted by I ( F ) . Similarly, we express the even-moment constraint in terms of a functional g k : F n ( A ) R { } given by
g k ( F ) A x 2 k d F ( x ) a ,
where g k ( F ) 0 is equivalent to E X F [ X 2 k ]     a . Rewriting (8) in terms of I ( · ) and g k ( · ) yields
C = sup F P n ( A , k , a ) I ( F ) = sup F F n ( A ) g k ( F ) 0 I ( F ) .
Much of the appendix is dedicated to understanding properties of the problem presented in (10). It is shown in Theorem A1 that P n ( A , k , a ) is convex and compact. Furthermore, by Theorems A3 and A4, I ( · ) is a weakly continuous and strictly concave function on P n ( A , k , a ) . Therefore, the supremum is achieved by a unique input distribution F * P n ( A , k , a ) (see, e.g., Appendix C of [14])—that is,
C = max F P n ( A , k , a ) I ( F ) = max F F n ( A ) g k ( F ) 0 I ( F ) = I ( F * ) .
We use the notation X * to describe a capacity-achieving input directly (i.e., X * F * ).
Before proceeding, we require some definitions and notations. In the first definition, and throughout the paper, for x R n and r > 0 , we denote the ball of radius r centered at x by B r ( x ) R n . We will denote the closure of B r ( x ) by B r ¯ ( x ) . The output density induced by an input X F is denoted p ( · ; F ) .
Definition 1. 
Let V be a random variable with alphabet A R n . Then, the support of V is the set given by
s u p p ( V ) { x A r > 0 , P { V B r ( x ) } > 0 } .
If V has distribution F V , we may alternatively refer to s u p p ( F V ) s u p p ( V ) .
Definition 2. 
For F F n ( A ) , the output differential entropy is given by
h Y ( F ) R n p ( y ; F ) ln p ( y ; F ) d y
and the marginal entropy density at x R n is given by
h ( x ; F ) R n p N ( y x ) ln p ( y ; F ) d y ,
whenever the integrals exist.
The relationship between the differential entropy and marginal entropy density can be seen as follows. For any b > 0 and F 1 , F 2 P n ( A , k , b ) , we have that
A h ( x ; F 1 ) d F 2 ( x ) = A R n p N ( y x ) ln p ( y ; F 1 ) d y d F 2 ( x )
= R n A p N ( y x ) d F 2 ( x ) ln p ( y ; F 1 ) d y
= R n p ( y ; F 2 ) ln p ( y ; F 1 ) d y .
The equivalence of (15)–(17) is due to Fubini–Tonelli [18] and Lemma A4. If F = F 1 = F 2 , then
A h ( x ; F ) d F ( x ) = h Y ( F ) .
Lastly, define
Q n ( A , k , a ) b a P n ( A , k , b )
and let s ( · ; F * ) : R n R be given by
s ( x ; F * ) γ ( x 2 k a ) + C + h ( N ) h ( x ; F * )
= γ ( x 2 k a ) + C + h ( N ) + R n p N ( y x ) ln p ( y ; F * ) d y .
Since F * P n ( A , k , a ) , we have that h ( x ; F * ) is finite for all x R n and conclude that s ( x ; F * ) is also finite for all x R n . Furthermore, by Lemma A8, s ( · ; F * ) can be extended to a complex analytic function for all z C n ; hence, it is continuous.
The remainder of Section 3.1 consists of two steps:
  • We show that F * solves (11) if and only if there exists γ 0 such that
    R n h ( x ; F * ) d F ( x ) C h ( N ) γ g k ( F ) 0 ,
    for all F Q n ( A , k , a ) .
  • We show that, for the choice of γ in (22), the inequality in (22) is equivalent to the condition that for all x A ,
    h ( x ; F * ) γ ( x 2 k a ) + C + h ( N ) ,
    and if x s u p p ( F * ) , then
    h ( x ; F * ) = γ ( x 2 k a ) + C + h ( N ) .
    For x A , (24) is satisfied if and only if s ( x ; F * ) = 0 .
To establish (22), we will first use a Lagrange multiplier to reformulate (11) as an unconstrained problem over Q n ( A , k , a ) . We will then obtain (22) by taking the weak derivative of the resulting objective functional and applying a Karush–Kuhn–Tucker condition for optimality. We choose to work in the space Q n ( A , k , a ) since, when A = R n , the functionals I ( · ) and g k ( · ) are not weakly differentiable on the larger space F n ( A ) = F n ( R n ) .
Since { F Q n ( A , k , a ) | g k ( F ) 0 } = P n ( A , k , a ) , (11) can equivalently be written as
C = max F Q n ( A , k , a ) g k ( F ) 0 I ( F ) = I ( F * ) ,
where F * is the same as in (11). By Theorem A5, g k ( · ) is convex. Moreover, letting F s Q n ( A , k , a ) be a Heaviside step function at 0 R n , F s is an interior point of the feasible region since g k ( F s ) = a < 0 . Since Q n ( A , k , a ) is convex by Lemma A1, there exists γ 0 such that
C = max F Q n ( A , k , a ) J γ ( F ) = J γ ( F * ) ,
where
J γ ( F ) I ( F ) γ g k ( F ) ,
and γ g k ( F * ) = 0 (see e.g., Appendix C of [14]). Furthermore, for an arbitrary b a , F * P n ( A , k , b ) Q n ( A , k , a ) . Therefore, for this choice of γ , we also have
C = max F P n ( A , k , b ) J γ ( F ) = J γ ( F * ) .
By Theorems A5 and A6, J γ ( · ) has a weak derivative at F * in the direction of any F P n ( A , k , b ) given by
J γ ( F * , F ) = I ( F * , F ) γ g k ( F * , F )
= A h ( x ; F * ) d F ( x ) h Y ( F * ) γ ( g k ( F ) g k ( F * ) )
= A h ( x ; F * ) d F ( x ) h Y ( F * ) γ g k ( F ) ,
where (31) is due to γ g k ( F * ) = 0 . Substituting C = h Y ( F * ) h ( N ) gives
J γ ( F * , F ) = A h ( x ; F * ) d F ( x ) C h ( N ) γ g k ( F ) ,
where the differential entropy of the noise h ( N ) = 1 2 ln | 2 π e Σ | is finite since Σ is positive definite.
Now, J γ ( · ) is the difference between a strictly concave function (see Theorem A4) and a convex function (due to Theorem A5 and the non-negativity of γ ). Therefore, J γ ( · ) is strictly concave and F * is optimal if and only if, for all F P n ( A , k , b ) ,
J γ ( F * , F ) = A h ( x ; F * ) d F ( x ) C h ( N ) γ g k ( F ) 0 .
However, b a is arbitrary and each F Q n ( A , k , a ) satisfies F P n ( A , k , b ) for some b a . Therefore, F * is optimal if and only if, for all F Q n ( A , k , a ) ,
J γ ( F * , F ) = A h ( x ; F * ) d F ( x ) C h ( N ) γ g k ( F ) 0 ,
which is the statement we sought to show in (22).
The condition (34) is on the capacity-achieving distribution F * itself. Since our objective is to characterize s u p p ( F * ) , we find an equivalent condition to (34) to describe the behavior of F * at individual points in the input alphabet A . Thus, by (34) and Theorem A2, for all x A R n ,
s ( x ; F * ) = γ ( x 2 k a ) + C + h ( N ) + R n p N ( y x ) ln p ( y ; F * ) d y 0 ,
and if x s u p p ( F * ) , then
s ( x ; F * ) = γ ( x 2 k a ) + C + h ( N ) + R n p N ( y x ) ln p ( y ; F * ) d y = 0 .
The rest of Section 3 is dedicated to exploiting the relationship
s u p p ( F * ) Z ( s ; F * ) A ,
where Z ( s ; F * ) is the zero set of s ( · , F * ) .

3.2. Hilbert Space and Hermite Polynomial Representation

In this subsection, an equivalent expression for (36) is found by viewing the integral as an inner product in a Hilbert space and writing ln p ( · ; F * ) in terms of a Hermite polynomial basis for that space. Hermite polynomial bases are well-suited to analysis of Gaussian noise channels and they have been used in a number of information-theoretic papers (see, e.g., Refs. [14,19]).
Consider the Hilbert space
L p N 2 ( R n ) { ξ : R n R | R n ξ 2 ( x ) p N ( x ) d x < } ,
equipped with inner product
ξ , ψ L p N 2 ( R n ) R n ξ ( x ) ψ ( x ) p N ( x ) d x .
The inner product’s subscript is omitted when the space can be inferred.
Since the components of N are independent, with N i having variance σ i 2 > 0 , the density of N factors into
p N ( y ) = i = 1 n p N i ( y i )
= i = 1 n 1 2 π σ i 2 e 1 2 σ i 2 y i 2 .
We will construct an orthogonal basis for L p N 2 ( R n ) from orthogonal bases for the spaces L p N i 2 ( R ) .
First, with Z N ( 0 , 1 ) , an orthogonal basis for L p Z 2 ( R ) is given by the Hermite polynomials { H m } m = 0 [20], which are defined through the generating function
e 1 2 x 2 + y x = m = 0 H m ( y ) x m m ! .
For any m Z 0 , the mth Hermite polynomial has degree m and a positive leading coefficient. Next, for each i { 1 , , n } and m i Z 0 , define the stretched Hermite polynomials
H m i ( y ; σ i 2 ) 1 σ i m i H m ( y σ i 2 ) .
The inner product of L p N i 2 ( R ) is related to that of L p Z 2 ( R ) by
ξ , ψ L p N i 2 ( R ) = 1 2 π σ i 2 e 1 2 σ i 2 y 2 ξ ( y ) ψ ( y ) d y
= 1 2 π e 1 2 u 2 ξ ( σ i 2 u ) ψ ( σ i 2 u ) d u
= ξ σ i 2 , ψ σ i 2 L p Z 2 ( R ) ,
where ξ σ i 2 and ψ σ i 2 are versions of ξ and ψ that have been stretched horizontally by a factor of 1 / σ i 2 . Substituting any H m i ( · , σ i 2 ) and H l i ( · , σ i 2 ) into (44) shows that the set { H m i ( · , σ i 2 ) } m i = 0 is orthogonal. Furthermore, if there was a non-zero function (in an L p N i 2 ( R ) sense) that had a zero L p N i 2 ( R ) inner product with H m i ( · , σ i 2 ) for each m i , then a stretched version of this function would also have a zero L p Z 2 ( R ) inner product with H m for each m. This would contradict the completeness of the Hermite polynomials in L p Z 2 ( R ) ; hence, { H m i ( · , σ i 2 ) } m i = 0 forms a basis for L p N i 2 ( R ) . Lastly, the stretched Hermite polynomials { H m i ( · , σ i 2 ) } m i = 0 have the generating function
e 1 2 σ i 2 x 2 + 1 σ i 2 y x = m i = 0 H m i ( y ; σ i 2 ) x m i m i ! .
Now, L p N 2 ( R n ) is isomorphic to the tensor product of the L p N i 2 ( R ) spaces, i { 1 , , n } ; consequently, { H m ( · ; σ 2 ) | m Z 0 n } forms an orthonormal basis for L p N 2 ( R n ) [21], where
H m ( y ; σ 2 ) i = 1 n H m i ( y i ; σ i 2 ) .
Since, by Lemma A2, ln p ( · ; F * ) L p N 2 ( R n ) , there exist constants { c i } i Z 0 n for which
ln p ( y ; F * ) = i Z 0 n c i H i ( y ; σ 2 ) ,
where equality is in an L p N 2 ( R n ) sense. Then, substituting (47), and using the notations
m ! m 1 ! m n ! ,
and
x m x 1 m 1 x n m n
for m Z 0 n and x R n , we write
e 1 2 x T Σ 1 x + x T Σ 1 y = i = 1 n e 1 2 σ i 2 x i 2 + 1 σ i 2 y i x i
= i = 1 n m i = 0 H m i ( y i ; σ i 2 ) x i m i m i !
= m Z 0 n i = 1 n H m i ( y i ; σ i 2 ) x i m i m i !
= m Z 0 n H m ( y ; σ 2 ) x m m ! .
Substituting (49) and (55) into the integral term in (36) yields
R n p N ( y x ) ln p ( y ; F * ) d y = R n p N ( y ) e 1 2 x T Σ 1 x + x T Σ 1 y i Z 0 n c i H i ( y ; σ 2 ) d y
= R n p N ( y ) m Z 0 n x m m ! H m ( y ; σ 2 ) i Z 0 n c i H i ( y ; σ 2 ) d y
= m Z 0 n x m m ! i Z 0 n c i H m ( y ; σ 2 ) , H i ( y ; σ 2 )
= m Z 0 n c m x m ,
where c m = c m / ρ m , with
> ρ m m ! H m ( y ; σ 2 ) , H m ( y ; σ 2 ) > 0 .
This simplification to a polynomial will be helpful since the cost function associated with the even-moment constraint is also a polynomial. This relationship is exploited in Section 3.3.

3.3. Non-Constancy of s ( · ; F * )

Recall the relationship s u p p ( F * ) Z ( s ; F * ) A from (37). Since s u p p ( F * ) , s ( · ; F * ) has at least one zero, it is constant if and only if Z ( s ; F * ) = R n . This subsection is dedicated to showing the latter equivalent condition. The immediate implication is that s u p p ( F * ) is a strict subset of R n ; however, the fact that s ( · ; F * ) is a non-zero real-analytic function will be used in Section 3.4 to prove the main results.
By way of contradiction, suppose that, for all x R n , s ( x ; F * ) = 0 . Substituting (59) into (36), this is equivalent to
[ γ a C h ( N ) ] γ ( i = 1 n x i 2 ) k = m Z 0 n c m x 1 m 1 x n m n ,
for all x R n . The discussion proceeds in two cases: k = 1 and k > 1 .
  • Case k = 1 :
In the case that k = 1 and A = R n , X * is known to be Gaussian [22] and there is no contradiction with (61). Therefore, for k = 1 , we focus only on compact input alphabets A R n .
With k = 1 and A R n , ref. (61) reduces to
[ γ a C h ( N ) ] γ i = 1 n x i 2 = m Z 0 n c m x 1 m 1 x n m n ,
for all x R n . Let e i be the i′th row of the n × n identity matrix and let 0 Z 0 n be the all zero vector. Since (62) holds for all x R n , matching coefficients gives
c m = γ a C h ( N ) , if   m = 0 γ , if   m = 2 e i , i { 1 , , n } 0 , otherwise .
Since, for each i { 1 , , n } and m Z 0 n , H m i ( y i ; σ i 2 ) has degree m i and a positive leading coefficient,
H m ( y ; σ 2 ) = i = 1 n H m i ( y i ; σ i 2 )
also has degree m i in y i and the unique term with total degree d ( m ) = m 1 + + m n is y m 1 y m n , which has a positive coefficient. Therefore, the polynomials present in the sum are of the form
H 0 ( y ; σ 2 ) = κ 0 > 0
and, for i { 1 , , n } ,
H 2 e i ( y ; σ 2 ) = κ 2 e i y i 2 + α 2 e i y i + β 2 e i .
The constants κ 0 and κ 2 e i are positive, while α 2 e i and β 2 e i are real. Substituting this and the identity c m = c m ρ m into (49) yields
ln p ( y ; F * ) = c 0 ρ 0 H 0 ( y ; σ 2 ) + i = 1 n c 2 e i ρ 2 e i H 2 e i ( y ; σ 2 )
= κ 0 ρ 0 [ γ a C h ( N ) ] γ i = 1 n ρ 2 e i ( κ 2 e i y i 2 + α 2 e i y i + β 2 e i ) ,
or equivalently,
p ( y ; F * ) = e κ 0 ρ 0 [ γ a C h ( N ) ] i = 1 n e γ ρ 2 e i ( κ 2 e i y i 2 + α 2 e i y i + β 2 e i ) .
By definition, γ 0 ; however, γ = 0 results in a constant density on R n , which is invalid. Then, it must be the case that γ > 0 . Thus, the output achieved by X * , Y * X * + N has independent Gaussian components. Since X * and N are independent and N is an n-variate Gaussian random variable, X * must either be an n-variate Gaussian random variable or be almost surely equal to some x 0 R n . In the former case, X * violates the stipulation that the input alphabet A is compact and contradicts the assumption that s ( · ; F * ) is identically 0. In the latter case, s u p p ( F * ) = { x 0 } is trivial and satisfies the main results of the paper.
  • Case k > 1 :
For the case k > 1 , we derive a contradiction to (61) using results on the rate of decay of a function compared with that of its Fourier transform to conclude that s ( · ; F * ) is not identically 0.
Lemma 1. 
Let U R n have, for some β > 0 , a characteristic function satisfying
| ϕ U ( ω ) | | E e i ω T U | e β ω 2 2
for all ω R n . Let V be a random variable independent of U . Then, the characteristic function of W = U + V satisfies, for all ω R n ,
| ϕ W ( ω ) | e β ω 2 2 .
Proof. 
By the independence of U and V , and the fact that characteristic functions have pointwise moduli upper-bounded by 1,
| ϕ W ( ω ) | =   | ϕ U ( ω ) | | ϕ V ( ω ) |
| ϕ V ( ω ) | e β ω 2 2
e β ω 2 2 .
Lemma 2. 
Let U R n have, for some constant β > 0 , a characteristic function satisfying
| ϕ U ( ω ) | | E e i ω T U | e β ω 2 2
for all ω R n . Let V be a random variable independent of U and W = U + V have density p W ( · ) . If there exist positive constants α and K such that, for all x R n ,
p W ( x ) K e α x 2 ,
then α β 0.5 .
Proof. 
Apply Lemma 1 and Theorem 4 of [23], noting that an identically 0 function cannot be a density. □
We make use of Lemma 2 by setting U = N , V = X * , and W = Y * and deriving a contradiction to the assumption that s ( · ; F * ) is identically 0. Note that, using Rayleigh quotients, the modulus of the characteristic function of N can be upper-bounded for any ω R n by
| E [ e i ω T N ] | = e 1 2 ω T Σ ω e 1 2 σ 1 2 ω 2 .
That is, the characteristic function of N satisfies (75) with β = σ 1 2 .
To complete the contradiction, we show that there exists α > 0.5 / β and K α > 0 such that p ( · ; F * ) satisfies the bound in (76). The assumption that s ( · ; F * ) is identically 0 yields (61); substituting the Multinomial Theorem,
[ γ a C h ( N ) ] γ k 1 + + k n = k k ! k 1 ! k n ! x 1 2 k 1 x n 2 k n = m Z 0 n c m x 1 m 1 x n m n .
By coefficient matching in (78), the set of non-zero coefficients, other than c 0 , is indexed by the set
M { m Z 0 n | i = 1 n m i = 2 k   and   m i   is   even   i { 1 , , n } } .
Furthermore, for m M ,
c m = γ ( i = 1 n m i / 2 ) ! ( m 1 / 2 ) ! ( m n / 2 ) !
Therefore, substituting (80) into (49),
p ( y ; F * ) = e κ 0 ρ 0 ( γ a C h ( N ) ) e m M c m ρ m H m ( y ; σ 2 ) .
for some positive constant κ 0 = H 0 ( y ; σ 2 ) . As with the case k = 1 , γ = 0 results in a constant output density over R n and can be disregarded as a possibility. Thus, for each m M , we have
c m < 0 .
With W = Y * and α > 0 , showing that there exists K α for which (76) holds, is equivalent to showing that
p ( y ; F * ) e α y 2 = e κ 0 ρ 0 ( γ a C h ( N ) ) e m M c m ρ m H m ( y ; σ 2 ) + α i = 1 n y i 2
is bounded. This, in turn, is equivalent to showing that the polynomial in the exponent,
q α ( y ) m M c m ρ m H m ( y ; σ 2 ) + α i = 1 n y i 2 ,
is upper bounded. We proceed by considering the degrees of the terms of q α ( y ) to determine the behavior of (84) as y increases.
For each m Z 0 n and i { 1 , , n } , H m ( y ; σ 2 ) has degree m i in y i . Furthermore, H m ( y ; σ 2 ) has total degree
d ( m ) i = 1 n m i
and the unique highest degree term, y 1 m 1 y n m n , has coefficient κ m > 0 . Note that, since k > 1 , and by the definition of M , q α ( · ) has total degree 2 k 4 . Hence, ref. (84) can be rewritten as q α ( y ) = q α ( 0 ) ( y ) + q α ( 1 ) ( y ) + q α ( 2 ) ( y ) , where
q α ( 0 ) ( y ) i = 1 n c 2 k e i ρ 2 k e i κ 2 k e i y i 2 k = γ i = 1 n ρ 2 k e i κ 2 k e i y i 2 k ,
q α ( 1 ) ( y ) m M m { 2 k e i i { 1 , , n } } c m ρ m κ m y m ,
and q α ( 2 ) ( y ) is the sum of the remaining terms, each with a total degree of at most 2 k 2 .
Note the following:
  • For each y R n , q α ( 0 ) ( y ) 0 and, by Lemma A9, the minimal value of | q α ( 0 ) ( y ) | —evaluated on a sphere y = L of radius L 0 —is at least γ min i { 1 , , n } { ρ 2 k e i κ 2 k e i } L 2 k / n k .
  • For each y R n , we have that q α ( 1 ) ( y ) 0 . Indeed, for each m M , c m < 0 by (82), ρ m > 0 , and κ m > 0 ; further, for each i { 1 , , n } , m i is even.
  • The maximum value of | q α ( 2 ) ( y ) | , evaluated on a sphere y = L of radius L 1 , is at most A L 2 k 2 for some A > 0 —that is, each term of q α ( 2 ) ( y ) is either of the form α y i 2 or c m ρ m ν m , l y l for some m M , l Z 0 n , and ν m , l R , where d ( l ) 2 k 2 . Lemma A10 shows that these are no more than α L 2 or c m ρ m ν m , l L d ( l ) .
We conclude that, since q α ( 0 ) ( y ) 0 and q α ( 1 ) ( y ) 0 for all y R n ,
lim y q α ( y ) = lim y ( q α ( 0 ) ( y ) + q α ( 2 ) ( y ) ) + lim y q α ( 1 ) ( y )
lim y ( q α ( 0 ) ( y ) + q α ( 2 ) ( y ) )
= .
Thus, since q α ( y ) is a continuous function that satisfies (90), it is bounded from above. Let M q , α = sup y R n q α ( y ) and
K α = e κ 0 ρ 0 ( γ a C h ( N ) ) e M q , α .
Then, for all y R n ,
p ( y ; F * ) K α e α y 2 .
Recall that, with σ 1 2 > 0 , the smallest eigenvalue of Σ , and β = σ 1 2 , the characteristic function of N satisfies (75). Let α = 1 / σ 1 2 and choose K α according to (91). Then, p ( y ; F * ) satisfies (92), yet α β = 1 > 0.5 . Hence, the bound on the characteristic function of N given by (77) and the bound on the density of Y * given in (92) contradict Lemma 2. Therefore, the coefficient matching equation (78) cannot hold for all x R n and we conclude that, for k > 1 , s ( · ; F * ) cannot be identically 0 on R n .
We summarize the results of the two cases, k = 1 and k > 1 , in a theorem.
Theorem 1. 
Suppose that either
1. 
A R n is compact, or
2. 
A = R n , with k 1 .
Then, either s u p p ( F * ) = { x 0 } for some x 0 R n or
Z ( s ; F * ) R n .
An immediate consequence of Theorem 1 is that s u p p ( F * ) is a strict subset of R n . Recall from Section 3.1 that s ( · ; F * ) has an analytic extension to C n . Therefore, Theorem 1 shows that s u p p ( F * ) is “sparse" in the sense used by [12]—that is, there exists a non-zero function with an analytic extension to C n that is zero on s u p p ( F * ) . However, the primary importance of Theorem 1 is as an intermediary result that is used in Section 3.4 to obtain a better understanding of the structure of s u p p ( F * ) .

3.4. Main Results

In this section, we use geometry to show that s u p p ( F * ) is contained in a countable disjoint union of submanifolds of dimensions ranging 0 , , n 1 . Furthermore, this union is finite when A is compact. We then show that s u p p ( F * ) has Lebesgue measure 0 and is nowhere dense in R n .
The discussions in this section consider subsets of a vector’s components; so, for x R n and i { 1 , , n } , we introduce the notation
x i ( x 1 , , x i ) R i .
Recall from (37) in Section 3.1 that
s u p p ( F * ) Z ( s ; F * ) A .
Since, by Lemma A8, s ( · ; F * ) has an analytic extension to C n , it is real-analytic, which motivates us to study the geometry of zero sets of real-analytic functions. We start by restating Theorem 6.3.3 of [24] to the level that is needed in this paper.
Theorem 2 
(Structure Theorem). Let ψ ( · ) : R n R be a real-analytic function, where ψ ( 0 , , 0 , x n ) is not identically 0 in x n . After a rotation of the coordinates x 1 , , x n 1 , there exist constants δ m , m { 1 , , n } such that with
Q { x R n | x m | < δ m m { 1 , , n } } ,
we have
{ x Q ψ ( x ) = 0 } = i = 0 n 1 V i ,
where V 0 is either empty or contains only the origin and V i , i { 1 , , n 1 } , is a finite disjoint union of i-dimensional submanifolds—that is, for each i { 1 , , n 1 } , there exists n i for which
V i = j = 0 n i Γ i j ,
where each Γ i j is an i-dimensional submanifold. Furthermore, letting
Q i { x i R i | x m | < δ m m { 1 , , i } } ,
there exists an open set Ω i j Q i and real-analytic functions α i j , m ( · ) , m { i + 1 , , n } , on Ω i j for which
Γ i j = { ( x i , α i j , i + 1 ( x i ) , , α i j , n ( x i ) ) R n x i Ω i j } .
We apply Theorem 2 to characterize the zero set of s ( · ; F * ) in the form of (97) and obtain the following result.
Theorem 3. 
Suppose that either
1. 
A R n is compact, or
2. 
A = R n , with k 1 .
Then,
s u p p ( F * ) Z ( s ; F * ) A = A ( i = 0 n 1 T i ) ,
where T 0 is a countable union of isolated points and T i , i { 1 , , n 1 } , is a countable disjoint union of i-dimensional submanifolds. Furthermore, if A is compact, these unions are finite.
Proof. 
First, note that, by Theorem 1, either s u p p ( F * ) = { x 0 } for some x 0 R n or s ( · ; F * ) is not identically 0 on R n . In the former case, the result is trivially true; so, assume that s ( · ; F * ) is not identically 0 on R n . Therefore, for any q Q n , we can translate s ( · ; F * ) by q and rotate its coordinate system to apply Theorem 2—that is, there exists a sufficiently small open set Q q around q such that
Z ( s ; F * ) Q q = i = 0 n 1 V i q ,
where the V i values are as in Theorem 2.
Since Q n is dense in R n ,
A q Q n Q q
Furthermore, if A is compact, the open cover { Q q } q Q n has a finite subcover { Q q j } j = 1 m —that is,
A j = 1 m Q q j .
Defining the index set
M Q n , A = R n { q j } j = 1 m , A R n is   compact ,
we obtain
Z ( s ; F * ) A = A ( q M ( Z ( s ; F * ) Q q ) )
= A ( q M i = 0 n 1 V i q )
= A ( i = 0 n 1 q M V i q ) .
Since, for each q M , V 0 q is either empty or a single point,
T 0 q M V 0 q
is a countable set of points and is finite when A is compact. Furthermore, each V i q , where i { 1 , , n 1 } , is itself a finite union of i-dimensional submanifolds. Hence,
T i q M V i q
is a countable union of i-dimensional submanifolds. When A is compact, this union is also finite. □
Note that Theorem 3 agrees with the results of [5] when the cases overlap. Indeed, consider the case in which A is a ball centered at the origin, k = 1 , and the noise covariance matrix Σ = t I n , where I n is the n × n identity matrix and t > 0 . Then, ref. [5] shows that the capacity-achieving distribution is supported on a finite number of concentric ( n 1 ) -spheres. Each ( n 1 ) -sphere is an n 1 dimensional submanifold.
In the next two theorems, we show that s u p p ( F * ) has Lebesgue measure 0 and is nowhere dense in R n .
Theorem 4. 
Suppose that either
1. 
A R n is compact, or
2. 
A = R n , with k 1 .
Let μ ( · ) denote the n-dimensional Lebesgue measure. Then,
μ ( s u p p ( F * ) ) = 0 .
Proof. 
By Theorem 3, we have
s u p p ( F * ) A ( i = 0 n 1 T i )
i = 0 n 1 q M V i q ,
where M is countable. Note that for each q M , V 0 q is either empty or a single point; so, μ ( V 0 q ) = 0 . Furthermore, for each q M and i { 1 , , n 1 } , V i q is a finite disjoint union of n i q i-dimensional submanifolds, and for i n 1 , each submanifold has Lebesgue measure 0. Therefore, μ ( s u p p ( F * ) ) = 0 . □
We will now define the notion of a subset being nowhere dense in its superset and show that s u p p ( F * ) is nowhere dense in R n .
Definition 3. 
Let B R n . A set A B is said to be dense in B if, for every b B , there exists a sequence { a i } i = 0 A such that
lim i a i = b .
Definition 4. 
Let B R n . A set A B is called nowhere dense in B if, for every open set U B , A U is not dense in U.
Theorem 5. 
Suppose that either
1. 
A R n is compact, or
2. 
A = R n , with k 1 .
Then, s u p p ( F * ) is nowhere dense in R n .
Proof. 
By Theorem 1, either s u p p ( F * ) = { x 0 } for some x 0 R n or
Z ( s ; F * ) R n .
Since, in the former, case s u p p ( F * ) is nowhere dense in R n , assume the latter and let (115) hold. Let U R n be a non-empty open set; we will show the result by proving that Z ( s ; F * ) U is not dense in U.
Fix x U . Translating s ( · ; F * ) by x , rotating the coordinate system and applying Theorem 2 shows that there exists a sufficiently small open set Q containing x on which
Z ( s ; F * ) Q = i = 0 n 1 V i
= V 0 ( i = 1 n 1 j = 0 n i Γ i j ) .
It suffices to show the existence of a point of the form ( x n 1 , u n ) U Q that is not the limit of any sequence in Z ( s ; F * ) U Q . Let ( y m n 1 , y n , m ) be a convergent sequence in Z ( s ; F * ) U Q , indexed by m, for which
lim m y m n 1 = x n 1 .
Using the parameterization from (100), the n’th component of sequence index m satisfies one of the following:
  • y n , m { v n v V 0 } , or
  • for some i m { 1 , , n 1 } and j m { 1 , , n i m } ,
    y n , m = α i m j m , n ( y m i m ) .
    Since each α i m j m , n ( · ) is real-analytic, it is continuous. Then, for y n , m satisfying (119), if lim m y n , m exists, we have lim m y n , m = α i j , n ( x i ) for some i { 1 , , n 1 } and j { 1 , , n i } . Since V 0 is either empty or a single point, the number of possible values for lim m ( y n ) m is at most
    | V 0 |   + i = 1 n 1 n i < .
However, since x U Q , where U Q is open, the set { t R ( x n 1 , x n + t ) U Q } is uncountable. Thus, there exists t such that ( x n 1 , x n + t ) U Q is not the limit of any sequence in Z ( s ; F * ) U Q . □

4. Discussion

This paper has considered vector-valued channels with additive Gaussian noise. Unlike much of the prior work in this area, the noise was not limited to having independent and identically distributed components. The support of the capacity-achieving input distribution was discussed when inputs were subjected to an even-moment radial constraint of order 2 k . Furthermore, the inputs were either allowed to take any value in R n or restricted to a compact set. When the input alphabet was the entire space, R n , only the case k 2 was considered since, for k = 1 , the optimal input distribution is well-known to be Gaussian.
The problem was framed as a convex optimization problem that was shown to be solved by a unique input distribution F * . The conditions for optimality yielded a real-analytic function s ( · ; F * ) whose zero set contained s u p p ( F * ) , the support of F * . Using the framework of an L 2 space that was weighted by the noise density, s ( · ; F * ) was simplified and shown to be non-constant on R n . Through geometric analysis of the zero set of s ( · ; F * ) , s u p p ( F * ) was shown to be contained in a countable union of single points and submanifolds of dimensions ranging 1 , , n 1 . When the input alphabet was compact, this union was further shown to be finite. Finally, it was determined that s u p p ( F * ) has Lebesgue measure 0 and is nowhere dense in R n .
This paper is an expansion of the work concerning even-moment input constraints in [14] to vector-valued channels that are not necessarily spherically symmetric. Viewed as a generalization of [12], it considers order 2 k rather than second-moment radial constraints and includes R n as a possible input alphabet. Unlike prior work, it also provides geometric results on the supports of capacity-achieving inputs to spherically asymmetric channels.

Author Contributions

Conceptualization, J.E., R.R.M. and P.M.; methodology, J.E., R.R.M. and P.M.; validation, J.E., R.R.M. and P.M.; formal analysis, J.E., R.R.M. and P.M.; investigation, J.E.; resources, J.E., R.R.M. and P.M.; writing—original draft preparation, J.E.; writing—review and editing, J.E., R.R.M. and P.M.; supervision, R.R.M. and P.M.; project administration, R.R.M. and P.M.; funding acquisition, J.E., R.R.M. and P.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported in part by grants from the Natural Sciences and Engineering Research Council of Canada (NSERC) through a CGS-M Grant (J.E.) and Discovery Grants (R.R.M. and P.M.).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Convexity and Compactness of Optimization Space

Theorem A1. 
The following properties hold for sets defined in Section 3.1:
  • F n ( A ) is convex;
  • for any b > 0 , P n ( A , k , b ) is convex and compact;
  • Q n ( A , k , a ) is convex.
Proof. 
We first show the convexity of F n ( A ) . Let F 1 , F 2 F n ( A ) , t [ 0 , 1 ] , and F t = t F 1 + ( 1 t ) F 2 . Then, since
s u p p ( F t ) s u p p ( F 1 ) s u p p ( F 2 ) A ,
F n ( A ) is convex.
To show convexity of P n ( A , k , b ) for b > 0 , let F 1 , F 2 P n ( A , k , b ) , t [ 0 , 1 ] and F t = t F 1 + ( 1 t ) F 2 . Since P n ( A , k , b ) F n ( A ) , it suffices to show that F t satisfies the radial even-moment constraint:
R n x 2 k d F t ( x ) = t R n x 2 k d F 1 ( x ) + ( 1 t ) R n x 2 k d F 2 ( x )
t b + ( 1 t ) b = b .
We now show the convexity of Q n ( A , k , a ) . For any F 1 , F 2 Q n ( A , k , a ) , there exists b F 1 , F 2 > 0 for which F 1 , F 2 P ( A , k , b F 1 , F 2 ) . Since P ( A , k , b F 1 , F 2 ) is convex, t F 1 + ( 1 t ) F 2 P ( A , k , b F 1 , F 2 ) Q n ( A , k , a ) . Hence, Q n ( A , k , a ) is convex.
It remains to show the compactness of P n ( A , k , b ) for any b > 0 . Note that the Lévy–Prokhorov metric metrizes weak convergence in F ( R n ) [25]; so, sequential compactness is equivalent to compactness. To prove compactness of P n ( A , k , b ) , we first show relative compactness, which allows us to conclude that any sequence in P n ( A , k , b ) has a subsequence that converges to some F F n ( A ) . Further, showing that F P n ( A , k , b ) will complete the proof.
Observe that each F F ( R n ) is defined on the complete separable metric space R n equipped with Euclidean distance. By Prokhorov’s Theorem (Theorem 3.2.1 of [25]), the relative compactness of P n ( A , k , b ) is equivalent to the tightness of P n ( A , k , b ) ; so, we will prove the latter.
To show the tightness of P n ( A , k , b ) , let X F P n ( A , k , b ) , ϵ > 0 , and D = ( a / ϵ ) 1 / 2 k . Then, applying Markov’s inequality,
P { X R n B D ¯ ( 0 ) } = P { X > D }
P { X 2 k D 2 k }
E [ X 2 k ] D 2 k
a D 2 k
= ϵ .
This is a uniform upper-bound for F P n ( A , k , b ) ; so, P n ( A , k , b ) is tight and, thus, relatively compact.
By the relative compactness of P n ( A , k , b ) , any sequence { F m } m = 0 P n ( A , k , b ) has a subsequence { F m j } m j = 0 that converges weakly to some F F ( R n ) . To show compactness, we must show that F P n ( A , k , b ) .
Since each F m j P n ( A , k , b ) , it follows that
A x 2 k d F m j ( x ) a .
By Theorem A.3.12 of [26], since x 2 k is non-negative and lower semicontinuous,
0 A x 2 k d F ( x ) lim inf j A x 2 k d F m j ( x ) a .
Therefore, the limiting distribution F satisfies the radial even-moment constraint imposed by P n ( A , k , b ) .
When A = R n , we conclude that F P n ( A , k , b ) . However, for the case that A is compact, we must also show almost surely that X A . For any index of the subsequence m j ,
A d F m j ( x ) = 1 .
By the Portmanteau Theorem [27], since A is closed,
A d F ( x ) lim sup j A d F m j ( x ) = 1 .

Appendix B. Necessary Conditions for the Capacity-Achieving Distribution

Theorem A2. 
Recall that
Q n ( A , k , a ) = b a P n ( A , k , b ) .
Suppose that F * solves the optimization problem in (26), where γ 0 is the Lagrange multiplier corresponding to the problem in (11). Then, the following are equivalent:
P.1 
For every F Q n ( A , k , a ) ,
A h ( x ; F * ) d F ( x ) γ ( A x 2 k d F ( x ) a ) + C + h ( N ) .
P.2 
For all x A ,
h ( x ; F * ) γ ( x 2 k a ) + C + h ( N ) ,
and if x s u p p ( F * ) , then
h ( x ; F * ) = γ ( x 2 k a ) + C + h ( N ) .
Proof. 
For any F Q n ( A , k , a ) , integrating both sides of (A15) with respect to d F ( · ) yields that (P.2) implies (P.1).
It remains to be shown that (P.1) implies (P.2). Suppose this implication is false—that is, (P.1) holds but there either exists v A for which
h ( v ; F * ) > γ ( v 2 k a ) + C + h ( N ) ,
or there exists w s u p p ( F * ) such that
h ( w ; F * ) γ ( w 2 k a ) + C + h ( N ) .
If (A17) holds, let b = max { v 2 k , a } and let F ( x ) = i = 1 n u 1 ( x i v i ) , where u 1 ( · ) is the Heaviside step function. Then, F P n ( A , k , b ) Q n ( A , k , a ) and
A h ( x ; F * ) d F ( x ) γ ( A x 2 k d F ( x ) a ) = h ( v ; F * ) γ ( v 2 k a )
> C + h ( N ) ,
contradicting (A14). Therefore, (A17) cannot be satisfied for any x A and we are left with the alternative that there exists w s u p p ( F * ) A for which (A18) hold—that is,
h ( w ; F * ) < γ ( w 2 k a ) + C + h ( N ) .
By Lemma A7, the extension of h ( · ; F ) to C n is continuous; hence, h ( · ; F ) is continuous on R n . Since · 2 k is continuous as well, there exists δ > 0 such that
h ( x ; F * ) < γ ( x 2 k a ) + C + h ( N )
for every x B δ ( w ) . Furthermore, since w s u p p ( F * ) , there exists ϵ such that P { X * B δ ( w ) } = ϵ > 0 .
Recall from (18) that
h Y ( F * ) = A h ( x ; F * ) d F * ( x ) ,
and since γ g k ( F * ) = 0 (see (27)),
γ a = γ A x 2 k d F * ( x ) .
Substituting (A23) and (A24), and noting that F * ( B δ ( w ) A C ) = 0 , yields
C + h ( N ) γ a = h Y ( F * ) γ a
= A [ h ( x ; F * ) γ x 2 k ] d F * ( x )
= A B δ ( w ) [ h ( x ; F * ) γ x 2 k ] d F * ( x )
= B δ ( w ) [ h ( x ; F * ) γ x 2 k ] d F * ( x ) + A B δ ( w ) [ h ( x ; F * ) γ x 2 k ] d F * ( x )
< ϵ [ C + h ( N ) γ a ] + ( 1 ϵ ) [ C + h ( N ) γ a ]
= C + h ( N ) γ a ,
where the first term of (A29) is due to (A22) and the second is due to the contradiction derived from (A17). The above is a contradiction, which completes the proof. □

Appendix C. Integrability Results

Lemma A1. 
Let b > 0 and F P n ( R n , k , b ) . Then, there exist positive constants η and κ for which
| ln p ( y ; F ) | η y 2 + κ
for any y R n .
Proof. 
With η = 2 σ 1 2 , D = ( 2 b ) 1 2 k , and
M = 1 2 ( 2 π ) n | Σ | ,
let
ζ y = M e η max { y 2 , D 2 } .
Then, for any y R n , 0 < ζ y p ( y ; F ) 2 M . Since | ln ( · ) | is continuous and has no local maxima,
| ln p ( y ; F ) | max { | ln ( ζ y ) | , | ln ( 2 M ) | }
η max { y 2 , D 2 } + | ln M | + | ln ( 2 M ) |
η y 2 + η D 2 + | ln M | + | ln ( 2 M ) | .
Lemma A2. 
For any F P n ( R n , k , a ) , ln p ( · ; F ) L p N 2 ( R n )
Proof. 
The result follows by Lemma A1. □
Lemma A3. 
Let β α 1 and Z be a random variable taking values in R n . If E [ Z β ] = ρ , then E [ Z α ] ρ +1.
Proof. 
Since β α 1 , we have
E [ Z α ] E [ max { 1 , Z } α ]
E [ max { 1 , Z } β ]
= E [ max { 1 , Z β } ]
ρ + 1 .
Lemma A4. 
For any b > 0 and F 0 , F 1 P n ( R n , k , b ) ,
R n | p ( y ; F 0 ) ln p ( y ; F 1 ) | d y < .
Proof. 
Note that
R n | p ( y ; F 0 ) ln p ( y ; F 1 ) | d y = R n p ( y ; F 0 ) | ln p ( y ; F 1 ) | d y
= R n R n p N ( y x ) | ln p ( y ; F 1 ) | d F 0 ( x ) d y .
We proceed by first proving that
R n R n p N ( y x ) | ln p ( y ; F 1 ) | d y d F 0 ( x ) < .
Then, since the integrand is non-negative, the Tonelli–Fubini Theorem [18] justifies interchanging the order of integration in (A44) to conclude that (A43) and, hence, the left side of (A42) is finite.
By Lemma A1, for any y R n ,
| ln p ( y ; F 1 ) | η y 2 + κ b .
Initially considering only the inner integral in (A44), with the substitution u = y x , yields
R n p N ( y x ) | ln p ( y ; F 1 ) | d y η R n p N ( y x ) y 2 d y + κ b
= η R n p N ( u ) u + x 2 d u + κ b
η R n p N ( u ) ( u + x ) 2 d u + κ b
η R n p N ( u ) ( 4 u 2 + 4 x 2 ) d u + κ b
8 η E [ N 2 ] + 8 η x 2 + κ b ,
where (A48) is due to the triangle inequality. Since E [ N 2 ] = t r ( Σ ) is finite, there are positive constants
κ 0 = 8 η E [ N 2 ] + κ b , a n d
κ 2 = 8 η ,
for which
R n p N ( y x ) | ln p ( y ; F 1 ) | d y κ 2 x 2 + κ 0 .
Furthermore, since k 1 and E F 0 [ X 2 k ] b , then by Lemma A3, E F 0 [ X 2 ] b + 1 . Substituting this and (A53) into (A44), we obtain
R n R n p N ( y x ) | ln p ( y ; F 1 ) | d y d F 0 ( x ) R n ( κ 2 x 2 + κ 0 ) d F 0 ( x )
= κ 2 E F 0 [ X 2 ] + κ 0
( b + 1 ) κ 2 + κ 0
< .

Appendix D. Properties of the Objective Functional

The aim of this section is to discuss the weak continuity, strict concavity, and weak differentiability of the objective function,
J γ ( F ) = I ( F ) γ g k ( F ) ,
for the optimization problem posed in (26). These properties are instrumental in the establishment and subsequent analysis of the convex optimization problem considered in Section 3. To support the proof of Theorem A2, we show that, for arbitrary b > 0 , the required properties hold on P n ( A , k , b ) .
Theorem A3. 
I ( · ) is weak continuous on P n ( A , k , a ) .
Proof. 
For any F P n ( A , k , a ) , we write
I ( F ) = h Y ( F ) h ( N )
= h Y ( F ) 1 2 ln | 2 π e Σ | ,
where h Y ( F ) is finite by Lemma A4. Therefore, weak continuity of I ( · ) on P n ( A , k , a ) is equivalent to weak continuity of h Y ( · ) on P n ( A , k , a ) .
Let { F m } m = 0 be a sequence in P n ( A , k , a ) converging weakly to F P n ( A , k , a ) . By the Helly–Bray Theorem, since p N ( · ) is bounded and continuous,
lim m p ( y ; F m ) = lim m R n p N ( y x ) d F m ( x )
= R n p N ( y x ) d F ( x )
= p ( y ; F ) ,
for any y R n . Therefore, by Scheffé’s Lemma, the sequence { p ( y ; F m ) } m = 0 converges in total variation to p ( y ; F ) . It suffices to show that differential entropy is uniformly continuous over { p ( · ; F ^ ) | F ^ P n ( A , k , a ) } with respect to the total variation metric.
The family of densities { p ( · ; F ^ ) | F ^ P n ( A , k , a ) } is uniformly upper bounded. Furthermore, the corresponding random vectors, Y = X + N for some X F ^ P n ( A , k , a ) , uniformly satisfy the bound
E [ Y ] a + 1 + E [ N ]
given by Lemma A3. The result follows by Theorem 1 of [28]. □
Theorem A4. 
For any b > 0 , I ( · ) is strictly concave on P n ( A , k , b ) .
Proof. 
See Appendix E of [14]. □
Theorem A5. 
g k ( · ) is convex on Q n ( A , k , a ) .
Proof. 
Let t [ 0 , 1 ] and F 0 , F 1 Q n ( A , k , a ) . Then, g k ( F 0 ) and g k ( F 1 ) are finite and
g k ( t F 0 + ( 1 t ) F 1 ) = R n x 2 k d t F 0 ( x ) + R n x 2 k d ( 1 t ) F 1 ( x ) a
= t ( R n x 2 k d F 0 ( x ) a ) + ( 1 t ) ( R n x 2 k d F 1 ( x ) a )
= t g k ( F 0 ) + ( 1 t ) g k ( F 1 ) .
We make use of the following notion of a derivative of a function defined on a convex set Ω [14].
Definition A1. 
Define the weak derivative of L : Ω R at F 0 in the direction F by
L ( F 0 , F ) = lim t 0 L ( ( 1 t ) F 0 + t F ) L ( F 0 ) t ,
whenever it exists.
Lemma A5. 
g k ( · ) is weakly differentiable on Q n ( A , k , a ) . For any F 0 , F Q n ( A , k , a ) , the weak derivative is finite and given by
g k ( F 0 , F ) = g k ( F ) g k ( F 0 ) .
Proof. 
Let t [ 0 , 1 ] and F 0 , F Q n ( A , k , a ) . Then, noting that g k ( F 0 ) and g k ( F 1 ) are finite,
g k ( t F + ( 1 t ) F 0 ) g k ( F 0 ) = g k ( t ( F F 0 ) + F 0 ) g k ( F 0 )
= A x 2 k d [ t ( F ( x ) F 0 ( x ) ) + F 0 ( x ) ] a ( A x 2 k d F 0 ( x ) a )
= t ( A x 2 k d F ( x ) A x 2 k d F 0 ( x ) )
= t [ A x 2 k d F ( x ) a ( A x 2 k d F 0 ( x ) a ) ]
= t ( g k ( F ) g k ( F 0 ) ) .
Dividing by t and taking the limit as it goes to 0 gives (A69). □
Lemma A6. 
I ( · ) is weakly differentiable on Q n ( A , k , a ) . For any F 0 , F Q n ( A , k , a ) , the weak derivative at F 0 in the direction of F is given by
I ( F 0 , F ) = A h ( x ; F 0 ) d F ( x ) h Y ( F 0 ) .
Proof. 
The proof largely follows Appendix E from [14]. The step that requires special attention is the application of the Dominated Convergence Theorem in (27) of [14]—that is, we would like to show the integrability of
| ( p ( y ; F ) p ( y ; F 0 ) ) ln ( 1 2 p ( y ; F 0 ) ) |   | p ( y ; F ) ln p ( y ; F 0 ) | + | p ( y ; F 0 ) ln p ( y ; F 0 ) |
+ ( p ( y ; F ) + p ( y ; F 0 ) ) ln 2 .
Since F 0 , F Q n ( A , k , a ) , there exists b > 0 for which F 0 , F P n ( A , k , b ) . Then, (A77) follows by Lemma A4. □

Appendix E. Analycity of Marginal Entropy Density

Lemma A7. 
For any b > 0 and F P n ( A , k , b ) , the extension of h ( x ; F ) to z C n given by
h ( z ; F ) = R n p N ( y z ) ln p ( y ; F ) d y
is continuous in z .
Proof. 
Let z C n . Fix ϵ > 0 and consider B C n , ϵ ¯ ( z ) , the ball of radius ϵ around z in C n . For any sequence { z m } m = 0 C n converging to z , there exists M 0 such that z m B C n , ϵ ¯ ( z ) for each m M . Therefore, it suffices to show that lim m h ( z m ; F ) = h ( z ; F ) for each sequence { z m } m = 0 B C n , ϵ ¯ ( z ) .
Since the extension of p N ( u ) to C n is continuous,
lim m h ( z m ; F ) = lim m R n p N ( y z m ) ln p ( y ; F ) d y
= R n lim m p N ( y z m ) ln p ( y ; F ) d y
= R n p N ( y z ) ln p ( y ; F ) d y
= h ( z ; F ) ,
where (A80) is due to the Dominated Convergence Theorem, which will be justified next.
Let y R n and z m = α m + i β m B C n , ϵ ¯ ( z ) . Prior to finding a dominating function for the entire integrand in (A80), we establish the following upper bound on | p N ( y z m ) | :
| p N ( y z m ) | = 1 ( 2 π ) n | Σ | e 1 2 ( y z m ) Σ 1 ( y z m )
= 1 ( 2 π ) n | Σ | e 1 2 ( y α m ) Σ 1 ( y α m ) e 1 2 β m Σ 1 β m
1 ( 2 π ) n | Σ | e 1 2 σ n 2 y α m 2 e 1 2 σ 1 2 β m 2
e 1 2 σ 1 2 ϵ 2 ( 2 π ) n | Σ | e 1 2 σ n 2 y α m 2
e 1 2 σ 1 2 ϵ 2 ( 2 π ) n | Σ | max u B ϵ ¯ ( z ) e 1 2 σ n 2 y u 2
e 1 2 σ 1 2 ϵ 2 ( 2 π ) n | Σ | e 1 2 σ n 2 ( 1 ϵ y ) 2 y 2 , if y { 0 } B ϵ ¯ ( z ) 1 , if y { 0 } B ϵ ¯ ( z )
Θ ( y ) .
Now, by Lemma A4,
| p N ( y z m ) ln p ( y ; F ) | Θ ( y ) ( η y 2 + κ b ) ,
which is integrable with respect to y . □
Lemma A8. 
For any F P n ( A , k , a ) , h ( x ; F ) has an analytic extension to an entire function on C n .
Proof. 
For convenience of notation, we will prove the case of n = 2 here.
Consider the extension of h ( x ; F ) to z C 2 :
h ( z ; F ) = R 2 p N ( y z ) ln p ( y ; F ) d y
= R 2 p N 1 ( y 1 z 1 ) p N 2 ( y 2 z 2 ) ln p ( y ; F ) d y
= R 2 κ 1 e ( y 1 z 1 ) 2 2 σ 1 2 κ 2 e ( y 2 z 2 ) 2 2 σ 2 2 ln p ( y ; F ) d y ,
where, for i { 1 , 2 } ,
κ i 1 2 π σ i 2 .
We will show that h ( ( z 1 , z 2 ) ; F ) is an entire function in z 1 C for fixed z 2 . Similarly, by the symmetry of the problem, h ( ( z 1 , z 2 ) ; F ) is an entire function in z 2 C for fixed z 1 . We finally conclude, by Hartog’s Theorem [29], that h ( ( z 1 , z 2 ) ; F ) is entire on C 2 . Therefore, it suffices to show that h ( ( · , z 2 ) ; F ) is entire, for which we use Morera’s Theorem.
Morera’s Theorem requires that the function under consideration, in this case h ( · , z 2 ; F ) , be continuous, which holds by Lemma A7. If, for any closed smooth curve θ ( t ) , defined for 0 t 1 , and any fixed z 2 C ,
0 1 h ( θ ( t ) , z 2 ; F ) θ ( t ) d t = 0 1 κ 1 e ( y 1 θ ( t ) ) 2 2 σ 1 2 p N 2 ( y 2 z 2 ) × ln p ( ( y 1 , y 2 ) ; F ) d y 1 d y 2 θ ( t ) d t
= 0 ,
then, by Morera’s Theorem, h ( ( · , z 2 ) ; F ) is entire. Furthermore, by the Fubini–Tonelli Theorem [18], if
0 1 κ 1 e ( y 1 θ ( t ) ) 2 2 σ 1 2 p N 2 ( y 2 z 2 ) ln p ( ( y 1 , y 2 ) ; F ) d y 1 d y 2 | θ ( t ) | d t < ,
then the order of integration in (A95) can be interchanged such that integration with respect to t is performed first. Under this condition, since the extension of
p N 1 ( u 1 ) = κ 1 e u 1 2 2 σ 1 2
to z 1 C is analytic, we obtain
0 1 κ 1 e ( y 1 θ ( t ) ) 2 2 σ 1 2 p N 2 ( y 2 z 2 ) ln p ( ( y 1 , y 2 ) ; F ) θ ( t ) d t = 0 ,
thereby fulfilling the condition for Morera’s Theorem in (A96). It remains only to justify the application of the Fubini–Tonelli theorem by showing (A97).
To upper bound the integrand on the left side of (A97), let α 0 , β 0 > 0 be sufficiently large such that
{ θ ( t ) | 0 t 1 } { z C R e { z } [ α 0 , α 0 ] , I m { z } [ β 0 , β 0 ] } .
By Lemma A1, there exists κ 0 such that, for all y R 2 ,
| ln p ( y ; F ) | η y 2 + κ .
We proceed by splitting the integral with respect to y 1 into two intervals:
A 1 ( , 0 ) ,
A 2 [ 0 , ) .
For 0 t 1 , let θ ( t ) = α ( t ) + i β ( t ) . Let y 1 A 2 and note that, since y 1 > 0 ,
1 2 σ 1 2 ( y 1 α ( t ) ) 2 1 2 σ 1 2 ( y 1 2 + 2 y 1 α 0 ( α ( t ) ) 2 )
1 2 σ 1 2 ( y 1 2 + 2 y 1 α 0 )
= 1 2 σ 1 2 ( y 1 α 0 ) 2 + 1 2 σ 1 2 α 0 2 .
From (A106), we obtain the upper bound
e ( y 1 θ ( t ) ) 2 2 σ 1 2 = e 1 2 σ 1 2 ( ( y 1 α ( t ) ) 2 ( β ( t ) ) 2 2 i ( y 1 α ( t ) ) β ( t ) )
= e 1 2 σ 1 2 ( β ( t ) ) 2 e 1 2 σ 1 2 ( y 1 α ( t ) ) 2
e 1 2 σ 1 2 ( α 0 2 + β 0 2 ) e 1 2 σ 1 2 ( y 1 α 0 ) 2 .
Therefore, substituting (A105), for any 0 t 1 ,
0 κ 1 e ( y 1 θ ( t ) ) 2 2 σ 1 2 ln p ( ( y 1 , y 2 ) ; F ) d y 1
κ 1 e 1 2 σ 1 2 ( α 0 2 + β 0 2 ) 0 e 1 2 σ 1 2 ( y 1 α 0 ) 2 ( η ( y 1 2 + y 2 2 ) + κ ) d y 1
= c 0 y 2 2 + c 1 ,
for some constants 0 c 0 , c 1 < . Applying similar reasoning when y 1 A 1 shows that there are constants 0 c 2 , c 3 < for which
0 κ 1 e ( y 1 θ ( t ) ) 2 2 σ 1 2 ln p ( ( y 1 , y 2 ) ; F ) d y 1 c 2 y 2 2 + c 3 .
We now integrate with respect to y 2 . For any z 2 = τ + i r C and 0 t 1 ,
ψ ( θ ( t ) , z 2 ) κ 1 e ( y 1 θ ( t ) ) 2 2 σ 1 2 p U 2 ( y 2 z 2 ) ln p ( ( y 1 , y 2 ) ; F ) d y 1 d y 2
= i = 1 2 A i κ 1 e ( y 1 θ ( t ) ) 2 2 σ 1 2 κ 2 e ( y 2 z 2 ) 2 2 σ 2 2 ln p ( ( y 1 , y 2 ) ; F ) d y 1 d y 2
κ 2 e r 2 2 σ 2 2 e ( y 2 τ ) 2 2 σ 2 2 [ c 0 y 2 2 + c 1 + c 2 y 2 2 + c 3 ] d y 2
< .
Therefore, ψ ( θ ( t ) , z 2 ) is uniformly bounded over { θ ( t ) | 0 t 1 } and
0 1 | ψ ( θ ( t ) , z 2 ) θ ( t ) | d t < .

Appendix F. Polynomial Bounds

Lemma A9. 
The function
f ( L ) = min y = L i = 1 n y i 2 k
satisfies f ( L ) L 2 k / n k .
Proof. 
First, note that, for any L 0 and y R n satisfying L = y , we have
max i { 1 , , n } y i 2 L 2 n .
Then, for any L 0 , we obtain the lower bound
f ( L ) = min y = L i = 1 n y i 2 k
min y = L max i { 1 , , n } y i 2 k
L 2 k n k .
Lemma A10. 
For any l Z 0 n , the function
ξ l ( L ) = max y = L | y l |
satisfies ξ l ( L ) L d ( l ) , where d ( l ) = l 1 + + l n .
Proof. 
For y = L ,
ξ l ( L ) = max y = L | y l |
= max y = L | y 1 l 1 y n l n |
max y = L L l 1 L l n
= L d ( l ) .

References

  1. Alireza Banani, S.; Vaughan, R.G. Compensating for Non-Linear Amplifiers in MIMO Communications Systems. IEEE Trans. Antennas Propag. 2012, 60, 700–714. [Google Scholar] [CrossRef]
  2. Liang, C.P.; Jong, J.H.; Stark, W.E.; East, J.R. Nonlinear amplifier effects in communications systems. IEEE Trans. Microw. Theory Tech. 1999, 47, 1461–1466. [Google Scholar] [CrossRef] [Green Version]
  3. Raich, R.; Zhou, G.T. On the modeling of memory nonlinear effects of power amplifiers for communication applications. In Proceedings of the 2002 IEEE 10th Digital Signal Processing Workshop, 2002 and the 2nd Signal Processing Education Workshop, Pine Mountain, GA, USA, 16 October 2002; pp. 7–10. [Google Scholar] [CrossRef]
  4. Smith, J.G. The information capacity of amplitude- and variance-constrained sclar Gaussian channels. Inf. Control. 1971, 18, 203–219. [Google Scholar] [CrossRef] [Green Version]
  5. Rassouli, B.; Clerckx, B. On the Capacity of Vector Gaussian Channels With Bounded Inputs. IEEE Trans. Inf. Theory 2016, 62, 6884–6903. [Google Scholar] [CrossRef]
  6. Shamai, S.; Bar-David, I. The capacity of average and peak-power-limited quadrature Gaussian channels. IEEE Trans. Inf. Theory 1995, 41, 1060–1071. [Google Scholar] [CrossRef] [Green Version]
  7. Dytso, A.; Al, M.; Poor, H.V.; Shamai Shitz, S. On the Capacity of the Peak Power Constrained Vector Gaussian Channel: An Estimation Theoretic Perspective. IEEE Trans. Inf. Theory 2019, 65, 3907–3921. [Google Scholar] [CrossRef]
  8. Dytso, A.; Yagli, S.; Poor, H.V.; Shamai Shitz, S. The Capacity Achieving Distribution for the Amplitude Constrained Additive Gaussian Channel: An Upper Bound on the Number of Mass Points. IEEE Trans. Inf. Theory 2020, 66, 2006–2022. [Google Scholar] [CrossRef] [Green Version]
  9. Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef] [Green Version]
  10. Sommerfeld, J.; Bjelakovic, I.; Boche, H. On the boundedness of the support of optimal input measures for Rayleigh fading channels. In Proceedings of the 2008 IEEE International Symposium on Information Theory, Toronto, ON, Canada, 6–11 July 2008; pp. 1208–1212. [Google Scholar] [CrossRef] [Green Version]
  11. Dytso, A.; Goldenbaum, M.; Shamai, S.; Poor, H.V. Upper and Lower Bounds on the Capacity of Amplitude-Constrained MIMO Channels. In Proceedings of the GLOBECOM 2017—2017 IEEE Global Communications Conference, Singapore, 4–8 December 2017; pp. 1–6. [Google Scholar] [CrossRef] [Green Version]
  12. Chan, T.H.; Hranilovic, S.; Kschischang, F.R. Capacity-achieving probability measure for conditionally Gaussian channels with bounded inputs. IEEE Trans. Inf. Theory 2005, 51, 2073–2088. [Google Scholar] [CrossRef]
  13. Dytso, A.; Goldenbaum, M.; Poor, H.V.; Shamai (Shitz), S. Amplitude Constrained MIMO Channels: Properties of Optimal Input Distributions and Bounds on the Capacity. Entropy 2019, 21, 200. [Google Scholar] [CrossRef] [Green Version]
  14. Fahs, J.J.; Abou-Faycal, I.C. Using Hermite Bases in Studying Capacity-Achieving Distributions Over AWGN Channels. IEEE Trans. Inf. Theory 2012, 58, 5302–5322. [Google Scholar] [CrossRef]
  15. Fahs, J.; Tchamkerten, A.; Yousefi, M.I. On the Optimal Input of the Nondispersive Optical Fiber. In Proceedings of the 2019 IEEE International Symposium on Information Theory (ISIT), Paris, France, 7–12 July 2019; pp. 131–135. [Google Scholar] [CrossRef]
  16. Bhaskara Rao, K.; Bhaskara Rao, M. Theory of Charges; Pure and Applied Mathematics; Academic Press: New York, NY, USA, 1983. [Google Scholar]
  17. Han, T.S. Information-Spectrum Methods in Information Theory; Springer: Berlin/Heidelberg, Germany, 2003. [Google Scholar] [CrossRef]
  18. Dudley, R.M. Integration. In Real Analysis and Probability, 2nd ed.; Cambridge Studies in Advanced Mathematics; Cambridge University Press: Cambridge, UK, 2002; pp. 114–151. [Google Scholar] [CrossRef]
  19. Rose, K. A mapping approach to rate-distortion computation and analysis. IEEE Trans. Inf. Theory 1994, 40, 1939–1952. [Google Scholar] [CrossRef] [Green Version]
  20. Johnston, W. The Weighted Hermite Polynomials Form a Basis for L2(R). Am. Math. Mon. 2014, 121, 249–253. [Google Scholar] [CrossRef]
  21. REED, M.; SIMON, B. II—Hilbert Spaces. In Methods of Modern Mathematical Physics; Reed, M., Simon, B., Eds.; Academic Press: New York, NY, USA, 1972; pp. 36–66. [Google Scholar] [CrossRef]
  22. Alajaji, F.; Chen, P. An Introduction to Single-User Information Theory; Springer Undergraduate Texts in Mathematics and Technology; Springer: Singapore, 2018. [Google Scholar]
  23. Sitaram, A.; Sundari, M.; Thangavelu, S. Uncertainty principles on certain Lie groups. Proc. Indian Acad. Sci. Math. Sci. 1995, 105, 135–151. [Google Scholar] [CrossRef]
  24. Krantz, S.; Parks, H. A Primer of Real Analytic Functions; Advanced Texts Series; Birkhäuser: Boston, MA, USA, 2002. [Google Scholar]
  25. Shiryaev, A.; Chibisov, D. Probability-1; Graduate Texts in Mathematics; Springer: New York, NY, USA, 2016. [Google Scholar]
  26. Dupuis, P.; Ellis, R. A Weak Convergence Approach to the Theory of Large Deviations; Wiley Series in Probability and Statistics; Wiley: New York, NY, USA, 1997. [Google Scholar]
  27. Billingsley, P. Convergence of Probability Measures; Wiley Series in Probability and Statistics; Wiley: New York, NY, USA, 2013. [Google Scholar]
  28. Ghourchian, H.; Gohari, A.; Amini, A. Existence and Continuity of Differential Entropy for a Class of Distributions. IEEE Commun. Lett. 2017, 21, 1469–1472. [Google Scholar] [CrossRef] [Green Version]
  29. Gunning, R.; Rossi, H. Analytic Functions of Several Complex Variables; Ams Chelsea Publishing, AMS Chelsea Pub.: Englewood Cliffs, NJ, USA, 2009. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Eisen, J.; Mazumdar, R.R.; Mitran, P. Capacity-Achieving Input Distributions of Additive Vector Gaussian Noise Channels: Even-Moment Constraints and Unbounded or Compact Support. Entropy 2023, 25, 1180. https://doi.org/10.3390/e25081180

AMA Style

Eisen J, Mazumdar RR, Mitran P. Capacity-Achieving Input Distributions of Additive Vector Gaussian Noise Channels: Even-Moment Constraints and Unbounded or Compact Support. Entropy. 2023; 25(8):1180. https://doi.org/10.3390/e25081180

Chicago/Turabian Style

Eisen, Jonah, Ravi R. Mazumdar, and Patrick Mitran. 2023. "Capacity-Achieving Input Distributions of Additive Vector Gaussian Noise Channels: Even-Moment Constraints and Unbounded or Compact Support" Entropy 25, no. 8: 1180. https://doi.org/10.3390/e25081180

APA Style

Eisen, J., Mazumdar, R. R., & Mitran, P. (2023). Capacity-Achieving Input Distributions of Additive Vector Gaussian Noise Channels: Even-Moment Constraints and Unbounded or Compact Support. Entropy, 25(8), 1180. https://doi.org/10.3390/e25081180

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop