Next Article in Journal
A Tool for Better Land Management
Next Article in Special Issue
Computational Techniques for Investigating Information Theoretic Limits of Information Systems
Previous Article in Journal
A Blockchain Voting System Based on the Feedback Mechanism and Wilson Score
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Source Coding with a Causal Helper

Faculty of Engineering, Bar-Ilan University, Ramat-Gan 52900, Israel
Information 2020, 11(12), 553; https://doi.org/10.3390/info11120553
Submission received: 9 November 2020 / Revised: 23 November 2020 / Accepted: 25 November 2020 / Published: 27 November 2020
(This article belongs to the Special Issue Statistical Communication and Information Theory)

Abstract

:
A multi-terminal network, in which an encoder is assisted by a side-information-aided helper, describes a memoryless identically distributed source to a receiver, is considered. The encoder provides a non-causal one-shot description of the source to both the helper and the receiver. The helper, which has access to causal side-information, describes the source to the receiver sequentially by sending a sequence of causal descriptions depending on the message conveyed by the encoder and the side-information subsequence it has observed so far. The receiver reconstructs the source causally by producing on each time unit an estimate of the current source symbol based on what it has received so far. Given a reconstruction fidelity measure and a maximal allowed distortion, we derive the rates-distortion region for this setting and express it in terms of an auxiliary random variable. When the source and side-information are drawn from an independent identically distributed Gaussian law and the fidelity measure is the squared-error distortion we show that for the evaluation of the rates-distortion region it suffices to choose the auxiliary random variable to be jointly Gaussian with the source and side-information pair.

1. Introduction

In the classical source coding with decoder side information problem, the source and side information are generated by independent drawings ( X k , Y k ) of the pair ( X , Y ) P X Y . The encoder forms a description of the source sequence X n = ( X 1 , , X n ) using a map f ( n ) : X n { 1 , , 2 n R } , while the decoder forms its reconstruction X ^ n depending on both the side-information sequence Y n and the index T { 1 , , 2 n R } conveyed by the encoder. In their seminal work [1], Wyner and Ziv derived the rate distortion function for this setting, given a fidelity measure, when X ^ n can depend on Y n in an arbitrary manner. Yet, Wyner–Ziv source coding with non-causal decoder side-information involves binning the implementation of which is complex.
A successive refinement for the Wyner–Ziv problem with side-information ( Y , Z ) is a variant of the Wyner–Ziv model, in which the encoder provides a two-layer description of the source sequence X n to a pair of decoders. Decoder 1, which obtains just the course description layer, has available as non-causal side-information the memoryless vector Z n , while Decoder 2, which obtains both description layers, has available as non-causal side-information the memoryless vector Y n . It is further assumed that the reconstruction formed by Decoder 2 should be of smaller distortion compared to that formed by Decoder 1. Such a model has been considered in [2], wherein a complete characterization of the rates-distortion region has been obtained for the case where Z is stochastically degraded to Y with respect to X—i.e., X Y Z forms a Markov chain.
The work [3] studies an extension of the model of [2] where a conference link of given capacity allows the unidirectional cooperation between Decoder 1 and Decoder 2—i.e., Decoder 1 functions also as a helper. The results in [3] are partially tight in the sense that the characterization of the encoder rates is conclusive, the remaining gap being in the characterization of the helper’s rate. Thus, with non-causal side-information at both decoders, the successive refinement for the Wyner–Ziv problem with a helper is yet unsolved.
Motivated by practical delay-constrained sequential source coding with decoder side information Weissman and El Gamal considered in [4], a scheme with causal side information at the decoder, where the sequence of reconstructions X ^ n = ( X ^ 1 , , X ^ n ) is formed sequentially in a causal manner according to X ^ k = X ^ k ( T , Y k ) , and derived the corresponding rate distortion function. Similar to [1], the rate distortion function in [4] is expressed in terms of an auxiliary random variable, thus leaving the optimal choice of which an open issue depending on the specifics of the model. For the Gaussian setting where ( X , Y ) are Gaussian, the authors in [4] compute an upper bound on the rate distortion function by choosing the auxiliary random variable to be jointly Gaussian with X, while leaving the question of whether this choice is optimal yet an open problem.
With the vision that modern network design will support the use of cooperation links in favor of reduction of encoding/decoding complexity and network deployment constraints, this work considers an extension of the model [4], involving a causal helper and causal side information at the decoder, which is described as follows. The components of a trivariate independent identically distributed (IID) finite-alphabet source X k , Y k , Z k k = 1 are observed by three terminals. The source component X k k = 1 is observed by the encoder, while the source component Y k k = 1 is observed by the helper. Both the encoder and the helper describe the length-n source sequence X n according to a given fidelity to the decoder in two steps. First, the encoder when given X n sends a rate-R description of it to both the helper and the decoder. Then, the helper sends to the decoder, per each source symbol Y k , a causal description depending on the message it had received from the encoder and the source subsequence Y k it had observed so far, with the aggregate rate not exceeding R h . The decoder, which observes the source component Z k k = 1 in a causal manner, per channel use k, uses the descriptions it had received so far and the source subsequence Z k to form its reconstruction X ^ k for the source symbol X k . Given a fidelity measure and a maximal allowed distortion, the goal is to determine the set of all rate pairs ( R , R h ) that satisfy the distortion constraint.
Causal decoder side information has been considered as well in context with successive refinement. With the aim of reducing encoder/decoder complexity, a two layer description model with successive refinement has been considered in [5], under the setting that the side information is available causally at each of the decoders. A single-letter characterization of the rates-distortion region is obtained in [5], irrespective of the relative ordering of the side information quality at the decoders. Furthermore, the direct part in [5] demonstrates that similarly to [4] with causal side-information at the decoders the optimal code avoids binning, hence its implementation is practically appealing. The extension of the model [5] with a causal helper has recently been studied in [6].

2. Problem Formulation

Formally, our problem can be stated as follows. A discrete memoryless source (DMS) ( X , P X ) is an infinite sequence { X i } i = 1 of independent copies of a random variable X taking values in the finite set X with generic law P X . Similarly, a triple source ( X Y Z , P X Y Z ) is an infinite sequence of independent copies of the triplet of random variables ( X , Y , Z ) , taking values in the finite sets X , Y and Z , respectively, with generic joint law P X Y Z . Since our goal is to encode the source X, let X ^ denote any finite reconstruction alphabet and let d : X × X ^ [ 0 , ) be a single-letter distortion measure. The vector distortion measure is defined as
d ( x , x ^ ) = 1 n i = 1 n d ( x i , x ^ i ) , x X n , x ^ X ^ n .
A system diagram appears in Figure 1.
Definition 1.
An n , M ( n ) , k = 1 n L k ( n ) , D code for the source X with causal side-information (SI) ( Y , Z ) and causal helper, consists of:
1. 
An encoder mapping
f ( n ) : X n { 1 , , M ( n ) } .
2. 
A unidirectional conference between the helper and the decoder consisting of a sequence of causal descriptions
f 1 , k ( n ) : { 1 , , M ( n ) } × Y k { 1 , , L k ( n ) } , k = 1 , , n
and
3. 
A sequence g 1 ( n ) , , g n ( n ) of decoder reconstructions
g k ( n ) : { 1 , , M ( n ) } × { 1 , , L 1 ( n ) } × × { 1 , , L k ( n ) } × Z k X ^ , k = 1 , , n
such that
E d [ X n , ( g 1 ( n ) f ( n ) ( X n ) , f 1 , 1 ( n ) ( f ( n ) ( X n ) , Y 1 ) , Z 1 , , g k ( n ) f ( n ) ( X n ) , f 1 , j ( n ) ( f ( n ) ( X n ) , Y j ) j = 1 k , Z k , , g n ( n ) f ( n ) ( X n ) , f 1 , j ( n ) ( f ( n ) ( X n ) , Y j ) j = 1 n , Z n ) ] D .
The rate tuple ( R , R h ) of the code is
R = 1 n log M ( n ) R h = 1 n k = 1 n log L k ( n ) .
Given a non-negative distortion D, the tuple ( R , R h ) is said to be D-achievable for X with causal SI ( Y , Z ) if, for any δ > 0 , ϵ > 0 , and sufficiently large n, there exists an ( n , exp [ n ( R + δ ) ] , exp [ n ( R h + δ ) ] , D + ϵ ) code for the source X with causal SI ( Y , Z ) and causal helper. The collection of all D-achievable rate tuples is the achievable source-coding region and is denoted by R ( D ) .
In this work, we provide a single-letter characterization for R ( D ) . In contrast to [5], a consequence of Definition 1 is that R ( D ) may depend on the joint law of the triple P X Y Z , not only through the marginal laws P X Y and P X Z . This is due to the fact that the decoder acquires, in addition to its private side information Z, some additional side information on Y via the conference link. As a result, the expectation in (4), which takes into account the mapping g k ( n ) ( · ) , is taken over the joint law P X Y Z and not just over the marginal law P X Z , as is the case in [5].
Finally, although not of the finite alphabet, of particular interest to us is the Gaussian source. This is a memoryless source, where ( X , Y , Z ) are centered jointly Gaussians with each pair ( X i , Y i ) drawn such that P X Y satisfies
Y i = ρ X i + W i
where ρ > 0 is a fixed constant, X i N ( 0 , 1 ) and W i N ( 0 , 1 ) are mutually independent. Moreover, Z i is drawn according to
Z i = a X i + b Y i + T i
where a and b are real numbers and T i N ( 0 , 1 ) is independent of ( X i , Y i ) . Furthermore, in this case, the fidelity measure will be
E d [ X n , X ^ n ] 1 n i = 1 n E ( X i X ^ i ) 2 ,
in which case we may restrict the reproduction functions g k ( n ) to be the MMSE estimates of X k , k = 1 , , n given f ( n ) ( X n ) , the sequence of causal descriptions f 1 , j ( n ) ( f ( n ) ( X n ) , Y j ) j = 1 k , and the side-information Z k . That is,
X ^ k = E [ X k f ( n ) ( X n ) , f 1 , j ( n ) ( f ( n ) ( X n ) , Y j ) j = 1 k , Z k .
In the Gaussian network setting, our focus will be on determining the optimal choice of the auxiliary random variable by means of which the rates-distortion region is defined. Specifically, we will show that choosing it to be jointly Gaussian with X is optimal.

3. Main Results

Given a maximal allowed distortion D, define R * ( D ) to be the set of all rate pairs ( R , R h ) for which there exist random variables ( U , V ) , taking values in finite alphabets U , V , respectively, such that
  • U X ( Y , Z ) forms a Markov chain.
  • Conditioned on U, X Y V forms a Markov chain.
  • There exist deterministic maps
    f ˜ 1 : U × Y V g : U × V × Z X ^
    such that, with V f ˜ 1 ( U , Y ) ,
    E d ( X , g ( U , V , Z ) ) D .
  • The alphabets U , V satisfy
    | U | | X | + 2 | V | | X | ( | X | + 2 ) + 1 .
  • The rates R and R h satisfy
    R I ( X ; U )
    R h H ( V | U ) = I ( Y ; V | U ) .
Our first result is a single-letter characterization of the rates-distortion region.
Theorem 1.
R ( D ) = R * ( D ) .
Proof. 
See Section 4.1. □
Remark 1.
The converse holds as well for the setting where the causal helper and the reconstructor benefit from causal disclosure—i.e., are cognizant of the past realizations of the source sequence, hence they are allowed to depend also on X k 1 , when forming f 1 , k ( n ) and X ^ k , respectively. That is,
f 1 , k ( n ) = f 1 , k ( n ) ( f ( n ) ( X n ) , Y k , X k 1 ) , X ^ k = g k ( n ) f ( n ) ( X n ) , f 1 , j ( n ) ( f ( n ) ( X n ) , Y i , X i 1 ) j = 1 k , Z k , X k 1 .

3.1. The Gaussian Setting with Z = .

To simplify the presentation, we consider first the Gaussian setting with Z = . In this case, the region R ( D ) is defined as the set of rate pairs ( R , R h ) for which there exist random variables ( U , V ) , taking real values, such that
  • U X Y forms a Markov chain.
  • Conditioned on U, X Y V forms a Markov chain.
  • There exist deterministic maps
    f ˜ 1 : U × Y V g : U × V X ^
    such that, with V f ˜ 1 ( U , Y ) ,
    D E X g ( U , V ) 2 σ X | U V 2 .
  • The rates R and R h satisfy (12a) and (12b), respectively.
Our second result characterizes the optimal choice of P X U in the Gaussian setting.
Theorem 2.
For the evaluation of R ( D ) when ( X , Y ) are Gaussian, it suffices to assume that ( X , U ) are jointly Gaussian.
Proof. 
For the treatment to follow, let us define the Gaussian channel Y = ϱ X + W where X has an arbitrary law with E [ X 2 ] 1 , W N ( 0 , 1 ) is independent of X, and ϱ > 0 is the channel signal-to-noise ratio. We shall denote by G Y | X ( ϱ ) the conditional law of Y given X for this additive Gaussian model. Furthermore, the notation ( U n , X n , Y n ) P U n X n G Y n | X n ( ϱ ) would imply that ( U n , X n ) P U n X n and Y n = ϱ X n + W n with W n N ( 0 , 1 ) independent of ( U n , X n ) . Using this notation, the law P U X Y V defining R ( D ) (see also (45) ahead) may equivalently be expressed as
P X U Y V = P X U G Y | X ( ϱ ) P V | Y U .
Henceforth, we denote a law which factors as in (16) by X Y V | U —i.e., conditioned on U, X Y V forms a Markov chain and, furthermore, P Y | X = G Y | X ( ϱ ) independently of the rest.
Define the region (12a) and (12b) subject to constraint (15) by O K , where the subscript K denotes the covariance constraint (15). The region O K is a closed convex set.
In line with [7,8], we define a λ -parametrized family of functions which are related to the sum rate associated with R ( D ) .
Fix some λ > 1 , and consider the minimization of the λ -sum rate defined as
min ( R , R h ) O K R + λ R h .
Observe that
min ( R , R h ) O K R + λ R h ( a ) inf P X U Y V : X Y V | U σ X | U V 2 D I ( X ; U ) + λ I ( Y ; V | U ) ,
where ( a ) follows using the lower bounds (12a) and (12b). Since the marginal law of X in our model (6) is Gaussian, the differential entropy h ( X ) is fixed, thus for the minimization of (18) over a law of the form (16), we define the following functional of P X U (i.e., of the conditional law P X | U )
s λ ( X , ϱ | U ) h ( X | U ) + inf V : X Y V | U σ X | U V 2 D λ I ( Y ; V | U ) .
Thus, the minimum in (17) may be expressed as
V λ ( ϱ ) inf P X U : E [ X 2 ] 1 s λ ( X , ϱ | U ) .
Henceforth, the set of laws P U X Y V defined by (16) which satisfy σ X | U V 2 D will be denoted by Q and will be attributed as the feasible set.
As shown below, with a proper choice of λ , the functional (19) exhibits a “pair grouping” property with respect to the input X in the sense that the value of s λ does not increase under this operation. Having established that, we follow the same steps as in the proof of ([9] Theorem 9) to establish that the objective (20) is attained when P X | U is Gaussian. More specifically,
  • Lemma 1 shows that the value of s λ “improves” under the pair grouping operation.
  • With the proper time-sharing of two distributions attaining the infimum in (20) and satisfying the extremal property defined in (23) ahead, Lemma 2 proves that the pair grouping operation exhibits Gaussianity in the sense of Bernstein’s characterization.
Lemma 1.
Let P X U be a law on X × U , let P U X Y = P X U G Y | X ( ϱ ) , and let ( U 1 , X 1 , Y 1 ) and ( U 2 , X 2 , Y 2 ) be two independent copies of ( U , X , Y ) . Define
X + X 1 + X 2 2 , X X 1 X 2 2
and similarly
Y + Y 1 + Y 2 2 , Y Y 1 Y 2 2 .
with U = ( U 1 , U 2 ) , there exists some λ * > 1 such that for any λ λ * , the following inequalities hold
2 s λ ( X , ϱ | U ) s λ ( X + , ϱ | X , U ) + s λ ( X , ϱ | Y + , U )
2 s λ ( X , ϱ | U ) s λ ( X + , ϱ | Y , U ) + s λ ( X , ϱ | X + , U ) .
Proof. 
See Section 4.2. □
Assume that the infimum in (20) is attained, and let P denote the subset of laws P X U achieving the minimum. Suppose further that there exists a law P X U P such that, for ( Y , X , U ) P X U G Y | X ( ϱ ) , for any other P X U P where ( Y , X , U ) P X U G Y | X ( ϱ )
h ( Y | U ) h ( X | U ) h ( Y | U ) h ( X | U ) .
Denote the value of the LHS of (23) by g * ( ϱ ) .
Lemma 2.
Fix ϵ > 0 . Let P X U be an admissible law such that
s λ ( X , ϱ | U ) V λ ( ϱ ) + ϵ
h ( Y | U ) h ( X | U ) g * ( ϱ ) + ϵ .
There exists a law ( Y , X , U ) P X U G Y | X ( ϱ ) satisfying
s λ ( X , ϱ | U ) V λ ( ϱ ) + 2 ϵ
and
h ( Y | U ) h ( X | U ) + 1 2 I ( X 1 + X 2 ; X 1 X 2 | Y 1 , Y 2 , U 1 , U 2 ) g * ( ϱ ) + ϵ ,
where ( U 1 , X 1 , Y 1 ) and ( U 2 , X 2 , Y 2 ) denote independent copies of ( U , X , Y ) .
Proof. 
See Section 4.3. □
Inequality (26) combined with assumption (23) and Lemma 6 in ([9] Section VI.B) establish the following.
Lemma 3.
There exists a sequence { X n , U n } , such that for each n 1 , ( X n , U n ) is feasible,
lim n s λ ( X n , ϱ | U n ) = V λ ( ϱ )
and there exists a feasible law P X * U * on R × U such that
( X n , U n ) D ( X * , U * ) P X * U *
and with U n ( U 1 , n , U 2 , n ) , Y n ( Y 1 , n , Y 2 , n ) , where the tuple ( U 1 , n , X 1 , n , Y 1 , n ) and the tuple ( U 2 , n , X 2 , n , Y 2 , n ) are two independent copies of ( U n , X n , Y n ) P U n X n G Y n | X n ( ϱ ) , we have
lim inf n I ( X 1 , n + X 2 , n ; X 1 , n X 2 , n | Y n , U n = u ) = 0 for P U * × P U * a . e u .
Finally, we apply to identity (28) the extended form of Bernstein’s theorem (see [10,11]), which is proved in ([9] Appendix A), to conclude that X 1 , * | { U 1 , * = u 1 } (i.e., the conditional random variable: X 1 , * conditioned on U 1 , * = u 1 ) and X 2 , * | { U 2 , * = u 2 } are independent Gaussian random variables of equal variance.
Since the marginal of X is Gaussian, with no loss in generality, we may choose ( X , U ) to be jointly Gaussian, which establishes our claim in Theorem 2. □

3.2. The Gaussian Setting with Decoder Side-Information Z

In the Gaussian setting where the decoder side-information Z in non-void, using our previous definition Y = ρ X + W , let
Z = a X + b Y + T
for a pair of real numbers ( a , b ) where T N ( 0 , 1 ) is independent of ( X , Y ) and U. For arbitrary X with E [ X 2 ] 1 , let us define by G ˜ Y Z | X ( ϱ , a , b ) the conditional law obtained by forming the pair ( Y , Z ) , when given X, via the additive independent Gaussian pair ( W , T ) as described above.
The rates-distortion region R ( D ) is defined in Theorem 1 with
E d ( X , g ( U , V , Z ) ) = E X g ( U , V , Z ) 2 σ X | U V Z 2 D
and it is evaluated over a law of the form
P X U Y Z V = P X U G ˜ Y Z | X ( ϱ , a , b ) P V | Y U .
Consequently, for the minimization of (20), we consider the functional
s λ ( X , ϱ | U ) h ( X | U ) + inf V : X Y V | U σ X | U V Z 2 D λ I ( Y ; V | U ) .
under a law of the form (31). Next, in Lemma 1, for P X U , a law on X × U and P U X Y Z = P X U G ˜ Y Z | X ( ϱ , a , b ) we let ( U 1 , X 1 , Y 1 , Z 1 ) and ( U 2 , X 2 , Y 2 , Z 2 ) be two independent copies of ( U , X , Y , Z ) .
Upon defining ( X , X + ) and ( Y , Y + ) as before, we also define
Z + Z 1 + Z 2 2 , Z Z 1 Z 2 2 .
It can be verified that
Z + = a X + + b Y + + 1 2 ( T 1 + T 2 ) a X + + b Y + + T + Z = a X + b Y + 1 2 ( T 1 T 2 ) a X + b Y + T
where T and T + are independent. Furthermore, the pair ( T , T + ) is independent of ( U , X , X + , Y , Y + ) and is equal in distribution to the pair ( T 1 , T 2 ) . Thus, the simultaneous unitary transformation ( Y 1 , Y 2 ) ( Y , Y + ) and ( Z 1 , Z 2 ) ( Z , Z + ) preserves the Gaussian nature of the channel and factors according to
P U X X + Y Y + Z Z + V = P X X + U G ˜ Y Z | X ( ϱ , a , b ) G ˜ Y + Z + | X + ( ϱ , a , b ) P V | U Y Y + .
Observe that with the choice of U U = ( U 1 , U 2 ) and V = ( V 1 , V 2 ) where ( U 1 , X 1 , Y 1 , Z 1 , V 1 ) and ( U 2 , X 2 , Y 2 , Z 2 , V 2 ) are two independent copies of ( U ˜ , X , Y , Z , V ˜ ) such that X Y V ˜ | U ˜ and σ X | U ˜ V ˜ Z 2 D –i.e., P U ˜ X Y Z V ˜ is feasible for the minimization of (20), we have
σ X + | U Y Z + V 2 σ X + | U V Z + 2 D σ X | U Y + Z V 2 σ X | U V Z 2 D .
Thus, Lemma 1 holds for this setting with decoder side-information Z as well.

4. Proofs

4.1. Proof of Theorem 1

Converse: Assume that the pair ( R , R h ) is D-achievable. For j = 1 , , n with the convention Z 0 , define the RV’s
T f ( n ) ( X n )
V j f 1 , j ( n ) ( f ( n ) ( X n ) , Y j ) , j = 1 , , n
U j ( T , X j 1 , Y j 1 , Z j 1 ) , j = 1 , , n .
The rate R is lower bounded as follows
n R log M ( n ) H ( T ) = I ( X n ; T ) = k = 1 n I ( X k ; T | X k 1 ) = ( a ) k = 1 n I ( X k ; T X k 1 ) = ( b ) k = 1 n I ( X k ; T X k 1 Y k 1 Z k 1 ) = k = 1 n I ( X k ; U k ) .
Here, ( a ) follows since X n is memoryless, and ( b ) follows since X k ( T , X k 1 ) ( Y k 1 , Z k 1 ) forms a Markov chain.
We may now lower bound R h as follows
n R h k = 1 n log L k ( n ) H ( V 1 , V 2 , , V n ) H ( V 1 , V 2 , , V n | T ) = k = 1 n H ( V k | T V k 1 ) k = 1 n H ( V k | T V k 1 X k 1 Y k 1 Z k 1 ) = ( c ) k = 1 n H ( V k | T X k 1 Y k 1 Z k 1 ) = ( c ) k = 1 n I ( Y k ; V k | T X k 1 Y k 1 Z k 1 ) = k = 1 n I ( Y k ; V k | U k ) ,
where ( c ) follows since V j is a deterministic function of ( T , Y j ) .
The sum-rate can be lower bounded as follows
n ( R + R h ) log M ( n ) + k = 1 n log L k ( n ) H ( T ) + H ( V 1 , , V n ) H ( T ) + H ( V 1 , , V n | T ) = I ( T ; X n ) + H ( V 1 , , V n | T ) = k = 1 n I ( X k ; T | X k 1 ) + H ( V 1 , , V n | T ) ( d ) k = 1 n I ( X k ; U k ) + I ( Y k ; V k | U k ) ,
where ( d ) follows by equality (37) and inequality (38).
Draw J uniformly from { 1 , , n } independently of { ( X k , Y k , Z k , V k , U k ) } k = 1 n , and define the RV’s U = ( U J , J ) , V = V J , Z = Z J , Y = Y J , and X = X J . Using J, we may express (37) as follows
R 1 n k = 1 n I ( X k ; U k ) = I ( X J ; U J | J ) = I ( X J ; U J , J ) I ( X J ; J ) = I ( X J ; U J , J ) = I ( X ; U ) ,
and we may express (38) as follows
R h 1 n k = 1 n I ( Y k ; V k | U k ) = I ( Y J ; V J | U J , J ) = I ( Y ; V | U ) = H ( V | U ) .
With regard to the expected distortion, we may write
D 1 n i = 1 n E [ d ( X i , X ^ i ) ] = 1 n i = 1 n E [ d ( X i , g i ( n ) ( T , V 1 , , V i , Z i ) ) ] ( e ) 1 n i = 1 n E [ d ( X i , g i * ( T , X i 1 , Y i 1 , Z i 1 , V i , Z i ) ) ] = 1 n i = 1 n E [ d ( X i , g i * ( U i , V i , Z i ) ) ] = E [ d ( X J , g ( U J , J , V J , Z J ) ) ] = E [ d ( X , g ( U , V , Z ) ) ] .
Step ( e ) is justified as follows: Since V 1 , , V i 1 are deterministic functions of ( T , Y i 1 )
( X i 1 , Z i 1 , Z i ) ( T , Y i 1 ) ( V 1 , , V i 1 )
is a Markov chain and, given ( U k , V k , Z k ) , the tuple ( T , V 1 , , V k , Z k ) is independent of ( V 1 , , V k 1 ) . As a consequence of that, Lemma 1 in ([12] Section II.B) guarantees the existence of a reconstruction X ^ k * ( U k , V k , Z k ) which dominates X ^ k in the sense that
E d ( X k , X ^ k * ( U k , V k , Z k ) ) E d ( X k , X ^ k ( T , V 1 , , V k , Z k , Z k 1 ) ) .
This observation interpreted as the “data processing inequality” for estimation has already been made in ([12] Lemma 1).
Furthermore,
V J = f 1 , J ( n ) ( f ( n ) ( X n ) , Y J ) = f 1 , J ( n ) ( f ( n ) ( X n ) , Y J 1 , Y J ) = f ˜ 1 , J ( n ) ( T , X J 1 , Y J 1 , Z J 1 , Y J ) = f ˜ 1 , J ( n ) ( U J , J , Y j ) .
By (1) and the memoryless property of the sequence ( X k , Y k , Z k ) , k = 1 , , n one can verify the Markov relation U k X k ( Y k , Z k ) which implies the Markov relation U X ( Y , Z ) . Similarly, the definitions of U k and V k and the memoryless property of ( X k , Y k ) , k = 1 , , n imply that, conditioned on U k , X k Y k V k forms a Markov chain.
Thus, conditioned on U, X Y V forms a Markov chain, hence
P X U Y V = P X U P Y | X P V | Y U ,
where P Y | X denotes the conditional law induced by the marginal law P X Y . The combination of (40)–(42) and (44) together with the latter Markov relations establish the converse.
We shall now obtain an alternative characterization for the lower bound (38). For a law
P U X Z Y V = P U P X | U P Z Y | X P V | Y U ,
and its induced conditional law P X Z Y V | U , let
Q ( P X Z Y V | U , D ˜ ) inf g * : U × V × Z X ^ : E [ d ( X , g * ( U , V , Z ) ) | U = u ] D ˜ H ( V | U = u ) ,
and let Q ¯ ( P X Z Y V | U , · ) denote the lower convex envelope of Q ( P X Z Y V | U , · ) .
Define
Q s ( P U X Z Y V , D ) inf ρ ( u ) : ρ ( u ) d P U ( u ) D U Q ¯ ( P X Z Y V | U , ρ ( u ) ) d P U ( u ) ,
then by ([13] Section III.C, Lemma 1) Q s ( P U X Z Y V , · ) is convex.
Note that
H ( V k | T , X k 1 , Y k 1 , Z k 1 ) = H f 1 , k ( n ) ( T , Y k 1 , Y k ) | U k = H f 1 , k ( n ) ( t , y k 1 , Y k ) | U k = u k d μ ( u k ) .
The integrand on the RHS of (46) is the entropy of the scalar quantizer of V k f 1 , k ( n ) ( T , Y k 1 , Y k ) conditioned on U k = u k where U k is defined in (36c). Now, conditioned on U k = u k , Lemma 1 in ([12] Section II.B) ensures that
E d ( X k , X ^ k ( t , y k 1 , f 1 , k ( n ) ( t , y k 1 , Y k ) , Z k ) ) | U k = u k E d ( X k , g k * ( U k , V k , Z k ) ) | U k = u k .
Consequently, we may lower bound the RHS of (46) as follows
H f 1 , k ( n ) ( t , y k 1 , Y k ) | U k = u k d μ ( u k ) ( a ) Q ¯ ( P X Z Y V | U , E [ d ( X k , X ^ k ) | U k = u k ] ) d μ ( u k ) Q ¯ ( P X Z Y V | U , E [ d ( X k , g ( u , V , z ) ) | U k = u ] ) d μ ( z | u ) d μ ( u ) ( b ) Q ¯ ( P X Z Y V | U , E [ d ( X k , g ( u , V , Z ) ) | U k = u ] ) d μ ( u ) Q s ( P U X Z Y V , D ) ,
where in ( a ) we used the definition of Q and in ( b ) its convexity. The lower bound (47) may be interpreted as follows. Fix ρ : U R + and consider, for each u U , time shraing of at most two scalar quantizers for the “source” P V | U = u attaining a distortion level ρ ( u ) . The optimal helper time shares side-information-dependent scalar quantizers of V k (at most two per each side-information symbol U k ), while the reconstruction at the decoder is a function of ( U , V , Z ) .
Direct: To establish the achievability of R * ( D ) , consider the codebook construction as follows. The codebook A = { u 1 , , u M } , u k U n is obtained by drawing the n-length sequences u k independently of T P U δ . (For the definition of T P U δ —the set of δ -strongly typical n-sequences corresponding to a marginal law P U and a few properties of these sequences see [14,15,16]).
Given the source sequence x, f ( n ) ( x ) is defined as follows.
  • If x T P X δ the encoder searches for the first sequence U k = u in A such that (s.t.) ( x , u ) T P X U 2 δ and sets f ( n ) ( x ) = k .
  • If x T P X δ , or U k A s.t. ( x , u ) T P X U 2 δ , an encoding error is declared.
Given f ( n ) ( x ) = k , the helper forms the sequence of descriptions V i = f ˜ 1 ( U k , i , Z i ) , V i [ 1 , , L i ] that is sent causally to the decoder.
Decoding: Given f ( n ) ( x ) = k as well as the sub-sequence V 1 , , V i , the decoder forms the reconstruction sequence X ^ i = g ( U k , i , V i , Z i ) , i [ 1 , , n ] .
Given that ( U , X ) are jointly typical since ( X , Y , Z ) is memoryless, the Markov lemma guarantees that, for large n, with high probability ( U , X , Y , Z ) are also jointly typical. Since X ( U , Y ) V forms a Markov chain, by the Markov lemma, with high probability, ( X , V ) as well as ( X , V , Z ) are jointly typical. Thus, with high probability, ( X , X ^ ) are jointly typical, hence the distortion constraint (10) is fulfilled for large n. That the sequence V 1 , , V n can be described at a conditional entropy rate satisfying (12b)–(47) can be established along similar lines as the proof of the direct part of Theorem 2 in ([13] Section III.C). Finally, standard error probability analysis verifies that, with high probability, ( U , X ) are jointly typical as long as (12a) holds.

4.2. Proof of Lemma 1

If Y i = ρ X i + W i , i = 1 , 2 , then
Y + = ρ 2 ( X 1 + X 2 ) + 1 2 ( W 1 + W 2 ) ρ X + + W +
Y = ρ 2 ( X 1 X 2 ) + 1 2 ( W 1 W 2 ) ρ X + W
where W and W + are independent and the pair ( W , W + ) is equal in distribution to the pair ( W 1 , W 2 ) . Thus, the unitary transformation ( Y 1 , Y 2 ) ( Y , Y + ) preserves the Gaussian nature of the channel and factors according to (see (53) ahead)
P U X X + Y Y + V = P X X + U G Y | X ( ϱ ) G Y + | X + ( ϱ ) P V | U Y Y + .
To show (22a), consider the sequence of identities
λ I ( Y + Y ; V | U ) h ( X + X | U ) = λ I ( Y + ; V | U ) + λ I ( Y ; V | U Y + ) h ( X | U Y + ) h ( X + | U X ) I ( X ; Y + | U ) = λ I ( Y + ; V | U X ) + λ I ( Y ; V | U Y + ) h ( X + | U X ) h ( X | U Y + ) + ( λ 1 ) I ( Y + ; X | U ) I ( Y + ; X | U V ) I ( Y + ; X | U V ) .
Moreover, to show (22b), consider the sequence of identities
λ I ( Y + Y ; V | U ) h ( X + X | U ) = λ I ( Y ; V | U ) + λ I ( Y + ; V | U Y ) h ( X + | U Y ) h ( X | U X + ) I ( X + ; Y | U ) = λ I ( Y ; V | U X + ) + λ I ( Y + ; V | U Y ) h ( X + | U Y ) h ( X | U X + ) + ( λ 1 ) I ( Y ; X + | U ) I ( Y ; X + | U V ) I ( Y ; X + | U V ) .
Starting with (50), consider the difference
I ( Y + ; X | U ) I ( Y + ; X | U V ) = I ( X ; V | U ) I ( X ; V | U Y + )
under a law of the form
P U X X + Y Y + V = P X X + U P Y | X P Y + | X + P V | U Y Y +
i.e., that X Y V | ( U , Y + ) forms a Markov chain (see also ([9] Section VI.A, Lemma 4)).
We distinguish between the two cases:
(1)
In case that I ( X ; Y + | U V ) = 0 , the non-negativity of mutual information implies that the expression (52) is non-negative, hence the inequalities (63) ahead hold for any λ > 1 .
(2)
In case that I ( X ; Y + | U V ) > 0 , we prove first that for the set of laws which are feasible for the optimization problem (20), the expression (52) is strictly positive.
Observe that with the choice of U U = ( U 1 , U 2 ) and V = ( V 1 , V 2 ) where ( U 1 , X 1 , Y 1 , V 1 ) and ( U 2 , X 2 , Y 2 , V 2 ) are two independent copies of ( U ˜ , X , Y , V ˜ ) such that X Y V ˜ | U ˜ and σ X | U ˜ V ˜ 2 D i.e., P U ˜ X Y V ˜ Q , we have
σ X + | U Y V 2 σ X + | U V 2 = 1 2 σ X 1 | U 1 V 1 2 + 1 2 σ X 2 | U 2 V 2 2 D σ X | U Y + V 2 σ X | U V 2 = 1 2 σ X 1 | U 1 V 1 2 + 1 2 σ X 2 | U 2 V 2 2 D .
Thus, the unitary transformation ( Y 1 , Y 2 ) ( Y , Y + ) picks a pair of independent copies of a law P U ˜ X Y V ˜ Q and “creates” a pair of laws, X Y V | U Y + and X + Y + V | U Y , which factor jointly according to (53) with
P Y | X = G Y | X ( ϱ ) P Y + | X + = G Y + | X + ( ϱ ) ,
hence are symmetric w.r.t. the inputs X and X + . We shall define the latter set of laws by P * and, as shown above, P * Q .
Remark 2.
Suppose that ( U ˜ , X , Y ) P X U ˜ G Y | X ( ϱ ) with U ˜ U then, for both laws X Y V | U Y + and X + Y + V | U Y , we have ( U , Y + ) U × U × R and ( U , Y ) U × U × R . Since the image of the map
P X U ˜ ( E [ X 2 ] , σ X | U ˜ V 2 , s λ ( X , ϱ | U ˜ ) )
is a convex set, standard dimensionality reduction argument can be used to establish the existence of a law P X U where P U is supported on a finite set and achieves any point in the image of the map (55) (See ([9] Section IV, Remark 2)).
By rate-distortion theory, the constraint σ X | U , V 2 D with both U and V non-void implies that
I ( X ; U ) < I ( X ; U V ) I ( X ; V | U ) > 0 .
Now, for any P U X X + Y Y + V as per (49), conditioned on ( U , Y + ) , the random variable V is dependent on Y i.e. I ( Y ; V | U Y + ) > 0 . As a consequence of that, since the mutual information I ( X ; Y ) is strictly positive and, conditioned on ( U , Y + ) , X Y V forms a Markov chain
I ( X ; V | U Y + ) > 0 .
Since a law of the form (53) dictates
I ( X ; V | U Y Y + ) = 0 ,
the combination of (56)–(58) yields that
I ( X ; V | U ) I ( X ; V | U Y + ) > I ( X ; V | U Y Y + ) = 0 .
Thus, the conditional mutual information I ( X ; V | U ) is non-increasing under the conditioning on ( Y , Y + ) .
By the symmetry of the pair of laws X Y V | U Y + and X + Y + V | U Y induced by P U X X + Y Y + V , conditioned on ( U , Y ) , the random variable V is dependent on Y + , hence
I ( Y + ; V | U Y ) > 0 .
Now
P U X X + Y Y + V = P X X + U P Y | X P Y + | X + P V | U Y Y + = P X X + U P Y + | X + P V Y | U X Y + ,
hence
P U X X + Y + V = Y P X X + U Y Y + V = Y P X X + U P Y + | X + P V Y | U X Y + = P X X + U P Y + | X + P V | U X Y + .
However, with the latter factorization, if indeed I ( X ; V | U ) = I ( X ; V | U Y + ) , i.e.,
I ( X ; V | U ) = I ( X ; V | U Y + ) > I ( X ; V | U Y Y + ) = 0
then, since the conditional mutual information I ( X ; V | U ) is non-increasing under the conditioning on ( Y , Y + ) , it follows that P V | U X Y + = P V | U X , which is in contradiction with (59). Consequently, for P U X X + Y Y + V P * , I ( X ; V | U ) I ( X ; V | U Y + ) = Δ > 0 , thus establishing that the expression (52) is positive.
Consequently, there exists some λ * > 1 , such that for any λ λ * I ( Y + ; X | U V ) Δ + 1
( λ 1 ) I ( Y + ; X | U ) I ( Y + ; X | U V ) I ( Y + ; X | U V ) > 0 .
The combination of (50) and (62) implies that, for any λ λ * ,
λ I ( Y + Y ; V | U ) h ( X + X | U ) ( a ) [ h ( X + | U X ) + h ( X | U Y + ) ] + λ [ I ( Y + ; V | U X ) + I ( Y ; V | U Y + ) ] ( b ) s λ ( X + , ϱ | X , U ) + s λ ( X , ϱ | Y + , U ) .
Here, ( a ) follows by (50) and (62), and ( b ) is true, since the set of laws over which the infimum on the RHS of (63) is evaluated is not empty. Indeed, the choice of U = ( U 1 , U 2 ) and V = ( V 1 , V 2 ) where ( U 1 , X 1 , Y 1 , V 1 ) and ( U 2 , X 2 , Y 2 , V 2 ) are two independent copies of ( U ˜ , X , Y , V ˜ ) such that X Y V ˜ | U ˜ and σ X | U ˜ V ˜ 2 D –i.e., P U ˜ X Y V ˜ Q (hence it is feasible for s λ ( X , ϱ | U ) in (19)), satisfies (54), hence it belongs to the feasible set.
On the other hand, since both mappings ( X 1 , X 2 ) ( X + , X ) and ( Y 1 , Y 2 ) ( Y + , Y ) are invertible, then with U = ( U 1 , U 2 )
λ I ( Y + Y ; V | U ) h ( X + X | U ) = λ I ( Y 1 Y 2 ; V | U ) h ( X 1 X 2 | U ) = λ I ( Y 1 ; V | U ) + λ I ( Y 2 ; V | U Y 1 ) h ( X 1 | U ) h ( X 2 | U X 1 ) = ( a ) λ I ( Y 1 ; V | U ) + λ I ( Y 2 ; V | U Y 1 ) h ( X 1 | U ) h ( X 2 | U ) = λ I ( Y 1 ; V | U ) + λ I ( Y 2 ; V | U ) h ( X 1 | U ) h ( X 2 | U ) λ [ I ( Y 2 ; V | U ) I ( Y 2 ; V | U Y 1 ) ] = λ I ( Y 1 ; V | U ) + λ I ( Y 2 ; V | U ) h ( X 1 | U ) h ( X 2 | U ) λ [ I ( Y 2 ; V | U ) h ( Y 2 | U Y 1 ) + h ( Y 2 ; | U V Y 1 ) ] = ( b ) λ I ( Y 1 ; V | U ) + λ I ( Y 2 ; V | U ) h ( X 1 | U ) h ( X 2 | U ) λ [ I ( Y 2 ; V | U ) h ( Y 2 | U ) + h ( Y 2 ; | U V Y 1 ) ] = λ I ( Y 1 ; V | U ) + λ I ( Y 2 ; V | U ) h ( X 1 | U ) h ( X 2 | U ) + λ I ( Y 2 ; Y 1 | U V ) .
Here, ( a ) follows since, conditioned on U , X 1 and X 2 are independent, and ( b ) follows since, conditioned on U , Y 1 and Y 2 are independent.
When U = ( U 1 , U 2 ) and V = ( V 1 , V 2 ) where ( U 1 , X 1 , Y 1 , V 1 ) and ( U 2 , X 2 , Y 2 , V 2 ) are two independent copies of ( U ˜ , X , Y , V ˜ ) , such that X Y V ˜ | U ˜ and σ X | U ˜ V ˜ 2 D , then Y 2 ( U , V ) Y 1 forms a Markov chain. Thus, I ( Y 1 ; Y 2 | U V ) = 0 and the RHS of (64) becomes λ I ( Y 1 ; V 1 | U 1 ) + λ I ( Y 2 ; V 2 | U 2 ) h ( X 1 | U 1 ) h ( X 2 | U 2 ) thus establishing that,
inf V : X Y V | U σ X | U V 2 D h ( X 1 , X 2 | U ) + λ I ( Y 1 , Y 2 ; V | U ) i = 1 2 inf V : X i Y i V | U i σ X i | U i V 2 D h ( X i | U i ) + λ I ( Y i ; V | U i ) = 2 s λ ( X , ϱ | U ) ,
where the inequality follows, since the infimum on the LHS is taken over a larger set than that on the RHS. The combination of (63) and (65) establishes (22a) for λ λ * .
Now, returning to (51), consider the difference
I ( Y ; X + | U ) I ( Y ; X + | U V ) = I ( X + ; V | U ) I ( X + ; V | U Y )
under the law (53) i.e., that X + Y + V | ( U , Y ) forms a Markov chain. By rate-distortion theory, the constraint σ X + | U , V 2 D with both U and V non-void implies that
I ( X + ; U ) < I ( X + ; U V ) I ( X + ; V | U ) > 0 .
Moreover, an argument similar to that leading to the conclusion that (52) is non-negative establishes that (66) is non-negative.
Consequently, there exist some λ * > 1 such that for any λ λ *
( λ 1 ) I ( Y ; X + | U ) I ( Y ; X + | U V ) I ( Y ; X + | U V ) > 0 .
In addition, the combination of (51) and (68) implies that, for any λ λ * ,
λ I ( Y + Y ; V | U ) h ( X + X | U ) s λ ( X , ϱ | X + , U ) + s λ ( X + , ϱ | Y , U ) .
The combination of (69) and (65) establishes (22b) for λ λ * .

4.3. Proof of Lemma 2

Consider the tuple ( X + , X , Y + , Y , U ) as defined in Lemma 1 that is constructed from independent copies of ( U , X , Y ) P X U G Y | X ( ϱ ) . With the transformation (21) X + and X preserve the variance of X i , i = 1 , 2 and W + and W preserve the variance of W i , i = 1 , 2 . Furthermore, as shown in the proof of Lemma 1, the unitary transformation ( Y 1 , Y 2 ) ( Y , Y + ) picks a pair of independent copies of a law P U X Y V Q and “creates” a pair of laws, X Y V | U Y + and X + Y + V | U Y , which factor jointly according to (53) with
P Y | X = G Y | X ( ϱ ) P Y + | X + = G Y + | X + ( ϱ ) ,
hence are symmetric w.r.t., the inputs X and X + and both are in the feasible set. Similarly, the unitary transformation ( Y 1 , Y 2 ) ( Y , Y + ) picks a pair of independent copies of a law P U X Y V Q and “creates” a pair of laws, X Y V | U X + and X + Y + V | U X , which are both in the feasible set (see the factorization (61)).
Consequently, by (20), each of the quantities s λ ( X + , ϱ | X , U ) , s λ ( X , ϱ | Y + , U ) , s λ ( X , ϱ | X + , U ) and s λ ( X + , ϱ | Y , U ) is at least V λ ( ϱ ) . The assumption P X U P leads by (22) of Lemma 1 to an opposite conclusion. Therefore, all four quantities are equal and coincide with the minimum, i.e.,
s λ ( X , ϱ | Y + , U ) = s λ ( X + , ϱ | Y , U ) = V λ ( ϱ )
s λ ( X , ϱ | X + , U ) = s λ ( X + , ϱ | X , U ) = V λ ( ϱ ) .
Let B B e r ( 1 / 2 ) be a Bernoulli random variable with values in the set { + , } independent of ( X , X + , Y , Y + , U ) . Let B ¯ be the complement of B in the set { + , } and define ( X ˜ , U ˜ ) by X ˜ X B and U ˜ ( B , Y B ¯ , U ) . Then
s λ ( X ˜ , ϱ | U ˜ ) = 1 2 s λ ( X , ϱ | Y + , U ) + 1 2 s λ ( X + , ϱ | Y , U ) V λ ( ϱ ) + 2 ϵ ,
where the inequality follows by assumption (24a).
Furthermore, E [ ( X ˜ ) 2 ] = E [ X 2 ] and by (54)
σ X ˜ | U ˜ V 2 1 2 σ X + | U V 2 + 1 2 σ X | U V 2 D .
Next
h ( Y ˜ | U ˜ ) = h ( Y B | B , Y B ¯ , U ) = 1 2 h ( Y + | Y U ) + 1 2 h ( Y | Y + U ) h ( X ˜ | U ˜ ) = h ( X B | B , Y B ¯ , U ) = 1 2 h ( X + | Y U ) + 1 2 h ( X | Y + U ) .
Thus
h ( Y ˜ | U ˜ ) h ( X ˜ | U ˜ ) = 1 2 h ( Y + | Y U ) + h ( Y | Y + U ) 1 2 h ( X + | Y U ) + h ( X | Y + U ) = 1 2 h ( Y + Y | U ) h ( Y | U ) + h ( Y | Y + U ) 1 2 h ( X + | Y U ) + h ( X | Y + U ) = 1 2 h ( Y + Y | U ) h ( X + X | U ) 1 2 I ( Y ; Y + | U ) + 1 2 h ( X + | U ) + h ( X | X + U ) 1 2 h ( X + | Y U ) + h ( X | Y + U ) = ( a ) 1 2 h ( Y + Y | U ) h ( X + X | U ) 1 2 I ( Y ; Y + | U ) + 1 2 I ( X + ; Y | U ) + 1 2 h ( X | X + Y + U ) h ( X | Y + U ) = ( b ) 1 2 h ( Y + Y | U ) h ( X + X | U ) + 1 2 h ( Y | Y + U ) h ( Y | X + Y + U ) 1 2 I ( X ; X + | Y + U ) = 1 2 h ( Y + Y | U ) h ( X + X | U ) + 1 2 I ( Y ; X + | Y + U ) 1 2 I ( X ; X + | Y + U ) = ( c ) 1 2 h ( Y + Y | U ) h ( X + X | U ) + 1 2 I ( Y ; X + | Y + U ) 1 2 I ( X Y ; X + | Y + U ) = 1 2 h ( Y + Y | U ) h ( X + X | U ) 1 2 I ( X ; X + | Y Y + U ) = h ( Y | U ) h ( X | U ) 1 2 I ( X ; X + | Y Y + U ) .
Here,
( a )
follows since X ( X + U ) Y + forms a Markov chain,
( b )
follows since Y ( X + U ) Y + forms a Markov chain, and
( c )
follows since Y ( X Y + U ) X + forms a Markov chain.
The combination of identity (73) with inequality (24b) proves (26).
Since X ˜ R while U ˜ { + , } × R × U × U a dimensionality reduction argument as in Remark 2 establishes the existence of a law P X U , where U has finite support, such that (25) and (26) are fulfilled, hence ( U , X ) are in the feasible set.

Funding

This research received no external funding.

Conflicts of Interest

The author declares no conflict of interest.

References

  1. Wyner, A.D.; Ziv, J. The rate-distortion function for source coding with side information at the receiver. IEEE Trans. Inform. Theory 1976, IT-22, 1–11. [Google Scholar] [CrossRef]
  2. Steinberg, Y.; Merhav, N. On successive refinement for the Wyner-Ziv problem. IEEE Trans. Inform. Theory 2004, 50, 1636–1654. [Google Scholar] [CrossRef]
  3. Bross, S.I.; Weissman, T. On successive refinement for the Wyner-Ziv problem with partially cooperating decoders. In Proceedings of the 2008 IEEE International Symposium on Information Theory (ISIT), Toronto, ON, Canada, 6–11 July 2008. [Google Scholar]
  4. Weissman, T.; Gamal, A.E. Source coding with limited-look-ahead side information at the decoder. IEEE Trans. Inform. Theory 2006, 52, 5218–5239. [Google Scholar] [CrossRef]
  5. Maor, A.; Merhav, N. On successive refinement with causal side information at the decoders. IEEE Trans. Inform. Theory 2008, 54, 332–343. [Google Scholar] [CrossRef] [Green Version]
  6. Bross, S.I. Scalable source coding with causal side information and a causal helper. In Proceedings of the 2020 IEEE International Symposium on Information Theory (ISIT), Los Angeles, CA, USA, 21–26 June 2020. [Google Scholar]
  7. Geng, Y.; Gohari, A.; Nair, C.; Yu, Y. The capacity region of classes of product broadcast channels. arXiv 2011, arXiv:1105.5438. [Google Scholar]
  8. Geng, Y.; Nair, C. The capacity region of the two-receiver Gaussian vector broadcast channel with private and common messages. IEEE Trans. Inform. Theory 2014, IT-60, 2087–2104. [Google Scholar] [CrossRef]
  9. Courtade, T.A. A strong entropy power inequality. IEEE Trans. Inform. Theory 2018, IT-64, 2173–2192. [Google Scholar] [CrossRef]
  10. Bernstein, S.N. On a property characteristic of the normal law. Trudy Leningrad. Polytech. Inst. 1941, 3, 21–22. [Google Scholar]
  11. Bryc, W. The Normal Distribution: Characterizations with Applications; Springer: New York, NY, USA, 2012; Volume 100. [Google Scholar]
  12. Choudhuri, C.; Kim, Y.-H.; Mitra, U. Causal state communication. IEEE Trans. Inform. Theory 2013, 59, 3709–3719. [Google Scholar] [CrossRef] [Green Version]
  13. Weissman, T.; Merhav, N. On causal source codes with side information. IEEE Trans. Inform. Theory 2005, 51, 4003–4013. [Google Scholar] [CrossRef]
  14. Cover, T.M.; Thomas, J.A. Elements of Information Theory; John Wiley and Sons: New York, NY, USA, 1991. [Google Scholar]
  15. Csiszár, I.; Körner, J. Information Theory: Coding Theorems for Discrete Memoryless Systems; Academic: New York, NY, USA, 1981. [Google Scholar]
  16. Gamal, A.E.; Kim, Y.H. Network Information Theory; Cambridge University Press: Cambridge, UK, 2012. [Google Scholar]
Figure 1. The causal listening-helper model.
Figure 1. The causal listening-helper model.
Information 11 00553 g001
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Bross, S.I. Source Coding with a Causal Helper. Information 2020, 11, 553. https://doi.org/10.3390/info11120553

AMA Style

Bross SI. Source Coding with a Causal Helper. Information. 2020; 11(12):553. https://doi.org/10.3390/info11120553

Chicago/Turabian Style

Bross, Shraga I. 2020. "Source Coding with a Causal Helper" Information 11, no. 12: 553. https://doi.org/10.3390/info11120553

APA Style

Bross, S. I. (2020). Source Coding with a Causal Helper. Information, 11(12), 553. https://doi.org/10.3390/info11120553

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop