Next Article in Journal
A New Lomax-G Family: Properties, Estimation and Applications
Previous Article in Journal
A Novel Quadrilateral Contour Disentangled Algorithm for Industrial Instrument Reading Detection
Previous Article in Special Issue
CURATE: Scaling-Up Differentially Private Causal Graph Discovery
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Variable-Length Coding with Zero and Non-Zero Privacy Leakage †

Division of Information Science and Engineering, KTH Royal Institute of Technology, 100-44 Stockholm, Sweden
*
Author to whom correspondence should be addressed.
This work was presented in part at the 2023 IEEE Workshop on Information Forensics and Security (WIFS), Nuremberg, Germany, 4–7 December 2023.
Entropy 2025, 27(2), 124; https://doi.org/10.3390/e27020124
Submission received: 13 December 2024 / Revised: 3 January 2025 / Accepted: 7 January 2025 / Published: 24 January 2025
(This article belongs to the Special Issue Information-Theoretic Security and Privacy)

Abstract

:
A private compression design problem is studied, where an encoder observes useful data Y, wishes to compress them using variable-length code, and communicates them through an unsecured channel. Since Y are correlated with the private attribute X, the encoder uses a private compression mechanism to design an encoded message C and sends it over the channel. An adversary is assumed to have access to the output of the encoder, i.e., C , and tries to estimate X. Furthermore, it is assumed that both encoder and decoder have access to a shared secret key W. In this work, the design goal is to encode message C with the minimum possible average length that satisfies certain privacy constraints. We consider two scenarios: 1. zero privacy leakage, i.e., perfect privacy (secrecy); 2. non-zero privacy leakage, i.e., non-perfect privacy constraint. Considering the perfect privacy scenario, we first study two different privacy mechanism design problems and find upper bounds on the entropy of the optimizers by solving a linear program. We use the obtained optimizers to design C . In the two cases, we strengthen the existing bounds: 1. | X | | Y | ; 2. The realization of ( X , Y ) follows a specific joint distribution. In particular, considering the second case, we use two-part construction coding to achieve the upper bounds. Furthermore, in a numerical example, we study the obtained bounds and show that they can improve existing results. Finally, we strengthen the obtained bounds using the minimum entropy coupling concept and a greedy entropy-based algorithm. Considering the non-perfect privacy scenario, we find upper and lower bounds on the average length of the encoded message using different privacy metrics and study them in special cases. For achievability, we use two-part construction coding and extended versions of the functional representation lemma. Lastly, in an example, we show that the bounds can be asymptotically tight.

1. Introduction

In this work, the random variable (RV) Y denotes the useful data and is correlated with the private data denoted by RV X. An encoder wishes to compress Y and communicates it to a user over an unsecured channel. The encoded message is denoted by the RV C . As shown in Figure 1, it is assumed that an adversary has access to the encoded message C , and wants to extract information about X. Moreover, it is assumed that the encoder and decoder have access to a shared secret key denoted by the RV W with size M. The goal is to design an encoded message C , which compresses Y, using a variable-length code with the minimum possible average length that satisfies certain privacy constraints. We utilize techniques used in privacy mechanism and compression design problems and combine them to build such a code C . In this paper, we introduce an approach and apply it to a lossless compression problem.
In this paper, we consider two scenarios, where the privacy leakage is either zero or a bounded leakage is allowed. We refer to these scenarios as perfect and non-perfect privacy scenarios. In the perfect privacy scenario, considering two different cases, we extend previously existing results [1,2]. In the non-perfect privacy scenario, we extend previously existing results [1,2] by generalizing the perfect privacy constraint and allowing non-zero leakage. We use different privacy leakage constraints, e.g., the mutual information between X and C equals ϵ , a strong privacy constraint, and bounded per-letter privacy criterion.

1.1. Related Works

Recently, the compression and privacy mechanism design problems have received increased attention [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24]. Specifically, in [3], the notion of perfect secrecy was introduced by Shannon, where the public data and private data are statistically independent. Equivocation as a measure of information leakage for information theoretic security was used in [4,5,6]. A rate-distortion approach to information theoretic secrecy was studied in [11]. Maximal leakage was introduced in [12] and used in [13] for the Shannon cipher system. Furthermore, bounds on the privacy–utility trade-off have been derived. The fundamental limits of the privacy–utility trade-off when measuring privacy leakage using estimation-theoretic guarantees were studied in [7]. A privacy mechanism design with total variation as a privacy measure was studied in [8]. A source coding problem with secrecy was studied in [10]. The concept of a privacy funnel was introduced in [14], where the privacy–utility trade-off was studied considering the log-loss as privacy measure and a distortion measure for utility. In both [6,10], privacy–utility trade-offs considering equivocation and expected distortion as measures of privacy and utility were studied.
In [15], the problem of privacy–utility trade-off considering mutual information as measures of both privacy and utility was studied. It was shown that, under perfect privacy, the privacy mechanism design problem can be obtained using a linear program. This was generalized in [25] considering a privacy–utility trade-off with a rate constraint. Moreover, in [15], it was shown that non-zero information about useful data Y can be revealed if the kernel (leakage matrix) between private and useful data is not invertible. In [16], the work of [15] was extended by relaxing the perfect privacy assumption, allowing some small bounded privacy leakage. Specifically, privacy mechanisms with a per-letter (point-wise) privacy criterion considering an invertible kernel were designed, allowing a small leakage. This result was extended to a non-invertible leakage matrix in [17].
In [1], an approach to partial secrecy called secrecy by design was introduced and applied to two information processing problems: privacy mechanism design and lossless compression. For the privacy design problem, bounds on the privacy–utility trade-off were derived using the functional representation lemma. These results were obtained under a perfect privacy (secrecy) assumption. In [18], the privacy problems considered in [1] were extended by relaxing the perfect secrecy constraint and allowing some leakages. In [20], a privacy–utility trade-off with two different per-letter (point-wise) privacy constraints was studied. The fundamental limits of private data disclosure were studied in [26], where the goal was to minimize the leakage under the utility constraints with non-specific tasks, and it was shown that, under the assumption that the private data are an element-wise deterministic function of useful data, the main problem can be reduced to multiple privacy funnel (PF) problems. Furthermore, the exact solution to each problem was obtained. The concept of lift was studied in [27], which represents the likelihood ratio between the posterior and prior beliefs concerning sensitive features within a dataset. Further privacy leakage measures, such as the differential privacy introduced in [28], ( ϵ , δ ) -differential privacy [29], local differential privacy [30,31,32,33], maximal leakage [12], and lift [27], have been used in the literature.
Moreover, in [1], the problems of fixed-length and variable-length compression were studied, and upper and lower bounds on the average length of an encoded message were derived. These results were obtained under the assumption that the private data were independent of the encoded message, which corresponds to perfect privacy. The minimum entropy coupling concept and a greedy entropy-based algorithm were studied in [34,35,36]. More specifically, in [36], an information spectrum converse was derived and used to find a lower bound on the minimum entropy problem.

1.2. Our Contributions

In the perfect privacy scenario, we consider the problem in [1], where for the problem of lossless data compression, strong information theoretic guarantees are provided and fundamental limits are characterized when the private data are a deterministic function of the useful data. In this paper, we improve the bounds obtained in [1].
To this end, we combine the privacy design techniques used in [18], which are based on extended versions of the functional representation lemma (FRL) and strong functional representation lemma (SFRL), as well as the lossless data compression design in [1]. We find lower and upper bounds on the average length of the encoded message C and study them in different scenarios. Considering a specific set of joint distributions for ( X , Y ) , we propose an algorithm with low complexity to find bounds on the optimizer of privacy problems in [18] with zero leakage. We use the obtained results to design C , which achieves tighter bounds compared to [1]. Furthermore, when | X | | Y | , we propose a new code that improves on the bounds in [1]. We propose a new design and strengthen the obtained bounds using the minimum entropy coupling concept and a greedy entropy-based algorithm. Finally, in a numerical example, we study the bounds and compare them with [1].
In the non-perfect privacy scenario, our problem is closely related to [1], where we generalize the variable-length lossless compression problem considered in [1] by removing the assumption that X and C are independent, i.e., I ( X ; C ( Y , W ) ) = 0 , and therefore allowing a small leakage.
We find the lower and upper bounds on the average length of the encoded message C and study them in different scenarios. For achievability, we use two-part construction coding. In an example, we show that the obtained bounds can be asymptotically tight. Furthermore, in the case of perfect privacy, the existing bounds found in [1] were improved considering the case where X = ( X 1 , X 2 ) and X 2 is a deterministic function of X 1 . The conference versions regarding this paper can be found in [22,23].
Our contribution can be summarized as follows:
(i) In Section 2, we define a private variable-length coding with zero and non-zero leakage problems. We propose a method to deal with the privacy concerns and present privacy–utility trade-offs considering zero and non-zero leakage scenarios.
(ii) In Section 3, we provide a brief overview of the privacy problems in the literature, two essential lemmas, and the minimum entropy coupling concept.
(iii) In Section 4, we find the upper and lower bounds on the trade-offs proposed in Section 2 considering zero and non-zero leakage scenarios. In both scenarios, to find upper bounds, we use the two-part construction coding introduced in [1], which is based on the extended versions of the FRL, the SFRL. We obtain a simple private compression mechanism design that can be optimal in specific scenarios. Moreover, we provide a discussion and comparison of the obtained bounds, with each other and the literature, and the tightness of the bounds is investigated. The results regarding the tightness of the bounds are extended using the concept of the minimum information coupling problem in a zero leakage scenario.
(iv) In Section 5, we present an application in which our method can be applied. Specifically, we consider a cache-aided network and discuss how our results in this paper can be used.
(v) In Section 6, we present an experiment considering the MNIST dataset and we compare our method with previous results in the literature.
Finally, the paper is concluded in Section 7.

2. System Model and Problem Formulation

Let P X Y denote the joint distribution of discrete random variables X and Y defined on alphabets X and Y . We assume that the cardinality | X | is finite and | Y | is finite or countably infinite. We represent P X Y by a matrix defined on R | X | × | Y | and marginal distributions of X and Y by vectors P X and P Y defined on R | X | and R | Y | given by the row and column sums of P X Y . The relation between X and Y is given by the leakage matrix P X | Y defined on R | X | × | Y | . The shared secret key is denoted by the discrete RV W defined on { 1 , , M } and is assumed to be accessible by both the encoder and decoder. Furthermore, we assume that W is uniformly distributed and is independent of X and Y. A prefix-free code with variable-length and shared secret key of size M is a pair of mappings:
( encoder ) C : X × Y × { 1 , , M } { 0 , 1 } ( decoder ) D : { 0 , 1 } × { 1 , , M } Y .
The output of the encoder C ( Y , W ) describes the encoded message. The variable-length code ( C , D ) is lossless if
P ( D ( C ( X , Y , W ) , W ) = Y ) = 1 .

2.1. Non-Zero Privacy Leakage (Non-Perfect Privacy)

Similarly to [1], we define ϵ -private, strongly ϵ -private, and point-wise ϵ -private codes. The code ( C , D ) is ϵ-private if
I ( C ( X , Y , W ) ; X ) = ϵ .
Moreover, the code ( C , D ) is bounded ϵ-private if
I ( C ( X , Y , W ) ; X ) ϵ .
Let ξ be the support of C ( X , W , Y ) . For any c ξ , let L ( c ) be the length of the codeword. The code ( C , D ) is of ( α , M ) -variable-length if
E ( L ( C ( X , Y , w ) ) ) α , w { 1 , , M } .
Finally, let us define the sets H ϵ ( α , M ) and H b ϵ ( α , M ) , as follows: H ϵ ( α , M ) { ( C , D ) : ( C , D ) is ϵ -private and of ( α , M ) -variable-length } , and H b ϵ ( α , M ) { ( C , D ) : ( C , D ) is bounded ϵ -private and of ( α , M ) -variable-length } . The private compression design problems can be then stated as follows:
L ( P X Y , M , ϵ ) = inf ( C , D ) : ( C , D ) H ϵ ( α , M ) α ,
L b ( P X Y , M , ϵ ) = inf ( C , D ) : ( C , D ) H b ϵ ( α , M ) α .
Clearly, we have L b ( P X Y , M , ϵ ) L ( P X Y , M , ϵ ) and L b ( P X Y , M , ϵ ) L ( P X Y , M , 0 ) .
Remark 1.
Considering ϵ = 0 , both (5) and (6) lead to the privacy–compression rate trade-off studied in [1]. In this paper, we generalize this trade-off by considering a non-zero ϵ .

2.2. Zero Privacy Leakage (Perfect Privacy)

We define a perfectly-private code. The code ( C , D ) is perfectly-private if
I ( C ( X , Y , W ) ; X ) = 0 .
Let ξ be the support of C ( X , W , Y ) and for any c ξ let L ( c ) be the length of the codeword. The code ( C , D ) is ( α , M ) -variable-length if (4) holds. Finally, let us define the sets H ( α , M ) as follows:
H ( α , M ) { ( C , D ) : ( C , D ) is perfectly-private and ( α , M ) -variable-length } . The private compression design problems can be stated as follows:
L 0 ( P X Y , M ) = inf ( C , D ) : ( C , D ) H ( α , M ) α .
Furthermore, let us recall the privacy mechanism design problems considered in [19] with zero leakage as follows:
g 0 ( P X Y ) max P U | Y : X Y U I ( U ; X ) = 0 , I ( Y ; U ) ,
h 0 ( P X Y ) max P U | Y , X : I ( U ; X ) = 0 , I ( Y ; U ) .
Finally, we define a set of joint distributions P ^ X Y as follows:
P ^ X Y { P X Y : g 0 ( P X Y ) = h 0 ( P X Y ) } .
Remark 2.
The problem (8) was studied in [1]. Here, we improve the bounds in [1] in two different cases and obtain more bounds using the concept of the minimum information coupling problem.

3. Preliminaries and Overview of the Privacy Problems

In this section, we provide a brief overview of a few related privacy problems. We then provide essential lemmas and definitions corresponding to the extended functional representation lemma, the extended strong functional representation lemma, and the minimum information coupling problem. Mutual information has been widely used as a privacy leakage measure [1,2,14,15,18,21,26]. Specifically, in [21], the following problem was addressed
g ϵ ( P X Y ) sup P U | Y : X Y U I ( U ; X ) ϵ , I ( Y ; U ) ,
where in ([21], Lemma 1), the lower and upper bounds on g ϵ ( P X Y ) were derived. Considering perfect privacy, i.e., ϵ = 0 , g 0 ( P X Y ) can be obtained by solving a linear program ([15], Theorem 1). This implies that the optimal mapping is the solution to a linear program. Furthermore, it has been shown that, to find the optimal privacy-preserving mapping, it is sufficient to consider U such that | U | null ( P X | Y ) + 1 . However, solving the linear program suggested by [15] can become challenging when the size of Y or X increases. Necessary and sufficient conditions for achieving non-zero utility when the private data are hidden, i.e., g 0 ( P X Y ) > 0 , were derived in [37,38]. It was shown that g 0 ( P X Y ) > 0 if and only if the rows of P X | Y are linearly dependent. This result was demonstrated and generalized in [15], e.g., see ([15], Proposition 1). Moreover, considering an observable private data scenario, necessary and sufficient conditions for attaining non-zero utility, i.e., h 0 ( P X Y ) > 0 , were derived in ([1], Theorem 5), where
h ϵ ( P X Y ) sup P U | Y , X : I ( U ; X ) ϵ , I ( Y ; U ) .
It has been shown that h 0 ( P X Y ) > 0 if and only if Y is not a deterministic function of X, i.e., H ( Y | X ) > 0 .
Next, we present a brief overview of the linear program proposed by [15] that attains g 0 ( P X Y ) . To do so, let β be a vector in R | Y | . It is shown that β Null ( P X | Y ) if and only if β Null ( M ) , where M R | X | × | Y | is constructed as follows: Let V be the matrix of right eigenvectors of P X | Y , i.e., P X | Y = U Σ V T and V = [ v 1 , v 2 , , v | Y | ] , then M is defined as
M v 1 , v 2 , , v rank ( P X | Y ) T .
Furthermore, if β Null ( P X | Y ) , then 1 T β = 0 . As shown in [15], for every u U , if the Markov chain X Y U holds and I ( X ; U ) = 0 , then the vector P Y | U = u belongs to the following convex polytope S
S = y R | Y | | M y = M P Y , y 0 .
Using this, we have
X Y U , I ( X ; U ) = 0 P Y | U = u S , u ,
which results in the following equivalency
min P U | Y : X Y U I ( X ; U ) = 0 H ( Y | U ) = min P U , P Y | U = u S , u U , u P U ( u ) P Y | U = u = P Y , H ( Y | U ) ,
where P U defined on R | U | is the marginal distribution of U. Finally, let P Y | U = u , u U be the minimizer of H ( Y | U ) over the set { P Y | U = u S , u U | u P U ( u ) P Y | U = u = P Y } , then, P Y | U = u S for all u U must belong to the extreme points of S u . Furthermore, it is sufficient to consider null ( P X | Y ) + 1 extreme points of S . Using the previous result, the solution to the minimization in (15) can be obtained in two phases: in phase one, the extreme points of set S are identified, while in phase two, proper weights over these extreme points are obtained to minimize the objective function. As argued in [15], the extreme points of S are the basic feasible solutions of S . Basic feasible solutions of S can be found in the following manner. Let Ω be the set of indices which correspond to rank ( P X | Y ) linearly independent columns of M, i.e., | Ω | = rank ( P X | Y ) and Ω { 1 , , | Y | } . Let M Ω R rank ( P X | Y ) × rank ( P X | Y ) be the sub-matrix of M with columns indexed by the set Ω . Then, if all elements of the vector M Ω 1 M P Y are non-negative, the vector V Ω R | Y | , which is defined in the following, is a basic feasible solution of S . Assume that Ω = { ω 1 , , ω rank ( P X | Y ) } , where ω i { 1 , , | Y | } and all elements are arranged in an increasing order. The ω i -th element of V Ω is defined as the i-th element of M Ω 1 M P Y , i.e., for 1 i rank ( P X | Y ) , we have
V Ω ( ω i ) = M Ω 1 M P Y ( i ) .
The other elements of V Ω are set to zero. Thus, by using the same arguments as in [15], we obtain the following corollary.
Corollary 1.
Let S be a set that contains the extreme points of S . If all elements of the vector M Ω 1 M P Y are non-negative, then V Ω S S .
In the next proposition, we show two properties of each vector inside S .
Proposition 1.
Let Ω { 1 , , | Y | } , | Ω | = rank ( P X | Y ) . For every Ω, we have 1 T M Ω 1 M P Y = 1 .
Proof. 
Consider the set S 1 = { y R | Y | | M y = M P Y } . Any element in S 1 has sum elements equal to one, since we have M ( y P Y ) = 0 y P Y Null ( P X | Y ) , we obtain 1 T ( y P Y ) = 0 1 T y = 1 . The basic solutions of S 1 are W Ω defined as follows: Let Ω = { ω 1 , , ω rank ( P X | Y ) } , where ω i { 1 , , | Y | } , then, W Ω ( ω i ) = M Ω 1 M P Y ( i ) , and other elements of W Ω are zero. Thus, the sum over all elements of M Ω 1 M P Y equals one, since each element in S 1 has to sum to one.  □
After finding all extreme points of the set S , for the second phase, we proceed as follows. Assume that all extreme points of the set S are denoted by α 1 , α 2 , , α K , which were obtained in the first phase. Then, using the equivalency in Theorem (15), we have
g 0 ( P X Y ) = H ( Y ) min β : β R K , β 0 [ α 1 α 2 α K ] β = P Y [ H ( α 1 ) H ( α 2 ) H ( α K ) ] β ,
where H ( α i ) is the entropy of the distribution vector α i .
In this work, considering the zero leakage scenario, to design the code, we use the properties of the privacy problems defined in (12) and (13).
Next, we recall two important lemmas from [18].
Lemma 1.
(Extended Functional Representation Lemma (EFRL)): For a pair of RVs ( X , Y ) distributed according to P X Y supported on alphabets X and Y , respectively, where | X | is finite and | Y | is finite or countably infinite, and any 0 ϵ < I ( X ; Y ) , there exists a RV U defined on U such that the leakage between X and U is equal to ϵ, i.e., we have
I ( U ; X ) = ϵ ,
Y is a deterministic function of ( U , X ) , i.e., we have
H ( Y | U , X ) = 0 ,
and | U | | X | ( | Y | 1 ) + 1 | X | + 1 .
Lemma 2.
(Extended Strong Functional Representation Lemma (ESFRL)): For a pair of RVs ( X , Y ) distributed according to P X Y supported on alphabets X and Y , respectively, where | X | is finite and | Y | is finite or countably infinite with I ( X , Y ) < , and any 0 ϵ < I ( X ; Y ) , there exists a RV U defined on U such that the leakage between X and U is equal to ϵ, i.e., we have
I ( U ; X ) = ϵ ,
Y is a deterministic function of ( U , X ) , i.e., we have
H ( Y | U , X ) = 0 ,
I ( X ; U | Y ) can be upper bounded as follows:
I ( X ; U | Y ) α H ( X | Y ) + ( 1 α ) log ( I ( X ; Y ) + 1 ) + 4 ,
and | U | | X | ( | Y | 1 ) + 2 | X | + 1 , where α = ϵ H ( X ) .
The main idea in constructing U, which satisfies the conditions in Lemmas 1 and 2, is to add a randomized response to the output of the FRL and SFRL, respectively. The output refers to a random variable which is produced by the FRL or the SFRL and satisfies the corresponding constraints. In this work, considering the non-zero leakage scenario, to design the code, we utilize the techniques used in Lemmas 1 and 2. The randomization technique was used in finding fair and private representations of a dataset in [39]. To find fair representations, we use randomization techniques similar to those in Lemmas 1 and 2, which lead to lower bounds. Furthermore, this randomization technique has also been applied in the design of differentially private mechanisms [40,41,42,43]. A similar randomization technique was used in [22,23] to address private compression design problems. Next, we recall the important results regarding the upper and lower bounds on the minimum entropy coupling as obtained in [34,35,36]. First, we recall the definition of the minimum entropy coupling problem using ([36], Def. 2).
Definition 1.
(minimum entropy coupling) For a set of PMFs { P 1 , , P M } , the minimum Shannon entropy coupling is
H ( P 1 , , P M ) inf P Y 1 , , Y M : Y i P i , i H ( Y 1 , , Y M ) ,
where the infimum is taken over all the joint distributions P Y 1 , , Y M , where the marginal distribution P Y i equals P i for all i { 1 , , M } .
Similarly to [36], for a given joint distribution P X Y , let the minimum entropy of functional representation of ( X , Y ) be defined as
H ( P X Y ) inf H ( Y | X , U ) = 0 , I ( X ; U ) = 0 H ( U ) .
Remark 3.
As shown in ([36], Lemma 1), the minimum entropy functional representation and the minimum entropy coupling are related functions. In more detail, H ( P X Y ) equals the minimum entropy coupling of the set of PMFs { P Y | X = x 1 , , P Y | X = x n } , where X = { x 1 , , x n } .
Let G S be the output of the greedy entropy-based algorithm which was proposed in ([34], Section 3), i.e., H ( P X Y ) H ( G S ) . More specifically, the corresponding algorithm aims to solve (19) but does not achieve the optimal solution in general. Next, we recall a result obtained in [35], which showed that G S is optimal within log e e 0.53 bits when | X | = 2 and is optimal within 1 + log e 2 1.22 bits when | X | > 2 . Let U achieve the optimal solution of (19), i.e., H ( U ) = H ( P X Y ) .
Theorem 1
([35], Th. 3.4, Th. 4.1, Th. 4.2). Let ( X , Y ) P X Y and have finite alphabets. When X is binary, we have
H ( P r o f i l e ) H ( U ) H ( G S ) H ( P r o f i l e ) + log e e H ( P r o f i l e ) + 0.53 .
Moreover, for | X | > 2 , we have
H ( P r o f i l e ) H ( U ) H ( G S ) H ( P r o f i l e ) + 1 + log e 2 H ( P r o f i l e ) + 1.22 .
Here, P r o f i l e corresponds to the profile method proposed in ([35], Section 3).
Since finding the optimal solution for (19) is challenging, Theorem 1 asserts that G S can be a good approximation for U that achieves (19). Next, we present the results on lower bounds on H ( P X Y ) obtained in a parallel work [36]. The lower bounds were derived using information spectrum and majorization concepts.
Theorem 2
([36], Corollary 2, Th. 2). Let ( X , Y ) P X Y and have finite alphabets. Let α = 1 in ([36], Corollary 2, Th. 2). We have
H ( x X P Y | x ) H ( Q ) H ( U ) ,
where ∧ corresponds to the greatest lower bound with respect to majorization and Q is defined in ([36], Lemma 3).
Remark 4.
In contrast with [35], the lower bounds in [36] were obtained using Rényi entropy in (19). In this work, we consider Shannon entropy, which is a special case of Rényi entropy.
Remark 5.
As argued in ([36], Remark 1), for α = 1 the (largest) lower bounds obtained in [35,36] match. Thus, using Theorem 1, for binary X, we have
H ( Q ) H ( U ) H ( G S ) H ( Q ) + log e e ,
and for | X | > 2 ,
H ( Q ) H ( U ) H ( G S ) H ( Q ) + 1 + log e 2 .
Moreover, in some cases, the lower bound H ( Q ) is tight, e.g., see ([36], Example 2).
As discussed in [36], Q can be obtained by a greedy construction. To do so, let Q = ( q 1 , q 2 , ) with q 1 q 2 , where q i = P ( Q = q i ) . Let P Y | X be a matrix with columns P Y | X = x , where each is a conditional distribution vector and assume that each column has a descending order (re-order each column). Let q 1 = min x X { max y Y P Y | X ( y | x ) } . In other words, we choose the smallest number in the first row of the matrix P Y | X . Next, we subtract q 1 from the first row and reorder each column and update the matrix. We then choose the smallest number from the first row of the updated matrix and represent it by q 2 . We continue this procedure until the summation of q i reaches one. To see an example, refer to ([36], Example 1).
Finally, we recall the two-part construction coding which was used in [1]. To do so, let us consider the scenario illustrated in Figure 1 where an encoder wishes to compress Y and communicates it to a user over an unsecured channel. The encoded message is described by the RV C . As shown in Figure 1, it is assumed that an adversary has access to the encoded message C , and wants to extract information about X. Moreover, it is assumed that the encoder and decoder have access to a shared secret key denoted by the RV W with size M. The goal is to design an encoded message C , which compresses Y, using a variable length code with the minimum possible average length that satisfies a zero leakage constraint. The shared secret key is denoted by the discrete RV W defined on { 1 , , M } and is assumed to be accessible by both the encoder and decoder. Furthermore, we assume that W is uniformly distributed and is independent of X and Y.
Theorem 3
([1], Theorem 8). Let 0 ϵ H ( X ) and the pair of RVs ( X , Y ) be distributed according to P X Y , and the shared secret key size be | X | , i.e., M = | X | . Then, we have
L 0 ( P X Y , | X | ) x X H ( Y | X = x ) + 1 + log ( | X | ) ,
where α = ϵ H ( X ) and if | Y | is finite, we have
L 0 ( P X Y , | X | ) log | X | ( | Y | 1 ) + 1 + log ( | X | ) ,
Finally, if X is a deterministic function of Y, we have
L 0 ( P X Y , | X | ) log ( | Y | | X | + 1 ) + log ( | X | ) .
Proof. 
Let W be the shared secret key with key size M = | X | , which is uniformly distributed over { 1 , , M } = { 1 , , | X | } and independent of ( X , Y ) . As shown in Figure 2, first, the private data X are encoded using the shared secret key. Thus, we have
X ˜ = X + W mod | X | .
Next, we show that X ˜ has a uniform distribution over { 1 , , | X | } and I ( X ; X ˜ ) = 0 . We have
H ( X ˜ | X ) = H ( X + W | X ) = H ( W | X ) = H ( W ) = log ( | X | ) .
Furthermore, H ( X ˜ | X ) H ( X ˜ ) , and combining it with (28), we obtain H ( X ˜ | X ) = H ( X ˜ ) = log ( | X | ) . For encoding X ˜ , we use log ( | X | ) bits. Let C 1 denote the encoded X ˜ . Next, by using the construction in ([1], Lemma 2), let U be the output of FRL. Thus, we have
(29) I ( U ; X ) = 0 , (30) H ( Y | U , X ) = 0 , (31) H ( U ) x H ( Y | X = x ) + 1 .
We encode U using any lossless code which uses at most H ( U ) + 1 bits and let C 2 be the encoded message U . We send C = ( C 1 , C 2 ) over the channel. Thus, (25) can be obtained. Note that, as shown in Figure 3, on the decoder side, we first decode X using W, by adding | X | W to X ˜ , then, Y is decoded using the fact that U satisfies H ( Y | U , X ) = 0 . Next, we show that if W is independent of ( X , U ) we obtain I ( C ; X ) = 0 . We have
I ( C ; X ) = I ( C 1 , C 2 ; X ) = I ( U , X ˜ ; X ) = I ( U ; X ) + I ( X ˜ ; X | U ) = I ( X ˜ ; X | U ) = ( a ) H ( X ˜ | U ) H ( X ˜ | X , U ) = H ( X ˜ | U ) H ( X + W | X , U ) = ( b ) H ( X ˜ | U ) H ( W ) = ( c ) H ( X ˜ ) H ( W ) = log ( | X | ) log ( | X | ) = 0
where (a) follows from I ( U ; X ) = 0 ; (b) follows, since W is independent of ( X , U ) ; and (c) from the independence of U and X ˜ . The latter follows, since we have
0 I ( X ˜ ; X | U ) = H ( X ˜ | U ) H ( W ) = ( i ) H ( X ˜ | U ) H ( X ˜ ) 0 .
Thus, X ˜ and U are independent. Step (i) above follows by the fact that W and X ˜ are uniformly distributed over { 1 , , | X | } , i.e., H ( W ) = H ( X ˜ ) . As a summary, if we choose W independently of ( X , U ) , the leakage to the adversary is zero.  □

4. Main Results

Here, we find upper and lower bounds on (5), (6) and (8), and study them in different scenarios.

4.1. Zero Leakage Results

In this section, we derive upper and lower bounds on L 0 ( P X Y , M , ϵ ) defined in (8). We first study the properties of P ^ X Y and provide a sufficient condition on P X Y so that it belongs to the corresponding set. We then find upper bounds on the entropy of the optimizers of g 0 ( P X Y ) and h 0 ( P X Y ) . The obtained bounds help us to find upper bounds on L 0 ( P X Y , M ) . In other words, for P X Y P ^ X Y , we use the solutions of g 0 ( P X Y ) and h 0 ( P X Y ) to design C ; however, in [1], the design of the code was based on the functional representation lemma. Finally, we provide lower bounds on L 0 ( P X Y , M ) , study the bounds in a numerical example, and compare them with [1].
In the next lemma, let C ( X , Y ) denote the common information between X and Y, where the common information corresponds to the Wyner [44] or Gács-Körner [45] notions of common information.
Lemma 3.
If C ( X , Y ) = I ( X ; Y ) , then P X Y P ^ X Y .
Proof. 
The proof is provided in ([18], Proposition 6) with ϵ = 0 and follows similar arguments as the proof of ([46], Theorem 2).  □
Corollary 2.
If X is a deterministic function of Y, then P X Y P ^ X Y , since, in this case, we have C ( X , Y ) = I ( X ; Y ) . Furthermore, this result was also shown in ([18], Proposition 6).
In the next result, we provide the properties of the optimizers for g 0 ( P X Y ) and h 0 ( P X Y ) .
Lemma 4.
If g 0 ( P X Y ) = h 0 ( P X Y ) , then
g 0 ( P X Y ) = h 0 ( P X Y ) = H ( Y | X ) ,
and both g 0 ( P X Y ) and h 0 ( P X Y ) have the same set of optimizers. Furthermore, for any optimizer U , we have
(33) H ( Y | U , X ) = 0 , (34) I ( X ; U | Y ) = 0 , (35) I ( X ; U ) = 0 .
Finally, if U satisfies (33)–(35), then it is an optimizer for both g 0 ( P X Y ) and h 0 ( P X Y ) .
Proof. 
All the statements can be shown using ([1], Theorem 7) and the key equation as follows:
I ( U ; Y ) = I ( X ; U ) + H ( Y | X ) I ( X ; U | Y ) H ( Y | X , U ) .
 □
Remark 6.
The optimization problems in (9) and (10) do not have unique optimizers. For instance, let X = f ( Y ) , by using the constructions used in [1,47], we can attain the optimum value.
Next, we show that if P X Y P ^ X Y , without loss of optimality, we can assume that | U | null ( P X | Y ) + 1 , where null ( P X | Y ) corresponds to the dimension of the null space of P X | Y .
Lemma 5.
For any P X Y , we have
g 0 ( P X Y ) = max P U | Y : X Y U , I ( X ; U ) = 0 , | U | null ( P X | Y ) + 1 I ( Y ; U ) .
Furthermore, let P X Y P ^ X Y . Then,
h 0 ( P X Y ) = g 0 ( P X Y ) (38) = max P U | Y : X Y U I ( X ; U ) = 0 , | U | null ( P X | Y ) + 1 I ( Y ; U ) (39) = max P U | Y : I ( X ; U ) = 0 , | U | null ( P X | Y ) + 1 I ( Y ; U )
Proof. 
The proof of (37) is provided in ([15], Theorem 1). It only remains to show the equality ( 38 ) = ( 39 ) . Let U be an optimizer of (38). Using (37), it is also an optimizer of g 0 ( P X Y ) and hence h 0 ( P X Y ) . Thus, ( 38 ) = ( 39 ) . In other words, for P X Y P ^ X Y , we can assume | U | null ( P X | Y ) + 1 in both problems g 0 ( P X Y ) and h 0 ( P X Y ) .  □
Before stating the next result, let us define a set U 1 ( P X Y ) and a function K ( P X Y ) as follows
(40) U 1 ( P X Y ) { U : U satisfies ( 33 ) , ( 34 ) , ( 35 ) } (41) K ( P X Y ) min U U 1 ( P X Y ) H ( U ) .
Note that U 1 ( P X Y ) can be empty for some joint distributions P X Y ; however, using Lemma 4 when P X Y P ^ X Y it is the set which contains all the optimizers satisfying g 0 ( P X Y ) = h 0 ( P X Y ) . Furthermore, the function K ( P X Y ) finds the minimum entropy of all optimizers satisfying g 0 ( P X Y ) = h 0 ( P X Y ) .
Lemma 6.
Let P X Y P ^ X Y . We have
K ( P X Y ) log ( null ( P X | Y ) + 1 ) .
Proof. 
The proof follows from Lemma 5. Let U be any optimizer of g 0 ( P X Y ) that satisfies | U | null ( P X | Y ) + 1 . We have K ( P X Y ) H ( U ) log ( null ( P X | Y ) + 1 ) .  □
Remark 7.
Lemma 6 asserts that when P X Y P ^ X Y , there exist a U that satisfies (33)–(35), and
H ( U ) log ( null ( P X | Y ) + 1 ) .
Remark 8.
The upper bound derived in Lemma 6 can be significantly smaller than the upper bound obtained in ([1], Lemma 2). This helps us to find codes that have a lesser average length compared to [1]. Noting that ([1], Lemma 2) asserts that there exists U which satisfies (33) and (35), and
| U | | X | ( | Y | 1 ) + 1 ,
which results in
H ( U ) log ( | X | ( | Y | 1 ) + 1 ) .
In the next example, we compare (44) and (43).
Example 1.
Let Y be a deterministic function of X, i.e., Y = f ( X ) , which yields P X Y P ^ X Y and Y | X | . In this case, null ( P X | Y ) | X | 1 . We have
null ( P X | Y ) | X | 1 < | X | ( | Y | 1 ) ,
which results in
( 43 ) < ( 44 ) .
Furthermore, as | X | and | Y | grow, the gap between (33) and (34) becomes larger. To see another example, let X and Y be represented by ( X , V ) and ( Y , V ) , where X and Y are conditionally independent given V, then by using [48], we have I ( X ; Y ) = C ( X ; Y ) resulting in P X Y P ^ X Y , where C ( X ; Y ) corresponds to the common information between X and Y. In this case, we have null ( P X | Y ) max { | X | , | Y | } 1 . For | X | > 2 , we have
null ( P X | Y ) max { | X | , | Y | } 1 < | X | ( | Y | 1 ) ,
which results in
( 43 ) < ( 44 ) .
Intuitively, when | X | and | Y | are large, (43) can be significantly less than (44), since null ( P X | Y ) max { | X | , | Y | } 1 and (44) includes multiplication of | X | and | Y | .
Before stating the next result, we define
(45) A X Y P y 1 P y 1 | x 1 P y q P y q | x 1 · · P y 1 P y 1 | x t P y q P y q | x t R t × q , (46) b X Y H ( Y | x 1 ) H ( Y | X ) · H ( Y | x t ) H ( Y | X ) R t , a a 1 · a q R q .
where t = | X | and q = | Y | .
Theorem 4.
Let P X Y P ^ X Y . For any U U 1 , we have
(47) H ( Y | X ) + min a i : A X Y a = b X Y , a 0 i = 1 q P y i a i K ( P X Y ) (48) H ( U ) H ( Y | X ) + max a i : A X Y a = b X Y , a 0 i = 1 q P y i a i .
The upper bound can be strengthened and we have
(49) K ( P X Y ) H ( Y | X ) + max a i : A X Y a = b X Y , a 0 , i = 1 q P y i a i log ( null ( P X | Y ) + 1 ) H ( Y | X ) i = 1 q P y i a i (50) log ( null ( P X | Y ) + 1 ) .
Moreover, when rank ( A X Y ) = | Y | and Y f ( X ) , we have | U 1 | = 1 , and for the unique optimizer we have
K ( P X Y ) = H ( Y | X ) + i P y i a i
where A X Y a = b X Y , a 0 and i = 1 q P y i a i log ( null ( P X | Y ) + 1 ) H ( Y | X ) .
Proof. 
The proof is provided in Appendix A.  □
Remark 9.
As we outlined earlier, it is shown in [15] that g 0 ( P X Y ) can be obtained by a linear program which is presented in Section 3. As we can see in Section 3, each extreme point α i is a vector with size | Y | in which at most rank ( P X | Y ) elements are non-zero. We recall that β corresponds to the weighting elements in the marginal distribution of U, i.e., β = P U . The number of extreme points is at most | Y | rank ( P X | Y ) . Hence, if we consider the linear equations [ α 1 α 2 α K ] β = P Y , the size of the matrix in the system of linear equations, i.e., [ α 1 α 2 α K ] , is at most | Y | × | Y | rank ( P X | Y ) and we have at most | Y | rank ( P X | Y ) variables in the system. By solving the linear program proposed in [15] we can find the exact value of K ( P X Y ) and the conditional distribution P U | Y X that attains it. The complexity of the linear program in [15] can grow faster than the exponential functions with respect to | Y | ; however, the complexity of our proposed method grows linearly with | Y | . The latter follows since, by considering the system of linear equations in our proposed method A X Y a = b X Y , the size of matrix is | X | × | Y | and we do not need to find all possible extreme points of the set S presented in Section 3. Thus, our proposed upper bound has less complexity compared to the solution in [15]. Furthermore, in a special case where rank ( A X Y ) = | Y | , we can find the exact value of K ( P X Y ) using our method. One necessary condition for rank ( A X Y ) = | Y | is to have | X | | Y | + 1 , since the summation of rows in A X Y equals zero.
Remark 10.
The optimizer of g 0 ( P X Y ) does not help us to build C , since the constraint H ( Y | U , X ) = 0 does not hold in general; however, the optimizer of h 0 ( P X Y ) satisfies it. Hence, we consider the cases where the equality g 0 ( P X Y ) = h 0 ( P X Y ) holds. In this case, H ( Y | U , X ) = 0 and we have tighter bounds on H ( U ) compared to [1].
Remark 11.
The upper bound in (48) holds for any U U 1 ; however, the upper bound in (48) asserts that there exists U U 1 such that the bound holds. The lower bound in (47) can be used to find a lower bound on L 0 ( P X Y , M ) . Similarly to Remark 9, the linear program attaining the lower bound in (47) has less complexity than [15].
Next, we obtain upper and lower bounds on L 0 ( P X Y , M ) . See Figure 4 and Figure 5.
Theorem 5.
For the pair of RVs ( X , Y ) , let P X Y P ^ X Y and the shared secret key size be | X | , i.e., M = | X | . Furthermore, let q = | Y | and β = log ( null ( P X | Y ) + 1 ) . Then, we obtain
(52) L 0 ( P X Y , | X | ) K ( P X Y ) + 1 + log ( | X | ) (53)   H ( Y | X ) + max a i : A X Y a = b X Y , a 0 , i = 1 q P y i a i β H ( Y | X ) i = 1 q P y i a i + 1 + log ( | X | ) (54)   β + 1 + log ( | X | )
For any P X Y , we have
(55) L 0 ( P X Y , | X | ) min P U | Y , X : I ( U ; X ) = 0 , H ( Y | X , U ) = 0 H ( U ) + 1 + log ( | X | ) (56) 1 + min ( x H ( Y | X = x ) , log ( | X | ( | Y | 1 ) + 1 ) 1 ) + log ( | X | ) ,
and if X = f ( Y ) ,
L 0 ( P X Y , | X | ) log ( | Y | | X | + 1 ) + log ( | X | ) .
Finally, for any P X Y with | Y | | X | , we have
L 0 ( P X Y , | Y | ) log | Y | .
Proof. 
All upper bounds can be obtained using two-part construction coding [1] and the complete proof is provided in Appendix A.  □
Next, we obtain lower bounds on L 0 ( P X Y , M ) .
Theorem 6.
For any pair of RVs ( X , Y ) distributed according to P X Y and shared key size M 1 , we have
L 0 ( P X Y , M ) max x H ( Y | X = x ) ,
if X = f ( Y ) ,
L 0 ( P X Y , | X | ) log ( | X | ) ,
and the code C does not exist when M < | X | and X = f ( Y ) . Furthermore, considering P X Y P ^ X Y , if we assume the received code C satisfies I ( X ; C ) = 0 , H ( Y | X , C ) = 0 and X Y C , we have
L 0 ( P X Y , M ) H ( Y | X ) + min a i : A X Y a = b X Y , a 0 i = 1 q P y i a i .
Proof. 
The proofs of (59) and (60) are provided in ([1], Theorem 9) and (61) follows from Theorem 4.  □
Remark 12.
The constraint X Y C in the converse bound (61) can be motivated by considering the cases where the encoder has no direct access to the private data X. Furthermore, the constraint H ( Y | X , C ) = 0 is stronger compared to the H ( Y | X , C , W ) = 0 that is used to prove (59). Specifically, the constraint needed to prove (59) is H ( Y | C , W ) = 0 , which results in H ( Y | X , C , W ) = 0 .

4.2. Comparison

In this part, we study the bounds obtained in Theorem 5 and compare them with the existing results in [1].
  • Case 1:  | Y | | X |
Clearly, for any P X Y with | Y | | X | , the upper bound obtained in (58) improves the bounds in (56) and (55), where (56) was derived in [1]. The latter follows, since in this case we have log ( | Y | ) log ( | X | ) .
  • Case 2:  P X Y P ^ X Y
In this part, let X = f ( Y ) , which results in P X Y P ^ X Y , see Corollary 1. The upper bound attained by [1] is log ( | Y | | X | + 1 ) + log ( | X | ) . Furthermore, we can assume that all elements in the probability vector P X are non-zero, otherwise we can remove them. the In this case, we have
null ( P X | Y ) = | Y | | X | .
This follows, since each column in P X | Y contains exactly one non-zero element that equals 1. Moreover, all rows of P X | Y are linearly independent, since each row contains a non-zero element that the other rows do not have. Thus, rank ( P X | Y ) = | X | , which results null ( P X | Y ) = | Y | | X | . Thus, the upper bound in (54) can be modified to obtain the same bound as (57). Furthermore, the upper bounds (53) and (52) attain less quantities compared to (57), i.e., they can improve the bounds in [1]. Next, in a numerical example, we show that the upper bounds (53) and (52) improve (57).
Example 2.
Let P X | Y = 1 1 1 0 0 0 0 0 0 1 1 1 and P Y = [ 1 8 , 2 8 , 3 8 , 1 8 , 1 16 , 1 16 ] . Clearly, in this case X is a deterministic function of Y. Using the linear program proposed in [15], we obtain a solution as P Y | u 1 = [ 0.75 , 0 , 0 , 0.25 , 0 , 0 ] , P Y | u 2 = [ 0 , 0.75 , 0 , 0.25 , 0 , 0 ] , P Y | u 3 = [ 0 , 0 , 0.75 , 0 , 0.25 , 0 ] , P Y | u 4 = [ 0 , 0 , 0.75 , 0 , 0 , 0.25 ] and P U = [ 1 6 , 1 3 , 1 4 , 1 4 ] , which results in H ( U ) = 1.9591 bits. We have K ( P X Y ) + 1 2.9591 and log ( | Y | | X | + 1 ) = log 5 = 3 . Hence, the upper bound (52) is strictly lower than (57), i.e., the U that achieves g 0 ( P X Y ) has less entropy than the RV U constructed in ([1], Lemma 1).

4.3. Improving the Bounds Using Greedy Entropy-Based Algorithm

In this part, we use the greedy entropy-based algorithm to generalize the bounds obtained in Theorem 5. In more detail, we use the output of the greedy entropy-based algorithm in Theorem 1 instead of the solution to g 0 ( P X Y ) = h 0 ( P X Y ) . In the next theorem, let Q = ( q 1 , q 2 , ) with q 1 q 2 , where q i = P ( Q = q i ) . Let P Y | X be a matrix with columns P Y | X = x , and we re-order each column so that each column has a descending order. Let q 1 = min x X { max y Y P Y | X ( y | x ) } , i.e., choose the smallest number in the first row of the matrix P Y | X , and then we subtract q 1 from the first row and reorder each column and update the matrix. We then choose the smallest number from the first row of the updated matrix and represent it by q 2 . We continue this procedure until the summation of q i reaches one.
Theorem 7.
Let the RVs ( X , Y ) be distributed according to P X Y supported on alphabets X and Y , where | X | and | Y | are finite, and let the shared secret key size be | X | , i.e., T = | X | . Let | X | = 2 , we have
L 0 ( P X Y , 2 ) H ( Q ) + log e e + 2 ,
where Q is defined in ([36], Lemma 3). When | X | > 2 , we have
L 0 ( P X Y , | X | ) H ( Q ) + 1 + log e 2 + 1 + log ( | X | ) ,
Proof. 
The proof is similar to Theorem 5 and the main difference is to use the minimum entropy output of (19) instead of the solution to g 0 ( P X Y ) = h 0 ( P X Y ) that is used in two-part construction coding. Similarly to Theorem 5, we use two-part code construction to achieve the upper bounds. As shown in Figure 6, we first encode the private data X using one-time pad coding ([2], Lemma 1), which uses log ( | X | ) bits. Next, we produce U based on the greedy entropy-based algorithm proposed in [34], which solves the minimum entropy problem in (19). Thus, we have
(65) H ( Y | X , U ) = 0 , (66) I ( U ; X ) = 0 ,
and
H ( U ) H ( Q ) + log e e ,
and for | X | > 2 ,
H ( U ) H ( Q ) + 1 + log e 2 .
Thus, we obtain (63) and (64). Moreover, for the leakage constraint, we note that the randomness of one-time-pad coding is independent of X and the output of the greedy entropy-based algorithm U.
On the user side, we can decode Y using (65).  □
Remark 13.
We emphasize that (63) and (64) can improve (52), since they achieve the minimum in (56) within a gap, i.e., achieve min P U | Y , X : I ( U ; X ) = 0 , H ( Y | X , U ) = 0 H ( U ) up to a constant gap that depends on | X | . Furthermore, in contrast with the bounds in Theorem 5, (63) and (64) can be achieved for any joint distribution P X Y . However, in the following numerical example, we show that the bounds in Theorem 5 can be tighter than those in Theorem 7.
Example 3.
Let P X | Y = 1 1 1 0 0 0 0 0 0 1 1 1 and P Y = [ 1 8 , 2 8 , 3 8 , 1 8 , 1 16 , 1 16 ] . Clearly, in this case, X is a deterministic function of Y. Using the linear program proposed in [15], we obtain a solution as P Y | u 1 = [ 0.75 , 0 , 0 , 0.25 , 0 , 0 ] , P Y | u 2 = [ 0 , 0.75 , 0 , 0.25 , 0 , 0 ] , P Y | u 3 = [ 0 , 0 , 0.75 , 0 , 0.25 , 0 ] , P Y | u 4 = [ 0 , 0 , 0.75 , 0 , 0 , 0.25 ] and P U = [ 1 6 , 1 3 , 1 4 , 1 4 ] which results H ( U ) = 1.9591 bits. We have H ( U ) = K ( P X Y ) 1.9591 . Moreover, we have
P Y | X = 1 6 0 1 3 0 1 2 0 0 1 2 0 1 4 0 1 4 .
Using the greedy search algorithm, we have P Q = [ 1 2 1 4 1 6 1 12 ] , hence, H ( Q ) = 1.7296 . Thus,
H ( Q ) + log e e = 2.2596 K ( P X Y ) = 1.9591 .
In the next example, we show that the bounds in Theorem 7 can be tighter than the bounds obtained in Theorem 5.
Example 4.
Let P X | Y = 1 1 0 0 0 0 0 0 1 1 1 1 and P Y = [ 1 8 , 2 8 , 3 8 , 1 8 , 0.0925 , 0.0325 ] . Similarly to the previous example, X is a deterministic function of Y. Using the linear program proposed in [15], we obtain a solution as P U = [ 0.185 , 0.065 , 0.315 , 0.25 , 0.185 ] , which results in H ( U ) = 2.1696 bits. We have H ( U ) = K ( P X Y ) = 2.182 . Moreover, we have
P Y | X = 1 3 0 2 3 0 0 0.6 0 0.2 0 0.148 0 0.052 .
Using the greedy search algorithm, we have P Q = [ 0.6 0.2 0.1333 0.052 0.0147 ] , hence, H ( Q ) = 1.6053 . Thus,
H ( Q ) + log e e = 2.136 K ( P X Y ) = 2.182 .

4.4. Non-Zero Leakage Results

In this section, we derive upper and lower bounds on L ( P X Y , M , ϵ ) and L b ( P X Y , M , ϵ ) , defined in (5) and (6). Next, we study the bounds in different scenarios; moreover, we provide an example that shows that the bounds can be tight. The following lemma helps us to find lower bounds on the entropy of an RV that satisfies I ( U ; X ) = ϵ and H ( Y | U , X ) = 0 considering the shared key.
Lemma 7.
Let the pair ( X , Y ) be jointly distributed and the RV W be independent of X and Y. Then, if the RV U satisfies I ( U ; X ) = ϵ , and H ( Y | U , X , W ) = 0 , then, we have
H ( U ) max { L 1 ( ϵ , P X Y ) , L 2 ( ϵ , P X Y ) , L 3 ( ϵ , P X Y ) } ,
where
(70) L 1 ( ϵ , P X Y ) = H ( Y | X ) , (71) L 2 ( ϵ , P X Y ) = min x X H ( Y | X ) + ϵ , (72) L 3 ( ϵ , P X Y ) = H ( Y | X ) H ( X | Y ) + ϵ ,
and min x X H ( Y | X = x ) is the minimum conditional entropy which is non-zero.
Proof. 
The proof is provided in Appendix B.  □
In the next two theorems, we provide upper and lower bounds on L ( P X Y , M , ϵ ) . The next theorem is a generalization of ([1], Theorem 8) for correlated C and X.
Theorem 8.
Let 0 ϵ H ( X ) and the pair of RVs ( X , Y ) be distributed according to P X Y , and the shared secret key size be | X | , i.e., M = | X | . Then, we have
L ( P X Y , | X | , ϵ ) x X H ( Y | X = x ) + ϵ + h ( α ) + 1 + log ( | X | ) ,
where α = ϵ H ( X ) and if | Y | is finite, we have
L ( P X Y , | X | , ϵ ) log | X | ( | Y | 1 ) + 1 | X | + 1 + log ( | X | ) ,
Finally, if X is a deterministic function of Y, we have
L ( P X Y , | X | , ϵ ) log ( | Y | | X | + 1 | X | + 1 ) + log ( | X | ) .
Proof. 
The proof is based on ([19], Lemma 5) and the two-part construction coding and is provided in Appendix B.  □
In the next theorem, we provide lower bounds on L ( P X Y , M , ϵ ) .
Theorem 9.
Let 0 ϵ H ( X ) and the pair of RVs ( X , Y ) be distributed according to P X Y supported on alphabets X and Y , where | X | is finite and | Y | is finite or countably infinite. For any shared secret key size M 1 , we have
L ( P X Y , M , ϵ ) max { L 1 ( ϵ , P X Y ) , L 2 ( ϵ , P X Y ) , L 3 ( ϵ , P X Y ) } ,
where L 1 ( ϵ , P X Y ) , L 2 ( ϵ , P X Y ) and L 3 ( ϵ , P X Y ) are defined in (70), (71) and (72). Furthermore, if X is deterministic function of Y, then, we have
L ( P X Y , M , ϵ ) log ( 1 max x P X ( x ) ) .
Proof. 
The proof is based on Lemma 7 and is provided in Appendix B.  □
Next, we provide an example where the upper and lower bounds obtained in Theorems 8 and 9 are studied.
Example 5.
Let Y = ( Y 1 , , Y N ) be an i.i.d Bernoulli( 0.5 ) sequence and X N = 1 N i = 1 N Y i . Clearly, X N is a function of Y, | X | = N + 1 and | Y | = 2 N . Using ([1], (72)), we have
h 0 ( P X Y ) = ( a ) H ( Y | X ) = i = 1 N ( 1 2 ) N N i ,
where (a) follows from ([1], (70)). Let the shared key size be N + 1 , by using Theorems 8 and 9, we have
ϵ + i = 1 N ( 1 2 ) N N i L ( P X Y , N + 1 , ϵ ) log ( ( N + 2 ) ( 2 N N ) ) + log ( N + 1 ) ,
where (75) is used for the upper bound and L 3 ( ϵ , P X Y ) is used for the lower bound. Using ([1], (73)), we have
lim N h 0 ( P X Y ) N = lim N H ( Y | X ) N = h ( 0.5 ) = 1 ,
where h ( · ) is the binary entropy function. Now, by using (79) and (80), we obtain
lim N L ( P X Y , N + 1 , ϵ ) N = 1
  • Bounds for Bounded Leakage Constraint
Here, we assume that X = ( X 1 , X 2 ) , where X = X 1 × X 2 , 1 < | X 1 | | X 2 | , and | X 1 | | X 2 | = | X | . In the next theorem, we provide upper bounds for L b ( P X Y , M , ϵ ) . We show that when the leakage is more than a threshold, we are able to communicate the message over the channel using a shared key size less than | X | and the receiver can decode the message without any loss.
Theorem 10.
Let ϵ H ( X 1 ) and the RVs ( X 1 , X 2 , Y ) be distributed according to P X 1 X 2 Y and let the shared secret key size be | X 2 | , i.e., M = | X 2 | . Then, we have
L b ( P X Y , | X 2 | , ϵ ) x X H ( Y | X = x ) + H ( X 1 ) + 2 + log ( | X 2 | ) ,
if | Y | is finite, then
L b ( P X Y , | X 2 | , ϵ ) log | X | ( | Y | 1 ) + 1 + log ( | X 2 | ) + 2 + H ( X 1 ) ,
and if X = ( X 1 , X 2 ) is deterministic function of Y, we obtain
L b ( P X Y , | X 2 | , ϵ ) log | Y | | X | + 1 + log ( | X 2 | ) + 2 + H ( X 1 ) ,
Furthermore, let ϵ H ( X 2 ) and the shared secret key size be | X 1 | . We have
L b ( P X Y , | X 1 | , ϵ ) x X H ( Y | X = x ) + H ( X 2 ) + 2 + log ( | X 1 | ) ,
Proof. 
The proof is similar to the proof of Theorem 8 and is provided in Appendix B.  □
Remark 14.
As argued in ([1], Theorem 9), when the leakage is zero and X is a deterministic function of Y, if the shared key size is less than the size of X, i.e., M | X | , lossless variable-length codes do not exist. However, as is shown in Theorem 10, when the leakage is more than a threshold, for the key size less than | X | , such codes exist.
Remark 15.
The upper bounds (82) and (85) can be less than the upper bound found in ([1], Theorem 9), which is obtained under perfect privacy assumption. For instance, if H ( X 1 ) + 2 log ( | X 1 | ) , then (82) is less than ([1], (95)). The latter follows, since we have H ( X 1 ) + 1 + log ( | X 2 | ) log ( | X 1 | ) + log ( | X | ) log ( | X 1 | ) 1 log ( | X 1 | ) + log ( | X | ) log ( | X 1 | ) 1 log ( | X | ) , where the last inequality follows by a + b a + b + 1 .
In the next theorem, we find lower bounds for L b ( P X Y , M , ϵ ) .
Theorem 11.
Let 0 ϵ H ( X ) and the RVs ( X 1 , X 2 , Y ) be distributed according to P X 1 X 2 Y . For any shared secret key size M 1 , we have
L b ( P X Y , M , ϵ ) H ( Y | X ) H ( X | Y ) .
Furthermore, if X is a deterministic function of Y, then we have
L b ( P X Y , M , ϵ ) log ( 1 max x P X ( x ) ) .
Proof. 
The proof is provided in Appendix B.  □
  • A General Approach for X
Next, by using the same construction as in Theorem 10, we find upper bounds on L b ( P X Y , M , ϵ ) for any joint distribution P X Y . Next, we present a simple observation, which we call the separation technique.
Observation 1.
(Separation technique) Any discrete RV X supported on X = { 1 , , | X | } with | X | > 2 can be represented by two RVs ( X 1 , X 2 ) , where 1 < | X 1 | < | X | and 1 < | X 2 | < | X | .
If | X | is not a prime number, then we can show X by ( X 1 , X 2 ) where X = X 1 × X 2 , 1 < | X 1 | | X 2 | , | X 1 | | X 2 | = | X | and P X ( x ) = P X 1 , X 2 ( x 1 , x 2 ) . Furthermore, if | X | is a prime number, then we can show X by ( X 1 , X 2 ) , where | X 1 | | X 2 | = | X | + 1 and P X 1 , X 2 ( | X 1 | , | X 2 | ) = 0 . Let S X be all possible representations of X, where X = ( X 1 , X 2 ) . For a fixed ϵ , we define S X 1 ϵ { ( X 1 , X 2 ) : ( X 1 , X 2 ) S X , H ( X 1 ) ϵ } and S X 2 ϵ { ( X 1 , X 2 ) : ( X 1 , X 2 ) S X , H ( X 2 ) ϵ } .
Theorem 12.
For any ϵ 0 , pair of RVs ( X , Y ) distributed according to P X Y , and shared key size M α , if S X 1 ϵ , we have
L b ( P X Y , M , ϵ ) x X H ( Y | X = x ) + 2 + min ( X 1 , X 2 ) S X 1 ϵ H ( X 1 ) + log ( | X 2 | ) ,
where α = | arg min X 2 : ( X 1 , X 2 ) S X 1 ϵ H ( X 1 ) + log ( | X 2 | ) | < | X | and | S | denotes the size of the RV S. If S X 2 ϵ , for a shared key size M β , we have
L b ( P X Y , M , ϵ ) x X H ( Y | X = x ) + 2 + min ( X 1 , X 2 ) S X 1 ϵ H ( X 2 ) + log ( | X 1 | ) ,
where α = | arg min X 1 : ( X 1 , X 2 ) S X 2 ϵ H ( X 1 ) + log ( | X 2 | ) | < | X | .
Proof. 
To prove (88), let ( X 1 , X 2 ) S X 1 ϵ . Then, by using Theorem 10, we have
L b ( P X Y , | X 2 | , ϵ ) x X H ( Y | X = x ) + H ( X 1 ) + 2 + log ( | X 2 | ) .
Since (90) holds for any ( X 1 , X 2 ) S X 1 ϵ , we can take the minimum over ( X 1 , X 2 ) , which results in (88). Furthermore, the key size is chosen as | arg min X 2 : ( X 1 , X 2 ) S X 1 ϵ H ( X 1 ) + log ( | X 2 | ) | < | X | and note that the first term x X H ( Y | X = x ) remains the same, since it is a function of P Y | X 1 X 2 , i.e., the distributions including the pair ( X 1 , X 2 ) do not change, since P X 1 , X 2 ( x 1 , x 2 ) = P X ( x ) . Using (85) and following the same approach, (89) is obtained.  □
Remark 16.
The upper bounds (88) and (89) generalize (82) and (85). The upper bounds (82) and (85) are obtained for a fixed ( X 1 , X 2 ) . Thus, by taking the minimum over ( X 1 , X 2 ) S X 1 ϵ and ( X 1 , X 2 ) S X 2 ϵ , we obtain tighter upper bounds.
Remark 17.
The same lower bounds as obtained in Theorem 11 can be used here, since the separation technique does not improve the bounds in (86) and (87).
Remark 18.
Using Example 5, we can see that the bounds obtained in Theorems 10–13 can be asymptotically tight.
Next, we provide an example to clarify the construction in Theorem 12 and compare the bounds with the perfect privacy case.
Example 6.
Let the RV X { 1 , , 12 } with distribution P X ( x ) = 0.05 , x { 1 , , 10 } and P X ( x ) = 0.475 , x { 11 , 12 } be correlated with Y, where Y | X = x is a BSC ( 0.5 ) . To calculate the bound in Theorem 12, one possible separation of X is ( X 1 , X 2 ) , where the distributions of X 1 and X 2 are P X 1 ( x 1 ) = 0.01 , x { 1 , , 5 } , P X 1 ( x 1 ) = 0.95 , x 1 = 6 and P X 2 ( x 2 ) = 0.5 , x { 1 , 2 } . In this case, H ( X 1 ) = 0.4025 and log ( | X 2 | ) = 1 . Using the bound (82), for all ϵ 0.4025 and key size M 1 , we have
L b ( P X Y , M , ϵ ) x X H ( Y | X = x ) + H ( X 1 ) + 2 + log ( | X 2 | ) = 15.45 bits ,
By using the bound ([1], (95)), for the key size M 12 , we have
L b ( P X Y , M , 0 ) 17 bits .
Note that, since | Y | = 2 , we can also use the following bound instead of (88)
L b ( P X Y , M , ϵ ) log | X | ( | Y | 1 ) + 1 + 2 + min ( X 1 , X 2 ) S X 1 ϵ log ( | X 2 | ) + H ( X 1 ) .
By checking all possible separations, the minimum of log ( | X 2 | ) + H ( X 1 ) occurs at | X 1 | = 6 , and | X 2 | = 2 where X 1 and X 2 have the distributions mentioned before. Thus, for all ϵ 0.4025 and the key size M 1 , we have L b ( P X Y , M , ϵ ) 7.4025 bits . By using the bound ([1], (96)), for the key size M 12 , we have L b ( P X Y , M , 0 ) 9 bits . We can see that by allowing a small amount of leakage with a shorter secret key size, we are able to send shorter codewords.
In the following, we study the bounds obtained in Theorem 12 considering a perfect privacy scenario and we show that the previous bounds can be improved by using the separation technique.
  • Perfect Privacy ( ϵ = 0 )
In this part, we let ϵ = 0 and consider the problem as in [1]. We show that when X = ( X 1 , X 2 ) , where X 2 is a deterministic function of X 1 , the upper bounds on L b ( P X Y , M , 0 ) can be improved by using a shorter shared key size, i.e., M < | X | . In the next theorem, we provide upper bounds on L b ( P X Y , M , 0 ) .
Theorem 13.
Let X = ( X 1 , X 2 ) , where X 2 = f ( X 1 ) , and the shared key size be | X 1 | . We have
L b ( P X Y , | X 1 | , 0 ) x 1 X 1 H ( Y | X 1 = x 1 ) + 1 + log ( | X 1 | ) ,
if | Y | is finite, then
L b ( P X Y , | X 1 | , 0 ) log | X | ( | Y | 1 ) + 1 + log ( | X 1 | ) + 1 ,
and if X = ( X 1 , X 2 ) is deterministic function of Y, we obtain
L b ( P X Y , | X 1 | , 0 ) log | Y | | X | + 1 + log ( | X 1 | ) + 1 .
Proof. 
The proof is provided in Appendix B.  □
Remark 19.
Clearly, when X = ( X 1 , X 2 ) with X 2 = f ( X 1 ) the upper bounds (91), (92) and (93) improve the bounds in ([1], Theorem 8).
  • Special Case: X = ( X 1 , X 2 ) and X 2 = f ( X 1 )
In this part, we use similar approaches as used in Theorems 12 and 13 to find upper bounds. We encode X 1 using one-time-pad coding and the same U as in Theorem 13. As shown in Theorem 13, X 1 is sufficient to decode Y, since X 2 = f ( X 1 ) ; however, in this scheme, we do not have the possibility that we are allowed to leak about X. Similarly to Theorem 13, we obtain
L b ( P X Y , | X 1 | , ϵ ) x X H ( Y | X = x ) + 1 + log ( | X 1 | ) = x 1 X 1 H ( Y | X 1 = x 1 ) + 1 + log ( | X 1 | ) .
Next, we use the separation technique as used in Theorem 12. For any ϵ 0 , if a separation exist such that ϵ H ( X 1 ) , where X 1 = ( X 1 , X 1 ) , we obtain
L b ( P X Y , | X | , ϵ ) x 1 X 1 H ( Y | X 1 = x 1 ) + H ( X 1 ) + 2 + log ( | X 1 | ) .
Furthermore, we can take the minimum over all possible separations to improve the upper bound. Let S X 1 ϵ be all possible separations, where ϵ H ( X 1 ) . We have
L b ( P X Y , α , ϵ ) x 1 X 1 H ( Y | X 1 = x 1 ) + 2 + min ( X 1 , X 1 ) S X 1 ϵ H ( X 1 ) + log ( | X 1 | ) ,
where α = | arg min X 1 : ( X 1 , X 1 ) S X 1 ϵ H ( X 1 ) + log ( | X 1 | ) | < | X 1 | .
We can see that the upper bound on L b ( P X Y , | X | , ϵ ) can be less than the bound in (91). For instance, let H ( X 1 ) + 2 log ( | X 1 | ) . Using the same arguments as in Remark 15, we have H ( X 1 ) + 1 + log ( | X 1 | ) log ( | X 1 | ) . Hence, if the leakage is more than a certain threshold, this can help us to improve the upper bound and send shorter codewords compared to the perfect privacy case. Moreover, it helps us to use a shorter shared secret key size.

5. Applications: Cache-Aided Networks

In this part, we present an application where the results of this paper can be used. The complete study of the application can be found in [49]. To do so, let us consider the scenario illustrated in Figure 7. A server has access to a database consisting of N files Y 1 , , Y N , where each file, of size F bits, is sampled from the joint distribution P X Y 1 · Y N . The random variable X denotes the private latent variable. We assume that the server knows the realization of the private variable X as well. The server is connected to K users over a shared link, where user i has access to a local cache memory of size M F bits. Furthermore, we assume that the server and the users have access to a shared secret key denoted by W, of size T. The system works in two phases: the placement and delivery phases, respectively [50]. In the placement phase, the server fills the local caches using the database. Let Z k denote the content of the local cache memory of user k, k [ K ] { 1 , , K } after the placement phase. In the delivery phase, first the users send their demands to the server, where d k [ N ] denotes the demand of user k. The server sends a response, denoted by C , over the shared link to satisfy all the demands, simultaneously. We assume that an adversary has access to the shared link as well, and uses C to extract information about X. However, the adversary does not have access to the local cache contents or the secret key. Since the files in the database are all correlated with the private latent variable X, the coded caching and delivery techniques introduced in [50] do not satisfy the privacy requirement. The goal of the cache-aided private delivery problem is to find a response C with minimum possible average length that satisfies a certain privacy constraint and the zero-error decodability constraint of the users. Here, we consider the worst-case demand combinations d = ( d 1 , , d K ) to construct C . This expectation is taken for the randomness in the database. In this study, we first consider a perfect privacy constraint, i.e., we require C to be independent of X. Let Y ^ d k denote the decoded message of user k using W, C , and Z k . User k should be able to recover Y d k reliably, i.e., P { Y ^ d k Y d k } = 0 , k [ K ] . We remark that we have a cache-aided variable-length compression problem with a privacy constraint. Hence, to solve this problem and design the code C , we utilized techniques used in privacy mechanisms, data compression, and cache design and coded delivery problems, and combined them to build such a code. In particular, we used the data compression techniques employed in this paper and caching design techniques in [50]. Finally, we extended the results for correlated X and C , i.e., when non-zero leakage is allowed. To do so, we used the results of this paper for correlated X and C , i.e., the non-perfect privacy part. For more details, see [49].

6. Experiment

In this section, we provide an experiment to evaluate the bounds obtained in Theorem 4. To do so, we used the MNIST dataset, which contains 60,000 images illustrating handwritten digits from 0 to 9. Let the RV S (information source) denote the images in the dataset, i.e., | S | = 60,000. Let Y represent the useful data that is a feature extracted from the dataset. In this example, the useful data are defined as a “quantized histogram”. To determine the quantized histogram of the dataset, we first converted the images into black and white by quantizing them. The histogram of any image in the dataset is then computed as the ratio of the number of white pixels to the total number of pixels. Consequently, the histogram of any image can be represented by a value within the interval [ 0 , 1 ] . For quantizing the histogram of any image, we used the following intervals: Interval 1 = [ 0 , 0.1 ) , Interval 2 = [ 0.1 , 0.15 ) , Interval 3 = [ 0.15 , 0.2 ) , Interval 4 = [ 0.2 , 0.25 ) , Interval 5 = [ 0.25 , 0.3 ) , Interval 6 = [ 0.3 , 0.35 ) and Interval 7 = [ 0.35 , 1 ] . Each image in the dataset was therefore labeled with a number between 1 and 7, indicating the interval to which its histogram belongs. Let Y denote the interval sample S belongs to. Clearly, the labels of the intervals are deterministic functions of the information source S. For instance, in Figure 8, the number of white pixels is 130, hence the quantized histogram is equal to 0.1658 , which corresponds to the third interval.
Let the RV Z denote the label of each image that is a number from 0 to 9, indicating the digit which the image represents, i.e., Z { 0 , 1 , , 9 } . Clearly, Z is a deterministic function of S, i.e., Z = f ( S ) . The private attribute X is defined as the presence of, digit 0 which is a binary RV, i.e., X { 0 , 1 } and X = 1 if Z = 0 , 0 if Z = 1 . Here, we use an empirical distribution for Z and it can be seen that X is a deterministic function of Z, which yields the Markov chain X Z Y . Using this Markov chain, the kernel P X | Y can be obtained as P X | Z P Z | Y , where
P X | Z = 1 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 ,
and P Z | Y can be found from the dataset. We used an empirical distribution to find the kernel P Y | Z , e.g., P Y = 1 | Z = 1 is the number of images illustrating digit 1, which corresponds to label 1 (their quantized histograms belong to the first interval) divided by the total number of images with digit 1. By using P Y | Z derived in (94) and multiplying it by P Z | X , we obtain P X | Y as in (95).
P Z | Y = 0.0064 0.0441 0.1445 0.3070 0.4678 0.5968 0.7500 0.4922 0.0497 0.0037 0.0004 0 0 0 0.0262 0.0854 0.1440 0.1599 0.1044 0.1129 0 0.0416 0.1017 0.1339 0.1265 0.0922 0.0323 0 0.0876 0.1330 0.0796 0.0332 0.0111 0 0 0.0712 0.1058 0.0907 0.0676 0.0478 0.0161 0 0.0479 0.1086 0.1181 0.0973 0.0711 0.0645 0 0.1330 0.1447 0.0635 0.0201 0.0078 0 0 0.0162 0.0872 0.1409 0.1523 0.1700 0.1774 0.2500 0.0778 0.1396 0.0812 0.0357 0.0278 0 0 ,
P X | Y = 0.0064 0.0441 0.1445 0.3070 0.4678 0.5968 0.7500 0.9936 0.9559 0.8555 0.6930 0.5322 0.4032 0.2500 .
Moreover, the distribution of the task Y is given as in Table 1.
The goal was to compress Y and send it to a user, and we considered a perfect privacy case where the privacy leakage was zero. Since X is a deterministic function of Y, we have h 0 ( P X Y ) = g 0 ( P X Y ) and P X Y P ^ X Y . Hence, using the linear program in [15], we found U that achieved g 0 ( P X Y , and K ( P X Y ) was obtained. Using Theorem 4, the average length of C was 2.88 bits; however, using the method in [1], 4.16 bits were required to send the message C over the channel.

7. Conclusions

We have studied a compression problem with a privacy constraint in two scenarios, where the information delivered over the channel is either independent of X or is correlated with X, satisfying a bounded leakage constraint. Considering the perfect privacy constraint, we proposed new upper bounds using a two-part construction coding that benefits from the solution of g 0 ( P X Y ) = h 0 ( P X Y ) to encode the second part of the code. For the first part of the code, we hide the private data using one-time pad coding. Furthermore, in case of | Y | | X | , we proposed a new achievable scheme. We have shown that the new upper bounds can improve the existing ones. Moreover, we strengthened the bounds using the greedy entropy-based algorithm. Considering the second scenario, we obtained upper and lower bounds and showed that the upper bounds can be asymptotically tight. To achieve the upper bounds, we used two-part construction coding. Furthermore, in the case of perfect privacy, we showed that, by using the separation technique, the previous upper bounds can be improved.

Author Contributions

Methodology, A.Z. and M.S.; Validation, M.S.; Formal analysis, A.Z.; Investigation, A.Z.; Writing–original draft A.Z.; Writing–review & editing A.Z. and M.S.; Supervision, M.S.; Funding acquisition, M.S. All authors have read and agreed to the published version of the manuscript.

Funding

Swedish Research Council (VR) under grant 2019-03606.

Data Availability Statement

The data presented in this study are openly available in https://doi.org/10.1109/WIFS58808.2023.10374604.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Appendix A.1. Proof of Theorem 4

Let U U 1 . In the following, we expand H ( Y , U | X = x ) using the chain rule:
H ( Y , U | X = x ) = H ( U | X = x ) + H ( Y | U , X = x ) = ( a ) H ( U ) ,
where, in (a), we used
I ( X ; U ) = 0 H ( U ) = H ( U | X ) = H ( U | X = x ) ,
and
H ( Y | X , U ) = 0 H ( Y | U , X = x ) = 0 .
Furthermore,
H ( Y , U | X = x ) = H ( Y | X = x ) + H ( U | Y , X = x ) = H ( Y | X = x ) + y P y | x H ( U | Y = y , X = x ) = ( b ) H ( Y | X = x ) + y P y | x H ( U | Y = y ) ,
where (b) follows from the Markov chain X Y U . Combining (A1) and (A2), for any x X , we have
H ( U ) = H ( Y | X = x ) + y P y | x H ( U | Y = y ) .
Furthermore, by using Lemma 4, U is an optimizer of both g 0 ( P X Y ) and h 0 ( P X Y ) . Thus,
H ( U ) = H ( Y | X ) + U ( U | Y ) = H ( Y | X ) + y P y H ( U | Y = y )
Combining (A4) and (A3), we have
H ( Y | X ) + y P y H ( U | Y = y ) = H ( Y | X = x ) + y P y | x H ( U | Y = y )
Thus, for any x X
y ( P y | x P y ) H ( U | Y = y ) = H ( Y | X ) H ( Y | X = x ) .
For any optimizer U U 1 , (A6) holds; hence, if we take the maximum over H ( U | Y = y ) and by letting a i = H ( U | Y = y i ) , we obtain
H ( U ) max H ( U ) = H ( Y | X ) + max i P y i a i
subject to
A X Y a = b X Y , a 0 .
To prove (49), note that, since we consider all optimizers of g 0 ( P X Y ) , there exists at least one U that satisfies | U | null ( P X | Y ) + 1 , which results H ( U ) log ( null ( P X | Y ) + 1 ) . Since (A6) must hold for any optimizer, U satisfies the inequality in (49). Moreover, (50) follows from Lemma 6 and the constraint we added to have H ( U ) log ( null ( P X | Y ) + 1 ) . To prove the last statement, note that when Y = f ( X ) , using the key equation in (36), the upper bound on h 0 ( P X Y ) equals zero, which results in h 0 ( P X Y ) = g 0 ( P X Y ) = 0 . Furthermore, as shown in ([1], Theorem 5), when Y f ( X ) , we have h 0 ( P X Y ) > 0 ; thus, the system of linear equations in (A6) has at least one solution. However, due to rank ( A X Y ) = | Y | , it has a unique solution, since the number of independent equations is at least the number of variables. Finally, to prove the lower bound in (47), note that for any U that satisfies (33), (34) and (35), (A6) holds. By taking the minimum over H ( U ) = H ( Y | X ) + i P y i H ( U | y i ) subject to (A6). Moreover, the inequality holds for U that achieves the minimum entropy.

Appendix A.2. Proof of Theorem 5

Let W be the shared secret key with key size M = | X | , which is uniformly distributed over { 1 , , M } = { 1 , , | X | } and independent of ( X , Y ) . As shown in Figure 4, first, the private data X are encoded using the shared secret key ([2], Lemma 1). Thus, we have
X ˜ = X + W mod | X | .
Next, we show that X ˜ has a uniform distribution over { 1 , , | X | } and I ( X ; X ˜ ) = 0 . We have
H ( X ˜ | X ) = H ( X + W | X ) = H ( W | X ) = H ( W ) = log ( | X | ) .
Furthermore, H ( X ˜ | X ) H ( X ˜ ) , and combining it with (A9), we obtain H ( X ˜ | X ) = H ( X ˜ ) = log ( | X | ) . For encoding X ˜ , we use log ( | X | ) bits. Let C 1 denote the encoded X ˜ . Next, by using P X Y P ^ X Y , let U U 1 be an optimizer of h 0 ( P X Y ) = g 0 ( P X Y ) that achieves K ( P X Y ) , i.e., has the minimum entropy. We encode U using any lossless code, which uses at most H ( U ) + 1 = K ( P X Y ) + 1 bits and let C 2 be the encoded message U . We send C = ( C 1 , C 2 ) over the channel. Thus, (52) can be obtained. Note that, as shown in Figure 5, on the decoder side, we first decode X using W, by adding | X | W to X ˜ , then, Y is decoded using the fact that U satisfies H ( Y | U , X ) = 0 . Next, we show that if W is independent of ( X , U ) , we obtain I ( C ; X ) = 0 . We have
I ( C ; X ) = I ( C 1 , C 2 ; X ) = I ( U , X ˜ ; X ) = I ( U ; X ) + I ( X ˜ ; X | U ) = I ( X ˜ ; X | U ) = ( a ) H ( X ˜ | U ) H ( X ˜ | X , U ) = H ( X ˜ | U ) H ( X + W | X , U ) = ( b ) H ( X ˜ | U ) H ( W ) = ( c ) H ( X ˜ ) H ( W ) = log ( | X | ) log ( | X | ) = 0
where (a) follows from I ( U ; X ) = 0 ; (b) follows, since W is independent of ( X , U ) ; and (c) from the independence of U and X ˜ . The latter follows, since we have
0 I ( X ˜ ; X | U ) = H ( X ˜ | U ) H ( W ) = ( i ) H ( X ˜ | U ) H ( X ˜ ) 0 .
Thus, X ˜ and U are independent. Step (i) above follows from the fact that W and X ˜ are uniformly distributed over { 1 , , | X | } , i.e., H ( W ) = H ( X ˜ ) . As a summary, if we choose W independent of ( X , U ) , the leakage to the adversary is zero. Thus, (52) is achievable. The upper bounds (53) and (54) can be obtained using the same coding and chain of inequalities stated in Theorem 4. The upper bounds in (56) and (57) were derived in ([1], Theorem 8). The upper bound in (55) can be obtained using the same coding. Note that the inequality between the upper bounds follows from the fact that the RV U that achieves the upper bound in (56) does not necessarily minimize the entropy H ( U ) ; however, it satisfies the constraints I ( X ; U ) = 0 and H ( Y | X , U ) = 0 . Finally, to achieve (58), let the shared key W be independent of ( X , Y ) and have a uniform distribution. We construct Y ˜ using one-time pad coding. We have
Y ˜ = Y + W mod | Y | .
Then, Y ˜ is encoded using any lossless code which uses at most log ( | Y | ) bits. The latter follows, since Y ˜ has a uniform distribution. It only remains to show that I ( Y ˜ ; X ) = 0 . We have
I ( Y ˜ ; X , Y ) = I ( Y + W ; X , Y ) = H ( Y + W ) H ( Y + W | X , Y ) = log ( | Y | ) H ( W | X , Y ) = log ( | Y | ) H ( W ) = log ( | Y | ) log ( | Y | ) = 0 .
Hence, I ( Y ˜ ; X ) = 0 . Furthermore, on the decoder side, we can decode Y using W.

Appendix B

Appendix B.1. Proof of Lemma 7

First we obtain L 3 ( ϵ , P X Y ) . Using the key equation ([19], (7)) with assignments X X and Y ( Y , W ) , for any RVs X, Y, W, and U, we have
I ( U ; Y , W ) = I ( U ; X ) + H ( Y , W | X ) H ( Y , W | X , U ) I ( X ; U | Y , W ) .
Using (A10), if U satisfies I ( U ; X ) = ϵ and H ( Y | U , X , W ) = 0 and W is independent of ( X , Y ) , we have
H ( U ) I ( U ; Y , W ) = I ( U ; X ) + H ( Y | X ) + H ( W | X , Y ) H ( W | X , U ) H ( Y | X , U , W ) H ( X | Y , W ) + H ( X | W , Y , U ) = ( a ) ϵ + H ( Y | X ) + H ( W ) H ( W | X , U ) H ( X | Y ) + H ( X | Y , U , W ) ( b ) ϵ + H ( Y | X ) H ( X | Y ) ,
where in (a), we used I ( X ; U ) = ϵ independently of W and ( X , Y ) , and H ( Y | X , U , W ) = 0 . Step (b) follows from the non-negativity of H ( W ) H ( W | X , U ) = I ( W ; X , U ) and H ( X | Y , U , W ) .
To prove L 2 ( ϵ , P X Y ) , note that we have
H ( U ) min x H ( U | X = x ) + ϵ ,
since
H ( U ) min x H ( U | X = x ) H ( U ) x P X ( x ) H ( U | X = x ) = ϵ .
Furthermore,
(A12) H ( U | X = x ) H ( U | W , X = x ) ( a ) H ( U | W , X = x ) + H ( Y | U , W , X = x ) = H ( U , Y | W , X = x ) = H ( Y | W , X = x ) + H ( U | Y , W , X = x ) (A13) ( b ) H ( Y | W , X = x ) (A14) = H ( Y | X = x ) ,
where (a) used the fact that H ( Y | U , W , X = x ) = 0 , since H ( Y | U , W , X ) = 0 . (b) follows, since W is independent of X and Y. Taking the minimum on both sides, we obtain min x H ( U | X = x ) min x H ( Y | X = x ) . Combining this with (A11), we obtain L 2 ( ϵ , P X Y ) . To obtain L 1 ( ϵ , P X Y ) , we take the average from both sides of (A12) and (A14). This gives us
H ( U ) H ( U | X ) H ( Y | X ) ,
which completes the proof.

Appendix B.2. Proof of Theorem 8

To prove (73), we use the two-part construction coding scheme as used in [1]. As shown in Figure A1, first the private data X are encoded by one-time-pad coding ([2], Lemma 1), which uses log ( | X | ) bits. Next, we produce U based on EFRL ([19], Lemma 3) and encode it using any traditional code which uses at most H ( U ) + 1 bits. By using the upper bound on H ( U ) found in ([19], Lemma 5), we obtain
L ( P X Y , | X | , ϵ ) x X H ( Y | X = x ) + ϵ + h ( α ) + 1 + log ( | X | ) .
For proving (74) and (75), we use the same coding and note that when | Y | is finite, we use the bound | U | | X | ( | Y | 1 ) + 1 | X | + 1 , which results in (74). Furthermore, when X is a deterministic function of Y, we can use the bound | U | | Y | | X | + 1 | X | + 1 that leads to (75). Moreover, for the leakage constraint, we note that the randomness of the one-time-pad coding is independent of X and the output of the EFRL. Let Z be the compressed output of the EFRL and X ˜ be the output of the one-time-pad coding. Then, we have
I ( X ; X ˜ , Z ) = I ( X ; X ˜ ) + I ( X ; Z | X ˜ ) = I ( X ; Z ) = ϵ ,
where we used the independency between X ˜ and ( Z , X ) . Moreover, as shown in Figure A2, on the receiver side, we first decode X using the shared key W, then by using the fact that U is produced by EFRL, we can decode Y based on X and U.
Figure A1. We use a two-part construction coding strategy to send codewords over the channels. As discussed in Theorem 8, on the receiver side using C and the shared key, we decode X, and using X, the shared key, and C , we decode Y. The leakage from X to C equals ϵ .
Figure A1. We use a two-part construction coding strategy to send codewords over the channels. As discussed in Theorem 8, on the receiver side using C and the shared key, we decode X, and using X, the shared key, and C , we decode Y. The leakage from X to C equals ϵ .
Entropy 27 00124 g0a1
Figure A2. On the receiver side, we first decode X using the shared key W, then by using the fact that U is produced by EFRL, we have H ( Y | X , U ) = 0 . Thus, we can decode Y based on X and U.
Figure A2. On the receiver side, we first decode X using the shared key W, then by using the fact that U is produced by EFRL, we have H ( Y | X , U ) = 0 . Thus, we can decode Y based on X and U.
Entropy 27 00124 g0a2

Appendix B.3. Proof of Theorem 9

Let U ˜ = C ( Y , W ) be the output of the encoder. Since the code is ϵ -private, it satisfies I ( U ; X ) = ϵ . Due to recoverability, we can recover the useful data, and this also satisfies H ( Y | U , X , W ) = 0 . Thus, by using Lemma 7, we have
H ( U ˜ ) max { L 1 ( ϵ , P X Y ) , L 2 ( ϵ , P X Y ) , L 3 ( ϵ , P X Y ) } ,
which results in (76). The proof of (77) is similar to ([1], Theorem 9). For any u ˜ ξ , we have
P U ˜ ( u ˜ ) = x P X ( x ) P U ˜ | X ( u ˜ | x ) max x P X ( x ) x P U ˜ | X ( u ˜ | x ) = max x P X ( x ) x w P U ˜ | W X ( u ˜ | w , x ) P W ( w ) = max x P X ( x ) w P W ( w ) x P U ˜ | W X ( u ˜ | w , x ) ( a ) max x P X ( x ) w P W ( w ) = max x P X ( x ) ,
where (a) follows from the lossless condition, since, as argued in ([1], Theorem 9), for a lossless code P U ˜ | W X ( u ˜ | w , x ) > 0 for at most one x X . Hence, we have
H ( U ˜ ) log ( 1 max x P X ( x ) ) .

Appendix B.4. Proof of Theorem 10

To prove (82), X 2 is encoded by one-time-pad coding ([2], Lemma 1), which uses log ( | X 2 | ) bits. Let the output of the one-time-pad coding be X ˜ 2 . Next, we encode X 1 using any traditional code, which uses at most H ( X 1 ) + 1 bits. Next, we produce U based on FRL ([1], Lemma 1) with assignments X ( X 1 , X 2 ) and Y Y , and encode it using any traditional code, which uses at most H ( U ) + 1 bits. We send the encoded X 1 , X ˜ 2 and U over the channel. Since U satisfies I ( U ; X 1 , X 2 ) = 0 and H ( Y | U , X 1 , X 2 ) = 0 by using the upper bound on H ( U ) found in ([1], Lemma 2), we obtain
H ( U ) x H ( Y | X = x ) .
Finally, we obtain
L b ( P X Y , | X 2 | , ϵ ) x X H ( Y | X = x ) + H ( X 1 ) + 2 + log ( | X 2 | ) .
Note that, on the receiver side, using the shared key and X ˜ 2 we can decode X 2 , and by using X 1 , X 2 and U we can decode Y without loss. For the leakage constraint, we note that the one-time-pad coding can be used such that X ˜ 2 is independent of ( X 1 , X 2 , U ) . Thus, we have
I ( X 1 , X ˜ 2 , U ; X 1 , X 2 ) = H ( X 1 ) + H ( X ˜ 2 ) + H ( U ) H ( X 1 | X 1 , X 2 ) H ( X ˜ 2 | X 1 , X 2 ) H ( U | X 1 , X 2 , X ˜ 2 ) = ( a ) H ( X 1 ) ϵ ,
where (a) follows from the independency between X ˜ 2 and ( U , X 1 , X 2 ) , and independency between U and ( X 1 , X 2 ) . The proofs of (83) and (84) are based on ([1], (31)) and ([1], (41)). The proof of (85) is similar to (82). We use one-time-pad coding for encoding X 1 instead of X 2 . In this case. we send compressed X 2 , X ˜ 1 and U over the channel, where X ˜ 1 is the output of the one-time-pad coding. We then have
L b ( P X Y , | X 1 | , ϵ ) x X H ( Y | X = x ) + H ( X 2 ) + 2 + log ( | X 1 | ) .

Appendix B.5. Proof of Theorem 11

Let U = C ( Y , W ) be the output of the encoder. The code is bounded ϵ -private, thus I ( U ; X 1 , X 2 ) ϵ . Due to recoverability, i.e., given the code, the shared secret key, and the private data, we can recover the useful data, which also satisfies H ( Y | W , U , X 1 , X 2 ) = 0 . Using (A10), we have
H ( U ) I ( U ; Y , W ) = I ( U ; X ) + H ( Y | X ) + H ( W | X , Y ) H ( W | X , U ) H ( Y | X , U , W ) H ( X | Y , W ) + H ( X | W , Y , U ) H ( Y | X ) + H ( W ) H ( W | X , U ) H ( X | Y ) + H ( X | Y , U , W ) H ( Y | X ) H ( X | Y ) ,
where we used I ( U ; X ) 0 , H ( X | Y , U , W ) 0 and the independency of W and ( X , Y ) . Proof of (87) is similar to (77).

Appendix B.6. Proof of Theorem 13

To prove (91), let U be produced based on FRL ([1], Lemma 1) with assignments X ( X 1 , X 2 ) and Y Y , and encode it using any traditional code which uses at most H ( U ) + 1 bits. Thus, I ( U ; X 1 , X 2 ) = 0 and H ( Y | X 1 , X 2 , U ) = 0 . Since X 2 = f ( X 1 ) , we have
H ( Y | X 1 , X 2 , U ) = H ( X 1 , U ) = 0 .
Using ([1], Lemma 2), we have
H ( U ) x H ( Y | X = x ) .
We encode X 1 by one-time-pad coding, which uses log ( | X 1 | ) bits. Let the output of one-time-pad coding be X ˜ 1 . We send compressed X ˜ 1 and U over the channel. Using the upper bound on H ( U ) and X ˜ 1 , we have
L b ( P X Y , | X 1 | , 0 ) x X H ( Y | X = x ) + 1 + log ( | X 1 | ) .
On the receiver side, using the shared key, first X 1 is decoded and we then use (A15) to decode Y using X 1 and U. For the leakage constraint, the independency between X ˜ 1 and X 1 implies I ( X 1 , X 2 ; X ˜ 1 ) = 0 , and the later follows since X 2 = f ( X 1 ) . Furthermore, we choose X ˜ 1 to be independent of U. Thus,
I ( X ˜ 1 , U ; X 1 , X 2 ) = I ( X ˜ 1 , U ; X 1 ) = I ( X ˜ 1 ; X 1 ) = 0 .
The bounds (92) and (93) can be obtained using ([1], (31)) and ([1], (41)).

References

  1. Shkel, Y.Y.; Blum, R.S.; Poor, H.V. Secrecy by Design With Applications to Privacy and Compression. IEEE Trans. Inf. Theory 2021, 67, 824–843. [Google Scholar] [CrossRef]
  2. Shkel, Y.Y.; Poor, H.V. A Compression Perspective on Secrecy Measures. IEEE J. Sel. Areas Inf. Theory 2021, 2, 163–176. [Google Scholar] [CrossRef]
  3. Shannon, C.E. Communication theory of secrecy systems. Bell Syst. Tech. J. 1949, 28, 656–715. [Google Scholar] [CrossRef]
  4. Gündüz, D.; Erkip, E.; Poor, H.V. Source coding under secrecy constraints. In Securing Wireless Communications at the Physical Layer; Springer: Boston, MA, USA, 2009; pp. 173–199. [Google Scholar]
  5. Schaefer, R.F.; Boche, H.; Khisti, A.; Poor, H.V. Information Theoretic Security and Privacy of Information Systems; Cambridge University Press: Cambridge, UK, 2017. [Google Scholar]
  6. Sankar, L.; Rajagopalan, S.R.; Poor, H.V. Utility-privacy tradeoffs in databases: An information-theoretic approach. IEEE Trans. Inf. Forensics Secur. 2013, 8, 838–852. [Google Scholar] [CrossRef]
  7. Wang, H.; Vo, L.; Calmon, F.P.; Médard, M.; Duffy, K.R.; Varia, M. Privacy with Estimation Guarantees. IEEE Trans. Inf. Theory 2019, 65, 8025–8042. [Google Scholar] [CrossRef]
  8. Rassouli, B.; Gündüz, D. Optimal Utility-Privacy Trade-Off With Total Variation Distance as a Privacy Measure. IEEE Trans. Inf. Forensics Secur. 2020, 15, 594–603. [Google Scholar] [CrossRef]
  9. Rassouli, B.; Rosas, F.E.; Gündüz, D. Data Disclosure under Perfect Sample Privacy. IEEE Trans. Inf. Forensics Secur. 2020, 15, 2012–2025. [Google Scholar] [CrossRef]
  10. Yamamoto, H. A source coding problem for sources with additional outputs to keep secret from the receiver or wiretappers (corresp.). IEEE Trans. Inf. Theory 1983, 29, 918–923. [Google Scholar] [CrossRef]
  11. Yamamoto, H. A rate-distortion problem for a communication system with a secondary decoder to be hindered. IEEE Trans. Inf. Theory 1988, 34, 835–842. [Google Scholar] [CrossRef]
  12. Issa, I.; Kamath, S.; Wagner, A.B. An operational measure of information leakage. In Proceedings of the 2016 Annual Conference on Information Science and Systems, Princeton, NJ, USA, 16–18 March 2016; pp. 234–239. [Google Scholar] [CrossRef]
  13. Issa, I.; Kamath, S.; Wagner, A.B. Maximal leakage minimization for the Shannon cipher system. In Proceedings of the 2016 IEEE International Symposium on Information Theory, Barcelona, Spain, 10–15 July 2016; pp. 520–524. [Google Scholar]
  14. Makhdoumi, A.; Salamatian, S.; Fawaz, N.; Médard, M. From the information bottleneck to the privacy funnel. In Proceedings of the 2014 IEEE Information Theory Workshop, Hobart, Australia, 2–5 November 2014; pp. 501–505. [Google Scholar]
  15. Rassouli, B.; Gündüz, D. On Perfect Privacy. IEEE J. Sel. Areas Inf. Theory 2021, 2, 177–191. [Google Scholar] [CrossRef]
  16. Zamani, A.; Oechtering, T.J.; Skoglund, M. A Design Framework for Strongly χ2-Private Data Disclosure. IEEE Trans. Inf. Forensics Secur. 2021, 16, 2312–2325. [Google Scholar] [CrossRef]
  17. Zamani, A.; Oechtering, T.J.; Skoglund, M. Data Disclosure With Non-Zero Leakage and Non-Invertible Leakage Matrix. IEEE Trans. Inf. Forensics Secur. 2022, 17, 165–179. [Google Scholar] [CrossRef]
  18. Zamani, A.; Oechtering, T.J.; Skoglund, M. On the Privacy-Utility Trade-Off With and Without Direct Access to the Private Data. IEEE Trans. Inf. Theory 2024, 70, 2177–2200. [Google Scholar] [CrossRef]
  19. Zamani, A.; Oechtering, T.J.; Skoglund, M. Bounds for Privacy-Utility Trade-off with Non-zero Leakage. In Proceedings of the 2022 IEEE International Symposium on Information Theory (ISIT), Espoo, Finland, 26 June–1 July 2022; pp. 620–625. [Google Scholar] [CrossRef]
  20. Zamani, A.; Oechtering, T.J.; Skoglund, M. Bounds for Privacy-Utility Trade-off with Per-letter Privacy Constraints and Non-zero Leakage. In Proceedings of the 2022 IEEE Information Theory Workshop (ITW), Mumbai, India, 1–9 November 2022; pp. 13–18. [Google Scholar] [CrossRef]
  21. Asoodeh, S.; Diaz, M.; Alajaji, F.; Linder, T. Information extraction under privacy constraints. Information 2016, 7, 15. [Google Scholar] [CrossRef]
  22. Zamani, A.; Oechtering, T.J.; Gündüz, D.; Skoglund, M. Private Variable-Length Coding with Zero Leakage. In Proceedings of the 2023 IEEE International Workshop on Information Forensics and Security (WIFS), Nurnberg, Germany, 4–7 December 2023; pp. 1–6. [Google Scholar] [CrossRef]
  23. Zamani, A.; Oechtering, T.J.; Skoglund, M. Private Variable-Length Coding with Non-Zero Leakage. In Proceedings of the 2023 IEEE International Workshop on Information Forensics and Security (WIFS), Nurnberg, Germany, 4–7 December 2023; pp. 1–6. [Google Scholar]
  24. Wang, S.; Lin, Z.; Fanti, G. Statistic Maximal Leakage. In Proceedings of the 2024 IEEE International Symposium on Information Theory (ISIT), Athens, Greece, 7–12 July 2024; pp. 2742–2747. [Google Scholar]
  25. Sreekumar, S.; Gündüz, D. Optimal Privacy-Utility Trade-off under a Rate Constraint. In Proceedings of the 2019 IEEE International Symposium on Information Theory, Paris, France, 7–12 July 2019; pp. 2159–2163. [Google Scholar] [CrossRef]
  26. Liu, T.Y.; Wang, I.H. Robust Privatization With Multiple Tasks and the Optimal Privacy-Utility Tradeoff. IEEE Trans. Inf. Theory 2024, 70, 8164–8179. [Google Scholar] [CrossRef]
  27. Zarrabian, M.A.; Ding, N.; Sadeghi, P. On the Lift, Related Privacy Measures, and Applications to Privacy–Utility Trade-Offs. Entropy 2023, 25, 679. [Google Scholar] [CrossRef]
  28. Dwork, C.; McSherry, F.; Nissim, K.; Smith, A. Calibrating noise to sensitivity in private data analysis. In Proceedings of the Theory of Cryptography Conference, TCC 2006, New York, NY, USA, 4–7 March 2006; pp. 265–284. [Google Scholar]
  29. Dwork, C.; Kenthapadi, K.; McSherry, F.; Mironov, I.; Naor, M. Our data, ourselves: Privacy via distributed noise generation. In Proceedings of the Advances in Cryptology-EUROCRYPT 2006: 24th Annual International Conference on the Theory and Applications of Cryptographic Techniques, St. Petersburg, Russia, 28 May–1 June 2006; pp. 486–503. [Google Scholar]
  30. Kairouz, P.; Oh, S.; Viswanath, P. The composition theorem for differential privacy. In Proceedings of the International Conference on Machine Learning, Lille, France, 7–9 July 2015; pp. 1376–1385. [Google Scholar]
  31. Cuff, P.; Yu, L. Differential privacy as a mutual information constraint. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, 24–28 October 2016; pp. 43–54. [Google Scholar]
  32. Barthe, G.; Olmedo, F. Beyond differential privacy: Composition theorems and relational logic for f-divergences between probabilistic programs. In Proceedings of the International Colloquium on Automata, Languages, and Programming, Riga, Latvia, 8–12 July 2013; pp. 49–60. [Google Scholar]
  33. Feldman, V.; Mironov, I.; Talwar, K.; Thakurta, A. Privacy amplification by iteration. In Proceedings of the 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS), Paris, France, 7–9 October 2018; pp. 521–532. [Google Scholar]
  34. Kocaoglu, M.; Dimakis, A.; Vishwanath, S.; Hassibi, B. Entropic causal inference. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; Volume 31. [Google Scholar]
  35. Compton, S.; Katz, D.; Qi, B.; Greenewald, K.; Kocaoglu, M. Minimum-entropy coupling approximation guarantees beyond the majorization barrier. In Proceedings of the International Conference on Artificial Intelligence and Statistics, Valencia, Spain, 25–27 April 2023; pp. 10445–10469. [Google Scholar]
  36. Shkel, Y.Y.; Yadav, A.K. Information spectrum converse for minimum entropy couplings and functional representations. In Proceedings of the 2023 IEEE International Symposium on Information Theory (ISIT), Taipei, Taiwan, 25–30 June 2023; pp. 66–71. [Google Scholar]
  37. Calmon, F.P.; Makhdoumi, A.; Medard, M.; Varia, M.; Christiansen, M.; Duffy, K.R. Principal Inertia Components and Applications. IEEE Trans. Inf. Theory 2017, 63, 5011–5038. [Google Scholar] [CrossRef]
  38. Berger, T.; Yeung, R.W. Multiterminal source encoding with encoder breakdown. IEEE Trans. Inf. Theory 1989, 35, 237–244. [Google Scholar] [CrossRef]
  39. Zamani, A.; Rodríguez-Gálvez, B.; Skoglund, M. On information theoretic fairness: Compressed representations with perfect demographic parity. arXiv 2024, arXiv:2408.13168. [Google Scholar]
  40. Erlingsson, Ú.; Pihur, V.; Korolova, A. Rappor: Randomized aggregatable privacy-preserving ordinal response. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, Scottsdale, AZ, USA, 3–7 November 2014; pp. 1054–1067. [Google Scholar]
  41. Asoodeh, S.; Liao, J.; Calmon, F.P.; Kosut, O.; Sankar, L. Three variants of differential privacy: Lossless conversion and applications. IEEE J. Sel. Areas Inf. Theory 2021, 2, 208–222. [Google Scholar] [CrossRef]
  42. Alghamdi, W.; Gomez, J.F.; Asoodeh, S.; Calmon, F.; Kosut, O.; Sankar, L. The saddle-point method in differential privacy. In Proceedings of the International Conference on Machine Learning, Honolulu, HI, USA, 23–29 July 2023; pp. 508–528. [Google Scholar]
  43. Dwork, C.; Rothblum, G.N.; Vadhan, S. Boosting and differential privacy. In Proceedings of the 2010 IEEE 51st Annual Symposium on Foundations of Computer Science, Las Vegas, NV, USA, 23–26 October 2010; pp. 51–60. [Google Scholar]
  44. Wyner, A.D. The wire-tap channel. Bell Syst. Tech. J. 1975, 54, 1355–1387. [Google Scholar] [CrossRef]
  45. Gács, P.; Körner, J. Common information is far less than mutual information. Probl. Control Inf. Theory 1973, 2, 149–162. [Google Scholar]
  46. Basciftci, Y.O.; Wang, Y.; Ishwar, P. On privacy-utility tradeoffs for constrained data release mechanisms. In Proceedings of the 2016 Information Theory and Applications Workshop, La Jolla, CA, USA, 31 January–5 February 2016; pp. 1–6. [Google Scholar] [CrossRef]
  47. Li, C.T.; El Gamal, A. Strong Functional Representation Lemma and Applications to Coding Theorems. IEEE Trans. Inf. Theory 2018, 64, 6967–6978. [Google Scholar] [CrossRef]
  48. Wyner, A. The common information of two dependent random variables. IEEE Trans. Inf. Theory 1975, 21, 163–179. [Google Scholar] [CrossRef]
  49. Zamani, A.; Oechtering, T.J.; Gündüz, D.; Skoglund, M. Cache-Aided Private Variable-Length Coding with Zero and Non-Zero Leakage. In Proceedings of the 2023 21st International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOpt), Singapore, 24–27 August 2023; pp. 247–254. [Google Scholar] [CrossRef]
  50. Maddah-Ali, M.A.; Niesen, U. Fundamental limits of caching. IEEE Trans. Inf. Theory 2014, 60, 2856–2867. [Google Scholar] [CrossRef]
Figure 1. In this work, an encoder wants to compress Y, which is correlated with X under certain privacy leakage constraints and send it over a channel where an eavesdropper has access to the output of the encoder.
Figure 1. In this work, an encoder wants to compress Y, which is correlated with X under certain privacy leakage constraints and send it over a channel where an eavesdropper has access to the output of the encoder.
Entropy 27 00124 g001
Figure 2. Two-part construction coding: A strategy to send codewords over the channels. We hide the information of X using one-time-pad coding, and we then use the FRL to produce U.
Figure 2. Two-part construction coding: A strategy to send codewords over the channels. We hide the information of X using one-time-pad coding, and we then use the FRL to produce U.
Entropy 27 00124 g002
Figure 3. At the receiver side, we first decode X using the shared key W, then by using the fact that U satisfies H ( Y | X , U ) = 0 , we can decode Y based on X and U.
Figure 3. At the receiver side, we first decode X using the shared key W, then by using the fact that U satisfies H ( Y | X , U ) = 0 , we can decode Y based on X and U.
Entropy 27 00124 g003
Figure 4. In this work, we use two-part construction coding to send codewords over the channels. The encoder takes X, Y, and W as inputs. In the first step, it hides the information of X using one-time-pad coding by adding the secret key W to X to produce X ˜ , then, X ˜ is encoded to C 1 using any lossless code. In the second step, the encoder uses the solution of g 0 ( P X Y ) = h 0 ( P X Y ) to construct U, which is independent of X. Moreover, U satisfies H ( Y | U , X ) = 0 , which ensures that Y can be recovered from X and U. Then, U is encoded to C 2 using any lossless code. Finally, C = ( C 1 , C 2 ) is sent over the channel.
Figure 4. In this work, we use two-part construction coding to send codewords over the channels. The encoder takes X, Y, and W as inputs. In the first step, it hides the information of X using one-time-pad coding by adding the secret key W to X to produce X ˜ , then, X ˜ is encoded to C 1 using any lossless code. In the second step, the encoder uses the solution of g 0 ( P X Y ) = h 0 ( P X Y ) to construct U, which is independent of X. Moreover, U satisfies H ( Y | U , X ) = 0 , which ensures that Y can be recovered from X and U. Then, U is encoded to C 2 using any lossless code. Finally, C = ( C 1 , C 2 ) is sent over the channel.
Entropy 27 00124 g004
Figure 5. On the receiver side, we first decode X using the shared key W, then by using H ( Y | X , U ) = 0 , we can decode Y based on X and U.
Figure 5. On the receiver side, we first decode X using the shared key W, then by using H ( Y | X , U ) = 0 , we can decode Y based on X and U.
Entropy 27 00124 g005
Figure 6. Encoder design: illustration of the achievability scheme of Theorem 7. Two-part code construction is used to produce the response of the server, C . The server sends C over the channel, which is independent of X.
Figure 6. Encoder design: illustration of the achievability scheme of Theorem 7. Two-part code construction is used to produce the response of the server, C . The server sends C over the channel, which is independent of X.
Entropy 27 00124 g006
Figure 7. A server wants to send a response over a shared link to the satisfy users’ demands, but since the database is correlated with the private data, the existing schemes are not applicable. In the delivery phase, we hide the information about X using one-time-pad coding and send the rest of response using the functional representation lemma (FRL).
Figure 7. A server wants to send a response over a shared link to the satisfy users’ demands, but since the database is correlated with the private data, the existing schemes are not applicable. In the delivery phase, we hide the information about X using one-time-pad coding and send the rest of response using the functional representation lemma (FRL).
Entropy 27 00124 g007
Figure 8. Quantized image of digit 3 with resolution 28 × 28 .
Figure 8. Quantized image of digit 3 with resolution 28 × 28 .
Entropy 27 00124 g008
Table 1. Marginal distribution of the useful data Y.
Table 1. Marginal distribution of the useful data Y.
y = 1 y = 2 y = 3 y = 4 y = 5
Pr ( Y = y ) 0.1851 0.4069 0.2981 0.0936 0.0151
y = 6 y = 7
Pr ( Y = y ) 0.0010 0.0002
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zamani, A.; Skoglund, M. Variable-Length Coding with Zero and Non-Zero Privacy Leakage. Entropy 2025, 27, 124. https://doi.org/10.3390/e27020124

AMA Style

Zamani A, Skoglund M. Variable-Length Coding with Zero and Non-Zero Privacy Leakage. Entropy. 2025; 27(2):124. https://doi.org/10.3390/e27020124

Chicago/Turabian Style

Zamani, Amirreza, and Mikael Skoglund. 2025. "Variable-Length Coding with Zero and Non-Zero Privacy Leakage" Entropy 27, no. 2: 124. https://doi.org/10.3390/e27020124

APA Style

Zamani, A., & Skoglund, M. (2025). Variable-Length Coding with Zero and Non-Zero Privacy Leakage. Entropy, 27(2), 124. https://doi.org/10.3390/e27020124

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop