3. Preliminaries and Overview of the Privacy Problems
In this section, we provide a brief overview of a few related privacy problems. We then provide essential lemmas and definitions corresponding to the extended functional representation lemma, the extended strong functional representation lemma, and the minimum information coupling problem. Mutual information has been widely used as a privacy leakage measure [
1,
2,
14,
15,
18,
21,
26]. Specifically, in [
21], the following problem was addressed
where in ([
21], Lemma 1), the lower and upper bounds on
were derived. Considering perfect privacy, i.e.,
,
can be obtained by solving a linear program ([
15], Theorem 1). This implies that the optimal mapping is the solution to a linear program. Furthermore, it has been shown that, to find the optimal privacy-preserving mapping, it is sufficient to consider
U such that
. However, solving the linear program suggested by [
15] can become challenging when the size of
or
increases. Necessary and sufficient conditions for achieving non-zero utility when the private data are hidden, i.e.,
, were derived in [
37,
38]. It was shown that
if and only if the rows of
are linearly dependent. This result was demonstrated and generalized in [
15], e.g., see ([
15], Proposition 1). Moreover, considering an observable private data scenario, necessary and sufficient conditions for attaining non-zero utility, i.e.,
, were derived in ([
1], Theorem 5), where
It has been shown that if and only if Y is not a deterministic function of X, i.e., .
Next, we present a brief overview of the linear program proposed by [
15] that attains
. To do so, let
be a vector in
. It is shown that
if and only if
, where
is constructed as follows: Let
V be the matrix of right eigenvectors of
, i.e.,
and
, then
M is defined as
Furthermore, if
, then
. As shown in [
15], for every
, if the Markov chain
holds and
, then the vector
belongs to the following convex polytope
Using this, we have
which results in the following equivalency
where
defined on
is the marginal distribution of
U. Finally, let
be the minimizer of
over the set
, then,
for all
must belong to the extreme points of
. Furthermore, it is sufficient to consider
extreme points of
. Using the previous result, the solution to the minimization in (
15) can be obtained in two phases: in phase one, the extreme points of set
are identified, while in phase two, proper weights over these extreme points are obtained to minimize the objective function. As argued in [
15], the extreme points of
are the basic feasible solutions of
. Basic feasible solutions of
can be found in the following manner. Let
be the set of indices which correspond to
linearly independent columns of
M, i.e.,
and
. Let
be the sub-matrix of
M with columns indexed by the set
. Then, if all elements of the vector
are non-negative, the vector
, which is defined in the following, is a basic feasible solution of
. Assume that
, where
and all elements are arranged in an increasing order. The
-th element of
is defined as the
i-th element of
, i.e., for
, we have
The other elements of
are set to zero. Thus, by using the same arguments as in [
15], we obtain the following corollary.
Corollary 1. Let be a set that contains the extreme points of . If all elements of the vector are non-negative, then .
In the next proposition, we show two properties of each vector inside .
Proposition 1. Let . For every Ω, we have .
Proof. Consider the set . Any element in has sum elements equal to one, since we have , we obtain . The basic solutions of are defined as follows: Let , where , then, and other elements of are zero. Thus, the sum over all elements of equals one, since each element in has to sum to one. □
After finding all extreme points of the set
, for the second phase, we proceed as follows. Assume that all extreme points of the set
are denoted by
, which were obtained in the first phase. Then, using the equivalency in Theorem (
15), we have
where
is the entropy of the distribution vector
.
In this work, considering the zero leakage scenario, to design the code, we use the properties of the privacy problems defined in (
12) and (
13).
Next, we recall two important lemmas from [
18].
Lemma 1. (Extended Functional Representation Lemma (EFRL)): For a pair of RVs distributed according to supported on alphabets and , respectively, where is finite and is finite or countably infinite, and any , there exists a RV U defined on such that the leakage between X and U is equal to ϵ, i.e., we haveY is a deterministic function of , i.e., we haveand Lemma 2. (Extended Strong Functional Representation Lemma (ESFRL)): For a pair of RVs distributed according to supported on alphabets and , respectively, where is finite and is finite or countably infinite with , and any , there exists a RV U defined on such that the leakage between X and U is equal to ϵ, i.e., we haveY is a deterministic function of , i.e., we have can be upper bounded as follows:and where . The main idea in constructing
U, which satisfies the conditions in Lemmas 1 and 2, is to add a randomized response to the output of the FRL and SFRL, respectively. The output refers to a random variable which is produced by the FRL or the SFRL and satisfies the corresponding constraints. In this work, considering the non-zero leakage scenario, to design the code, we utilize the techniques used in Lemmas 1 and 2. The randomization technique was used in finding fair and private representations of a dataset in [
39]. To find fair representations, we use randomization techniques similar to those in Lemmas 1 and 2, which lead to lower bounds. Furthermore, this randomization technique has also been applied in the design of differentially private mechanisms [
40,
41,
42,
43]. A similar randomization technique was used in [
22,
23] to address private compression design problems. Next, we recall the important results regarding the upper and lower bounds on the minimum entropy coupling as obtained in [
34,
35,
36]. First, we recall the definition of the minimum entropy coupling problem using ([
36], Def. 2).
Definition 1. (minimum entropy coupling) For a set of PMFs , the minimum Shannon entropy coupling iswhere the infimum is taken over all the joint distributions , where the marginal distribution equals for all . Similarly to [
36], for a given joint distribution
, let the minimum entropy of functional representation of
be defined as
Remark 3. As shown in ([36], Lemma 1), the minimum entropy functional representation and the minimum entropy coupling are related functions. In more detail, equals the minimum entropy coupling of the set of PMFs , where . Let
be the output of the greedy entropy-based algorithm which was proposed in ([
34], Section 3), i.e.,
. More specifically, the corresponding algorithm aims to solve (
19) but does not achieve the optimal solution in general. Next, we recall a result obtained in [
35], which showed that
is optimal within
bits when
and is optimal within
bits when
. Let
achieve the optimal solution of (
19), i.e.,
.
Theorem 1 ([
35], Th. 3.4, Th. 4.1, Th. 4.2)
. Let and have finite alphabets. When X is binary, we haveMoreover, for , we haveHere, corresponds to the profile method proposed in ([35], Section 3). Since finding the optimal solution for (
19) is challenging, Theorem 1 asserts that
can be a good approximation for
that achieves (
19). Next, we present the results on lower bounds on
obtained in a parallel work [
36]. The lower bounds were derived using information spectrum and majorization concepts.
Theorem 2 ([
36], Corollary 2, Th. 2)
. Let and have finite alphabets. Let in ([36], Corollary 2, Th. 2). We havewhere ∧ corresponds to the greatest lower bound with respect to majorization and is defined in ([36], Lemma 3). Remark 4. In contrast with [35], the lower bounds in [36] were obtained using Rényi entropy in (
19)
. In this work, we consider Shannon entropy, which is a special case of Rényi entropy. Remark 5. As argued in ([36], Remark 1), for the (largest) lower bounds obtained in [35,36] match. Thus, using Theorem 1, for binary X, we haveand for ,Moreover, in some cases, the lower bound is tight, e.g., see ([36], Example 2). As discussed in [
36],
can be obtained by a greedy construction. To do so, let
with
, where
. Let
be a matrix with columns
, where each is a conditional distribution vector and assume that each column has a descending order (re-order each column). Let
. In other words, we choose the smallest number in the first row of the matrix
. Next, we subtract
from the first row and reorder each column and update the matrix. We then choose the smallest number from the first row of the updated matrix and represent it by
. We continue this procedure until the summation of
reaches one. To see an example, refer to ([
36], Example 1).
Finally, we recall the two-part construction coding which was used in [
1]. To do so, let us consider the scenario illustrated in
Figure 1 where an encoder wishes to compress
Y and communicates it to a user over an unsecured channel. The encoded message is described by the RV
. As shown in
Figure 1, it is assumed that an adversary has access to the encoded message
, and wants to extract information about
X. Moreover, it is assumed that the encoder and decoder have access to a shared secret key denoted by the RV
W with size
M. The goal is to design an encoded message
, which compresses
Y, using a variable length code with the minimum possible average length that satisfies a zero leakage constraint. The shared secret key is denoted by the discrete RV
W defined on
and is assumed to be accessible by both the encoder and decoder. Furthermore, we assume that
W is uniformly distributed and is independent of
X and
Y.
Theorem 3 ([
1], Theorem 8)
. Let and the pair of RVs be distributed according to , and the shared secret key size be , i.e., . Then, we havewhere and if is finite, we haveFinally, if X is a deterministic function of Y, we have Proof. Let
W be the shared secret key with key size
, which is uniformly distributed over
and independent of
. As shown in
Figure 2, first, the private data
X are encoded using the shared secret key. Thus, we have
Next, we show that
has a uniform distribution over
and
. We have
Furthermore,
, and combining it with (
28), we obtain
. For encoding
, we use
bits. Let
denote the encoded
. Next, by using the construction in ([
1], Lemma 2), let
be the output of FRL. Thus, we have
We encode
using any lossless code which uses at most
bits and let
be the encoded message
. We send
over the channel. Thus, (
25) can be obtained. Note that, as shown in
Figure 3, on the decoder side, we first decode
X using
W, by adding
to
, then,
Y is decoded using the fact that
satisfies
. Next, we show that if
W is independent of
we obtain
. We have
where (a) follows from
; (b) follows, since
W is independent of
; and (c) from the independence of
and
. The latter follows, since we have
Thus,
and
are independent. Step (i) above follows by the fact that
W and
are uniformly distributed over
, i.e.,
. As a summary, if we choose
W independently of
, the leakage to the adversary is zero. □
4. Main Results
Here, we find upper and lower bounds on (
5), (6) and (
8), and study them in different scenarios.
4.1. Zero Leakage Results
In this section, we derive upper and lower bounds on
defined in (
8). We first study the properties of
and provide a sufficient condition on
so that it belongs to the corresponding set. We then find upper bounds on the entropy of the optimizers of
and
. The obtained bounds help us to find upper bounds on
. In other words, for
, we use the solutions of
and
to design
; however, in [
1], the design of the code was based on the functional representation lemma. Finally, we provide lower bounds on
, study the bounds in a numerical example, and compare them with [
1].
In the next lemma, let
denote the common information between
X and
Y, where the common information corresponds to the Wyner [
44] or Gács-Körner [
45] notions of common information.
Lemma 3. If , then .
Proof. The proof is provided in ([
18], Proposition 6) with
and follows similar arguments as the proof of ([
46], Theorem 2). □
Corollary 2. If X is a deterministic function of Y, then , since, in this case, we have . Furthermore, this result was also shown in ([18], Proposition 6). In the next result, we provide the properties of the optimizers for and .
Lemma 4. If , thenand both and have the same set of optimizers. Furthermore, for any optimizer , we haveFinally, if U satisfies (33)–(35)
, then it is an optimizer for both and . Proof. All the statements can be shown using ([
1], Theorem 7) and the key equation as follows:
□
Remark 6. The optimization problems in (
9)
and (10)
do not have unique optimizers. For instance, let , by using the constructions used in [1,47], we can attain the optimum value. Next, we show that if , without loss of optimality, we can assume that , where corresponds to the dimension of the null space of .
Lemma 5. For any , we haveFurthermore, let . Then, Proof. The proof of (
37) is provided in ([
15], Theorem 1). It only remains to show the equality
. Let
be an optimizer of (38). Using (
37), it is also an optimizer of
and hence
. Thus,
. In other words, for
, we can assume
in both problems
and
. □
Before stating the next result, let us define a set
and a function
as follows
Note that
can be empty for some joint distributions
; however, using Lemma 4 when
it is the set which contains all the optimizers satisfying
. Furthermore, the function
finds the minimum entropy of all optimizers satisfying
.
Lemma 6. Let . We have Proof. The proof follows from Lemma 5. Let be any optimizer of that satisfies . We have . □
Remark 7. Lemma 6 asserts that when , there exist a U that satisfies (33)–(35)
, and Remark 8. The upper bound derived in Lemma 6 can be significantly smaller than the upper bound obtained in ([1], Lemma 2). This helps us to find codes that have a lesser average length compared to [1]. Noting that ([1], Lemma 2) asserts that there exists U which satisfies (33)
and (35)
, andwhich results in In the next example, we compare (
44) and (
43).
Example 1. Let Y be a deterministic function of X, i.e., , which yields and . In this case, . We havewhich results inFurthermore, as and grow, the gap between (33) and (34) becomes larger. To see another example, let X and Y be represented by and , where and are conditionally independent given V, then by using [48], we have resulting in , where corresponds to the common information between X and Y. In this case, we have . For , we havewhich results inIntuitively, when and are large, (
43)
can be significantly less than (
44)
, since and (
44)
includes multiplication of and . Before stating the next result, we define
where
and
.
Theorem 4. Let . For any , we haveThe upper bound can be strengthened and we haveMoreover, when and , we have , and for the unique optimizer we havewhere and . Remark 9. As we outlined earlier, it is shown in [15] that can be obtained by a linear program which is presented in Section 3. As we can see in Section 3, each extreme point is a vector with size in which at most elements are non-zero. We recall that β corresponds to the weighting elements in the marginal distribution of U, i.e., . The number of extreme points is at most . Hence, if we consider the linear equations , the size of the matrix in the system of linear equations, i.e., , is at most and we have at most variables in the system. By solving the linear program proposed in [15] we can find the exact value of and the conditional distribution that attains it. The complexity of the linear program in [15] can grow faster than the exponential functions with respect to ; however, the complexity of our proposed method grows linearly with . The latter follows since, by considering the system of linear equations in our proposed method , the size of matrix is and we do not need to find all possible extreme points of the set presented in Section 3. Thus, our proposed upper bound has less complexity compared to the solution in [15]. Furthermore, in a special case where , we can find the exact value of using our method. One necessary condition for is to have , since the summation of rows in equals zero. Remark 10. The optimizer of does not help us to build , since the constraint does not hold in general; however, the optimizer of satisfies it. Hence, we consider the cases where the equality holds. In this case, and we have tighter bounds on compared to [1]. Remark 11. The upper bound in (48)
holds for any ; however, the upper bound in (48)
asserts that there exists such that the bound holds. The lower bound in (47)
can be used to find a lower bound on . Similarly to Remark 9, the linear program attaining the lower bound in (47)
has less complexity than [15]. Next, we obtain upper and lower bounds on
. See
Figure 4 and
Figure 5.
Theorem 5. For the pair of RVs , let and the shared secret key size be , i.e., . Furthermore, let and . Then, we obtainFor any , we haveand if ,Finally, for any with , we have Proof. All upper bounds can be obtained using two-part construction coding [
1] and the complete proof is provided in
Appendix A. □
Next, we obtain lower bounds on .
Theorem 6. For any pair of RVs distributed according to and shared key size , we haveif ,and the code does not exist when and . Furthermore, considering , if we assume the received code satisfies , and , we have Proof. The proofs of (
59) and (
60) are provided in ([
1], Theorem 9) and (
61) follows from Theorem 4. □
Remark 12. The constraint in the converse bound (
61)
can be motivated by considering the cases where the encoder has no direct access to the private data X. Furthermore, the constraint is stronger compared to the that is used to prove (
59)
. Specifically, the constraint needed to prove (
59)
is , which results in . 4.2. Comparison
In this part, we study the bounds obtained in Theorem 5 and compare them with the existing results in [
1].
Clearly, for any
with
, the upper bound obtained in (
58) improves the bounds in (56) and (55), where (56) was derived in [
1]. The latter follows, since in this case we have
.
In this part, let
, which results in
, see Corollary 1. The upper bound attained by [
1] is
. Furthermore, we can assume that all elements in the probability vector
are non-zero, otherwise we can remove them. the In this case, we have
This follows, since each column in
contains exactly one non-zero element that equals 1. Moreover, all rows of
are linearly independent, since each row contains a non-zero element that the other rows do not have. Thus,
, which results
. Thus, the upper bound in (54) can be modified to obtain the same bound as (
57). Furthermore, the upper bounds (53) and (52) attain less quantities compared to (
57), i.e., they can improve the bounds in [
1]. Next, in a numerical example, we show that the upper bounds (53) and (52) improve (
57).
Example 2. Let and . Clearly, in this case X is a deterministic function of Y. Using the linear program proposed in [15], we obtain a solution as , , , and , which results in bits. We have and . Hence, the upper bound (52)
is strictly lower than (
57)
, i.e., the that achieves has less entropy than the RV U constructed in ([1], Lemma 1). 4.3. Improving the Bounds Using Greedy Entropy-Based Algorithm
In this part, we use the greedy entropy-based algorithm to generalize the bounds obtained in Theorem 5. In more detail, we use the output of the greedy entropy-based algorithm in Theorem 1 instead of the solution to . In the next theorem, let with , where . Let be a matrix with columns , and we re-order each column so that each column has a descending order. Let , i.e., choose the smallest number in the first row of the matrix , and then we subtract from the first row and reorder each column and update the matrix. We then choose the smallest number from the first row of the updated matrix and represent it by . We continue this procedure until the summation of reaches one.
Theorem 7. Let the RVs be distributed according to supported on alphabets and , where and are finite, and let the shared secret key size be , i.e., . Let , we havewhere is defined in ([36], Lemma 3). When , we have Proof. The proof is similar to Theorem 5 and the main difference is to use the minimum entropy output of (
19) instead of the solution to
that is used in two-part construction coding. Similarly to Theorem 5, we use two-part code construction to achieve the upper bounds. As shown in
Figure 6, we first encode the private data
X using one-time pad coding ([
2], Lemma 1), which uses
bits. Next, we produce
U based on the greedy entropy-based algorithm proposed in [
34], which solves the minimum entropy problem in (
19). Thus, we have
and
and for
,
Thus, we obtain (
63) and (
64). Moreover, for the leakage constraint, we note that the randomness of one-time-pad coding is independent of
X and the output of the greedy entropy-based algorithm
U.
On the user side, we can decode Y using (65). □
Remark 13. We emphasize that (
63)
and (
64)
can improve (52)
, since they achieve the minimum in (56)
within a gap, i.e., achieve up to a constant gap that depends on . Furthermore, in contrast with the bounds in Theorem 5, (
63)
and (
64)
can be achieved for any joint distribution . However, in the following numerical example, we show that the bounds in Theorem 5 can be tighter than those in Theorem 7. Example 3. Let and . Clearly, in this case, X is a deterministic function of Y. Using the linear program proposed in [15], we obtain a solution as , , , and which results bits. We have . Moreover, we haveUsing the greedy search algorithm, we have , hence, . Thus, In the next example, we show that the bounds in Theorem 7 can be tighter than the bounds obtained in Theorem 5.
Example 4. Let and . Similarly to the previous example, X is a deterministic function of Y. Using the linear program proposed in [15], we obtain a solution as , which results in bits. We have . Moreover, we haveUsing the greedy search algorithm, we have , hence, . Thus, 4.4. Non-Zero Leakage Results
In this section, we derive upper and lower bounds on
and
, defined in (
5) and (6). Next, we study the bounds in different scenarios; moreover, we provide an example that shows that the bounds can be tight. The following lemma helps us to find lower bounds on the entropy of an RV that satisfies
and
considering the shared key.
Lemma 7. Let the pair be jointly distributed and the RV W be independent of X and Y. Then, if the RV U satisfies and then, we havewhereand is the minimum conditional entropy which is non-zero. In the next two theorems, we provide upper and lower bounds on
. The next theorem is a generalization of ([
1], Theorem 8) for correlated
and
X.
Theorem 8. Let and the pair of RVs be distributed according to , and the shared secret key size be , i.e., . Then, we havewhere and if is finite, we haveFinally, if X is a deterministic function of Y, we have Proof. The proof is based on ([
19], Lemma 5) and the two-part construction coding and is provided in
Appendix B. □
In the next theorem, we provide lower bounds on .
Theorem 9. Let and the pair of RVs be distributed according to supported on alphabets and , where is finite and is finite or countably infinite. For any shared secret key size , we havewhere , and are defined in (70)
, (71)
and (72)
. Furthermore, if X is deterministic function of Y, then, we have Proof. The proof is based on Lemma 7 and is provided in
Appendix B. □
Next, we provide an example where the upper and lower bounds obtained in Theorems 8 and 9 are studied.
Example 5. Let be an i.i.d Bernoulli() sequence and . Clearly, is a function of Y, and . Using ([1], (72)), we havewhere (a) follows from ([1], (70)). Let the shared key size be , by using Theorems 8 and 9, we havewhere (
75)
is used for the upper bound and is used for the lower bound. Using ([1], (73)), we havewhere is the binary entropy function. Now, by using (
79)
and (
80)
, we obtain Here, we assume that , where , , and . In the next theorem, we provide upper bounds for . We show that when the leakage is more than a threshold, we are able to communicate the message over the channel using a shared key size less than and the receiver can decode the message without any loss.
Theorem 10. Let and the RVs be distributed according to and let the shared secret key size be , i.e., . Then, we haveif is finite, thenand if is deterministic function of Y, we obtainFurthermore, let and the shared secret key size be . We have Proof. The proof is similar to the proof of Theorem 8 and is provided in
Appendix B. □
Remark 14. As argued in ([1], Theorem 9), when the leakage is zero and X is a deterministic function of Y, if the shared key size is less than the size of X, i.e., , lossless variable-length codes do not exist. However, as is shown in Theorem 10, when the leakage is more than a threshold, for the key size less than , such codes exist. Remark 15. The upper bounds (
82)
and (
85)
can be less than the upper bound found in ([1], Theorem 9), which is obtained under perfect privacy assumption. For instance, if , then (
82)
is less than ([1], (95)). The latter follows, since we have where the last inequality follows by . In the next theorem, we find lower bounds for .
Theorem 11. Let and the RVs be distributed according to . For any shared secret key size , we haveFurthermore, if X is a deterministic function of Y, then we have Next, by using the same construction as in Theorem 10, we find upper bounds on for any joint distribution . Next, we present a simple observation, which we call the separation technique.
Observation 1. (Separation technique) Any discrete RV X supported on with can be represented by two RVs , where and .
If is not a prime number, then we can show X by where , , and . Furthermore, if is a prime number, then we can show X by , where and . Let be all possible representations of X, where . For a fixed , we define and .
Theorem 12. For any , pair of RVs distributed according to , and shared key size , if , we havewhere and denotes the size of the RV S. If , for a shared key size , we havewhere . Proof. To prove (
88), let
. Then, by using Theorem 10, we have
Since (
90) holds for any
, we can take the minimum over
, which results in (
88). Furthermore, the key size is chosen as
and note that the first term
remains the same, since it is a function of
, i.e., the distributions including the pair
do not change, since
. Using (
85) and following the same approach, (
89) is obtained. □
Remark 16. The upper bounds (
88)
and (
89)
generalize (
82)
and (
85)
. The upper bounds (
82)
and (
85)
are obtained for a fixed . Thus, by taking the minimum over and , we obtain tighter upper bounds. Remark 17. The same lower bounds as obtained in Theorem 11 can be used here, since the separation technique does not improve the bounds in (
86)
and (
87)
. Remark 18. Using Example 5, we can see that the bounds obtained in Theorems 10–13 can be asymptotically tight.
Next, we provide an example to clarify the construction in Theorem 12 and compare the bounds with the perfect privacy case.
Example 6. Let the RV with distribution and be correlated with Y, where is a . To calculate the bound in Theorem 12, one possible separation of X is , where the distributions of and are , and . In this case, and . Using the bound (
82)
, for all and key size , we haveBy using the bound ([1], (95)), for the key size , we haveNote that, since , we can also use the following bound instead of (
88)
By checking all possible separations, the minimum of occurs at , and where and have the distributions mentioned before. Thus, for all and the key size , we have By using the bound ([1], (96)), for the key size , we have We can see that by allowing a small amount of leakage with a shorter secret key size, we are able to send shorter codewords. In the following, we study the bounds obtained in Theorem 12 considering a perfect privacy scenario and we show that the previous bounds can be improved by using the separation technique.
In this part, we let
and consider the problem as in [
1]. We show that when
, where
is a deterministic function of
, the upper bounds on
can be improved by using a shorter shared key size, i.e.,
. In the next theorem, we provide upper bounds on
.
Theorem 13. Let , where , and the shared key size be . We haveif is finite, thenand if is deterministic function of Y, we obtain Remark 19. Clearly, when with the upper bounds (
91)
, (
92)
and (
93)
improve the bounds in ([1], Theorem 8). In this part, we use similar approaches as used in Theorems 12 and 13 to find upper bounds. We encode
using one-time-pad coding and the same
U as in Theorem 13. As shown in Theorem 13,
is sufficient to decode
Y, since
; however, in this scheme, we do not have the possibility that we are allowed to leak about
X. Similarly to Theorem 13, we obtain
Next, we use the separation technique as used in Theorem 12. For any
, if a separation exist such that
, where
, we obtain
Furthermore, we can take the minimum over all possible separations to improve the upper bound. Let
be all possible separations, where
. We have
where
.
We can see that the upper bound on
can be less than the bound in (
91). For instance, let
. Using the same arguments as in Remark 15, we have
. Hence, if the leakage is more than a certain threshold, this can help us to improve the upper bound and send shorter codewords compared to the perfect privacy case. Moreover, it helps us to use a shorter shared secret key size.
5. Applications: Cache-Aided Networks
In this part, we present an application where the results of this paper can be used. The complete study of the application can be found in [
49]. To do so, let us consider the scenario illustrated in
Figure 7. A server has access to a database consisting of
N files
, where each file, of size
F bits, is sampled from the joint distribution
. The random variable
X denotes the private latent variable. We assume that the server knows the realization of the private variable
X as well. The server is connected to
K users over a shared link, where user
i has access to a local cache memory of size
bits. Furthermore, we assume that the server and the users have access to a shared secret key denoted by
W, of size
T. The system works in two phases: the placement and delivery phases, respectively [
50]. In the placement phase, the server fills the local caches using the database. Let
denote the content of the local cache memory of user
k,
after the placement phase. In the delivery phase, first the users send their demands to the server, where
denotes the demand of user
k. The server sends a response, denoted by
, over the shared link to satisfy all the demands, simultaneously. We assume that an adversary has access to the shared link as well, and uses
to extract information about
X. However, the adversary does not have access to the local cache contents or the secret key. Since the files in the database are all correlated with the private latent variable
X, the coded caching and delivery techniques introduced in [
50] do not satisfy the privacy requirement. The goal of the cache-aided private delivery problem is to find a response
with minimum possible average length that satisfies a certain privacy constraint and the zero-error decodability constraint of the users. Here, we consider the worst-case demand combinations
to construct
. This expectation is taken for the randomness in the database. In this study, we first consider a perfect privacy constraint, i.e., we require
to be independent of
X. Let
denote the decoded message of user
k using
W,
, and
. User
k should be able to recover
reliably, i.e.,
,
. We remark that we have a cache-aided variable-length compression problem with a privacy constraint. Hence, to solve this problem and design the code
, we utilized techniques used in privacy mechanisms, data compression, and cache design and coded delivery problems, and combined them to build such a code. In particular, we used the data compression techniques employed in this paper and caching design techniques in [
50]. Finally, we extended the results for correlated
X and
, i.e., when non-zero leakage is allowed. To do so, we used the results of this paper for correlated
X and
, i.e., the non-perfect privacy part. For more details, see [
49].
6. Experiment
In this section, we provide an experiment to evaluate the bounds obtained in Theorem 4. To do so, we used the MNIST dataset, which contains 60,000 images illustrating handwritten digits from 0 to 9. Let the RV
S (information source) denote the images in the dataset, i.e.,
60,000. Let
Y represent the useful data that is a feature extracted from the dataset. In this example, the useful data are defined as a “quantized histogram”. To determine the quantized histogram of the dataset, we first converted the images into black and white by quantizing them. The histogram of any image in the dataset is then computed as the ratio of the number of white pixels to the total number of pixels. Consequently, the histogram of any image can be represented by a value within the interval
. For quantizing the histogram of any image, we used the following intervals: Interval 1
, Interval 2
, Interval 3
, Interval 4
, Interval 5
, Interval 6
and Interval 7
. Each image in the dataset was therefore labeled with a number between 1 and 7, indicating the interval to which its histogram belongs. Let
Y denote the interval sample
S belongs to. Clearly, the labels of the intervals are deterministic functions of the information source
S. For instance, in
Figure 8, the number of white pixels is 130, hence the quantized histogram is equal to
, which corresponds to the third interval.
Let the RV
Z denote the label of each image that is a number from 0 to 9, indicating the digit which the image represents, i.e.,
. Clearly,
Z is a deterministic function of
S, i.e.,
. The private attribute
X is defined as the presence of, digit 0 which is a binary RV, i.e.,
and
. Here, we use an empirical distribution for
Z and it can be seen that
X is a deterministic function of
Z, which yields the Markov chain
. Using this Markov chain, the kernel
can be obtained as
, where
and
can be found from the dataset. We used an empirical distribution to find the kernel
, e.g.,
is the number of images illustrating digit 1, which corresponds to label 1 (their quantized histograms belong to the first interval) divided by the total number of images with digit 1. By using
derived in (
94) and multiplying it by
, we obtain
as in (
95).
Moreover, the distribution of the task
Y is given as in
Table 1.
The goal was to compress
Y and send it to a user, and we considered a perfect privacy case where the privacy leakage was zero. Since
X is a deterministic function of
Y, we have
and
. Hence, using the linear program in [
15], we found
U that achieved
, and
was obtained. Using Theorem 4, the average length of
was 2.88 bits; however, using the method in [
1], 4.16 bits were required to send the message
over the channel.