Next Article in Journal
Quantum Error-Correcting Codes Based on Orthogonal Arrays
Next Article in Special Issue
Approximating Functions with Approximate Privacy for Applications in Signal Estimation and Learning
Previous Article in Journal
Geometric Structures Induced by Deformations of the Legendre Transform
Previous Article in Special Issue
Cache-Based Privacy Protection Scheme for Continuous Location Query
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

On the Lift, Related Privacy Measures, and Applications to Privacy–Utility Trade-Offs †

1
College of Engineering, Computing and Cybernetics, Australian National University, Canberra, ACT 2601, Australia
2
School of Computing and Information Systems, University of Melbourne, Parkville, VIC 3010, Australia
3
School of Engineering and Information Technology, University of New South Wales, Canberra, ACT 2600, Australia
*
Authors to whom correspondence should be addressed.
Preliminary results of this work have been published in part at the 2022 IEEE International Conference on Acoustics, Speech and Signal Processing, Singapore, 22–27 May 2022, and IEEE Information Theory Workshop, Mumbai, India, 1–2 November 2022.
Entropy 2023, 25(4), 679; https://doi.org/10.3390/e25040679
Submission received: 20 February 2023 / Revised: 6 April 2023 / Accepted: 14 April 2023 / Published: 18 April 2023
(This article belongs to the Special Issue Information-Theoretic Privacy in Retrieval, Computing, and Learning)

Abstract

:
This paper investigates lift, the likelihood ratio between the posterior and prior belief about sensitive features in a dataset. Maximum and minimum lifts over sensitive features quantify the adversary’s knowledge gain and should be bounded to protect privacy. We demonstrate that max- and min-lifts have a distinct range of values and probability of appearance in the dataset, referred to as lift asymmetry. We propose asymmetric local information privacy (ALIP) as a compatible privacy notion with lift asymmetry, where different bounds can be applied to min- and max-lifts. We use ALIP in the watchdog and optimal random response (ORR) mechanisms, the main methods to achieve lift-based privacy. It is shown that ALIP enhances utility in these methods compared to existing local information privacy, which ensures the same (symmetric) bounds on both max- and min-lifts. We propose subset merging for the watchdog mechanism to improve data utility and subset random response for the ORR to reduce complexity. We then investigate the related lift-based measures, including 1 -norm, χ 2 -privacy criterion, and α -lift. We reveal that they can only restrict max-lift, resulting in significant min-lift leakage. To overcome this problem, we propose corresponding lift-inverse measures to restrict the min-lift. We apply these lift-based and lift-inverse measures in the watchdog mechanism. We show that they can be considered as relaxations of ALIP, where a higher utility can be achieved by bounding only average max- and min-lifts.

1. Introduction

With the recent emergence of “Big-Data”, generating, sharing, and analysing data are proliferating via the advancement of communication systems and machine learning methods. While sharing datasets is essential to achieve social and economic benefits, it may lead to the leakage of private information, which has raised great concern about the privacy preservation of individuals. The main approach to protect privacy is perturbing the data via a privacy mechanism. Consider some raw data denoted by random variable X and some sensitive features denoted by S, which are correlated via a joint distribution P S X P S × P X . A privacy mechanism (characterised by the transition probability P Y | X ) is applied to publish Y as a sanitised version of X to protect S.
The design of a privacy mechanism depends on the privacy measure. Differential privacy (DP) [1,2,3] is a widely used notion of privacy. DP restricts the chance of revealing the individual’s presence in a dataset from the outcome of analysis over that dataset [4]. It ensures that neighboured sensitive features s and s , which differ in only one entry, result in a similar output probability distribution, by restricting the ratio between posterior beliefs P Y | S ( y | s ) / P Y | S ( y | s ) below a threshold e ε . The neighbourhood assumption is relaxed in the local differential privacy (LDP) [5,6,7,8,9], where the ratio between posterior beliefs is restricted below e ε for any two sensitive features s and s , denoted by ε -LDP. The quantity of ε is known as the privacy budget. DP and LDP are considered context-free privacy notions, i.e., they do not take into account the prior distribution P S . In contrast, in information-theoretic (IT) privacy, also known as context-aware privacy [10,11], it is assumed that the distribution of data or an estimation of them is available. Some of the dominant IT privacy measures are mutual information (MI) [10,12,13], maximal leakage [14,15,16], α -leakage [17], and local information privacy (LIP) [11,18,19,20,21,22,23,24,25,26,27,28]. A challenge is that while data perturbation restricts privacy leakage, it necessarily reduces data resolution and datasets’ usefulness. Therefore, a privacy mechanism is desired to deliver a satisfactory level of data utility. Depending on the application, data utility is quantified either by measures of similarity between X and Y, such as f-divergence [7] and MI [7,12], or measures of dissimilarity and error, such as Hamming distortion [8,9] and mean square error [22], respectively. This tension between privacy and utility is known as the privacy–utility trade-off (PUT).
In this paper, we consider lift, a pivotal element in IT privacy measures, which is the likelihood ratio between the posterior belief P S | Y ( s | y ) and prior belief P S ( s ) about sensitive features in a dataset:
l ( s , y ) = P S | Y ( s | y ) P S ( s ) = P S Y ( s , y ) P S ( s ) P Y ( y ) .
The logarithm of the lift i ( s , y ) = log l ( s , y ) , which we call log-lift, is the information density [24]. For each y, the more P S | Y ( s | y ) differs from P S ( s ) , the more the adversary gains knowledge about s [29]. Consequently, both min-lift and max-lift, denoted by min s l ( s , y ) and max s l ( s , y ) , respectively, quantify the highest privacy leakage for each y. In [29], the role of min-lift and max-lift in privacy breach was proposed, based on which information privacy was introduced in [18]. Accordingly, min-lift is associated with revealing what values are less probable for s after observing y, while max-lift is associated with the more probable values. In addition, recently, other operational meanings for max-lift have been revealed in guessing frameworks [30] and quantitative information flow [31]. In LIP, min-lift and max-lift are bounded below and above by thresholds e ε and e ε , respectively, to restrict the adversary’s knowledge gain, denoted by ε -LIP. The main privacy mechanisms to achieve ε -LIP are the watchdog mechanism [24,25] and optimal random response (ORR) [28]. Watchdog mechanism bipartitions the alphabet of X into low-risk and high-risk symbols, and only high-risk ones are randomised. It was proved in [25] that X-invariant randomisation (e.g., merging all high-risk symbols) minimises privacy leakage for the watchdog mechanism. ORR is an optimal mechanism for ε -LIP, which maximises MI as the utility measure.

Contributions

We investigate lift and its related privacy notions such as LIP. We demonstrate that min-lift and max-lift have distinct values and probability of appearance in the dataset. More specifically, min-lifts have a broader range of values than max-lifts, while max-lifts have a higher likelihood P S Y ( s , y ) of appearing in the dataset. We call this property lift asymmetry. However, ε -LIP allocates symmetric privacy budgets to min s i ( s , y ) and max s i ( s , y ) ( ε and ε , respectively), which is incompatible with the lift asymmetry. Thus, we propose asymmetric-LIP (ALIP) as an amenable privacy notion to the lift properties, where asymmetric privacy budgets can be allocated to min s i ( s , y ) and max s i ( s , y ) , denoted by ε l and ε u , respectively. We demonstrate that ALIP implies ε -LDP and can result in better utility than LIP in the watchdog and ORR mechanisms. Utility increases by relaxing the bound on the min-lift, which has a lower probability of appearance in the dataset.
We propose two randomization methods to overcome the low utility of the watchdog mechanism and the high complexity of the ORR mechanism. In the watchdog mechanism, X-invariant randomisation perturbs all high-risk symbols together and deteriorates data resolution and utility. On the other hand, ORR suffers from high complexity, which is exponential in the size of datasets. To overcome these problems, we propose subset merging and subset random response (SRR) perturbation methods that make finer subsets of high-risk symbols and privatise each subset separately. Subset merging enhances utility in the watchdog mechanism by applying X-invariant randomisation to disjoint subsets of high-risk symbols. In addition, SRR relaxes the complexity of ORR for large datasets by applying random response solutions on disjoint subsets of high-risk symbols, which results in near-optimal utility.
Besides LIP, we also consider some recently proposed privacy measures, which we call lift-based measures, including 1 -norm [32], χ 2 -strong privacy [33], and  α -lift [34]. They have been proposed as the privacy notions stronger than their corresponding average leakages: the total variation distance [35], χ 2 -divergence [36], and Sibson MI [16,34], respectively. We clarify that they only bound max-lift leakage and can cause significant min-lift leakage. Therefore, we propose a corresponding modified version of these measures to restrict min-lift leakage, which we call lift-inverse measures. We apply lift-based and lift-inverse measures in the watchdog mechanism with subset merging randomisation to investigate their PUT. They result in higher utility than ALIP since they are functions of average lift over sensitive features and, thus, can be considered as relaxations of the max- and min-lift.

2. Preliminaries

2.1. Notation

We use the following notation throughout the paper. Capital letters denote discrete random variables, corresponding capital calligraphic letters denote their finite supports, and lowercase letters denote any of their realisations. For example, a random variable X has the support X , and its realisation is x X . For random variables S and X, we use P S X to indicate their joint probability distribution, P S | X for the conditional distribution of S given X, and  P S and P X for the marginal distributions. Bold capital and lowercase letters are used for matrices and vectors, respectively, and lowercase letters for the corresponding elements of the vectors, e.g., v = [ v 1 , v 2 , , v n ] T . We also use | · | for the cardinality of a set, e.g.,  | X | . We denote the natural logarithm by log and the set of integers { 1 , 2 , , n } by [ n ] . The indicator function is shown by 1 { f } , which is 1 when f is true and zero otherwise.

2.2. System Model and Privacy Measures

Consider some useful data intended for sharing and denoted by random variable X with alphabet X . It is correlated with some sensitive features S with the alphabet S through a discrete joint distribution P S X . To protect the sensitive features, a privacy mechanism is applied to generate a sanitised version of X, denoted by Y with the alphabet Y . We assume P S and P X have full support, and  P Y | X , S ( y | x , s ) = P Y | X ( y | x ) , which results in the Markov chain S X Y .
The main privacy measure is lift (since we assume P S and P Y have full supports, l ( s , y ) is finite), given in (1). Lift and its logarithm, log-lift, quantify multiplicative information gain on each sensitive feature s S via accessing y Y . There are two cases: l ( s , y ) > 1 P S | Y ( s | y ) > P S ( s ) indicates the increment of the belief about s after releasing y; l ( s , y ) 1 P S | Y ( s | y ) P S ( s ) means that releasing y decreases the belief. The more the posterior belief deviates from the prior belief, the more an adversary gains knowledge about s. Thus, for each y Y , the  max s l ( s , y ) and min s l ( s , y ) determine the highest knowledge gain of sensitive features, and they should be restricted to protect privacy. We use the following notation for these quantities:
Ψ ( y ) min s S l ( s , y ) and Λ ( y ) max s S l ( s , y ) .
In Appendix A, we explain the operational meaning of Ψ ( y ) and Λ ( y ) in privacy breach based on the work in [29].
The lift has been applied in local information privacy [24,25,28] to provide protection of sensitive features, and is defined as follows.
Definition 1.
For ε R + , a privacy mechanism M : X Y is ε-local information private or ε-LIP, with respect to S, if for all y Y ,
e ε Ψ ( y ) and Λ ( y ) e ε .
Another instance-wise measure is local differential privacy [5,6,28],
Definition 2.
For ε R + , a privacy mechanism M : X Y is ε-local differential private or ε-LDP, with respect to S, if for all s , s S and all y Y ,
Γ ( y ) = sup s , s S P Y | S ( y | s ) P Y | S ( y | s ) = Λ ( y ) Ψ ( y ) e ε .

3. Asymmetric Local Information Privacy

According to (3), LIP restricts the decrement of log Ψ ( y ) and increment of log Λ ( y ) by the symmetric bounds. However, we demonstrate that these metrics have a distinct range of values and probabilities of appearance in the dataset, P S Y ( s , y ) . We plot the histogram of log Ψ ( y ) and log Λ ( y ) for 10 3 randomly generated distributions in Figure 1, where | X | = 17 and | S | = 5 . In this figure, the range of log Ψ ( y ) is [ 12 , 0.06 ] , much larger than the range of log Λ ( y ) , [ 0.02 , 1.64 ] . Moreover, the maximum probability of log Ψ ( y ) is much lower than the maximum probability of log Λ ( y ) . We refer to these properties as lift asymmetry. Since high values of | log Ψ ( y ) | have a significantly lower probability (for example, in Figure 1, the probability of | log Ψ ( y ) | 6 is near zero) than the log Λ ( y ) , we can relax the min-lift privacy by allocating a higher privacy budget to it while applying a stricter bound for the max-lift. Thus, we propose asymmetric local information privacy (ALIP), where we consider different privacy budgets ε l and ε u for | log Ψ ( y ) | and log Λ ( y ) , respectively.
This will result in the following notion of privacy, which is more compatible with the lift asymmetry property.
Definition 3.
For ε l , ε u R + , a privacy mechanism M : X Y is ( ε l , ε u ) -asymmetric local information private, or  ( ε l , ε u ) -ALIP, with respect to S, if for all y Y ,
e ε l Ψ ( y ) and Λ ( y ) e ε u .
The following proposition indicates how ( ε l , ε u ) -ALIP restricts average privacy leakage measures and LDP.
Proposition 1.
If ( ε l , ε u ) -ALIP is satisfied, then
 1. 
I ( S ; Y ) ε u ;
 2. 
T ( S ; Y ) 1 2 ( e ε u 1 ) and χ 2 ( S ; Y ) ( e ε u 1 ) 2 ;
 3. 
I α S ( S ; Y ) α α 1 ε u and I α A ( S ; Y ) α α 1 ε u ;
 4. 
ε-LDP is satisfied where ε = ε l + ε u ;
where T ( S ; Y ) is the total variation distance, χ 2 ( S ; Y ) is χ 2 -divergence, I α S ( S ; Y ) is Sibson MI, and  I α A ( S ; Y ) is Arimoto MI.
Proof. 
The proof is given in Appendix B.    □
Proposition 1-1–3 demonstrate that average measures are bounded with the max-lift privacy budget. In Section 3.1, we show that ALIP can enhance utility via relaxing min-lift ε l > ε u , where a smaller upper bound is allocated to the max-lift and average measures in Proposition 1. Proposition 1–4 shows the relationship between ( ε l , ε u ) -ALIP and ε -LDP. We introduce a variable λ ( 0 , 1 ) to have a convenient representation of this relationship as follows: for an LDP privacy budget ε , if we set ε l = λ ε and ε u = ( 1 λ ) ε , we have ε l + ε u = ε . Thus, varying λ gives rise to different ( ε l , ε u ) -ALIP scenarios within the same budget for ε -LDP. If λ < 0.5 , we have relaxation on the max-lift privacy; if λ > 0.5 , it implies relaxation on the min-lift privacy. When λ = 0.5 , we have the symmetric case of ε 2 -LIP, where ε l = ε u = ε 2 .

3.1. ALIP Privacy–Utility Trade-Off

In this subsection, we propose a watchdog mechanism based on ALIP and LDP and an asymmetric ORR (AORR) mechanism for ALIP to perturb data and achieve privacy protection. We observe the PUT of ALIP and LDP, where the utility is measured by MI between X and Y, I ( X ; Y ) .

3.1.1. Watchdog Mechanism

Watchdog privacy mechanism bipartitions X into low-risk and high-risk subsets denoted by X L and X H , respectively, and only randomises high-risk symbols. In the existing LIP, X L and X H are determined by symmetric bounds. We propose to use ALIP to obtain X L and X H :
X L { x X : e ε l Ψ ( x ) a n d Λ ( x ) e ε u } a n d X H = X X L .
For LDP, X L and X H are given by
X L { x X : Γ ( x ) e ε } a n d X H = X X L .
After obtaining X L and X H , the privacy mechanism will be
M = 1 { x = y } , x , y X L = Y L , r ( y | x ) , x X H , y Y H , 0 , o t h e r w i s e ,
where 1 { x = y } indicates the publication of low-risk symbols without alteration, and  r ( y | x ) is the randomisation on high-risk symbols, where y Y H r ( y | x ) = 1 .
An instance of r ( y | x ) is the X-invariant randomisation, if  r ( y | x ) = R ( y ) for x X H , y Y H , and  y Y H R ( y ) = 1 . An example of R ( y ) is the uniform randomisation R ( y ) = 1 | Y H | with the special case of complete merging, where | Y H | = 1 , and all x X H are mapped to one super symbol y * Y H . It was proved in [25] for LIP that X-invariant randomisation minimises privacy leakage in X H . Accordingly, if we apply ALIP in the watchdog mechanism, for  X H , the minimum leakages over X H are
ε ¯ u : = max s S i ( s , X H ) = max s S log l ( s , X H ) = max s S log P ( X H | s ) P ( X H ) ,
ε ¯ l : = min s S i ( s , X H ) = min s S log l ( s , X H ) = min s S log P ( X H | s ) P ( X H ) ,
where P ( X H | s ) = x X H P X | S ( x | s ) and P ( X H ) = x X H P X ( x ) .
X-invariant randomisation is also applicable for LDP, and the following theorem shows that it minimises LDP privacy leakage in X H .
Theorem 1.
In the LDP watchdog mechanism where X L and X H are determined according to (7), X-invariant randomisation minimises privacy leakage in X H measured by Γ ( y ) in (4).
Proof. 
The proof is given in Appendix C.    □
In the watchdog mechanism with X-invariant randomisation, the resulting utility measured by MI between X and Y is given by
I ( X ; Y ) = H ( X ) x X H P X ( x ) log P ( X H ) P X ( x ) .
In [25], it was verified that I ( X ; Y ) in (11) is monotonic in X H : if X H X H then I ( X ; Y ) < I ( X ; Y ) , where I ( X ; Y ) is the resulting utility of X H .
Proposition 2.
In the watchdog mechanism with X-invariant randomisation, for a given LDP privacy budget ε, λ ( 0 , 1 ) , and ALIP privacy budgets ε l = λ ε , ε u = ( 1 λ ) ε , LDP results in higher utility and than ALIP.
Proof. 
Denote the high-risk subset for LDP by X H and for ALIP by X H . We have
X H = { x X : Λ ( x ) Ψ ( x ) > e ε } and X H = { x X : Λ ( x ) > e ( 1 λ ) ε o r Ψ ( x ) < e λ ε } .
Based on the remark following (11), it is enough to prove that X H X H . If x X H , for any given λ ( 0 , 1 ) , there are only two possible cases: either Λ ( x ) > e ( 1 λ ) ε or Λ ( x ) e ( 1 λ ) ε . If Λ ( x ) > e ( 1 λ ) ε , then x X H , and our claim is true. If Λ ( x ) e ( 1 λ ) ε , we have x X H Λ ( x ) Ψ ( x ) > e ε Ψ ( x ) < Λ ( x ) e ε . We assumed that Λ ( x ) e ( 1 λ ) ε ; therefore, Ψ ( x ) < Λ ( x ) e ε e ( 1 λ ) ε e ε = e λ ε . Since Ψ ( x ) < e λ ε , we have x X H .  □
This proposition shows the result of applying LDP and ALIP in the watchdog mechanism in terms of the privacy–utility trade-off. While both ε -LDP and ( λ ε , ( 1 λ ) ε ) -ALIP imply the same LDP privacy budget, LDP results in fewer high-risk symbols compared to ALIP. This needs to be considered when one applies the watchdog mechanism to achieve LDP or ALIP privacy. Although having fewer high-risk symbols provides better utility, it may compromise privacy. In other words, when X H X H , then the privacy leakage of the partition { X L , X H } is greater than or equal to the privacy leakage of the partition { X L , X H } . As a result, it is possible that ε -LDP cannot achieve the desired ( λ ε , ( 1 λ ) ε ) -ALIP privacy level for a given λ .
Watchdog mechanism with X-invariant randomisation is a powerful method with low complexity that can be easily applied to instance-wise measures. However, it significantly degrades the utility [25] because X-invariant randomisation obfuscates all high-risk symbols together to minimise privacy leakage, with the cost of deteriorating data resolution. In Section 4, we propose subset merging randomisation to enhance the utility of the watchdog mechanism.

3.1.2. Asymmetric Optimal Random Response (AORR)

ORR was proposed in [28] as a localised instance-wise replacement of the privacy funnel [12]. It is the solution to the optimal utility problem subject to ε -LIP or ε -LDP constraints. For ALIP, we propose asymmetric optimal random response (AORR), which is defined as
max P X | Y , P Y I ( X ; Y ) s . t . S X Y e ε l Ψ ( y ) and Λ ( y ) e ε u , y Y .
Privacy constraints in this optimisation problem form a closed, bounded, convex polytope [28]. It has been proved that vertices of this polytope are the feasible candidates that maximise MI and satisfy privacy constraints [7,28,37]. However, the number of vertices grows exponentially in the dimension of the polyhedron, which is | X | ( | X | 1 ) for LDP and | X | 1 for LIP. This makes ORR computationally cumbersome for large | X | . Accordingly, [28] suggests some approaches with lower complexity than ORR to avoid vertex enumeration for the larger sizes of X , but this comes at the cost of lower utility.

3.1.3. Numerical Results

Here, we demonstrate the privacy leakage and utility of AORR and the watchdog mechanism under ALIP. For the utility, we use normalised MI (NMI)
NMI = I ( X ; Y ) H ( X ) [ 0 , 1 ] .
It is clear that the maximum possible utility is obtained when X is published without randomisation, where Y = X and I ( X ; Y ) = H ( X ) . Thus, I ( X ; Y ) H ( X ) and NMI 1 .
We present numerical results for both synthetic and real datasets using MATLAB. For synthetic data, we randomly generated 10 3 distributions for the watchdog mechanism and 100 distributions for the AORR where | X | = 17 and | S | = 5 . These distributions were generated by normalising the output of the rand function in MATLAB. For real datasets, we used the Adult dataset [38] and set S = { relationship } and X = { Occupation } , where | S | = 5 and | X | = 15 . In all scenarios, ε varies from 0.25 to 8, and we consider three cases for ( ε u , ε l ) -ALIP, where λ { 0.35 , 0.5 , 0.65 } , ε l = λ ε , and ε u = ( 1 λ ) ε . The results of the watchdog mechanism are shown in Figure 2 and Figure 3 for synthetic and real data, respectively; while the AORR results are presented in Figure 4 and Figure 5. The figures display NMI, log max y Λ ( y ) (max-lift leakage), and  log min y Ψ ( y ) (min-lift leakage) versus the LDP privacy budget ε for real data, and the mean values of the same quantities are shown for synthetic data.
In Figure 2a and Figure 3a, we observe that in the watchdog mechanism, LDP provides higher utility and leakage than ALIP for all values of ε and λ , which confirms Proposition 2. Figure 2a, Figure 3a, Figure 4a and Figure 5a demonstrate that the min-lift relaxation, λ = 0.65 , enhances utility in the watchdog and AORR mechanisms for ε > 1 . Note that in all figures, λ = 0.5 refers to ε 2 -LIP. On the other hand, λ = 0.35 results in lower utility. Generally, any value of λ < 0.5 reduces utility since it strictly bounds the min-lift while relaxing the max-lift. As the min-lift has a wider range of values, achieving this strict bound enlarges the set X H and requires randomising more symbols, which reduces utility. Another observation here is that AORR incurs significantly higher utility than the watchdog mechanism. For instance, when λ = 0.5 and ε = 2 , the watchdog mechanism results in a utility of 0.52 for synthetic data and 0.73 for the real data, while AORR has a utility of 0.94 and 0.96 for the synthetic and real data, respectively. AORR finds the optimal utility, which, due to PUT, necessarily results in the highest leakage subject to privacy constraints. However, the watchdog mechanism is a nonoptimal solution that minimises leakage of high-risk symbols to provide strong privacy protection, which deteriorates utility. To solve this drawback of the watchdog mechanism, we propose a subset randomisation method in the following section.

4. Subset Merging in Watchdog Mechanism

The watchdog mechanism with X-invariant randomisation is a low-complexity method that can be easily applied when the privacy measures are symbol-wise. X-invariant randomisation is the optimal privacy protection for the high-risk symbols that minimises privacy leakage in X H and necessarily results in the worst data resolution. Thus, in this section, we propose the subset merging algorithm to improve data resolution by randomising disjoint subsets of high-risk symbols and enhancing utility in the watchdog mechanism. In the following, we show that applying X-invariant randomisation to disjoint subsets of X H increases the utility.
Let G X H = { X 1 , X 2 , X g } be a partition of X H , where for every i [ g ] , X i X H : X i X j = , i j , and X H = i = 1 g X i . We randomise each subset X i G X H by X-invariant randomisation R Y i ( y ) for x X i and y Y i , where y Y i R Y i ( y ) = 1 . The resulting MI between X and Y is
I ( X ; Y ) = H ( X ) i = 1 g x X i P X ( x ) log P ( X i ) P X ( x ) .
Definition 4.
Assume two partitions, G X H = { X 1 , , X g } and G X H = { X 1 , , X g } . We say that G X H is a refinement of G X H , or  G X H is an aggregation of G X H , if for every i [ g ] , X i = j J i X j where J i [ g ] , and  P ( X i ) = j J i P ( X j ) (this definition is inspired from [39] (Definition 10)).
If G X H is a refinement of G X H , then I G X H ( X ; Y ) I G X H ( X ; Y ) .
Obtaining the optimal G X H that maximises utility and satisfies privacy constraints is a combinatorial optimisation problem over all possible partitions of X H , which is cumbersome to solve. Therefore, we propose a heuristic method in the following.

4.1. Greedy Algorithm to Make Refined Subsets of High-Risk Symbols

In Algorithm 1, we propose a bottom-up algorithm that constitutes a partition of X H by merging high-risk symbols in disjoint subsets. It works based on a leakage risk metric for each x X H : ω ( x ) = Λ ( x ) + Ψ ( x ) for ALIP and ω ( x ) = Γ ( x ) for LDP. For LIP, ω ( x ) = max { log Λ ( x ) , | log Ψ ( x ) | } . This metric is used to order the subsets by the privacy risk level. Accordingly, to constitute a subset X i X H , Algorithm 1 bootstraps from the highest risk symbol X i = { arg max x X H ω ( x ) } (line 5). Then, it merges a symbol x * with X i that minimises ω ( X i x * ) (line 7), as long as the privacy constraints are satisfied in X i (line 6). The ALIP privacy constraints for a subset X i are given by
e ε l Ψ ( X i ) a n d Λ ( X i ) e ε u ,
where Ψ ( X i ) = min s S x X i P X | S ( x | s ) x X i P X ( x ) and Λ ( X i ) = max s S x X i P X | S ( x | s ) x X i P X ( x ) . For LDP constraint, we have Γ ( X i ) = Λ ( X i ) Ψ ( X i ) e ε . In Algorithm 1, we used ALIP privacy constraints for the while loops condition in lines 4, 6, and 12. For LDP, the privacy constraint is changed to Γ ( X Q ) > ε , and  ω ( x ) for LDP is applied. After the constitution of the partition G X H , the last subset X g may not meet privacy constraints. Therefore, the leakage of X g is checked (line 12), and if there is a privacy breach, an agglomerate X g is made by merging other subsets to it that minimises subset risk, ω ( X g ) = Λ ( X g ) + Ψ ( X g ) (lines 13–14), until privacy constraints are satisfied.
Algorithm 1: Subset merging in the watchdog mechanism.
Entropy 25 00679 i001

4.2. Numerical Results

We show PUT for ALIP and LDP under subset merging randomisation in Figure 6 and Figure 7 for synthetic and real data, respectively, with the same setup for the watchdog mechanism in Section 3.1.3. Compared with the complete merging (Figure 2 and Figure 3), the utility was enhanced significantly for both LDP and ALIP in all scenarios under the same privacy constraint. For instance, consider the symmetric case λ = 0.5 when ε = 1 , and compare PUT between the subset and complete merging. Figure 6a and Figure 7a demonstrate a utility value of around 0.73 for the subset merging compared to the utilities of 0.17 and 0.28 for the complete merging in Figure 2a and Figure 3a, which are almost 320 % and 160 % utility enhancement. Moreover, as Figure 6b,c and Figure 7b,c illustrate, privacy constraints are satisfied in all cases.

5. Subset Random Response

In the previous section, we showed that subset merging enhances utility in the watchdog mechanism significantly. In this section, we propose a method to decrease the complexity of AORR for large datasets. We adopt AORR for subsets of X H to decrease the complexity of AORR for large sets such that random response becomes applicable for typically an order of magnitude larger X .
The AORR optimisation problem in (12) is equivalent to the following problem:
H ( x ) min P X | Y , P Y H ( X | Y ) s . t . S X Y e ε l Ψ ( y ) and Λ ( y ) e ε u , y Y .
To reduce the complexity of (15), we divide X into X L and X H , similar to the watchdog mechanism, and make a partition G X H = { X 1 , X 2 , X g } from X H . We randomise each subset X i G X H , i [ g ] , separately by a randomisation pair Q i , q i , where Q i is a matrix in R | X i | × | Y i | and q i is a vector in R | Y i | .
The elements of Q i and q i are given by
Q i ( x | y ) = Pr [ X = x | Y = y ] , x X i , y Y i ,
q i ( y ) = Pr [ Y = y ] , y Y i .
For each y Y i , we have x X i Q i ( x | y ) = 1 . Consequently, H ( X | Y ) = i [ g ] H i ( X | Y ) , where
H i ( X | Y ) = y Y i q i ( y ) x X i Q i ( x | y ) log Q i ( x | y ) , i [ g ] .
This setting turns (15) into g optimisation problems for each subset X i G X H , i [ g ] as follows:
min Q i , q i H i ( X | Y )
s . t . 0 q i ( y ) , y Y i ,
0 Q i ( x | y ) , x X i , y Y i ,
x X i Q i ( x | y ) = 1 , y Y i ,
y Y i Q i ( x | y ) q i ( y ) = P X ( x ) , x X i ,
e ε l P S ( s ) x X i P S | X ( s | x ) Q i ( x | y ) e ε u P S ( s ) , s S , y Y i .
The columns of the randomisation matrix Q i , i [ g ] can be expressed as the members of a convex and bounded polytope Π i , which is given by the following constraints:
Π i = v R | X i | : 0 v k , k [ | X i | ] , k = 1 | X i | v k = 1 , e ε l P S ( s ) x X i P S | X ( s | x ) v k e ε u P S ( s ) , s S , k [ | X i | ] .
For each X i G X H , i [ g ] , let V i = { v 1 i , v M i } be the vertices of Π i in (25), H v k i be the entropy of each v k i for k [ M ] , P X i be the probability vector of x X i , and  β i be the solution to the following optimisation:
min β i R M k = 1 M H v k i β k i , s . t . β k i 0 , k [ M ] , k = 1 M v k i β k i = P X i .
Then, Y i and Q i , q i are given by
Y i = { y : β y i 0 } ;
q i ( y ) = β y i and Q i ( . | y ) = v y i , y Y i .
The ( ε l , ε u ) -ALIP protocol, M : X Y , is given by the pair of ( P X | Y , P Y ) as follows:
P X | Y = 1 { x = y } , x , y X L = Y L , Q i ( x | y ) , x X i , y Y i , i [ g ] , 0 , o t h e r w i s e ;
P Y = P X ( y ) , y Y L , q i ( y ) , y Y i , i [ g ] .

5.1. Algorithm for Subset Random Response

We propose Algorithm 2 to implement AORR for subsets of X H , which we call subset random response (SRR). In this algorithm, first, we obtain a partition G X H = { X 1 , X 2 , X g } of X H via Algorithm 1. Then, based on G X H , we make another partition O X H and find the optimal random response for each subset X i O X H (line 5). By obtaining the optimal random responses for all subsets, we obtain a pair ( Q i , q i ) for each subset X i , and consequently P X | Y and P Y , by (29) and (30) (lines 20–23). The while loop in lines 7–14 is for the particular cases when the polytope in (25) is empty for a subset X i . It may occur for strict privacy conditions where the privacy budget is very small. Since we reduce the dimension of the original polytope in (15), it increases the possibility that no feasible random response exists in some cases. Therefore, in such cases, we make a union with other subsets until we have a nonempty polytope. If | G X H | > 0 , then we make a union with another subset in G X H (line 9). If | G X H | = 0 , it means that X g is the last subset in O X H ; therefore, we make a union with previously made subsets in O X H and update the index g (line 12). Then, if the condition in lines 17–18 is for the cases where there is no feasible polytope after making a union of all subsets, this means that the problem in (15) cannot be solved. Whenever this occurs, we apply the subset merging mechanism.
The number of polytope vertices in AORR is n O exp ( | X | 1 ) , and the time complexity is O ( n ) . As | X | increases, the complexity of AORR increases exponentially in the | X | 1 . In SRR, for each X i G X H , i [ g ] , the number of vertices is n i O ( exp ( | X i | 1 ) ) , and the complexity of SRR is O ( max i n i ) .
Algorithm 2: Subset random response.
Entropy 25 00679 i002

5.2. Numerical Results

Here, we compare the PUT of AORR with SRR in Algorithm 2 and subset merging in Algorithm 1. Figure 8 depicts mean values of utility, leakage, and time complexity for 100 randomly generated distributions where λ = 0.65 and simulation setup is the same as that in Section 3.1.3. The result of the Adult dataset is also shown in Figure 9.
Figure 8a–c and Figure 9 demonstrate that SRR results in better utility and higher leakage than subset merging, and its PUT is very close to AORR. Figure 8d illustrates the processing time of each mechanism for synthetic data from which we observe that the complexity of AORR and SRR is much higher than the subset merging. Running SRR is less complex than AORR for strict privacy constraints ( ε < 1 ) and for ε > 2.5 . While SRR shows higher complexity for some privacy budgets, 1 ε 2.5 , it has the advantage in high-dimension systems. Figure 10 shows a PUT comparison between SRR and subset merging for synthetic data where | X | = 200 , | S | = 15 , ε { 1 , 1.25 , , 8 } , and λ = 0.5 . This experiment shows that both SRR and subset merging are applicable to large datasets. Obviously, SRR provides better utility (Figure 10a) and higher leakage (Figure 10b,c), which is still below the given budgets ε l and ε u .

6. Lift-Based and Lift-Inverse Measures

In this section, we consider some recently proposed privacy measures that quantify the divergence between the posterior and prior belief on sensitive features, including 1 -norm [32], strong χ 2 -privacy criterion [33], and α -lift [34]. They have been proposed as a stronger version of their corresponding average measures, which are the total variation distance [35], χ 2 -divergence [40], and Sibson MI [16], respectively. We call them lift-based measures and define them in the following.
Definition 5.
For each y Y , lift-based privacy measures are defined as follows:
  • The 1 -lift is given by
    Λ 1 ( y ) s S P S ( s ) | l ( s , y ) 1 | ,
    then the total variation distance will be
    T ( S ; Y ) = 1 2 E Y [ Λ 1 ( Y ) ] .
  • The χ 2 -lift is given by
    Λ χ 2 ( y ) s S P S ( s ) l ( s , y ) 1 2 ,
    then χ 2 -divergence will be
    χ 2 ( S ; Y ) = E Y [ Λ χ 2 ( Y ) ] .
  • The α-lift is given by
    Λ α S ( y ) s S P S ( s ) l ( s , y ) α 1 / α ,
    then Sibson MI will be
    I α S ( S ; Y ) = α α 1 log E Y [ Λ α S ( Y ) ] .
Here, we reveal their relationship with ALIP.
Proposition 3.
If ( ε l , ε u ) -ALIP is satisfied, then
 1. 
max y Y Λ 1 ( y ) e ε u 1 ;
 2. 
max y Y Λ χ 2 ( y ) ( e ε u 1 ) 2 ;
 3. 
max y Y Λ α S ( y ) e ε u .
The proof is given in Appendix D.
Proposition 4.
Let s y = arg max s S [ l ( s , y ) ] and y ¯ = arg max y Y Λ ( y ) , then
 1. 
If max y Y Λ 1 ( y ) ε max y Y Λ ( y ) ε / P S ( s y ¯ ) + 1 ;
 2. 
If max y Y Λ χ 2 ( y ) ε max y Y Λ ( y ) ε / P S ( s y ¯ ) + 1 ;
 3. 
If max y Y Λ α S ( y ) ε max y Y Λ ( y ) ε / P S ( s y ¯ ) 1 α .
Proof. 
The proof is given in Appendix E. □
Proposition 3 shows that lift-based measures, similar to their corresponding average leakages, are upper-bounded by the max-lift bound. Proposition 4 indicates that if we bound lift-based measures, they can only restrict the max-lift leakage. Accordingly, if one only applies a lift-based measure to protect privacy, such as in previous works [32,33,34], it may cause significant leakage on the min-lift. Therefore, in the following, we propose lift-inverse measures to bound the min-lift leakage.

6.1. Lift-Inverse Measures

In Propositions 3 and 4, we showed that lift-based measures only bound the max-lift. In this subsection, we present lift-inverse measures to restrict the min-lift leakage, min y Y Ψ ( y ) = min y Y [ min s S l ( s , y ) ].
Definition 6.
For lift-based measures in (31) to (33), we replace l ( s , y ) with 1 l ( s , y ) and call the resulting quantities lift-inverse measures.
  • The 1 -lift-inverse is given by
    Ψ 1 ( y ) = s S P S ( s ) 1 ( s , y ) 1 .
  • The χ 2 -lift-inverse is given by
    Ψ χ 2 ( y ) s S P S ( s ) 1 ( s , y ) 1 2 .
  • The α-lift-inverse is given by
    Ψ α S ( y ) s S P S ( s ) 1 ( s , y ) α 1 / α .
In the following propositions, we show the relationship between ( ε l , ε u ) -ALIP.
Proposition 5.
If ( ε l , ε u ) -ALIP is achieved, we have
 1. 
max y Y Ψ 1 ( y ) e ε l 1 ;
 2. 
max y Y Ψ χ 2 ( y ) e ε l 1 2 ;
 3. 
max y Y Ψ α S ( y ) e ε l .
Proof. 
The proof is provided in Appendix F. □
Proposition 6.
Let s y = arg min s l ( s , y ) and y ̲ = arg min y [ Ψ ( y ) ] , then
 1. 
If max y Y Ψ 1 ( y ) ε min y Y Ψ ( y ) P S ( s y ̲ ) ε + P S ( s y ̲ ) ;
 2. 
If max y Y Ψ χ 2 ( y ) ε min y Y Ψ ( y ) P S ( s y ̲ ) ε + P S ( s y ̲ ) ;
 3. 
If max y Y Ψ α S ( y ) ε min y Y Ψ ( y ) ε 1 P S ( s y ̲ ) 1 α .
Proof. 
The proof is provided in Appendix G. □
Propositions 5 and 6 demonstrate that the aforementioned lift-inverse measures are associated with the min-lift, and bounding them can restrict the min-lift leakage. Since lift-based and lift-inverse measures quantify privacy leakage by a function of lift averaged over sensitive features, they can be regarded as more relaxed measures than the min- and max-lifts.

6.2. PUT and Numerical Results

Optimal randomisations for 1 -lift and χ 2 -lift privacy, which maximise MI as the utility measure, were proposed in [32,33], respectively. Note that ORR is not applicable to these measures since their privacy constraints are not convex. However, here, we apply the watchdog mechanism with subset merging randomisation to investigate the PUT for lift-based and lift-inverse measures. This application shows that the watchdog mechanism with X-invariant randomisation is a low-complexity method that can be applied to all the aforementioned measures. Moreover, our subset merging algorithm significantly enhances the utility, which is comparable to the optimal solutions.
To apply lift-based and lift-inverse measures to the subset merging algorithm, we replace Λ ( y ) and Ψ ( y ) in (6) and Algorithm 1, with the corresponding lift-based and lift-inverse measures in Definitions 5 and 6, respectively. For example, X L and X H for the α -lift are obtained as
X L { x X : Ψ α S ( x ) e ε l a n d Λ α S ( x ) e ε u } a n d X H = X \ X L .
The privacy risk measure in Algorithm 1 is also given by ω ( x ) = Ψ α S ( x ) + Λ α S ( x ) . We compare the PUT of lift-based and lift-inverse privacy with ( ε l , ε u )-ALIP, where lift-based and lift-inverse measures are bounded as follows:
  • 1 -privacy: Λ 1 ( y ) e ε u 1           and Ψ 1 ( y ) e ε l 1 , y Y .
  • χ 2 -privacy: Λ χ 2 ( y ) ( e ε u 1 ) 2     and Ψ χ 2 ( y ) ( e ε l 1 ) 2 ,     y Y .
  • α -lift-privacy: Λ α S ( y ) e ε u               and Ψ α S ( y ) e ε l , y Y .
We apply the subset merging mechanism with the simulation setup in Section 3.1.3. Figure 11 and Figure 12 demonstrate the PUT of ALIP for λ = 0.5 and 1 and χ 2 privacy for λ = { 0.5 , 0.65 } for synthetic and Adult dataset, respectively. When λ = 0.5 , 1 and χ 2 privacy result in higher utility compared to ALIP for all values of ε since lift-based and lift-inverse measures are relaxations of max- and min-lift. To observe the effect of the asymmetric scenario, we depict 1 and χ 2 privacy for λ = 0.65 . From Figure 11a and Figure 12a, we observe that lift-inverse relaxation ( λ = 0.65 ) enhances utility significantly for ε > 1 , but worsens utility for ε < 1 . The reason is that when ε < 1 , lift-inverse privacy constraint in the asymmetric scenario is strict, which requires more symbols to be merged in each subset and causes larger subsets and utility degradation.
A comparison between α -lift privacy and ALIP for synthetic data is shown in Figure 13 for α { 2 , 10 , 100 } . α -lift privacy is tunable such that when α = , it is equivalent to ALIP, and when α < , it results in a relaxation scenario. We observe tunable property in Figure 13 where α = 2 has the highest utility. Moreover, when α increases, the PUT of α -lift privacy becomes closer to ALIP.

7. Conclusions

In this paper, we studied lift, the likelihood ratio between posterior and prior belief about sensitive features in a dataset. We demonstrated the distinction between the min- and max-lifts in terms of data privacy concerns. We proposed ALIP as a generalised version of LIP to have a more compatible notion of privacy with lift asymmetry. ALIP can enhance utility in the watchdog and ORR mechanisms, two main approaches to achieve lift-based privacy. We proposed two subset randomisation methods to enhance the utility of the watchdog mechanism and reduce ORR complexity for large datasets. We also investigated the existing lift-based measures, showing that they could incur significant leakage on the min-lift. Thus, we proposed lift-inverse measures to restrict the min-lift leakage. Finally, we applied the watchdog mechanism to study the PUT of lift-based and lift-inverse measures. For future work, one can consider the applicable operational meaning of the min-lift and max-lift. Subset randomisation can be applied to decrease the complexity and enhance the utility of other privacy mechanisms. Moreover, optimal randomisation for α -lift is also unknown and could be considered.

Author Contributions

Conceptualization, N.D. and P.S.; Methodology, N.D. and P.S.; Validation, P.S.; Formal analysis, M.A.Z.; Writing—original draft, M.A.Z.; Writing—review & editing, N.D. and P.S.; Visualization, M.A.Z.; Supervision, N.D. and P.S.; Project administration, P.S.; Funding acquisition, P.S. All authors have read and agreed to the published version of the manuscript.

Funding

The work of P. Sadeghi and M.A. Zarrabian is supported by the ARC Future Fellowship FT190100429 and partly by the Data61 CRP: IT-PPUB.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

For real-world data simulations we have used public Adult dataset from UCI machine learning repository https://archive-beta.ics.uci.edu/dataset/2/adult.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Appendix A

In this appendix, we represent the privacy breach in terms of min-lift and max-lift based on the work of Evfimievsk et al. [29]. In this work, they proposed the following definition for privacy breach and provided a detailed example [29] (Example 1).
Definition A1.
For ρ 1 , ρ 2 R + : 0 < ρ 1 < ρ 2 < 1 , we say that there is a straight or upward ρ 1 -to- ρ 2 privacy breach with respect to a property Q ( S ) if for some y Y
Pr [ Q ( S ) ] ρ 1 , Pr [ Q ( S ) | Y = y ] > ρ 2 .
We say that there is a downward ρ 2 -to- ρ 1 privacy breach with respect to Q ( S ) if for some y Y
Pr [ Q ( S ) ] ρ 2 , Pr [ Q ( S ) | Y = y ] < ρ 1 .
Using the property Q ( S ) = ¬ Q ( S ) , we have
Pr [ Q ( S ) ] 1 ρ 2 , Pr [ Q ( S ) | Y = y ] 1 ρ 1 .
Therefore, a downward ρ 2 -to- ρ 1 privacy breach implies an upward ( 1 ρ 2 ) -to- ( 1 ρ 1 ) for the complement event Q ( S ) . In the original definition, the constraint for Pr [ Q ( S ) | Y = y ] includes equality. Here, we remove the equality constraint for consistency with the context. They proved that in the ε -LDP scenario, the sufficient condition to prevent both upward ρ 1 -to- ρ 2 and downward ρ 2 -to- ρ 1 privacy breach is
e ε ρ 2 ρ 1 · 1 ρ 1 1 ρ 2 .
Here, we provide sufficient conditions to prevent upward ρ 1 -to- ρ 2 and downward ρ 2 -to- ρ 1 in terms of min- and max-lift as follows:
Proposition A1.
To prevent upward and downward privacy breaches it is sufficient to have:
1 ρ 2 1 ρ 1 Ψ ( y ) , Λ ( y ) ρ 2 ρ 1 , y Y .
Proof. 
We prove this for an upward privacy breach on a property Q ( S ) and a downward privacy breach on its complement ¬ Q ( S ) . For any property Q ( S ) with Pr [ Q ( S ) ] ρ 1 we have
Pr [ Q ( S ) | Y = y ] = s Q ( S ) P S | Y ( s | y ) = s Q ( S ) P S ( s ) . P Y | S ( y | s ) P Y ( y ) s Q ( S ) P S ( s ) max s P Y | S ( y | s ) P Y ( y ) = Λ ( y ) Pr [ Q ( S ) ] .
Since Pr [ Q ( S ) ] ρ 1 , if Λ ( y ) ρ 2 ρ 1 , then Pr [ Q ( S ) | Y = y ] ρ 2 and there is no upward privacy breach for Q ( S ) . Similarly, for downward privacy breach on ¬ Q ( S ) , we have
Pr [ ¬ Q ( S ) | Y = y ] = s ¬ Q ( S ) P S | Y ( s | y ) = s ¬ Q ( S ) P S ( s ) . P Y | S ( y | s ) P Y ( y ) s ¬ Q ( S ) P S ( S ) min s P Y | S ( y | s ) P Y ( y ) = Ψ ( y ) Pr [ ¬ Q ( S ) ] .
Since Pr [ ¬ Q ( S ) ] 1 ρ 1 , if Ψ ( y ) 1 ρ 2 1 ρ 1 , then Pr [ ¬ Q ( S ) | Y = y ] 1 ρ 2 and there is no downward privacy breach for ¬ Q ( S ) . □

Appendix B

  • For MI, we have
    I ( S ; Y ) = E P S Y [ i ( S , Y ) ] E P S Y [ ε u ] = ε u .
    • For the total variation distance, we have
      T ( S ; Y ) = 1 2 y Y P Y ( y ) s S P S ( s ) | l ( s , y ) 1 | 1 2 y Y P Y ( y ) s S P S ( s ) | Λ ( y ) 1 | = 1 2 y Y P Y ( y ) | Λ ( y ) 1 | 1 2 y Y P Y ( y ) | e ε u 1 | = 1 2 ( e ε u 1 ) .
    • For χ 2 -divergence, we have
      χ 2 ( S ; Y ) = y Y P Y ( y ) s S P S ( s ) l ( s , y ) 1 2 y Y P Y ( y ) s S P S ( s ) Λ ( y ) 1 2 = y Y P Y ( y ) Λ ( y ) 1 2 y Y P Y ( y ) e ε u 1 2 = ( e ε u 1 ) 2 .
    • For Sibson MI, we have
      I α S ( S ; Y ) = α α 1 log y Y P Y ( y ) s S P S ( s ) l ( s , y ) α 1 / α α α 1 log y Y P Y ( y ) s S P S ( s ) Λ ( y ) α 1 / α α α 1 log y Y P Y ( y ) s S P S ( s ) e ε u α 1 / α = ε u α α 1 .
    • For Arimoto MI, we have
      I α A ( S ; Y ) = α α 1 log y Y P Y ( y ) s S P S α ( s ) l ( s , y ) α 1 / α α α 1 log y Y P Y ( y ) s S P S α ( s ) Λ ( y ) α 1 / α α α 1 log y Y P Y ( y ) s S P S α ( s ) e ε u α 1 / α = ε u α α 1 ,
      where P S α ( s ) = P S ( s ) α s S P S ( s ) α .
  • In LDP, for all y Y , we have
    Γ ( y ) = sup s , s S P Y | S ( y | s ) P Y | S ( y | s ) = max s S P Y | S ( y | s ) min s S P Y | S ( y | s ) = max s S P Y | S ( y | s ) / P Y ( y ) min s S P Y | S ( y | s ) / P Y ( y ) = Λ ( y ) Ψ ( y ) e ε u e ε l = e ε l + ε u .

Appendix C

Here, we prove that X-invariant randomisation minimises privacy leakage in X H for LDP.
Proposition A2.
A randomisation r ( y | x ) , x , y , X H can attain ( ε , X H ) -LDP if and only if:
Γ L D P ( X H ) = max s S P ( X H | s ) min s S P ( X H | s ) e ε .
Proof. 
Sufficient condition: Consider an X-invariant randomization where r ( y | x ) = R ( y ) , x X H , and y Y H . If (A2) holds, then for all s , s S , we have
P ( X H | s ) P ( X H | s ) max s S P ( X H | s ) min s S P ( X H | s ) e ε P ( X H | s ) P ( X H | s ) e ε R ( y ) P ( X H | s ) R ( y ) P ( X H | s ) e ε x X H r ( y | x ) P X | S ( x | s ) x X H r ( y | x ) P X | S ( x | s ) e ε P Y | S ( y | s ) P Y | S ( y | s ) e ε .
For the necessary condition, note that for all s , s S and y Y H , we have
P Y | S ( y | s ) e ε P Y | S ( y | s ) x X H r ( y | x ) P X | S ( x | s ) e ε x X H r ( y | x ) P X | S ( x | s ) ,
then by a summation over all y Y H on both sides, we obtain
x X H P X | S ( x | s ) y Y H r ( y | x ) = 1 e ε x X H P X | S ( x | s ) y Y H r ( y | x ) = 1 P ( X H | s ) P ( X H | s ) e ε P ( X H | s ) P ( X H | s ) e ε .
Because (A3) holds for all s , s S , we have
max s , s S P ( X H | s ) P ( X H | s ) = max s S P ( X H | s ) min s S P ( X H | s ) e ε .

Appendix D

  • For 1 -lift, we have
    Λ 1 ( y ) = s S P S ( s ) | l ( s , y ) 1 | s S P S ( s ) | Λ ( y ) 1 | = Λ ( y ) 1 e ε u 1 .
  • For χ 2 -lift, we have
    Λ χ 2 ( y ) = s S P S ( s ) l ( s , y ) 1 2 s S P S ( s ) Λ ( y ) 1 2 = Λ ( y ) 1 2 e ε u 1 2 .
  • For the α -lift, we have
    Λ α S ( y ) = s S P S ( s ) l ( s , y ) α 1 / α s S P S ( s ) Λ ( y ) α 1 / α = Λ ( y ) e ε u .

Appendix E

If s y = arg max s S l ( s , y ) , then we have Λ ( y ) = l ( s y , y ) . Recall that y ¯ = arg max y Y [ Λ ( y ) ] .
  • When max y Y Λ 1 ( y ) ε , for all y Y , we have
    Λ 1 ( y ) = s S P S ( s ) | l ( s , y ) 1 | ε P S ( s y ) | l ( s y , y ) 1 | = P S ( s y ) ( Λ ( y ) 1 ) ε ,
    which results in
    max y Y Λ ( y ) ε P S ( s y ¯ ) + 1 .
  • When max y Y Λ χ 2 ( y ) ε , for all y Y , we have
    Λ χ 2 ( y ) = s S P S ( s ) l ( s , y ) 1 2 ε P S ( s y ) l ( s y , y ) 1 2 = P S ( s y ) ( Λ ( y ) 1 ) 2 ε ,
    which results in
    max y Y Λ ( y ) ε P S ( s y ¯ ) + 1 .
  • When max y Y Λ α S ( y ) ε , for all y Y , we have
    Λ α S ( y ) = s S P S ( s ) l ( s , y ) α 1 / α ε P S ( s y ) l ( s y , y ) α = P S ( s y ) Λ ( y ) α ε α ,
    which results in
    max y Y Λ ( y ) ε P S ( s y ¯ ) 1 α .

Appendix F

Since ( ε l , ε u ) -ALIP is satisfied, for all y Y , we have e ε u 1 l ( s , y ) e ε l and max s 1 l ( s , y ) = 1 Ψ ( y ) .
  • For 1 -lift-inverse, we have
    Ψ 1 ( y ) = s S P S ( s ) 1 l ( s , y ) 1 s S P S ( s ) 1 Ψ ( y ) 1 = 1 Ψ ( y ) 1 e ε l 1 .
  • For χ 2 -lift-inverse, we have
    Ψ χ 2 ( y ) = s S P S ( s ) 1 l ( s , y ) 1 2 s S P S ( s ) 1 Ψ ( y ) 1 2 = 1 Ψ ( y ) 1 2 e ε l 1 2 .
  • For α -lift-inverse, we have
    Ψ α S ( y ) = s S P S ( s ) 1 l ( s , y ) α 1 α s S P S ( s ) 1 Ψ ( y ) α 1 α = 1 Ψ ( y ) e ε l .

Appendix G

If s y = arg min s l ( s , y ) , then we have Ψ ( y ) = l ( s y , y ) . Recall that y ̲ = arg min y [ Ψ ( y ) ] .
  • When max y Y Ψ 1 ( y ) ε , for all y Y , we have
    Ψ 1 ( y ) = s S P S ( s ) 1 l ( s , y ) 1 ε P S ( s y ) 1 l ( s y , y ) 1 = P S ( s y ) 1 Ψ ( y ) 1 ε ,
    which results in
    min y Y Ψ ( y ) P S ( s y ̲ ) ε + P S ( s y ̲ ) .
  • When max y Y Ψ χ 2 ( y ) ε , for all y Y , we have
    Ψ χ 2 ( y ) = s S P S ( s ) 1 l ( s , y ) 1 2 ε P S ( s y ) 1 l ( s y , y ) 1 2 = P S ( s y ) 1 Ψ ( y ) 1 2 ε ,
    which results in
    min y Y Ψ ( y ) P S ( s y ̲ ) ε + P S ( s y ̲ ) .
  • When max y Y Ψ α S ( y ) ε , for all y Y , we have
    Ψ α S ( y ) = s S P S ( s ) 1 l ( s , y ) α 1 / α ε P S ( s y ) 1 l ( s y , y ) α = P S ( s y ) 1 Ψ ( y ) α ε α ,
    which results in
    min y Y Ψ ( y ) ε 1 P S ( s y ̲ ) 1 α .

References

  1. Dwork, C.; McSherry, F.; Nissim, K.; Smith, A. Calibrating Noise to Sensitivity in Private Data Analysis. In Theory of Cryptography; Halevi, S., Rabin, T., Eds.; Springer: Berlin/Heidelberg, Germany, 2006; pp. 265–284. [Google Scholar]
  2. Dwork, C. Differential Privacy. In Proceedings of the 33rd International Colloquium on Automata, Languages and Programming, part II (ICALP 2006), Venice, Italy, 10–14 July 2006; Volume 4052, pp. 1–12. [Google Scholar]
  3. Dwork, C. Differential privacy. In Encyclopedia of Cryptography and Security; Springer: Boston, MA, USA, 2011; pp. 338–340. [Google Scholar] [CrossRef]
  4. Dwork, C.; Roth, A. The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci. 2014, 9, 211–407. [Google Scholar] [CrossRef]
  5. Kasiviswanathan, S.P.; Lee, H.K.; Nissim, K.; Raskhodnikova, S.; Smith, A. What can we learn privately? SIAM J. Comput. 2011, 40, 793–826. [Google Scholar] [CrossRef]
  6. Duchi, J.C.; Jordan, M.I.; Wainwright, M.J. Local Privacy and Statistical Minimax Rates. In Proceedings of the 2013 IEEE 54th Annual Symposium on Foundations of Computer Science, Berkeley, CA, USA, 26–29 October 2013; pp. 429–438. [Google Scholar] [CrossRef]
  7. Kairouz, P.; Oh, S.; Viswanath, P. Extremal mechanisms for local differential privacy. Adv. Neural Inf. Process. Syst. 2014, 4, 2879–2887. [Google Scholar]
  8. Sarwate, A.D.; Sankar, L. A rate-disortion perspective on local differential privacy. In Proceedings of the 2014 52nd Annual Allerton Conference on Communication, Control, and Computing (Allerton), Monticello, IL, USA, 30 September–3 October 2014; pp. 903–908. [Google Scholar] [CrossRef]
  9. Kalantari, K.; Sankar, L.; Sarwate, A.D. Robust Privacy-Utility Tradeoffs Under Differential Privacy and Hamming Distortion. IEEE Trans. Inf. Forensics Secur. 2018, 13, 2816–2830. [Google Scholar] [CrossRef]
  10. Sankar, L.; Rajagopalan, S.R.; Poor, H.V. Utility-privacy tradeoffs in databases: An information-theoretic approach. IEEE Trans. Inf. Forensics Secur. 2013, 8, 838–852. [Google Scholar] [CrossRef]
  11. Jiang, B.; Li, M.; Tandon, R. Context-aware Data Aggregation with Localized Information Privacy. In Proceedings of the 2018 IEEE Conference on Communications and Network Security (CNS), Beijing, China, 30 May–1 June 2018; pp. 1–9. [Google Scholar] [CrossRef]
  12. Makhdoumi, A.; Salamatian, S.; Fawaz, N.; Médard, M. From the Information Bottleneck to the Privacy Funnel. In Proceedings of the 2014 IEEE Information Theory Workshop (ITW 2014), Hobart, TAS, Australia, 2–5 November 2014; pp. 501–505. [Google Scholar] [CrossRef]
  13. Salamatian, S.; du Pin Calmon, F.; Fawaz, N.; Makhdoumi, A.; Médard, M. Privacy-Utility Tradeoff and Privacy Funnel. 2020. Available online: http://www.mit.edu/~salmansa/files/privacy_TIFS.pdf (accessed on 16 January 2020).
  14. Issa, I.; Kamath, S.; Wagner, A.B. An operational measure of information leakage. In Proceedings of the 2016 Annual Conference on Information Science and Systems (CISS), Princeton, NJ, USA, 16–18 March 2016; pp. 234–239. [Google Scholar]
  15. Issa, I.; Kamath, S.; Wagner, A.B. Maximal leakage minimization for the Shannon cipher system. In Proceedings of the 2016 IEEE International Symposium on Information Theory (ISIT), Barcelona, Spain, 10–15 July 2016; pp. 520–524. [Google Scholar] [CrossRef]
  16. Issa, I.; Wagner, A.B.; Kamath, S. An Operational Approach to Information Leakage. IEEE Trans. Inf. Theory 2020, 66, 1625–1657. [Google Scholar] [CrossRef]
  17. Liao, J.; Kosut, O.; Sankar, L.; Calmon, F.P. A tunable measure for information leakage. In Proceedings of the IEEE International Symposium on Information Theory (ISIT), Vail, CO, USA, 17–22 June 2018; pp. 701–705. [Google Scholar]
  18. Du Pin Calmon, F.; Fawaz, N. Privacy against statistical inference. In Proceedings of the 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton), Monticello, IL, USA, 1–5 October 2012; pp. 1401–1408. [Google Scholar]
  19. Jiang, B.; Li, M.; Tandon, R. Local Information Privacy with Bounded Prior. In Proceedings of the 2019 IEEE International Conference on Communications (ICC), Shanghai, China, 20–24 May 2019; pp. 1–7. [Google Scholar] [CrossRef]
  20. Seif, M.; Tandon, R.; Li, M. Context Aware Laplacian Mechanism for Local Information Privacy. In Proceedings of the 2019 IEEE Information Theory Workshop (ITW), Visby, Sweden, 25–28 August 2019; pp. 1–5. [Google Scholar] [CrossRef]
  21. Jiang, B.; Li, M.; Tandon, R. Local Information Privacy and Its Application to Privacy-Preserving Data Aggregation. IEEE Trans. Dependable Secur. Comput. 2020, 19, 1918–1935. [Google Scholar] [CrossRef]
  22. Jiang, B.; Seif, M.; Tandon, R.; Li, M. Context-Aware Local Information Privacy. IEEE Trans. Inf. Forensics Secur. 2021, 16, 3694–3708. [Google Scholar] [CrossRef]
  23. Ding, N.; Liu, Y.; Farokhi, F. A Linear Reduction Method for Local Differential Privacy and Log-lift. In Proceedings of the 2021 IEEE International Symposium on Information Theory (ISIT), Melbourne, VIC, Australia, 12–20 July 2021; pp. 551–556. [Google Scholar] [CrossRef]
  24. Hsu, H.; Asoodeh, S.; Calmon, F.P. Information-Theoretic Privacy Watchdogs. In Proceedings of the 2019 IEEE International Symposium on Information Theory (ISIT), Paris, France, 7–12 July 2019; pp. 552–556. [Google Scholar]
  25. Sadeghi, P.; Ding, N.; Rakotoarivelo, T. On Properties and Optimization of Information-theoretic Privacy Watchdog. In Proceedings of the 2020 IEEE Information Theory Workshop (ITW), Riva del Garda, Italy, 11–15 April 2020. [Google Scholar]
  26. Zarrabian, M.A.; Ding, N.; Sadeghi, P.; Rakotoarivelo, T. Enhancing utility in the watchdog privacy mechanism. In Proceedings of the ICASSP 2022—2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Virtual, 7–13 May 2022; pp. 2979–2983. [Google Scholar]
  27. Razeghi, B.; Calmon, F.; Gunduz, D.; Voloshynovskiy, S. On Perfect Obfuscation: Local Information Geometry Analysis. arXiv 2020, arXiv:2009.04157. [Google Scholar]
  28. Lopuhaä-Zwakenberg, M.; Tong, H.; Škorić, B. Data Sanitisation Protocols for the Privacy Funnel with Differential Privacy Guarantees. Int. J. Adv. Secur. 2021, 13, 162–174. [Google Scholar]
  29. Evfimievski, A.; Gehrke, J.; Srikant, R. Limiting Privacy Breaches in Privacy Preserving Data Mining. In Proceedings of the Twenty-Second ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS ’03, San Diego, CA, USA, 9–12 June 2003; pp. 211–222. [Google Scholar] [CrossRef]
  30. Saeidian, S.; Cervia, G.; Oechtering, T.J.; Skoglund, M. Pointwise Maximal Leakage. arXiv 2022, arXiv:2205.04935. [Google Scholar]
  31. Fernandes, N.; McIver, A.; Sadeghi, P. Explaining epsilon in differential privacy through the lens of information theory. arXiv 2022, arXiv:2210.12916. [Google Scholar]
  32. Zamani, A.; Oechtering, T.J.; Skoglund, M. Data Disclosure With Non-Zero Leakage and Non-Invertible Leakage Matrix. IEEE Trans. Inf. Forensics Secur. 2022, 17, 165–179. [Google Scholar] [CrossRef]
  33. Zamani, A.; Oechtering, T.J.; Skoglund, M. A Design Framework for Strongly χ2-Private Data Disclosure. IEEE Trans. Inf. Forensics Secur. 2021, 16, 2312–2325. [Google Scholar] [CrossRef]
  34. Ding, N.; Zarrabian, M.A.; Sadeghi, P. α-Information-theoretic Privacy Watchdog and Optimal Privatization Scheme. In Proceedings of the 2021 IEEE International Symposium on Information Theory (ISIT), Melbourne, VIC, Australia, 12–20 July 2021; pp. 2584–2589. [Google Scholar]
  35. Rassouli, B.; Gunduz, D. Optimal utility-privacy trade-off with total variation distance as a privacy measure. IEEE Trans. Inf. Forensics Secur. 2019, 15, 594–603. [Google Scholar] [CrossRef]
  36. Asoodeh, S.; Diaz, M.; Alajaji, F.; Linder, T. Estimation efficiency under privacy constraints. IEEE Trans. Inf. Theory 2018, 65, 1512–1534. [Google Scholar] [CrossRef]
  37. Rassouli, B.; Rosas, F.E.; Gunduz, D. Data Disclosure under Perfect Sample Privacy. arXiv 2019, arXiv:1904.01711. [Google Scholar] [CrossRef]
  38. Becker, B.; Kohavi, R. Adult. UC Irvine Machine Learning Repository. Available online: https://archive-beta.ics.uci.edu/dataset/2/adult (accessed on 5 January 1996).
  39. Liu, Y.; Sadeghi, P.; Arbabjolfaei, F.; Kim, Y.H. Capacity Theorems for Distributed Index Coding. IEEE Trans. Inf. Theory 2020, 66, 4653–4680. [Google Scholar] [CrossRef]
  40. Wang, H.; Vo, L.; Calmon, F.P.; Médard, M.; Duffy, K.R.; Varia, M. Privacy With Estimation Guarantees. IEEE Trans. Inf. Theory 2019, 65, 8025–8042. [Google Scholar] [CrossRef]
Figure 1. Histogram of log Ψ ( y ) = min s i ( s , y ) and log Λ ( y ) = max s i ( s , y ) for 10 3 randomly generated distributions, where | X | = 17 , | S | = 5 .
Figure 1. Histogram of log Ψ ( y ) = min s i ( s , y ) and log Λ ( y ) = max s i ( s , y ) for 10 3 randomly generated distributions, where | X | = 17 , | S | = 5 .
Entropy 25 00679 g001
Figure 2. Privacy–utility trade-off of the watchdog mechanism with complete merging randomisation for synthetic data under ε -LDP and ( ε l , ε u ) -ALIP, where | X | = 17 , | S | = 5 , ε LDP { 0.25 , 0.5 , 0.75 , , 8 } , λ { 0.35 , 0.5 , 0.65 } , ε l = λ ε , and ε u = ( 1 λ ) ε .
Figure 2. Privacy–utility trade-off of the watchdog mechanism with complete merging randomisation for synthetic data under ε -LDP and ( ε l , ε u ) -ALIP, where | X | = 17 , | S | = 5 , ε LDP { 0.25 , 0.5 , 0.75 , , 8 } , λ { 0.35 , 0.5 , 0.65 } , ε l = λ ε , and ε u = ( 1 λ ) ε .
Entropy 25 00679 g002
Figure 3. Privacy–utility trade-off of the watchdog mechanism with complete merging randomisation for Adult dataset under ε -LDP and ( ε l , ε u ) -ALIP, where S = { relationship } , X = { occupation } , | X | = 15 , | S | = 5 , ε LDP { 0.25 , 0.5 , 0.75 , , 8 } , λ { 0.35 , 0.5 , 0.65 } , ε l = λ ε , and ε u = ( 1 λ ) ε .
Figure 3. Privacy–utility trade-off of the watchdog mechanism with complete merging randomisation for Adult dataset under ε -LDP and ( ε l , ε u ) -ALIP, where S = { relationship } , X = { occupation } , | X | = 15 , | S | = 5 , ε LDP { 0.25 , 0.5 , 0.75 , , 8 } , λ { 0.35 , 0.5 , 0.65 } , ε l = λ ε , and ε u = ( 1 λ ) ε .
Entropy 25 00679 g003
Figure 4. Privacy–utility trade-off of AORR for synthetic data where | X | = 17 , | S | = 5 , ε LDP { 0.25 , 0.5 , 0.75 , , 8 } , λ { 0.35 , 0.5 , 0.65 } , ε l = λ ε , and ε u = ( 1 λ ) ε .
Figure 4. Privacy–utility trade-off of AORR for synthetic data where | X | = 17 , | S | = 5 , ε LDP { 0.25 , 0.5 , 0.75 , , 8 } , λ { 0.35 , 0.5 , 0.65 } , ε l = λ ε , and ε u = ( 1 λ ) ε .
Entropy 25 00679 g004
Figure 5. Privacy–utility trade-off of AORR for Adult dataset where S = { relationship } , X = { occupation } , | X | = 15 , | S | = 5 , ε LDP { 0.25 , 0.5 , 0.75 , , 8 } , λ { 0.35 , 0.5 , 0.65 } , ε l = λ ε , and ε u = ( 1 λ ) ε .
Figure 5. Privacy–utility trade-off of AORR for Adult dataset where S = { relationship } , X = { occupation } , | X | = 15 , | S | = 5 , ε LDP { 0.25 , 0.5 , 0.75 , , 8 } , λ { 0.35 , 0.5 , 0.65 } , ε l = λ ε , and ε u = ( 1 λ ) ε .
Entropy 25 00679 g005
Figure 6. Privacy–utility trade-off of subset merging randomisation under ε -LDP and ( ε l , ε u ) -ALIP, where | X | = 17 , | S | = 5 , ε LDP { 0.25 , 0.5 , 0.75 , , 8 } , λ { 0.35 , 0.5 , 0.65 } , ε l = λ ε , and ε u = ( 1 λ ) ε .
Figure 6. Privacy–utility trade-off of subset merging randomisation under ε -LDP and ( ε l , ε u ) -ALIP, where | X | = 17 , | S | = 5 , ε LDP { 0.25 , 0.5 , 0.75 , , 8 } , λ { 0.35 , 0.5 , 0.65 } , ε l = λ ε , and ε u = ( 1 λ ) ε .
Entropy 25 00679 g006
Figure 7. Privacy–utility trade-off of subset merging randomisation for Adult dataset under ε -LDP and ( ε l , ε u ) -ALIP, where S = { relationship } , X = { occupation } , | X | = 15 , | S | = 5 , ε LDP { 0.25 , 0.5 , 0.75 , , 8 } , λ { 0.35 , 0.5 , 0.65 } , ε l = λ ε , and ε u = ( 1 λ ) ε .
Figure 7. Privacy–utility trade-off of subset merging randomisation for Adult dataset under ε -LDP and ( ε l , ε u ) -ALIP, where S = { relationship } , X = { occupation } , | X | = 15 , | S | = 5 , ε LDP { 0.25 , 0.5 , 0.75 , , 8 } , λ { 0.35 , 0.5 , 0.65 } , ε l = λ ε , and ε u = ( 1 λ ) ε .
Entropy 25 00679 g007
Figure 8. Comparison of privacy–utility trade-off and time complexity between AORR, SRR (Algorithm 2), and subset merging (Algorithm 1) for synthetic data, where | X | = 17 , | S | = 5 , ε LDP { 0.25 , 0.5 , 0.75 , , 8 } , λ = 0.65 , ε l = λ ε LDP , and ε u = ( 1 λ ) ε LDP .
Figure 8. Comparison of privacy–utility trade-off and time complexity between AORR, SRR (Algorithm 2), and subset merging (Algorithm 1) for synthetic data, where | X | = 17 , | S | = 5 , ε LDP { 0.25 , 0.5 , 0.75 , , 8 } , λ = 0.65 , ε l = λ ε LDP , and ε u = ( 1 λ ) ε LDP .
Entropy 25 00679 g008
Figure 9. Comparison of privacy–utility trade-off between AORR, SRR (Algorithm 2), and subset merging (Algorithm 1) for Adult dataset, where S = { relationship } , X = { occupation } , | X | = 15 , | S | = 5 , ε LDP { 0.25 , 0.5 , 0.75 , , 8 } , λ = 0.65 , ε l = λ ε LDP , and ε u = ( 1 λ ) ε LDP .
Figure 9. Comparison of privacy–utility trade-off between AORR, SRR (Algorithm 2), and subset merging (Algorithm 1) for Adult dataset, where S = { relationship } , X = { occupation } , | X | = 15 , | S | = 5 , ε LDP { 0.25 , 0.5 , 0.75 , , 8 } , λ = 0.65 , ε l = λ ε LDP , and ε u = ( 1 λ ) ε LDP .
Entropy 25 00679 g009
Figure 10. Comparison of privacy–utility trade-off and time complexity between SRR (Algorithm 2) and subset merging (Algorithm 1) for synthetic data, where | X | = 200 , | S | = 15 , ε LDP { 1 , 1.25 , 1.5 , 1.75 , , 8 } , and ε l = ε u = ε LDP 2 .
Figure 10. Comparison of privacy–utility trade-off and time complexity between SRR (Algorithm 2) and subset merging (Algorithm 1) for synthetic data, where | X | = 200 , | S | = 15 , ε LDP { 1 , 1.25 , 1.5 , 1.75 , , 8 } , and ε l = ε u = ε LDP 2 .
Entropy 25 00679 g010
Figure 11. Comparison of privacy–utility trade-off between ALIP, 1 -privacy, and χ 2 -privacy for synthetic data, where | X | = 17 , | S | = 5 , ε LDP { 0.25 , 0.5 , 0.75 , , 8 } , λ { 0.5 , 0.65 } , ε l = λ ε , and ε u = ( 1 λ ) ε .
Figure 11. Comparison of privacy–utility trade-off between ALIP, 1 -privacy, and χ 2 -privacy for synthetic data, where | X | = 17 , | S | = 5 , ε LDP { 0.25 , 0.5 , 0.75 , , 8 } , λ { 0.5 , 0.65 } , ε l = λ ε , and ε u = ( 1 λ ) ε .
Entropy 25 00679 g011
Figure 12. Comparison of privacy–utility trade-off between ALIP, 1 -privacy, and χ 2 -privacy for Adult dataset where, S = { relationship } , X = { occupation } , | X | = 15 , | S | = 5 , ε LDP { 0.25 , 0.5 , 0.75 , , 8 } , λ { 0.5 , 0.65 } , ε l = λ ε , and ε u = ( 1 λ ) ε .
Figure 12. Comparison of privacy–utility trade-off between ALIP, 1 -privacy, and χ 2 -privacy for Adult dataset where, S = { relationship } , X = { occupation } , | X | = 15 , | S | = 5 , ε LDP { 0.25 , 0.5 , 0.75 , , 8 } , λ { 0.5 , 0.65 } , ε l = λ ε , and ε u = ( 1 λ ) ε .
Entropy 25 00679 g012
Figure 13. Comparison of privacy–utility trade-off between ALIP and α -lift-privacy, where | X | = 17 , | S | = 5 , ε LDP { 0.25 , 0.5 , 0.75 , , 8 } , ε l = ε u = ε LDP 2 , and α { 2 , 10 , 100 } .
Figure 13. Comparison of privacy–utility trade-off between ALIP and α -lift-privacy, where | X | = 17 , | S | = 5 , ε LDP { 0.25 , 0.5 , 0.75 , , 8 } , ε l = ε u = ε LDP 2 , and α { 2 , 10 , 100 } .
Entropy 25 00679 g013
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zarrabian, M.A.; Ding, N.; Sadeghi, P. On the Lift, Related Privacy Measures, and Applications to Privacy–Utility Trade-Offs. Entropy 2023, 25, 679. https://doi.org/10.3390/e25040679

AMA Style

Zarrabian MA, Ding N, Sadeghi P. On the Lift, Related Privacy Measures, and Applications to Privacy–Utility Trade-Offs. Entropy. 2023; 25(4):679. https://doi.org/10.3390/e25040679

Chicago/Turabian Style

Zarrabian, Mohammad Amin, Ni Ding, and Parastoo Sadeghi. 2023. "On the Lift, Related Privacy Measures, and Applications to Privacy–Utility Trade-Offs" Entropy 25, no. 4: 679. https://doi.org/10.3390/e25040679

APA Style

Zarrabian, M. A., Ding, N., & Sadeghi, P. (2023). On the Lift, Related Privacy Measures, and Applications to Privacy–Utility Trade-Offs. Entropy, 25(4), 679. https://doi.org/10.3390/e25040679

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop