Two-Party Threshold Private Set Intersection Protocols from Lightweight Cryptographic Primitives

Zhao, Shengnan; Zhao, Chuan; Huang, Yuchen; Song, Xiangfu; Xu, Qiuliang

doi:10.3390/cryptography8040058

Open AccessArticle

Two-Party Threshold Private Set Intersection Protocols from Lightweight Cryptographic Primitives^†

by

Shengnan Zhao

¹,

Chuan Zhao

^1,*,

Yuchen Huang

²

,

Xiangfu Song

³ and

Qiuliang Xu

⁴

¹

Quan Cheng Laboratory, Jinan 250103, China

²

College of Computer and Information Engineering, Henan Normal University, Xinxiang 453007, China

³

College of Computing and Data Science, Nanyang Technological University, Singapore 639798, Singapore

⁴

School of Software, Shandong University, Jinan 250101, China

^*

Author to whom correspondence should be addressed.

^†

This paper is an extended version of our paper published in the Proceedings of the 16th International Conference on Wireless Algorithms, Systems, and Applications, Nanjing, China, 25–27 June 2021.

Cryptography 2024, 8(4), 58; https://doi.org/10.3390/cryptography8040058

Submission received: 25 November 2024 / Revised: 18 December 2024 / Accepted: 19 December 2024 / Published: 22 December 2024

Download

Browse Figures

Versions Notes

Abstract

:

Private Set Intersection (PSI) is a significant application of interest within Secure Multi-party Computation (MPC), even though we are still in the early stages of deploying MPC solutions to real-world problems. Threshold PSI (tPSI), a variant of PSI, allows two parties to determine the intersection of their respective sets only if the cardinality of the intersection is at least (or less than) a specified threshold t. In this paper, we propose a generic construction for two-party tPSI that extensively utilizes Oblivious Transfer (OT). Our approach is based on lightweight primitives and avoids costly public-key systems such as homomorphic encryption. We start by introducing the secret-sharing private membership test

{}^{ss}{PMT}

that is based on the secret-sharing private equality test

{}^{ss}{PEQT}

. The

{}^{ss}{PMT}

enables tPSI to be scaled for a wide range of practical applications, particularly benefiting parties with limited computational resources. Consequently, two distinct two-party tPSI protocols can be efficiently implemented: over-threshold PSI (

t^{\leq} PSI

) and under-threshold PSI

t^{>} PSI

. In addition, we propose a lightweight two-party tPSI with limited leakage and a generic precomputing OT suitable for phased implementation. Experimental performance demonstrates that our protocols are highly efficient and computationally friendly, thus paving the way for broader deployment of tPSI solutions.

Keywords:

secure multiparty computation; private set intersection (PSI); threshold PSI; oblivious transfer

1. Introduction

Originating from Yao’s millionaire’s problem [1] in 1982, secure Multi-party Computation (MPC) has witnessed notable achievements as a research area and in experimental performance. However, we remain in the early stages of implementing MPC solutions for real-world applications [2]. Researchers have shown growing interest in private set operations, which encompass intersection, union, counting, summation, and more general functions relating to intersections or unions. Private set operations enable two or more parties to perform calculations on their datasets without disclosing the actual data to one another. This mechanism ensures that sensitive information stays confidential, which is especially critical in sectors such as healthcare, finance, and personal data management.

Private Set Intersection (PSI) allows two parties, denoted by Alice and Bob with their respective sets X and Y, to compute the intersection

X \cap Y

without revealing any additional information about either set. As a specific research field within MPC, and driven by practical applications, PSI has been a topic of significant interest across various fields. Recently, extensive research papers have been published, focusing on various aspects of PSI, including efficiency [3,4,5], the functionality of union operations [6,7], the security of PSI in the two-party scenario [8], multiparty [9,10], and server-aided settings [11], as well as various adversary models [12]. A growing body of literature highlights the importance and practical applications of PSI in different scenarios, such as online ad conversion rates [13], privacy-preserving contact tracing [14], searchable encryption scheme [15] and vertical Federated Learning [16].

Threshold PSI (tPSI) enables the parties to learn the intersection only if the number of their common items is sufficiently large, i.e., a threshold t. PSI with Cardinality (

{PSI}^{CA}

) [17] is designed to limit the output to the size of the intersection, revealing nothing else about the sets involved. Thus, the construction of tPSI protocols typically invokes

{PSI}^{CA}

as an intermediary computation. We take over-threshold PSI protocol (

t^{\leq} PSI

) as an example; the protocol

t^{\leq} PSI

outputs the intersection only when the cardinality is greater than or equal to the threshold (i.e.,

t \leq

{PSI}^{CA}

). In this paper, we define that if the cardinality matches the threshold, the protocol still outputs the intersection. Apparently, tPSI reduces to normal PSI when

t = 0

, and it is equivalent to

{PSI}^{CA}

when

t = | X | = | Y | = n

. Intuitively,

{PSI}^{CA}

serves as a building block in the design of tPSI protocol. That is to say, to carry out tPSI in a secure way, one could simply invoke

{PSI}^{CA}

as a subprotocol first and then securely compare the output of

{PSI}^{CA}

with a specified threshold t. Finally, tPSI outputs the intersection by invoking any standard PSI protocol if

{PSI}^{CA}

is greater than (or less than) the threshold t. However, this straightforward method for implementing tPSI is often inefficient due to the independent nature of the two protocols

{PSI}^{CA}

and PSI. Moreover, to the best of our knowledge, most existing

{PSI}^{CA}

constructions rely heavily on public-key primitives. Without resorting to generic MPC methods, designing a lightweight tPSI protocol indeed presents a significant challenge, but it is not insurmountable.

1.1. Related Work

The studies in the field of tPSI have primarily focused on minimizing communication cost [18,19,20], often at the expense of computational efficiency due to expensive public-key operations such as homomorphic encryption. While effective in reducing network data transfer, they fail to address scenarios where participants have computational limitations, which is crucial for practical, lightweight implementations. Ghosh and Simkin [21] demonstrated that the communication complexity can be sublinear in the set size and showed a lower bound on the communication complexity of any protocol for the two-party case that is based on fully homomorphic encryption (FHE). Branco et al. [22] presented a multi-party cardinality testing approach based on threshold additively homomorphic encryption (AHE). This method allows for the verification of whether the intersection of input sets exceeds

(n - t)

, where n represents the size of each individual set and t denotes a specific threshold. Furthermore, they developed an l-party threshold PSI protocol with communication complexity

O (l t^{2})

. Badrinarayanan et al. [23] examined two functionalities tailored for the multi-party case. They rigorously proved that both functionalities require at least

Ω (n t)

bits of communication. This finding underscores the inherent communication complexity involved in such multi-party protocols. They also solved an open problem in [21] by building a two-party threshold PSI protocol with communication complexity

\tilde{O} (t)

from assumptions weaker than FHE. Recently, the communication optimization and new functionality for multi-party protocols have been proposed, expanding the research scope of tPSI. Based on AHE and Oblivious Transfer (OT), Ghosh and Simkin [24] proposed an l-party threshold PSI protocol with a communication complexity of

O (λ^{2} l t poly (\log t))

, where

λ

is the security parameter. Liu et al. [25] introduced a new notion called probabilistic tPSI, where the parties learn the intersection with probability proportional to the size of it. In [26], the protocol securely computes which element appears in at least t sets of lightweight parties. In addition, Mohanty et al. [27] proposed the first quantum secure tPSI protocol which has a direct application in privacy-preserving ride sharing.

Constructing tPSI based on lightweight techniques with better efficiency is non-trivial. This line of research is attractive because, in general, few cryptographic techniques achieve both computational and communicational efficiency when designing PSI and variants. Oblivious Transfer is currently one of the most popular methods for investigating how to construct an efficient PSI protocol [28,29,30] and its variants in different settings [31] and adversary models [32,33]. Compared to a naïve hashing solution, OT-based private set operations, leveraging the OT extension technique, balance communication and computation costs well and are currently one of the most popular methods. Batched OPRF (Oblivious Pseudorandom Function) and VOLE (Vector Oblivious Linear Evaluation) can be generated through the use of the OT extension mechanism [34]. Orrù et al. [35] described a random actively secure 1-out-of-n OT extension protocol and applied it as a building block to improve the performance of PSI. Zhao et al. [36] proposed lightweight tPSI based on OT, ensuring that the computational complexity is independent of the set size. Also in two-party computation scenarios, Hu et al. [37] studied tPSI using FHE within the realm of cloud computing, where the server has a larger dataset than the client. By carefully selecting and optimizing cryptographic primitives and by leveraging efficient algorithms and design patterns, it is possible to design a protocol that balances security, efficiency, and simplicity. This would meet the needs of applications where computational resources are limited, ensuring practical and secure implementations.

1.2. Our Contribution

We focus on a balanced two-party scenario where the size of two sets is equal, namely

| X | = | Y | = n

where the length of each element is l. In particular, the output should be received by the party that does not possess the threshold t, to which we refer as a receiver

R

.

To our knowledge, developing tPSI using almost exclusively lightweight cryptographic primitives is highly challenging, and this difficulty extends to more general protocols, including over-threshold PSI (

t^{\leq} PSI

) and under-threshold PSI (

t^{>} PSI

) in which the protocol outputs the intersection only when

t \leq

{PSI}^{CA}

and

t >

{PSI}^{CA}

, respectively. In this paper, we concentrate on two-party threshold PSI under the semi-honest model. The main contributions of this work are summarized as follows:

We propose ${}^{ss}{PMT}$ to compute secret-shared private membership test functionality $F_{PMT}^{ss, q}$ . In ${}^{ss}{PMT}$ , ${}^{ss}{PEQT}$ serves as a building block, which securely computes the secret-shared private equality test functionality $F_{PEQT}^{ss, q}$ .
We offer a generic tPSI framework where two kinds of tPSI could be efficiently implemented: over-threshold PSI ( $t^{\leq} PSI$ ) and under-threshold PSI ( $t^{>} PSI$ ). Furthermore, we propose a more lightweight tPSI with limited leakage. We prove the security of all protocols under the semi-honest model.
We introduce a generic precomputing OT method used to generate instances in tPSI. Experimental results on ${}^{ss}{PEQT}$ and tPSI show that our proposed protocols are highly efficient and computationally friendly.

In addition to our previous work in [36], we have two new contributions. First, we propose another new tPSI protocol named

t_{l l}^{\leq} PSI

, which is entirely based on lightweight cryptographic primitives. Unlike

t^{\leq} PSI

and

t^{>} PSI

, the

t_{l l}^{\leq} PSI

protocol eliminates the need for the comparison circuit for functionality

F_{≷ (y_{0}, x, y_{1})}

by employing our proposed radix complement method. Second, we introduce a generic precomputing OT method where, for any given n, the 1-out-of-n OT protocol primarily involves symmetric operations during online computation. Our work aims to provide a general framework for tPSI with lightweight cryptographic primitives, contributing to the ongoing discourse on enhancing the computational efficiency of such protocols.

2. Preliminaries

In this section, we introduce the notation, cryptographic primitives, and security definitions used in the protocols.

2.1. Notation

We denote the computational and statistical security parameters by

κ

and

μ

, respectively. A shared variable x is denoted by

{[[x]]}_{P}

, where the subscript

P \in {S, R}

indicates that which party,

S

or

R

, holds the share.

r \in_{R} Z_{q}

denotes that r is uniformly chosen at random from

Z_{q}

. We let

“ \oplus ”

denote mod-2 addition (also used to denote bitwise XOR operation without causing ambiguity);

x : = y

denote assigning the value of y to x. We write

X \overset{c}{\equiv} Y

if two distribution probability ensembles X and Y are computationally indistinguishable.

2.2. Oblivious Transfer

Oblivious transfer is rapidly emerging as a significant primitive in secure computation. We begin with the 1-out-of-2 OT, which involves a sender

S

that holds two messages

(m_{0}, m_{1})

and a receiver

R

that holds a choice bit

b \in {0, 1}

. The protocol guarantees that

R

only learns

m_{b}

but knows nothing about

m_{1 - b}

, while the sender does not gain any knowledge about the choice bit b.

In Figure 1, the functionality

F_{OT}^{n}

provides the receiver

R

one message from n messages sent by the sender

S

. We denote by

{OT}_{n}^{1}

a specific 1-out-of-n OT construction that securely computes that functionality

F_{OT}^{n}

. Unless otherwise stated, we use “OT” to refer to 1-out-of-2 OT and

{OT}_{n}^{1}

to denote 1-out-of-n OT protocol. Correlated OT (COT) [38,39] introduces special variants of OT when

n = 2

in

F_{OT}^{n}

. While exhibiting the same essential functionality as OT, COT allows the sender to choose the inputs

(x_{0}, x_{1})

under the constraint that their XOR equals

Δ

. This constraint is specifically defined by a correlation function

f_{Δ} (\cdot)

where

x_{1} = f_{Δ} (x_{0}) = x_{0} + Δ

.

2.3. Cuckoo Hashing

Cuckoo Hashing is a dictionary data structure designed to enhance performance by allowing an item to be stored in one of several possible locations. As detailed in [40], the fundamental structure of Cuckoo hashing employs multiple hash functions to allocate items into bins. Specifically, using m hash functions

h_{i}, \dots, h_{m}

, this method maps n items to b bins

B [1, \dots, b]

, where

b = (1 + ϵ) n

(

ϵ

is a constant and typically

ϵ = 1.2

as an experience parameter [3]) and each bin

B [i]

can contain at most one item.

Typically, there is a stash for items that cannot be placed in the bins, and two hash functions

h_{0}

and

h_{1}

(i.e.,

m = 2

) are randomly selected. When inserting an item x, the algorithm first checks if either of the bins

B [h_{0} (x)]

or

B [h_{1} (x)]

is empty. If an empty bin is available, the item x is placed there, and the algorithm proceeds to the next item. If both bins are occupied, x is inserted into bin

B [h_{i} (x)]

where

i \in_{R} {0, 1}

, evicting the item currently in

B [h_{i} (x)]

. The evicted item is then subjected to the same insertion process. This cycle continues until either an item is successfully placed without further evictions or a predetermined iteration threshold (indicating a maximum number of evictions) is reached. In cases where the threshold is surpassed, the last evicted item is stored in the stash. This method effectively balances space efficiency with quick item retrieval and insertion operations.

2.4. Secret Sharing

A secret sharing scheme [41] allows a secret s to be divided into several pieces held by different individuals. In this framework, a certain coalition of holders can recover the original secret value s, while any smaller coalition learns nothing about s. In our discussion, we focus on

(n, n)

secret sharing schemes, where all shares are both necessary and sufficient to reconstruct the secret. We utilize sharing in both

Z_{q}

(arithmetic sharing) and

Z_{2}

(boolean sharing) to implement it. Arithmetic sharing involves dividing the secret in a manner that relies on modular arithmetic over a finite field, while boolean sharing operates over binary representations.

Linear shares offer significant benefits by supporting parties to perform operations on the addition of two shared values in a non-interactive way. We take full advantage of the linearity of the secret sharing scheme in this work. We suppose there are two parties named

S

and

R

in a secret sharing scheme. The linearity means that, given arithmetic sharing

{[[x]]}_{P}

of x and

{[[y]]}_{P}

of y where

P \in {S, R}

,

z = x + y

is evaluated by

S

and

R

adding the shares they already hold. That is,

S

computes its output share as

{[[z]]}_{S} = {[[x]]}_{S} + {[[y]]}_{S}

, and

R

correspondingly computes its output share as

{[[z]]}_{R} = {[[x]]}_{R} + {[[y]]}_{R}

. As a consequence, two parties indeed share the secret value z without any additional interaction. This scheme implies the appropriate homomorphic property when the method of sharing is simple additive sharing over a group, such as

Z_{q}

. Apparently, this kind of linear operation could be executed correctly by several times if the group

Z_{q}

is sufficiently large. We would call the property linearity instead of additive homomorphism in a later section.

2.5. Security Definition

Goldreich [42] proposed the formal security definition of secure multiparty protocols by comparing two output distributions, which come from an ideal world and a real world, respectively. In this paper, we focus on the semi-honest model, and the formal security definition is proposed in the following.

We let

f = (f_{1}, f_{2})

denote a deterministic two-party functionality and

π

a two-party protocol for computing f. Given the security parameter

κ

, the input

{x}

from

P_{1}

and

{y}

from

P_{2}

, for

i \in {1, 2}

, the view of

P_{i}

during the execution of

π

is denoted as

{view}_{i}^{π} (x, y, κ) = (w, r_{i}, m_{i}^{1}, \dots, m_{i}^{t})

, where

w \in {x, y}

,

r_{i}

represents the randomness used by

P_{i}

, and

m_{i}^{j}

means the jth message received by

P_{i}

; the output of

P_{i}

is denoted by

{output}_{i}^{π} (x, y, κ)

, and the joint output of the two parties is

{output}^{π} (x, y, κ) = ({output}_{1}^{π} (x, y, κ), {output}_{2}^{π} (x, y, κ))

.

Definition 1.

We say that π securely computes f in semi-honest model if

The correctness holds:

${output}^{π} (x, y, κ) \overset{c}{\equiv} {f (x, y)}_{x, y, κ} .$
There exist probabilistic polynomial-time simulator ${Sim}_{1}$ and ${Sim}_{2}$ , such that

$\begin{array}{l} {{Sim}_{1} (1^{κ}, x, f_{1} (x, y))}_{x, y, κ} \overset{c}{\equiv} {{view}_{1}^{π} (x, y, κ)}_{x, y, κ}, \\ {{Sim}_{2} (1^{κ}, y, f_{2} (x, y))}_{x, y, κ} \overset{c}{\equiv} {{view}_{2}^{π} (x, y, κ)}_{x, y, κ} . \end{array}$

The construction of simulators

{Sim}_{1}

and

{Sim}_{2}

is crucial for demonstrating the security of the protocol under the chosen corruptions. For

i \in {1, 2}

,

{Sim}_{i}

represents the simulator for participant

P_{i}

that has been corrupted by an adversary. The simulator constructs a simulated view based on the inputs and outputs of the ideal functionality. Specifically, given input x and output

f_{1}

from functionality

f = (f_{1}, f_{2})

, simulator

{Sim}_{1}

needs to construct view

{{Sim}_{1} (1^{κ}, x, f_{1} (x, y))}

. The same is the case with simulator

{Sim}_{2}

and view

{{Sim}_{2} (1^{κ}, y, f_{2} (x, y))}

. By ensuring that the outputs of the simulators are indistinguishable from the actual views of the corrupt parties, we can argue that the security conditions of the protocol hold.

3. Overview of Our Construction

In this section, together with briefly introducing the main limitations of the current study, we offer two functionalities

F_{PEQT}^{ss, q}

and

F_{PMT}^{ss, q}

which compute a secret-shared private equality test

{}^{ss}{PEQT}

and a secret-shared private membership test

{}^{ss}{PMT}

, respectively. In addition, we intuitively present our constructions

{}^{ss}{PEQT}

and

{}^{ss}{PMT}

.

3.1. One-out-of-n Correlated OT

We first review COT from another point of view and then generalize the definition of COT by extending from

{OT}_{2}^{1}

to

{OT}_{n}^{1}

to construct other high-level building blocks for our final protocol. We suppose Alice and Bob, respectively, hold bit a and bit b. Alice chooses

r \in_{R} Z_{2}

first and then inputs messages

(m_{0}, m_{1})

under the correlation function

f_{Δ} (\cdot)

, where

m_{a} = - r + 1

and

m_{1 - a} = - r

. Bob inputs b as his choice bit and receives

m_{b}

from COT. Notably, it holds that

m_{1 - a} = - r = r

when

r \in Z_{2}

.

An important observation is that, by setting

m_{1 - a} = f_{Δ} (m_{a}) = {(m_{a})}^{- 1} \oplus 1

in COT, Alice and Bob secretly share the result of

a \oplus b = m_{a} \oplus m_{b}

. That is to say, we suppose

c = a \oplus b

; Alice always holds

{[[c]]}_{0} = r

and Bob learns

{[[c]]}_{1} = x_{b}

from COT defined above. It implies that Alice and Bob cooperatively finish a two-out-two ⊕-secret sharing on c. Essentially, the value of c indicates whether the two bits a and b are identical or not.

We let

{COT}_{n}^{1}

denote one-out-of-n Correlated OT. Alice holds

a \in Z_{2^{l}}

and Bob holds

b \in Z_{2^{l}}

. First of all, Alice now chooses

r \in_{R} Z_{p}

(instead of

r \in_{R} Z_{2}

), and then prepares

n = 2^{l}

input messages

(m_{0}, \dots, m_{n - 1})

under a correlation function

f_{Δ, a} (\cdot)

in the following way:

For $i = a$ , $m_{i} = - r + 1$ ;
For $i \in [n]$ and $i \neq a$ , $m_{i} = f_{Δ, a} (\cdot) = - r$ .

Finally, Bob inputs his b as the chosen string in

{OT}_{n}^{1}

and receives

m_{b}

as output. The objective of the correlation function

f_{Δ, a} (\cdot)

above is that, apart from the ath input

x_{a}

, each input message generated by the sender (i.e., Alice) in

{OT}_{n}^{1}

is always one same value satisfying

f_{Δ, i} (\cdot)

, which is similar to original COT protocol.

Notably, different correlation functions could be defined to restrict the sender’s inputs in

{COT}_{n}^{1}

, though we assign a special but sufficient correlation function used in our

{}^{ss}{PEQT}

construction. When the context is unambiguous, we abuse the notation and use COT to represent all variants of

f_{Δ, i} (\cdot)

in

{OT}_{2}^{1}

and

{OT}_{n}^{1}

. In Section 4, we introduce another tailor-made COT construction to obtain better performance, which is not omitted here to keep the content more consistent and coherent.

3.2. Secret-Shared Private Equality Test

We start by introducing the Private Equality Test (PEQT) functionality named

F_{PEQT}

(see Figure 2), which is used to determine whether two strings x and y are equal. In a specific PEQT protocol, a sender

S

with x as input interacts with a receiver

R

holding y as input. At the end of the protocol,

R

only learns bit z indicating the comparison result but nothing else, and

S

has no outputs. As shown in Figure 2,

R

learns

z = 1

if and only if

x = y

; otherwise,

z = 0

.

In Figure 3, we introduce the ideal definition of a secret-shared PEQT functionality

F_{PEQT}^{ss, q}

, which enables two parties

S

and

R

to learn whether two strings are equal in a secret-sharing way. Notably, Couteau [43] introduced a similar secure equality test protocol that computes shares over

Z_{2}

of the equality predicate. Here, we formally define a general secret-shared PEQT functionality where the outputs are shared over

Z_{q}

. In detail, we define a public parameter q in

F_{PEQT}^{ss, q}

to indicate the output type of

{[[z]]}_{P}

, i.e., Boolean sharing (when

q = 2

) or Arithmetic sharing. That is, the output z in

F_{PEQT}

is shared using additive secret sharing scheme to produce

{[[z]]}_{S} = r

and

{[[z]]}_{R} = r^{'}

. Intuitively, the only difference between

F_{PEQT}

and

F_{PEQT}^{ss, q}

lies in that the latter provides two correlated outputs to both

S

and

R

. It may appear to be indirect and unnecessary. However, this distinction becomes more conspicuous in the following.

In

F_{PEQT}^{ss, q}

, it either satisfies

{[[z]]}_{S} + {[[z]]}_{R} = 0

or

{[[z]]}_{S} + {[[z]]}_{R} = 1

because of the fact that

r^{'} \in {r^{- 1}, - r}

. In prior classical works [4,28,31,32], it only holds

{[[z]]}_{S} = {[[z]]}_{R} = r

when

x = y

, or provides

S

and

R

two different random shares

{[[z]]}_{S}, {[[z]]}_{R} \in_{R} Z_{2^{l}}

when

x \neq y

. In fact, by introducing Garbled Circuit (GC) or Homomorphic Encryption, it is an easy task to adjust these prior works to securely computing

F_{PEQT}^{ss, q}

, such that

z = {0, 1}

and

{[[z]]}_{S}, {[[z]]}_{R}, z \in Z_{q}

. However, without resorting to additional public-key operations, it is not trivial to design the protocol satisfying that the output

z = {0, 1}

in PEQT is additive secret shared by

S

and

R

. Therefore, we offer the formal functionality

F_{PEQT}^{ss, q}

in this paper, and we introduce a simple construction based on COT in the following.

We leverage the capability of COT to provide an efficient solution for this construction. That is,

F_{PEQT}^{ss, q}

could be instantiated directly by generalized COT defined above when x and y are short strings. We suppose

x, y \in Z_{2^{l}}

,

S

generates

r \in_{R} Z_{q}

and prepares

n = 2^{l}

messages

(m_{0}, \dots, m_{n - 1})

satisfying the following paprameters: For

i = x

, it holds

m_{i} = r^{- 1}

; For

i \in [n]

and

i \neq x

, it holds

m_{i} = - r

. Then,

R

acts as receiver with chosen string y in

{OT}_{n}^{1}

, and

S

acts as sender with input messages

(m_{0}, \dots, m_{n - 1})

prepared above. After

{OT}_{n}^{1}

,

R

obtains

m_{y}

and takes

r^{'} = m_{y}

as a share. It is trivial to verify that general correlated

{OT}_{n}^{1}

cooperatively executed by

S

and

R

securely compute

F_{PEQT}^{ss, q}

such that

{[[z]]}_{S} = r

and

{[[z]]}_{R} = r^{'}

.

This is an illustrative example that is instructive to build the intuition of

F_{PEQT}^{ss, q}

. When l is large, say,

l > 16

, it is almost impractical for

S

to prepare

n = 2^{l}

messages in

{OT}_{n}^{1}

protocol. To solve this problem, we offer a general construction to compute

F_{PEQT}^{ss, q}

to support any length of x and y and the actual decomposition of the protocol

{}^{ss}{PEQT}

in Section 4.

3.3. Secret-Shared Private Membership Test

In Figure 4, the ideal Private Membership Test (PMT) functionality

F_{PMT}

is as follows. The sender

S

has a set X; meanwhile, the receiver

R

has an element y. After

F_{PMT}

,

R

receives bit z indicating whether

y \in X

or not and

S

receives nothing.

Following the study relationship between functionalities

F_{PEQT}

and

F_{PEQT}^{ss, q}

in Figure 2 and Figure 3, we offer the ideal definition of a secret-shared PMT functionality

F_{PMT}^{ss, q}

in Figure 5. The research interest of designing protocols that securely compute functionality

F_{PMT}

is similar to

F_{PEQT}

and

F_{PEQT}^{ss, q}

, and now we need the output z in

F_{PMT}

to be secretly shared between

S

and

R

. The idea of [4] is that if

y \in X

, then both

S

and

R

learn the same random value. Otherwise, they learn different random values. Instead of sharing

z = {0, 1}

, it is

z = {0, Δ}

that is shared between

S

and

R

in [4], which is not what we want, though it could transform

Δ

to a bit by an additional conversion circuit.

Given functionality

F_{PEQT}^{ss, q}

, it is trivial to construct a secret-shared PMT (

{}^{ss}{PMT}

) protocol (see Figure 6) that securely realizes functionality

F_{PMT}^{ss, q}

. We make full use of the linearity property of the secret sharing scheme defined in

F_{PEQT}^{ss, q}

. Because the outputs of

F_{PEQT}^{ss, q}

are additively secret-shared between

S

and

R

, the outputs for

F_{PMT}^{ss, q}

can be obtained by summing up these shares without interaction. The correctness of

{}^{ss}{PMT}

is obvious:

{[[r]]}_{S} + {[[r]]}_{R} = 1

if and only if there exactly exists an element

x_{j}

satisfying

y = x_{j}

so that

r_{j} + r_{j}^{'} = 1

, which implies the fact that

y \in X

. Otherwise,

F_{PEQT}^{ss, q}

offers zero-shares to

S

and

R

in each invocation, and after all invocations and accumulations, it still holds that

{[[r]]}_{S} + {[[r]]}_{R} = 0

.

The security of the protocol in Figure 6 boils down to the protocol that computes functionality

F_{PEQT}^{ss, q}

. In the semi-honest model, the security of

{}^{ss}{PMT}

is easy to understand: the only new information obtained by

S

and

R

is secret shares, which leak no information about the parties’ input when invoking ideal functionality

F_{PEQT}^{ss, q}

.

4. Secrect-Shared Private Equality Test ( ${}^{ss}{PEQT}$ )

In this section, we offer concrete construction

{}^{ss}{PEQT}

that securely computes

F_{PEQT}^{ss, q}

in a more general case.

F_{PEQT}^{ss, q}

can not be instantiated directly by generalized COT when inputs are long strings. This can be circumvented by

F_{OT}^{n}

which transforms the comparison between two long input strings into two shorter ones.

The core idea of the

{}^{ss}{PEQT}

is to reduce the comparison procedure between two long input strings to two smaller arithmetic shares, which are obtained by sharing the Hamming Distance of the two long binary strings. Notation

d_{H} (x, y)

denotes the Hamming Distance between two binary strings (vectors) x and y, here

x, y \in {0, 1}^{l}

and

l = 2^{k}

. Apparently, we have

0 \leq d_{H} (x, y) \leq l

and the binary representation of

d_{H} (x, y)

would be k-bit long at most. So are the arithmetic sharing shares of

d_{H} (x, y)

, denoted by

{[[d_{H} (x, y)]]}_{1}

and

{[[d_{H} (x, y)]]}_{2}

. That is, instead of focusing on two long inputs x and y themselves, we now just compare two shares that are much shorter than x and y. In addition, the induction procedure between two shares could be invoked again until the length of shares of

d_{H} (x^{'}, y^{'})

is short enough. In fact, if we invoke such a reduction procedure just once, we can receive a log-order of magnitude.

Theorem 1.

The

{}^{ss}{PEQT}

protocol in Figure 7 securely computes functionality

F_{PEQT}^{ss, q}

(Figure 3) in semi-honest setting, as described in Definition 1, given functionality

F_{OT}^{n}

defined in Figure 1.

Proof.

We begin by proving correctness. Notably,

l^{'}

is the length of p in binary form. That is to say, it requires at most

l^{'}

binary bits to represent

x_{OUT}

or

y_{OUT}

. In Step 6, it is impossible that

x_{OUT}, y_{OUT} \in Z_{3}

, though it also holds that

⌈ \log 3 ⌉ = 2

. If so, we have

p_{n o w} = 3

and

l_{n o w}^{'} = ⌈ \log 3 ⌉ = 2

. We notice that

p_{n o w} : = l_{o l d}^{'} + 1

, and we obtain

l_{o l d}^{'} = 2

, which is the time to end the loop and execute Step 6 since

l_{o l d}^{'}

is not greater than two. In addition, due to the fact that

p_{o l d} > p_{n o w} > 2

, according to the constraints

l_{o l d}^{'} = 2 = ⌈ \log p_{o l d} ⌉

and

p_{o l d} > p_{n o w} = 3

, we draw a conclusion that there always exists

p_{o l d} = 4

. Therefore, it always holds that

x_{OUT}, y_{OUT} \in Z_{4}

in Step 6 when

l > 2

.

In summary, we have

{output}^{π} (x, y, κ) \overset{c}{\equiv} {f (x, y)}_{x, y, κ}

. It concludes the correctness of

{}^{ss}{PEQT}

.

In this context, the protocol

{}^{ss}{PEQT}

is denoted as

π

for brevity in the proof. According to the specifications of

{}^{ss}{PEQT}

, we define the functions as follows:

f_{1} (x, y) = r

and

f_{2} (x, y) = r^{'}

, where r and

r^{'}

represent the outputs computed by the respective parties based on their inputs x and y.

We now construct two simulators,

{Sim}_{S}

and

{Sim}_{R}

, designed for simulating the corrupt parties

S

and

R

, respectively. The goal of these simulators is to produce a transcript and a view that is computationally indistinguishable from those generated during a real execution of the protocol. More formally, we express this indistinguishability as follows:

\begin{array}{l} {{Sim}_{S} (1^{κ}, x, f_{1} (x, y))}_{x, y, κ} \overset{c}{\equiv} {{view}_{S}^{π} (x, y, κ)}_{x, y, κ}, \\ {{Sim}_{R} (1^{κ}, y, f_{2} (x, y))}_{x, y, κ} \overset{c}{\equiv} {{view}_{R}^{π} (x, y, κ)}_{x, y, κ} . \end{array}

The protocol

{}^{ss}{PEQT}

is denoted as symbol

π

for briefness in proof, and according to

{}^{ss}{PEQT}

, we have

f_{1} (x, y) = r

and

f_{2} (x, y) = r^{'}

.

Corrupt

S

. In protocol

π

,

S

is always working as a sender of

F_{OT}^{n}

in Step 2 and Step 7, where

S

prepares

r_{i} \in_{R} Z_{p}

for

i \in [l]

and

r \in_{R} Z_{q}

, respectively. This role is crucial as it ensures that

S

’s inputs are both random and independent, which is essential for the security of the protocol. Except for

S

’s input, all messages obtained by

S

essentially are generated locally, just reminding that

S

gets the output

f_{1} (x, y) = r

. That is to say, the main transcript in

{view}_{S}^{π} (x, y, κ)

originates from random messages set

R = {r_{i} | i \in [l], r_{i} \in_{R} Z_{p}}

. Therefore, it is trivial for

{Sim}_{S}

to generate a simulation of

{view}_{S}^{π} (x, y, κ)

.

Specifically,

{Sim}_{S}

simulates a simulation by choosing

\hat{R} = {{\hat{r}}_{i} | i \in [l], {\hat{r}}_{i} \in_{R} Z_{p}}

and preparing

M_{1} = {(m_{0}^{i}, m_{1}^{i}) | i \in [l], m_{0}^{i} = {\hat{r}}_{i} + x_{i}, m_{1}^{i} = {\hat{r}}_{i} + (x_{i} \oplus 1)}

in Step 2. Then,

{Sim}_{S}

could compute

{\hat{x}}_{OUT} = \sum_{i = 1}^{l} {\hat{r}}_{i} \mod p

. It ensures that the output maintains the necessary randomness and structure consistent with the actual execution of the protocol

π

. Here, we omit Step 6 and skip to Step 7 since simulator

{Sim}_{S}

can repeat the reduction process by following the operations described above. Then,

{Sim}_{S}

prepares messages set

M_{2} = {{\hat{m}}_{i} | i = 0, 1, 2, 3}

, where

{\hat{m}}_{{\hat{x}}_{OUT}} = {(f_{1} (x, y))}^{- 1} = r^{- 1}

and

{\hat{m}}_{j} = - f_{1} (x, y)

for

j = i + {\hat{x}}_{OUT} \mod 4

.

Finally, we let

{M_{1}, M_{2}}

be the output of

{Sim}_{S} (1^{κ}, x, f_{1} (x, y))

. This view guarantees that the simulator’s output effectively simulates the real execution. Therefore, we claim that the view generated by

{Sim}_{S}

and the view of corrupt

S

in a real protocol are computationally indistinguishable.

Corrupt

R

. To construct simulator

{Sim}_{R}

that simulates the view of the corrupt

R

in the real protocol execution, we start by analyzing

R

’s view

{view}_{R}^{π} (x, y, κ)

.

R

acts as a receiver of

F_{OT}^{n}

in Step 2 and Step 7, with input y and message set

{y_{OUT}}

obtained from previous steps. Therefore,

{view}_{R}^{π} (x, y, κ)

consists of

R

’s input y, output

f_{2} (x, y)

, and the random message set

{y_{OUT}}

. Given the ideal functionality

F_{OT}^{n}

, it is straightforward for

{Sim}_{R}

to construct the simulation of view

{view}_{R}^{π} (x, y, κ)

. For simplicity, we omit the repeated reduction process and present the simulation of Step 2 only once.

{Sim}_{R}

simulates the following views: (1) In Step 2, each time invoking the ideal functionality

F_{OT}^{n}

,

{Sim}_{R}

prepares

{\hat{m}}_{y_{i}} = m_{y_{i}}

and

{\hat{m}}_{1 - y_{i}} = s_{i}

where

s_{i} \in_{R} Z_{p}

, and then sends messages

M = {{\hat{m}}_{y_{i}}, {\hat{m}}_{1 - y_{i}}}

to

F_{OT}^{n}

.

{Sim}_{R}

receives a message set

M_{3} = {{\hat{m}}_{y_{i}}}

from

F_{OT}^{n}

first and then computes

{\hat{y}}_{OUT}

. (2) In Step 7, the simulator prepares four messages

M_{4} = {{\hat{m}}_{j} ∣ j = 0, 1, 2, 3}

, where

{\hat{m}}_{{\hat{y}}_{OUT}} = f_{2} (x, y)

and the others are random values. Then,

{Sim}_{R}

appends the output of

F_{OT}^{n}

to its output.

By combining the two steps described above in sequence, we establish that the output of

{Sim}_{R}

is computationally indistinguishable from that of the real execution. This means that we have

{{Sim}_{R} (1^{κ}, y, f_{2} (x, y))}_{x, y, κ} \overset{c}{\equiv} {{view}_{R}^{π} (x, y, κ)}_{x, y, κ} .

This completes the construction of two simulators,

{Sim}_{S}

and

{Sim}_{R}

. In summary, we prove protocol

{}^{ss}{PEQT}

secure under the semi-honest model. □

5. Generic Threshold PSI Protocols

Our tPSI protocols benefit from hashing the item of the input set to bins and executing

{}^{ss}{PMT}

protocol separately on each bin where functionality

F_{PEQT}^{ss, q}

features a useful linearity property. In Cuckoo hashing, the size of the bins

B [1, \dots, b]

depends on the number of hash functions as well as on the stash size. The higher the number of hash functions chosen, the more likely the process is to succeed. Therefore, we choose a stash-less cuckoo hashing used in our construction, where cuckoo hashing parameters guarantee that the stash does not exist with sufficiently low probability.

Based on the relationship between the threshold and the intersection cardinality, we first introduce two types of protocols sequentially, i.e.,

t^{\leq} PSI

and

t^{>} PSI

. Next, we consider lightweight and efficient methods for determining the threshold and the intersection cardinality, which leads us to introduce

t_{l l}^{\leq} PSI

.

5.1. Over-Threshold PSI Protocol: $t^{\leq} PSI$

We first introduce a threshold PSI protocol

t^{\leq} PSI

which outputs the intersection only when

t \leq

{PSI}^{CA}

. We provide over-threshold PSI functionality in Figure 8 and an ideal comparison functionality with three inputs in Figure 9.

The protocol outlined in Figure 10 involves the following steps: (1) both

S

and

R

map their respective set into bins before invoking the

{}^{ss}{PMT}

for each bin; (2)

S

and

R

aggregate the results from all bins and then compare the cardinality and threshold; (3)

R

outputs the intersection based on the comparison. The key insight of our protocol lies in fully utilizing the linearity property of the secret sharing scheme defined in

F_{PEQT}^{ss, q}

. Given that the outputs of

F_{PEQT}^{ss, q}

are additively secret-shared between

S

and

R

, the outputs for

F_{PMT}^{ss, q}

can be obtained by summing these shares without additional interaction. It is crucial to emphasize that

S

cannot straightforwardly disclose the shares from each

{}^{ss}{PMT}

to

R

, as our protocol mandates that

R

should only receive the intersection and proceed with the protocol output if the threshold condition is satisfied. Consequently,

S

must relay these shares to

R

without gaining insight into the comparison outcome. To maintain security, this process necessitates the implementation of a one-time OT.

It is trivial to securely compare the threshold t and intersection cardinality

{PSI}^{CA}

, though

{PSI}^{CA}

is secret-shared between two parties. We provide an ideal comparison functionality with three inputs in Figure 9 to simplify the

t^{\leq} PSI

protocol. In fact, different secure comparison implementations could be realized. For example, we leverage the capability of the garbled circuit to offer a lightweight solution for this task. It is very efficient to deploy a comparison circuit for short inputs, where

S

encodes

c_{0}

and t and

R

encodes

c_{1}

for corresponding input wires. However, we claim that our overall protocol in Figure 10 is not a circuit-based construction because different secure comparison methods could be taken to obtain value z for functionality

F_{≷ (y_{0}, x, y_{1})}

.

Theorem 2.

Protocol

t^{\leq} PSI

in Figure 10 securely computes the functionality in Figure 8 in a semi-honest model, given functionality

F_{OT}^{n}

in Figure 1,

F_{PMT}^{ss, q}

in Figure 5 and

F_{≷ (y_{0}, x, y_{1})}

in Figure 9.

Proof.

We omit the detailed security proof of the

t^{\leq} PSI

in Figure 10 and provide a sketch of the proof.

$S$ is corrupted. The semi-honest

S

receives nothing from the protocol, and the only messages received are random shares from

F_{PMT}^{ss, q}

. Therefore, it is trivial to perfectly construct simulator

{Sim}_{S}

.

$R$ is corrupted. The view of

R

consists of three kinds of messages: (1) outputs from

F_{PMT}^{ss, q}

, which are random values and independent of the inputs; (2) one-bit output z from

F_{≷ (y_{0}, x, y_{1})}

, which can be obtained from the

t^{\leq} PSI

protocol. That is,

z = 1

if

R

outputs intersection ∅, otherwise

z = 0

; (3) the concatenated shares via input message

I^{'}

in the OT phase, which are used to compute the intersection. Given the ideal functionality

F_{OT}^{n}

and the output of the protocol, i.e., intersection

{B_{R} [i]}

, simulator

{Sim}_{S}

needs to simulate the entire index set I.

{Sim}_{S}

computes secret shares at the corresponding positions indicating the element in the output that satisfies

r_{i} + {\hat{r}}_{i} = 1

, while the shares at other positions can be filled with random numbers. If the intersection is ∅, all shares are randomly generated by

{Sim}_{S}

. This selective computation ensures that the simulator accurately mimics the behavior of the real protocol. Hence, scripts can be perfectly simulated. □

5.2. Under-Threshold PSI Protocol: $t^{>} PSI$

The key to constructing the

t^{>} PSI

protocol, or rather its distinction from the

t^{\leq} PSI

protocol, lies in the assessment of the threshold and intersection cardinality. Similar to

t^{\leq} PSI

, we claim that it is trivial to execute

t^{>} PSI

without any extra communication or computation. Therefore, we only offer the detailed

t^{\leq} PSI

protocol corresponding to the functionality and omit the functionality of

t^{>} PSI

and its concrete construction. The only difference lies in Step 5a of Figure 10 where

S

acts as sender with pair-input

{0, I}

. The security proof for this protocol follows a similar approach and is not elaborated upon here.

5.3. Lightweight tPSI with Limited Leakage: $t_{l l}^{\leq} PSI$

Now we consider lightweight and efficient methods for determining the threshold and intersection cardinality. Given that we assume the threshold t is an input of

S

and is unknown to

R

, the cardinality testing approach does not necessarily depend on the secure comparison functionality in Figure 9. Here, we introduce

t_{l l}^{\leq} PSI

in Figure 11, a lightweight tPSI protocol designed with limited leakage. We begin by defining function

L

that describes the leakage in threshold PSI. Without loss of generality, we define the leakage function on the

S

side as

L

= | X \cap Y | - t

.

Notably, if

S

sends

t^{'} = c_{0} - t

to

R

that computes locally

δ = c_{0} - t + c_{1}

over the ring

Z_{q}

in Step 2 of Figure 11, the protocol does not correctly compare the threshold t with the cardinality of the intersection because

δ

must be non-negative in

Z_{q}

. This limitation reminds us of the need to prevent the wraparound when working over the integers with a certain modulus. To address this issue, a straightforward solution is to perform the calculation over another ring

Z_{q^{'}}

that can represent a negative value in some manner.

We refer to the radix complement that is used to execute arithmetic operations of addition and subtraction within a binary number system. This allows us to expand the representational meanings of elements in

Z_{q}

. We first represent all elements in

Z_{q}

as binary strings and then use the concept of complement to classify these binary strings into positive and negative numbers. In this representation, the most significant bit represents the sign of a number. Given the bit length of q is

λ

, we use mapping

Q (x) = \{\begin{matrix} x & if & 0 \leq x < 2^{λ - 1} - 1 \\ x - 2^{λ} & if & 2^{λ - 1} \leq x < 2^{λ} - 1 \end{matrix}

to recover the original value of the radix complement.

To prevent unexpected overflow when the cardinality of any two sets is large, we assume that parameter

λ

could satisfy constraint

2^{λ - 1} > n

instead of

2^{λ} > n

. In this approach,

S

computes

t^{'}

over the ring

Z_{2^{λ}}

, where the range of original numbers spans from

- 2^{λ - 1}

to

2^{λ - 1} - 1

.

R

then obtains

δ

in the complement representation. We let

MSB (x)

denote the most significant bit of x. In Step 2 of Figure 11,

R

can simply determine

σ = MSB (δ)

. Additionally,

R

computes

Q (δ)

to learn the original value of

δ

, which essentially implies

L

defined above.

Theorem 3.

The

t_{l l}^{\leq} PSI

protocol in Figure 11 securely computes the functionality in Figure 8 in a semi-honest model, given leak function

L

= | X \cap Y | - t

, functionality

F_{OT}^{n}

in Figure 1, and

F_{PMT}^{ss, q}

in Figure 5.

We omit the detailed security proof of Theorem 3. We start by providing a sketch of the proof when

R

is corrupted. Similar to the proof in Therorem 2, it is trivial to perfectly construct simulator

{Sim}_{S}

. For simulator

{Sim}_{R}

, the most critical aspect is to simulate the situation where

R

outputs intersection ∅. Specifically, given leak

L

, simulator

{Sim}_{S}

calculates

t = L - c_{1}

and sends it to

R

. Then,

{Sim}_{S}

simply follows the previous simulation process.

{Sim}_{S}

computes secret shares at the corresponding positions indicating

r_{i} + {\hat{r}}_{i} = 1

, while the shares at other positions are filled with random numbers. If the intersection is ∅, all shares are randomly generated.

6. Complexity Analysis and Comparisons

We provide comparisons to prior classical protocols in Table 1. It should be emphasized that our comparisons are confined to the literature providing two-party computation protocols. The construction of multi-party threshold computation protocols is explicitly excluded from this analysis. This exclusion is based on the observation that threshold definitions in multiparty scenarios exhibit significantly greater complexity and diversity. For instance, in multi-party threshold PSI in [23], parties learn the intersection only if the difference between their respective sets and the intersection does not exceed t, or if the discrepancy between the union of all their sets and the intersection is within t.

7. Implementation and Performance Evaluation

7.1. Generic Precomputing Oblivious Transfer

To accelerate computation time during the online phase in code implementation, as illustrated in Figure 12, we provide a generic precomputing OT that securely computes functionality

F_{OT}^{n}

. The idea of precomputing OT could be traced back to Beaver’s work [44] where a precomputing one-out-of-two OT construction is given. We apply this idea to a more general case.

The security of the precomputing method can be reduced to the security of the offline-phase OT, and we omit the proof details here. It is worth noting that

S

learns additional information

δ

, but it only reveals the difference in

R

’s choices between the two phases without disclosing the inputs of either phase, thus maintaining the security of the construction. Furthermore, through the precomputing technique, both parties in the online phase perform simple operations such as XOR or shift, significantly enhancing the lightweight nature of the entire threshold PSI protocol.

7.2. Experimental Performance

In this section, we test the performance of

{}^{ss}{PEQT}

and tPSI. The experiments were performed on Ubuntu version 20.04 equipped with 3.20 GHz AMD R7-6800H CPU and 16 GB RAM. Our tests refer to the implementation on Github: https://github.com/osu-crypto/libOTe, (accessed on 14 December 2024). We simulate the protocols on a single machine.

In Figure 13, the three subplots exhibit a consistent increase in processing time as the number of the length increases. For the subplot with an element length of

l = 128

, the sender has mostly stable data around 10 ms, while the receiver averages around 14 ms with slightly more fluctuation. The total reflects the combined time with fluctuations between 20 ms and 30 ms. For

l = 256

, the total averages about 45 ms, maintaining a stable trend but with higher values. For

l = 512

, the total average is 70 ms, showing notable fluctuations and higher values. Comparing across subplots, as the element length increases, so do the average values for sender, receiver, and total times, indicating that longer elements require more processing time. Smaller element lengths show less volatility and more stable trends, while longer elements exhibit more pronounced fluctuations. Overall, processing times for both sender and receiver, as well as the total time, are proportional to the element length, suggesting performance decreases with longer elements. This highlights the importance of carefully considering performance implications when handling longer elements.

Table 2 illustrates a pronounced rise in execution time as the length of elements escalates, accompanied by a parallel increase in communication volume. For a length of 128, the average execution time is approximately 300 ms. With a length of 256, the average time is around 600 ms, ranging from about 580 ms to 770 ms. When the length extends to 512, the average execution time is roughly 1300 ms. These findings indicate that as the length increases, there is a simultaneous increase in both the average execution time and its variability, providing crucial insights for performance optimization and prediction.

In Figure 14, we present the building blocks and time comparisons of similar protocols. For polynomial interpolation, we referenced open-source code available at https://github.com/openfheorg/openfhe-development, (accessed on 14 December 2024). For the HE algorithm, we utilized an open-source framework hosted on the server, which can be found at https://gitee.com/tianpu/Fast-Polynomial-Interpolation-and-Evaluation, (accessed on 14 December 2024). It is important to note that the timing for HE is provided for reference only, as local testing is likely to take more time. In Figure 14a, we set the set sizes to security parameter

κ = 256

and 512. Here, Poly-256 indicates the time required for interpolating the plaintext polynomial on the set

X = {x_{1}, \dots, x_{n}}

, where

n = 256

. The same applies to Poly-512, AHE-256, AHE-512, OT-256, and OT-512. Since the OT extension protocol requires a foundational instance of at least 256, the advantages of the OT protocol are not particularly pronounced in this scenario without assistance by the precomputing technique.

However, our lightweight protocols demonstrate increasingly significant time advantages as the size of the set grows, as shown in Figure 14b. The protocols presented in [19,20] do not provide open-sourced code, and the constructions in [21,24] focus on the theoretical optimization of communication complexity but lack a testing procedure. Therefore, we adopt the best conclusions from both types of schemes and provide a comparative analysis of the simulated times. The blue and black curves show a rapid increase, indicating that HE-based protocol becomes less efficient as the input grows. In contrast, the red curve from our construction shows minimal growth, suggesting greater efficiency. Therefore, the protocol derived from lightweight cryptographic primitives is more favorable for scenarios that prioritize speed and resource utilization.

8. Conclusions and Future Work

We propose generic threshold Private Set Intersection (tPSI) construction via Oblivious Transfer (OT) without resorting to the relatively expensive public-key operations. We believe

{}^{ss}{PMT}

is an important building block in tPSI applications. In future work, we will consider the problem of lightweight design in multi-party tPSI scenarios. Given the significant communication burden on participants posed by OT-extension-based protocols, our subsequent research will focus primarily on balancing the computational and communication loads of the protocol, as well as designing multi-party protocols under different security models.

Author Contributions

Conceptualization, S.Z.; methodology, S.Z. and X.S.; software, Y.H.; validation, S.Z., C.Z. and X.S.; formal analysis, C.Z.; investigation, S.Z.; resources, Y.H.; data curation, Y.H. and Q.X; writing—original draft preparation, S.Z. and Y.H.; writing—review and editing, C.Z. and X.S.; supervision, C.Z. and Q.X.; project administration, S.Z.; funding acquisition, S.Z., C.Z. and Q.X. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the National Natural Science Foundation of China (62472252, 62172258), TaiShan Scholars Program (tsqn202211280), Shandong Provincial Natural Science Foundation (ZR2024QF131, ZR2023LZH014, ZR2022ZD01), Department of Science and Technology of Shandong Province (SYS202201), and Quan Cheng Laboratory (QCLZD202302).

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Yao, A.C. Protocols for secure computations. In Proceedings of the 23rd Annual Symposium on Foundations of Computer Science (sfcs 1982), Chicago, IL, USA, 3–5 November 1982; pp. 160–164. [Google Scholar]
Zhao, C.; Zhao, S.; Zhao, M.; Chen, Z.; Gao, C.Z.; Li, H.; Tan, Y. Secure multi-party computation: Theory, practice and applications. Inf. Sci. 2019, 476, 357–372. [Google Scholar] [CrossRef]
Kolesnikov, V.; Kumaresan, R.; Rosulek, M.; Trieu, N. Efficient batched oblivious PRF with applications to private set intersection. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, 24–28 October 2016; pp. 818–829. [Google Scholar]
Pinkas, B.; Schneider, T.; Tkachenko, O.; Yanai, A. Efficient circuit-based psi with linear communication. In Advances in Cryptology–EUROCRYPT 2019: 38th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Darmstadt, Germany 19–23 May 2019; Springer: Berlin, Germany, 2019; pp. 122–153. [Google Scholar]
Pinkas, B.; Rosulek, M.; Trieu, N.; Yanai, A. SpOT-light: Lightweight private set intersection from sparse OT extension. In Proceedings of the Advances in Cryptology–CRYPTO 2019: 39th Annual International Cryptology Conference, Santa Barbara, CA, USA, 18–22 August 2019; Proceedings, Part III 39. Springer: Berlin, Germany, 2019; pp. 401–431. [Google Scholar]
Kolesnikov, V.; Rosulek, M.; Trieu, N.; Wang, X. Scalable private set union from symmetric-key techniques. In Proceedings of the International Conference on the Theory and Application of Cryptology and Information Security; Springer: Cham, Switzerland, 2019; pp. 636–666. [Google Scholar]
Zhang, C.; Chen, Y.; Liu, W.; Zhang, M.; Lin, D. Linear private set union from {Multi-Query} reverse private membership test. In Proceedings of the 32nd USENIX Security Symposium (USENIX Security 23), Anaheim, CA, USA, 9–11 August 2023; pp. 337–354. [Google Scholar]
Le, P.H.; Ranellucci, S.; Gordon, S.D. Two-party private set intersection with an untrusted third party. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, London, UK, 11–15 November 2019; pp. 2403–2420. [Google Scholar]
Kolesnikov, V.; Matania, N.; Pinkas, B.; Rosulek, M.; Trieu, N. Practical multi-party private set intersection from symmetric-key techniques. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, Dallas, TX, USA, 30 October–3 November 2017; pp. 1257–1272. [Google Scholar]
Wei, L.; Liu, J.; Zhang, L.; Wang, Q.; Zhang, W.; Qian, X. Efficient multi-party private set intersection protocols for large participants and small sets. Comput. Stand. Interfaces 2024, 87, 103764. [Google Scholar] [CrossRef]
Chongchitmate, W.; Ishai, Y.; Lu, S.; Ostrovsky, R. Psi from ring-ole. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, Los Angeles, CA, USA, 7–11 November 2022; pp. 531–545. [Google Scholar]
Lv, S.; Wei, Y.; Jia, J.; Li, X.; Li, T.; Liu, Z.; Chen, X.; Guo, L. New Approach for Efficient Malicious Multiparty Private Set Intersection. Inf. Sci. 2024, 678, 120995. [Google Scholar] [CrossRef]
Ion, M.; Kreuter, B.; Nergiz, E.; Patel, S.; Saxena, S.; Seth, K.; Shanahan, D.; Yung, M. Private intersection-sum protocol with applications to attributing aggregate ad conversions. Cryptology ePrint Archive 2017, 2017, 738. [Google Scholar]
Duong, T.; Phan, D.H.; Trieu, N. Catalic: Delegated PSI cardinality with applications to contact tracing. In Proceedings of the Advances in Cryptology–ASIACRYPT 2020: 26th International Conference on the Theory and Application of Cryptology and Information Security, Daejeon, Republic of Korea, 7–11 December 2020; Proceedings, Part III 26; Springer: Cham, Switzerland, 2020; pp. 870–899. [Google Scholar]
Kong, X.; Chen, L.; Zhu, Y.; Mu, Y. Efficient Public-key Searchable Encryption Scheme from PSI with Scalable Proxy Servers. IEEE Trans. Serv. Comput. 2024, 1–14. [Google Scholar] [CrossRef]
Yang, Y.; Chen, X.; Pan, Y.; Shen, J.; Cao, Z.; Dong, X.; Li, X.; Sun, J.; Yang, G.; Deng, R. OpenVFL: A Vertical Federated Learning Framework with Stronger Privacy-Preserving. IEEE Trans. Inf. Forensics Secur. 2024, 19, 9670–9681. [Google Scholar] [CrossRef]
Yang, Y.; Yang, Y.; Chen, X.; Dong, X.; Cao, Z.; Shen, J. DMPSI: Efficient Scalable Delegated Multiparty PSI and PSI-CA with Oblivious PRF. IEEE Trans. Serv. Comput. 2024, 17, 497–508. [Google Scholar] [CrossRef]
Bradley, T.; Faber, S.; Tsudik, G. Bounded size-hiding private set intersection. In Proceedings of the International Conference on Security and Cryptography for Networks; Springer: Berlin, Germany, 2016; pp. 449–467. [Google Scholar]
Zhao, Y.; Chow, S.S. Are you the one to share? secret transfer with access structure. Proc. Priv. Enhanc. Technol. 2017, 2017, 149–169. [Google Scholar] [CrossRef]
Zhao, Y.; Chow, S.S. Can you find the one for me? In Proceedings of the 2018 Workshop on Privacy in the Electronic Society, Toronto, ON, Canada, 15–19 October 2018; pp. 54–65. [Google Scholar]
Ghosh, S.; Simkin, M. The communication complexity of threshold private set intersection. In Proceedings of the Annual International Cryptology Conference, Santa Barbara, CA, USA, 18–22 August 2019; pp. 3–29. [Google Scholar]
Branco, P.; Döttling, N.; Pu, S. Multiparty cardinality testing for threshold private intersection. In Proceedings of the IACR International Conference on Public-Key Cryptography, Vitual Event, 10–13 May 2021; pp. 32–60. [Google Scholar]
Badrinarayanan, S.; Miao, P.; Rindal, P. Multi-Party Threshold Private Set Intersection with Sublinear Communication. IACR Cryptol. ePrint Arch. 2020, 2020, 600. [Google Scholar]
Ghosh, S.; Simkin, M. Threshold private set intersection with better communication complexity. In Proceedings of the IACR International Conference on Public-Key Cryptography; Springer: Cham, Switzerland, 2023; pp. 251–272. [Google Scholar]
Liu, F.H.; Zhang, E.; Qin, L. Efficient Multiparty Probabilistic Threshold Private Set Intersection. In Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, Copenhagen, Denmark, 26–30 November 2023; pp. 2188–2201. [Google Scholar]
Ma, L.; Wang, H.; Niu, Z.; Li, Z.; Wu, L.; Wei, X.; Su, Y. Over-threshold multi-party private set operation protocols for lightweight clients. Comput. Stand. Interfaces 2024, 88, 103781. [Google Scholar] [CrossRef]
Mohanty, T.; Srivastava, V.; Debnath, S.K.; Das, A.K.; Sikdar, B. Quantum secure threshold private set intersection protocol for IoT-Enabled privacy preserving ride-sharing application. IEEE Internet Things J. 2023, 11, 1761–1772. [Google Scholar] [CrossRef]
Pinkas, B.; Schneider, T.; Zohner, M. Faster private set intersection based on {OT} extension. In Proceedings of the 23rd {USENIX} Security Symposium ({USENIX} Security 14), San Diego, CA, USA, 20–22 August 2014; pp. 797–812. [Google Scholar]
Chase, M.; Miao, P. Private set intersection in the internet setting from lightweight oblivious PRF. In Advances in Cryptology–CRYPTO 2020: 40th Annual International Cryptology Conference, CRYPTO 2020, Santa Barbara, CA, USA, 17–21 August 2020; Proceedings, Part III 40; Springer: Cham, Switerland, 2020; pp. 34–63. [Google Scholar]
Song, X.; Gai, M.; Zhao, S.; Jiang, H. Privacy-Preserving Statistics Protocol for Set-Based Computation (in Chinese). J. Comput. Res. Dev. 2020, 57, 2221. [Google Scholar]
Ciampi, M.; Orlandi, C. Combining private set-intersection with secure two-party computation. In Proceedings of the International Conference on Security and Cryptography for Networks; Springer: Cham, Switerland, 2018; pp. 464–482. [Google Scholar]
Karakoç, F.; Küpçü, A. Linear Complexity Private Set Intersection for Secure Two-Party Protocols. Cryptology ePrint Archive, Report 2020/864. In International Conference on Cryptology and Network Security; Springer: Cham, Switzerland, 2020. [Google Scholar]
Miao, P.; Patel, S.; Raykova, M.; Seth, K.; Yung, M. Two-sided malicious security for private intersection-sum with cardinality. In Proceedings of the Annual International Cryptology Conference; Springer: Cham, Switzerland, 2020; pp. 3–33. [Google Scholar]
Kolesnikov, V.; Kumaresan, R. Improved OT extension for transferring short secrets. In Proceedings of the Annual Cryptology Conference; Springer: Cham, Switzerland, 2013; pp. 54–70. [Google Scholar]
Orrù, M.; Orsini, E.; Scholl, P. Actively secure 1-out-of-N OT extension with application to private set intersection. In Proceedings of the Topics in Cryptology–CT-RSA 2017: The Cryptographers’ Track at the RSA Conference 2017, San Francisco, CA, USA, 14–17 February 2017; Proceedings. Springer: Cham, Switzerland, 2017; pp. 381–396. [Google Scholar]
Zhao, S.; Ma, M.; Song, X.; Jiang, H.; Yan, Y.; Xu, Q. Lightweight threshold private set intersection via oblivious transfer. In Proceedings of the Wireless Algorithms, Systems, and Applications: 16th International Conference, WASA 2021, Nanjing, China, 25–27 June 2021; Proceedings, Part III 16. Springer: Cham, Switzerland, 2021; pp. 108–116. [Google Scholar]
Hu, J.; Zhao, Y.; Tan, B.H.M.; Aung, K.M.M.; Wang, H. Enabling Threshold Functionality for Private Set Intersection Protocols in Cloud Computing. IEEE Trans. Inf. Forensics Secur. 2024, 19, 6184–6196. [Google Scholar] [CrossRef]
Asharov, G.; Lindell, Y.; Schneider, T.; Zohner, M. More efficient oblivious transfer and extensions for faster secure computation. In Proceedings of the 2013 ACM SIGSAC conference on Computer & Communications Security, Berlin, Germnay, 4–8 November 2013; pp. 535–548. [Google Scholar]
Yang, K.; Weng, C.; Lan, X.; Zhang, J.; Wang, X. Ferret: Fast Extension for coRRElated oT with small communication. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, Virtual Event, 9–13 November 2020; pp. 1607–1626. [Google Scholar]
Pagh, R.; Rodler, F.F. Cuckoo hashing. J. Algorithms 2004, 51, 122–144. [Google Scholar] [CrossRef]
Shamir, A. How to share a secret. Commun. ACM 1979, 22, 612–613. [Google Scholar] [CrossRef]
Goldreich, O. Foundations of Cryptography: Volume 2, Basic Applications; Cambridge University Press: Cambridge, UK, 2009. [Google Scholar]
Couteau, G. New protocols for secure equality test and comparison. In Proceedings of the International Conference on Applied Cryptography and Network Security; Springer: Cham, Switzerland, 2018; pp. 303–320. [Google Scholar]
Beaver, D. Precomputing oblivious transfer. In Proceedings of the Annual International Cryptology Conference; Springer: Cham, Switzerland, 1995; pp. 97–109. [Google Scholar]

Figure 1. One-out-of-n Oblivious Transfer Functionality

F_{OT}^{n}

.

Figure 1. One-out-of-n Oblivious Transfer Functionality

F_{OT}^{n}

.

Figure 2. Private Equality Test

F_{PEQT}

.

Figure 2. Private Equality Test

F_{PEQT}

.

Figure 3. Secret-shared Private Equality Test

F_{PEQT}^{ss, q}

.

Figure 3. Secret-shared Private Equality Test

F_{PEQT}^{ss, q}

.

Figure 4. Private Membership Test

F_{PMT}

.

Figure 4. Private Membership Test

F_{PMT}

.

Figure 5. Secret-shared Private Membership Test

F_{PMT}^{ss, q}

.

Figure 5. Secret-shared Private Membership Test

F_{PMT}^{ss, q}

.

Figure 6. Secret-shared Private Membership Test Protocol

{}^{ss}{PMT}

.

Figure 6. Secret-shared Private Membership Test Protocol

{}^{ss}{PMT}

.

Figure 7. Our

{}^{ss}{PEQT}

Construction Computing

F_{PEQT}^{ss, q}

.

Figure 7. Our

{}^{ss}{PEQT}

Construction Computing

F_{PEQT}^{ss, q}

.

Figure 8. Threshold Private Set Intersection tPSI.

Figure 9. Secure Comparison Functionality

F_{≷ (y_{0}, x, y_{1})}

.

Figure 9. Secure Comparison Functionality

F_{≷ (y_{0}, x, y_{1})}

.

Figure 10. Threshold PSI

t^{\leq} PSI

.

Figure 10. Threshold PSI

t^{\leq} PSI

.

Figure 11. Threshold PSI protocol with limited leakage (

t_{l l}^{\leq} PSI

).

Figure 11. Threshold PSI protocol with limited leakage (

t_{l l}^{\leq} PSI

).

Figure 12. Generic Precomputing

{OT}_{n}^{1}

Method.

Figure 12. Generic Precomputing

{OT}_{n}^{1}

Method.

Figure 13. Running Time of

{}^{ss}{PEQT}

.

Figure 13. Running Time of

{}^{ss}{PEQT}

.

Figure 14. Comparisons of Running Time.

Table 1. Threshold PSI Protocol Comparisons.

Protocol	Public-Key Computation	Communication	Leakage	Functionality	Main Tools
[19]	$O (n^{2})$	$O (l n + κ n)$	$\| X \cap Y \|$	$t^{\leq} PSI$	HE, OT
[20]	$O (n)$	$O (l n + κ n)$	-	$t^{\leq} PSI$ & $t^{>} PSI$	HE
[21]	$O (n l)$	$\tilde{O} (t^{2})$	-	$t^{\leq} PSI$	FHE
[23]	$O (n l)$	$\tilde{O} (t)$	-	$t^{\leq} PSI$	AHE
[36]	$O (κ + l)$	$O (l n \log n)$	-	$t^{\leq} PSI$ & $t^{>} PSI$	OT, GC
[24]	$O (n l)$	$O (κ^{2} t poly (\log t))$	-	$t^{\leq} PSI$	AHE
Ours *	$O (κ + l)$	$O (l n \log n)$	$\| X \cap Y \| - t$	$t^{\leq} PSI$ , $t^{>} PSI$ & $t_{l l}^{\leq} PSI$	OT

*: We analyze all the three protocols in Section 5. Note:

κ

denotes the computational security parameter.

\tilde{O} (\cdot)

hides

poly

factors. The complexity in [19,20] refers to

{PSI}^{CA}

phase. The threshold t in [21,23,24] is used to measure the number of different items instead of common ones.

Table 2. tPSI running time.

Length	Comm. (MB)	Running Time (ms)
Length	Comm. (MB)	1	2	3	4	5	6	7	8	9	10
$l = 128$	2.816	321.92	340.736	388.48	317.184	297.088	296.96	334.72	337.664	397.312	374.272
$l = 256$	5.932	628.75	665.5	758.75	619.5	580.25	580	653.75	659.5	776	731
$l = 512$	12.264	1257.2	1331	1517.5	1239	1160.7	1160	1307.1	1319	1552.3	1462

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhao, S.; Zhao, C.; Huang, Y.; Song, X.; Xu, Q. Two-Party Threshold Private Set Intersection Protocols from Lightweight Cryptographic Primitives. Cryptography 2024, 8, 58. https://doi.org/10.3390/cryptography8040058

AMA Style

Zhao S, Zhao C, Huang Y, Song X, Xu Q. Two-Party Threshold Private Set Intersection Protocols from Lightweight Cryptographic Primitives. Cryptography. 2024; 8(4):58. https://doi.org/10.3390/cryptography8040058

Chicago/Turabian Style

Zhao, Shengnan, Chuan Zhao, Yuchen Huang, Xiangfu Song, and Qiuliang Xu. 2024. "Two-Party Threshold Private Set Intersection Protocols from Lightweight Cryptographic Primitives" Cryptography 8, no. 4: 58. https://doi.org/10.3390/cryptography8040058

APA Style

Zhao, S., Zhao, C., Huang, Y., Song, X., & Xu, Q. (2024). Two-Party Threshold Private Set Intersection Protocols from Lightweight Cryptographic Primitives. Cryptography, 8(4), 58. https://doi.org/10.3390/cryptography8040058

Article Menu

Two-Party Threshold Private Set Intersection Protocols from Lightweight Cryptographic Primitives^†

Abstract

1. Introduction

1.1. Related Work

1.2. Our Contribution

2. Preliminaries

2.1. Notation

2.2. Oblivious Transfer

2.3. Cuckoo Hashing

2.4. Secret Sharing

2.5. Security Definition

3. Overview of Our Construction

3.1. One-out-of-n Correlated OT

3.2. Secret-Shared Private Equality Test

3.3. Secret-Shared Private Membership Test

4. Secrect-Shared Private Equality Test ( ${}^{ss}{PEQT}$ )

5. Generic Threshold PSI Protocols

5.1. Over-Threshold PSI Protocol: $t^{\leq} PSI$

5.2. Under-Threshold PSI Protocol: $t^{>} PSI$

5.3. Lightweight tPSI with Limited Leakage: $t_{l l}^{\leq} PSI$

6. Complexity Analysis and Comparisons

7. Implementation and Performance Evaluation

7.1. Generic Precomputing Oblivious Transfer

7.2. Experimental Performance

8. Conclusions and Future Work

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Two-Party Threshold Private Set Intersection Protocols from Lightweight Cryptographic Primitives †

Abstract

1. Introduction

1.1. Related Work

1.2. Our Contribution

2. Preliminaries

2.1. Notation

2.2. Oblivious Transfer

2.3. Cuckoo Hashing

2.4. Secret Sharing

2.5. Security Definition

3. Overview of Our Construction

3.1. One-out-of-n Correlated OT

3.2. Secret-Shared Private Equality Test

3.3. Secret-Shared Private Membership Test

4. Secrect-Shared Private Equality Test ( PEQT ss )

5. Generic Threshold PSI Protocols

5.1. Over-Threshold PSI Protocol: t ≤ PSI

5.2. Under-Threshold PSI Protocol: t > PSI

5.3. Lightweight tPSI with Limited Leakage: t l l ≤ PSI

6. Complexity Analysis and Comparisons

7. Implementation and Performance Evaluation

7.1. Generic Precomputing Oblivious Transfer

7.2. Experimental Performance

8. Conclusions and Future Work

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Two-Party Threshold Private Set Intersection Protocols from Lightweight Cryptographic Primitives^†

4. Secrect-Shared Private Equality Test ( ${}^{ss}{PEQT}$ )

5.1. Over-Threshold PSI Protocol: $t^{\leq} PSI$

5.2. Under-Threshold PSI Protocol: $t^{>} PSI$

5.3. Lightweight tPSI with Limited Leakage: $t_{l l}^{\leq} PSI$