Efficient Lp Distance Computation Using Function-Hiding Inner Product Encryption for Privacy-Preserving Anomaly Detection

Ryu, Dong-Hyeon; Jeon, Seong-Yun; Hong, Junho; Lee, Mun-Kyu

doi:10.3390/s23084169

Open AccessCommunication

Efficient L_p Distance Computation Using Function-Hiding Inner Product Encryption for Privacy-Preserving Anomaly Detection

¹

Department of Computer Science and Engineering, Pohang University of Science and Technology, Pohang 37673, Republic of Korea

²

Department of Computer Engineering, Inha University, Incheon 22212, Republic of Korea

³

Department of Electrical and Computer Engineering, University of Michigan-Dearborn, Dearborn, MI 48128, USA

^*

Author to whom correspondence should be addressed.

^†

Most of this work was done when D.-H. Ryu was in Department of Computer Engineering, Inha University, Incheon 22212, Republic of Korea.

Sensors 2023, 23(8), 4169; https://doi.org/10.3390/s23084169

Submission received: 16 February 2023 / Revised: 13 April 2023 / Accepted: 18 April 2023 / Published: 21 April 2023

(This article belongs to the Collection Cryptography and Security in IoT and Sensor Networks)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

In Internet of Things (IoT) systems in which a large number of IoT devices are connected to each other and to third-party servers, it is crucial to verify whether each device operates appropriately. Although anomaly detection can help with this verification, individual devices cannot afford this process because of resource constraints. Therefore, it is reasonable to outsource anomaly detection to servers; however, sharing device state information with outside servers may raise privacy concerns. In this paper, we propose a method to compute the

L_{p}

distance privately for even

p > 2

using inner product functional encryption and we use this method to compute an advanced metric, namely p-powered error, for anomaly detection in a privacy-preserving manner. We demonstrate implementations on both a desktop computer and Raspberry Pi device to confirm the feasibility of our method. The experimental results demonstrate that the proposed method is sufficiently efficient for use in real-world IoT devices. Finally, we suggest two possible applications of the proposed computation method for

L_{p}

distance for privacy-preserving anomaly detection, namely smart building management and remote device diagnosis.

Keywords:

functional encryption; anomaly detection; mean p-powered error

1. Introduction

Anomaly detection is defined as the identification of abnormal data, events, or phenomena that deviate significantly from common observations and cannot be described using any well-known notion of normal behavior [1]. Since its first appearance as an intrusion detection system in 1987 [2], anomaly detection has evolved in several fields, including statistics, medicine, manufacturing processes, machine learning, cyber security, and computer vision.

In particular, in Internet of Things (IoT) ecosystems, where a large number of IoT devices are connected and communicate with each other and third-party servers, it is crucial to verify whether each device operates appropriately [3]. IoT devices may malfunction because of initial faults during the manufacturing process, software bugs, mechanical failures of parts during operation, extreme changes in the surrounding environment, and malicious attacks by attackers. In real-world scenarios, an IoT ecosystem comprises of servers and several IoT devices, and it is reasonable to assume that servers are responsible for detecting such anomalous states in IoT devices because most IoT devices do not have sufficient computational resources to detect failures and recover from abnormal states. In this setup, all the normal-state information of each device, such as power consumption, network connections, and temperature, should be maintained in safe storage, and servers may periodically check whether the current state of each device is significantly different from the stored normal state. Mean absolute error and mean squared error are commonly considered to evaluate the difference (and similarity) between two state vectors. However, in 2018, Shalyga et al. [4] showed that the mean p-powered error is more effective in predicting short-term anomalies when p is greater than two on the SWaT dataset [5]. Here, the mean p-powered error is defined as

\frac{1}{n} \sum_{i = 1}^{n} {| x_{i} - y_{i} |}^{p}

, where

x = (x_{1}, \dots, x_{n})

and

y = (x_{1}, \dots, x_{n})

are the normal state vector and the current state vector, respectively, with dimension n. According to the experiments in [4], the best choice for p is

p = 6

.

Meanwhile, privacy concerns regarding IoT devices have arisen over the past decades as IoT ecosystems rapidly expand to industries, cities, houses, and personal wearable devices [6,7,8]. Sensor data are highly detailed, precise, and personal. Furthermore, the combination of data from multiple sensors may allow attackers to infer additional information more accurately, which is not possible with a single sensor [9]. In 2017, Nesa et al. showed that the occupancy of a room can be determined by fusing data, such as humidity, temperature, illumination level, and CO₂ [10] with very high accuracy (up to 99.09%). This prediction becomes more precise as the number of fusion parameters increases. Therefore, privacy protection must be considered in IoT ecosystems. A recent controversy at Carnegie Mellon University over super-sensing IoT devices called Mites demonstrated potentially serious privacy concerns with IoT-based smart building management systems [11].

To protect the privacy of IoT devices, all the information collected from IoT devices must be encrypted before being transmitted to the anomaly detection server. In addition, it is desirable that an anomaly detection process should be conducted on the encrypted data instead of the original data. Furthermore, the encryption operations must be sufficiently lightweight to be run on IoT devices.

To satisfy the aforementioned requirements, we consider functional encryption (FE); we provide a formal definition of FE in the following section. FE is a special type of encryption algorithm that satisfies the following properties: let m be a plaintext and E(m) be a ciphertext of m. Given E(m), an individual with additional information, that is, a secret key bound to a function f, can only obtain

f (m)

, whereas they cannot learn the original plaintext m. Inner product encryption (IPE) is a special form of FE whose function is the inner product, and both the ciphertext and secret key are associated with vectors. Given a secret key for the coefficient vector

y

and ciphertext for the plaintext vector

x

, we can obtain only the inner product values

〈 x, y 〉

through decryption but no further information about

x

. We say that there is a function-hiding property when information about

y

is protected, as well as that about

x

. In 2018, Kim et al. [12] proposed a practical function-hiding IPE (FHIPE) scheme. In [12] and several related works, such as [13], FHIPE was used to compute vector-related metrics, such as

L_{1}

distance and

L_{2}

distance, in a privacy-preserving manner. These metrics have been used for biometric authentication, nearest neighbor (NN) searches on encrypted data, and secure linear regression. However, the general

L_{p}

distance (

p > 2

), which is necessary for evaluating the mean p-powered error for advanced anomaly detection [4], has not been considered in previous studies.

In this study, we generalize the

L_{p}

distance over the FHIPE scheme as a new distance metric for privacy-preserving anomaly detection, where

p > 2

denotes an even number. To be precise, our problem statement is as follows: Given a ciphertext E

(x)

for a vector

x = (x_{1}, \dots, x_{n})

and a secret key bound to a vector

y = (x_{1}, \dots, x_{n})

, compute the

L_{p}

distance,

\sum_{i = 1}^{n} {| x_{i} - y_{i} |}^{p}

for even p, using the FHIPE scheme (the mean p-powered error computation needs an additional operation, division by n. For simplicity, we omit this throughout the paper, as its computation is trivial.). Our experimental results indicate that the computation of this new metric in the ciphertext domain is efficient and applicable to highly resource-constrained IoT devices. We also suggest two applications using our method for privacy-preserving anomaly detection, i.e., smart building management and remote device diagnosis.

1.1. Related Works

Several applications, including biometric authentication, location-based services, data mining, and anomaly detection, encode data into vectors and then measure the distance or similarity between the vectors [13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29]. The vectors used in these applications may contain sensitive information. Therefore, extensive research has been conducted to measure the distance between two vectors while maintaining data privacy. Wang et al. [17] and Im, Jeon, and Lee [16] have proposed methods that encrypt a vector representing a facial image using homomorphic encryption (HE) and calculated the Euclidean distance between the encrypted vectors without decrypting them. Kang et al. [30] have shown that HE can be used to outsource a service that remotely detects whether a component of an industrial machine has diverted from its normal operating condition, without disclosing the private information of the data owner. Several previous studies used FE instead of HE. Kim et al. [12] proposed a biometric authentication method based on Hamming distance and nearest neighbor search based on

L_{2}

distance using FHIPE. Zhou and Ren [14] proposed a biometric authentication method, called Passbio, that supports both Hamming distance and Euclidean distance. Their method does not reveal the actual distance, but only indicates whether the distance is within a predefined threshold. Kwon and Lee [31] commented on two vulnerabilities of Passbio and proposed a countermeasure against these vulnerabilities. Lee et al. [15] proposed an efficient biometric authentication method specialized for Hamming distance. Li and Jung [18] used HE and FE to encrypt geometric locations. The aforementioned applications use the Euclidean distance to evaluate the similarity between two vectors. In other applications, distance metrics, such as the Hamming distance, cosine similarity, and Mahalanobis distance, have also been used [29]. For example, the data-mining applications proposed in [20,21,22,32] calculated the cosine similarity in a privacy-preserving manner using HE, secure multiparty computation, and oblivious transfer. In studies on anomaly detection, HE and differential privacy were used to construct privacy-preserving anomaly detection systems. Alabdulatif et al. [23,24] proposed cloud-based anomaly detection systems for IoT using HE. Mehnaz and Bertino [25] also proposed a cloud-based anomaly detection system using HE. They adopted the edge computing approach in one of the stages comprising anomaly detection and used additive HE for secure aggregation of individual values from edge devices. Lyu et al. [26] proposed a privacy-preserving collaborative anomaly detection method with differential privacy. However, to the best of our knowledge, this study is the first work on general

L_{p}

distance calculations in a privacy-preserving manner.

1.2. Contributions

We propose a method to compute the $L_{p}$ distance for an even number $p > 2$ over FHIPE. We use this $L_{p}$ distance to compute the mean p-powered error for anomaly detection.
To demonstrate the feasibility for IoT ecosystems, we implement the proposed method in C++ and conduct experiments on a prototype system composed of a desktop computer (as a server) and the Raspberry Pi (as an IoT device) system. The experimental result shows that the proposed method is sufficiently efficient to be applied to IoT ecosystems in terms of execution times and memory usage.
We present two possible applications of the proposed $L_{p}$ distance computation method for privacy-preserving anomaly detection, i.e., smart building management and remote device diagnosis. Accordingly, we suggest a protocol involving multiple IoT devices and two servers.

In the remainder of this paper, Section 2 describes the preliminaries of the proposed work. Section 3 explains the proposed method on how to compute the

L_{p}

distance over FHIPE for an even number p. Section 4 provides the performance analysis. Section 5 describes a general framework for privacy-preserving anomaly detection for IoT systems. Section 6 discusses the limitation of the proposed method and presents a future research direction. Conclusions are given in Section 7.

2. Preliminaries

2.1. Barreto–Naehrig Curve (BN Curve)

If computations in the extension

F_{r^{k}}

of the prime field

F_{r}

are feasible, an elliptical curve defined over

F_{r}

is known as a pairing-friendly curve [33]. Specifically, the elliptical curve over

F_{r}

should be non-supersingular and contain a subgroup whose embedding degree k is not considerably large. Barreto and Naehrig proposed a method for constructing pairing-friendly elliptical curves with an embedding degree

k = 12

[33]. This is can be expressed as:

E (F_{r}) : y^{2} = x^{3} + b

for nonzero b. For

t \neq 0

, they parameterized the order q of the elliptical curve group and the characteristic r as follows:

q = 36 t^{4} + 36 t^{3} + 18 t^{2} + 6 t + 1,

(1)

r = 36 t^{4} + 36 t^{3} + 24 t^{2} + 6 t + 1 .

(2)

2.2. Cryptographic Pairing

Let

G_{1}

and

G_{2}

be additive groups of prime order q and

g_{1}

and

g_{2}

be generators of

G_{1}

and

G_{2}

, respectively. Subsequently, a mapping function is defined as

e : G_{1} \times G_{2} \to G_{T}

, which is used to map the elements of

G_{1}

and

G_{2}

onto

G_{T}

, which is a multiplicative group of prime order q.

0_{G_{1}}

,

0_{G_{2}}

, and

1_{G_{T}}

denote the identity elements of

G_{1}

,

G_{2}

, and

G_{T}

, respectively. We define the map e as a cryptographic pairing when the tuple (q,

G_{1}

,

G_{2}

,

G_{T}

, e) satisfies the following properties [34]:

The map e and group operations in $G_{1}$ , $G_{2}$ , and $G_{T}$ are efficiently computable.
The map e is bilinear for all $g_{1} \in G_{1}$ , $g_{2} \in G_{2}$ and a, b∈ $Z_{q}$ . That is,

$e ([a] g_{1}, [b] g_{2}) = e {(g_{1}, g_{2})}^{a b} .$

(3)
The map e is nondegenerate for $g_{1} \neq 0_{G_{1}}$ , $g_{2} \neq 0_{G_{2}}$ . That is,

$e (g_{1}, g_{2}) \neq 1_{G_{T}} .$

(4)

Generally, cryptographic pairings that satisfy these three properties consist of group operations on elliptical curves and finite fields. Specifically,

G_{1}

and

G_{2}

are asymmetric, that is,

G_{1} \neq G_{2}

, and they are elliptical curve subgroups of prime order q.

G_{T}

is a subgroup of the finite field

F_{r^{k}}

for the embedding degree k, and its order is q. In this study, we use a cryptographic pairing on the BN curve, and a bilinear environment is defined as a tuple

(q, G_{1}, G_{2}, G_{T}, g_{1}, g_{2}, e)

[34]. For notational convenience, we also define a pairing-product operation. Given

P = (P_{1}, P_{2}, \dots, P_{n}) \in G_{1}^{n}

and

Q = (Q_{1}, Q_{2}, \dots, Q_{n}) \in G_{2}^{n}

, the product of the pairings for the two vectors

P

and

Q

is defined as follows:

e_{p r o d} (P, Q) = Π_{i = 1}^{n} e (P_{i}, Q_{i}) .

(5)

2.3. FHIPE

In this study, we used the FHIPE scheme

\prod_{I P E}

proposed by Kim et al. [12].

\prod_{I P E}

is defined as

(I P E . S e t u p, I P E . K e y G e n, I P E . E n c r y p t, I P E . D e c r y p t)

. The dimensions n of vectors

x

and

y

are given as parameters, and S is a subset of

Z_{q}

, which specifies the possible range of the inner product.

$I P E . S e t u p (1^{λ}, S)$
1.
Select a bilinear environment $(q, G_{1}, G_{2}, G_{T}, g_{1}, g_{2}, e)$ according to the security parameter $λ$ .
2.
Choose a matrix $B \leftarrow {GL}_{n} (Z_{q})$ , where ${GL}_{n} (Z_{q})$ refers to a group of $n \times n$ square matrices whose elements belong to the finite field $Z_{q}$ and an inverse matrix exists.
3.
Compute $B^{★} \leftarrow \det (B) \cdot {(B^{- 1})}^{⊤}$ .
4.
Output the public parameter $p p = (q, G_{1}, G_{2}, G_{T}, e, S)$ and master secret key $m s k = (p p, g_{1}, g_{2}, B, B^{★})$ .
$I P E . K e y G e n (m s k, y)$
1.
Choose a uniformly random element $α \overset{R}{\leftarrow} Z_{q}$ .
2.
Using the master key $m s k$ and vector $y \in Z_{q}^{n}$ , output the secret key $s k = (K_{1}, K_{2}) = ([α \cdot \det (B)] g_{1}, [α \cdot y \cdot B] g_{1})$ such that $K_{2} \in G_{1}^{n}$ .
$I P E . E n c r y p t (m s k, x)$
2.
Choose a uniformly random element $β \overset{R}{\leftarrow} Z_{q}$ .
2.
Using the master key $m s k$ and vector $x \in Z_{q}^{n}$ , output the ciphertext $c t = (C_{1}, C_{2}) = ([β] g_{2}, [β \cdot x \cdot B^{★}] g_{2})$ such that $C_{2} \in G_{2}^{n}$ .
$I P E . D e c r y p t (p p, s k, c t)$
1.
Using the public parameter $p p$ , secret $s k = (K_{1}, K_{2})$ , and ciphertext $c t = (C_{1}, C_{2})$ , compute $D_{1} = e (K_{1}, C_{1})$ and $D_{2} = e_{p r o d} (K_{2}, C_{2})$ .
2.
Find a solution $z \in S$ for $D_{1}^{z} = D_{2}$ . If z exists, it is the inner product of $x$ and $y$ , denoted as $z = 〈 x, y 〉$ . Output z if it exists; otherwise, output ⊥, which indicates that no solution exists.

2.4. $L_{1}$ and $L_{2}$ Distances over FHIPE

Kim et al. [12] proposed an encoding method based on the FHIPE scheme to compute the

L_{1}

distance (Hamming distance) between two binary vectors. Given two binary vectors

x, y \in {0, 1}^{n}

, vectors

x

and

y

are encoded as

x^{'}

and

y^{'}

\in {- 1, 1}^{n}

by changing the zeroes in

x

and

y

to

- 1

. Subsequently, the Hamming distance between

x

and

y

is

d (x, y) = (n - 〈 x^{'}, y^{'} 〉) / 2

.

Jeon and Lee [13] used the square of the

L_{2}

distance (squared Euclidean distance) as a metric of similarity in a face authentication system. They designed three operations, namely,

E n c o d e X

,

E n c o d e Y

, and

E u c l i d

, using the encoding method proposed by Kim et al. [12]:

$E n c o d e X (m s k, x)$
1.
Construct an $(n + 2)$ -dimensional vector $x^{'} = {(∥ x ∥}^{2}, - 2 x_{1}, - 2 x_{2}, \dots, - 2 x_{n}, 1)$ from $x = (x_{1}, \dots, x_{n})$ .
2.
Output $c t = I P E . E n c r y p t (m s k, x^{'})$ .
$E n c o d e Y (m s k, y)$
1.
Construct an $(n + 2)$ -dimensional vector $y^{'} = (1, y_{1}, y_{2}, \dots, y_{n} {, ∥ y ∥}^{2})$ from $y = (y_{1}, \dots, y_{n})$ .
2.
Output $s k = I P E . K e y G e n (m s k, y^{'})$ .
$E u c l i d (p p, s k, c t)$
1.
Calculate $z = I P E . D e c r y p t (p p, s k, c t)$ .
2.
Output z. z satisfies $z = 〈 x^{'}, y^{'} 〉 = {(∥ x ∥}^{2} - 2 x_{1} y_{1} - 2 x_{2} y_{2} - \dots - 2 x_{n} y_{n} + {∥ y ∥}^{2} {) = (∥ x ∥}^{2} - 2 〈 x, y 〉 + {∥ y ∥}^{2} {) = ∥ x - y ∥}^{2} .$

3. Proposed Method

We present a method for computing the

L_{p}

distance over FHIPE for an even number p. Technically, we compute the p-powered

L_{p}

distance. However, computing the

L_{p}

distance as the p-th root from the p-powered

L_{p}

distance is trivial.

Given two vectors

x \in Z_{q}^{n}

and

y \in Z_{q}^{n}

, the p-powered

L_{p}

distance can be calculated as follows:

L_{p}^{p} (x, y) = {∥ x - y ∥}_{p}^{p} = \sum_{i = 1}^{n} {|x_{i} - y_{i}|}^{p} = \sum_{i = 1}^{n} {(x_{i} - y_{i})}^{p},

(6)

where the last equality comes from the fact that p is even. For the two n-dimensional vectors

u

and

v

, the ⊙ operator denotes element-wise multiplication, i.e.,

u ⊙ v = (u_{1} v_{1}, \dots, u_{n} v_{n})

, and

v^{(t)}

denotes the t-th Hadamard power of

v

for any

t \geq 0

, i.e.,

v^{(0)} = (1, \dots, 1)

and

v^{(t)} = v^{(t - 1)} ⊙ v

for

t \geq 1

[35]. Note that

L_{p}^{p} (x, y)

in (6) is expressed as a sum of p-th powers of binomial

(x_{i} - y_{i})

. Therefore, we can apply the binomial theorem,

{(x_{i} - y_{i})}^{p} = \sum_{j = 0}^{p} (\binom{p}{j}) {x_{i}}^{j} {(- y_{i})}^{p - j} = \sum_{j = 0}^{p} (\binom{p}{j}) {(- 1)}^{p - j} {x_{i}}^{j} y_{i}^{p - j}

, to each term

{(x_{i} - y_{i})}^{p}

. Consequently,

L_{p}^{p} (x, y) = \sum_{i = 1}^{n} \sum_{j = 0}^{p} (\binom{p}{j}) {(- 1)}^{p - j} {x_{i}}^{j} y_{i}^{p - j}

, and we obtain:

L_{p}^{p} (x, y) = \sum_{i = 0}^{p} (\binom{p}{i}) {(- 1)}^{p - i} 〈 x^{(i)}, y^{(p - i)} 〉 .

(7)

We encode

x

and

y

to

x^{'} \in Z_{q}^{(p - 1) n + 2}

and

y^{'} \in Z_{q}^{(p - 1) n + 2}

over a ring

Z_{q}

as follows:

x^{'} = (α_{0} {∥ x ∥}_{p}^{p}, α_{1} x^{(p - 1)}, α_{2} x^{(p - 2)}, \dots, α_{p - 1} x^{(1)}, α_{p}),

(8)

y^{'} = (1, y^{(1)}, y^{(2)}, \dots, y^{(p - 2)}, y^{(p - 1)} {, ∥ y ∥}_{p}^{p}),

(9)

where

α_{i} = (\binom{p}{i}) {(- 1)}^{i}

for

i = 0, \dots, p

. Thus, we observe that:

L_{p}^{p} (x, y) = 〈 x^{'}, y^{'} 〉 .

(10)

Accordingly, a generalized form of the p-powered

L_{p}

distance computation protocol over FHIPE is defined as

\prod_{P D I S T}

=

(S e t u p

,

E n c o d e X

,

E n c o d e Y

,

G e t D i s t a n c e)

, which is obtained from using

\prod_{I P E}

for a security parameter

λ \in N

, degree

p \in 2 N

, the dimension

n \in N

of the input vectors, and the range S of the inner product. These four operations are defined as follows:

$S e t u p (1^{λ}, S)$
1.
Select a bilinear environment $(q, G_{1}, G_{2}, G_{T}, g_{1}, g_{2}, e)$ according to the security parameter $λ$ .
2.
Choose a matrix $B^{'} \leftarrow {GL}_{l} (Z_{q})$ for $l = (p - 1) n + 2$ , where ${GL}_{l} (Z_{q})$ is a group of $l \times l$ square matrices whose elements belong to the finite field $Z_{q}$ and an inverse matrix exists.
3.
Compute ${B^{'}}^{★} \leftarrow \det (B^{'}) \cdot {(B^{'} - 1)}^{⊤}$ .
4.
Output the public parameter $p p^{'} = (G_{1}, G_{2}, G_{T}, q, e, S, p)$ and master secret key $m s k^{'} = (p p^{'}, g_{1}, g_{2}, B^{'}, {B^{'}}^{★})$ .
$E n c o d e X (m s k^{'}, x)$
1.
Construct a vector $x^{'} \in Z_{q}^{l}$ as described previously (8).
2.
Output $c t^{'} = I P E . E n c r y p t (m s k^{'}, x^{'})$ .
$E n c o d e Y (m s k^{'}, y)$
1.
Construct a vector $y^{'} \in Z_{q}^{l}$ as described previously (9).
2.
Output $s k^{'} = I P E . K e y G e n (m s k^{'}, y^{'})$ .
$G e t D i s t a n c e (p p^{'}, s k^{'}, c t^{'})$
1.
Calculate $z = I P E . D e c r y p t (p p, s k^{'}, c t^{'}) \in S$ .
2.
Output z. z satisfies $z = 〈 x^{'}, y^{'} 〉 = L_{p}^{p} (x, y)$ .

4. Performance Analysis

To verify the feasibility of the proposed method, we implemented

\prod_{P D I S T}

in C++ using the FHIPE library developed in a previous study [13] (our algorithm and code are available at https://github.com/inhaislab/FHIPE-Lp-Distance (accessed on 19 April 2023)). As shown in Table 1, we also used NTL 11.3.2 [36] for integer, vector, and matrix arithmetic, MCL 1.51 [37] for pairing-based cryptography, and GMP 6.1.2 [38] for big number arithmetic. We applied

\prod_{P D I S T}

to our testing programs on desktop PC and Raspberry Pi. The programs were run on a desktop computer with 16 GB main memory, an AMD Ryzen5 3600 6-Core processor @ 3.60 GHz, and Ubuntu 20.04 (using Windows 11 Pro WSL2), and on a Raspberry Pi 2 model B with 1 GB RAM and a quad-core ARM Cortex-A7 processor @ 900 MHz. MCL 1.51 [37] provides various parameter sets for bilinear environments. Among these, we selected SNARK1, which is one of the most frequently used parameter sets for Ethereum.

We measured the execution times for the four operations of

\prod_{P D I S T}

with

n = {8, 16, 32, 64, 128}

and

p = {2, 4, 6, 8, 10}

. For each combination, 100 measurements were averaged. Figure 1 shows the experimental results on the desktop PC. The exact numerical data are given in the Appendix A (See Table A1, Table A2, Table A3 and Table A4.) Based on the results, it can be deduced that

S e t u p

and

G e t D i s t a n c e

are the dominant operations. In particular, the execution time for

G e t D i s t a n c e

increases exponentially as p increases. For example, it took 0.01 s, 0.07 s, 0.56 s, 6.07 s, 57.3 s for

p = 2, 4, 6, 8, 10

, respectively, when n is 64. Notably, the main operation in

G e t D i s t a n c e

is the discrete logarithm (DL) computation. The complexity of solving a DL problem is

O (\sqrt{S})

when a well-known general method is used, that is, the baby-step giant-step method. It must be noted that S, which is the range of the solution of the DL problem, satisfies

S = n \cdot m^{p}

, where m is the maximum difference between the two vector elements

x_{i}

and

y_{i}

. This explains the exponential growth in execution time with an increase in p. We also observe that the time increases slowly as n increases because the execution time is proportional to

\sqrt{n}

.

S e t u p

also consumes a significant amount of time, for example, 6.65 s on average when n = 64 and p = 10, because it involves complex matrix operations, such as inverse computation. However,

E n c o d e X

and

E n c o d e Y

consume negligible time compared with

S e t u p

and

G e t D i s t a n c e

. For example, they only took 24.77 ms and 21.80 ms, respectively, when n = 64. This is because they involve relatively lower cost operations, such as matrix-vector multiplication and elliptical-curve point multiplication.

We also measured the execution time on the Raspberry Pi.

S e t u p

and

G e t D i s t a n c e

were not considered in this experiment because these operations are expected to run not on IoT devices but on high-end servers. Thus, the implementation on an IoT device, that is, Raspberry Pi, requires only

E n c o d e X

and

E n c o d e Y

. Figure 2 shows the experimental results for these operations. The exact numerical data are provided in the Appendix A. (See Table A5 and Table A6.) The overall tendencies of

E n c o d e X

and

E n c o d e Y

on the Raspberry Pi were similar to those of the previous experiment on the desktop PC. Between the two operations,

E n c o d e Y

required slightly shorter execution times than

E n c o d e X

on both the desktop and Raspberry Pi, up to approximately 33% on the desktop and approximately 50% on the Raspberry Pi. Notably, even a combination of the largest parameters, that is,

(n, p) = (128, 10)

, requires only a few seconds for execution. To be precise, the execution times were 3.82 s and 2.58 s for

E n c o d e X

and

E n c o d e Y

, respectively. Under typical settings with

p = 6

[4], the time required was less than a second.

Finally, we compared the proposed method with previous privacy-preserving anomaly detection methods. In particular, we focused on the distance metric to identify anomaly. Table 2 demonstrates that all the previous privacy-preserving anomaly detection methods were based on either Euclidean distance (

L_{2}

distance) or absolute error (

L_{1}

distance). Even though a customized nonlinear metric based on Gaussian distribution was used in [25], it was computed on plaintext. Therefore, it cannot be directly compared with the proposed method. By examining the distance metric as a quantitative property, we can see that the proposed method is the first anomaly detection method that performs distance computation over ciphertext using the

L_{p}

distance metric for

p > 2

.

5. Applications

In this section, we propose a general framework for privacy-preserving anomaly detection in IoT systems. We present two examples of this framework: smart building management and remote device diagnosis. Our basic assumption is that each IoT device has several normal states defined by its specifications. Each of these states can be represented as a latent vector, that is, a feature vector, by applying a neural network model as a feature extractor. The normal-state vectors are stored, and anomaly detection is performed by comparing the current state with the normal states. An anomaly is defined as an event in which the current feature vectors are not sufficiently close to any of the stored normal feature vectors in terms of the

L_{p}

distance metric. We used

\prod_{P D I S T}

as a building block for privacy-preserving anomaly detection.

As shown in Figure 3,the framework is composed of two servers and several IoT devices.

S e r v e r_{s e t u p}

denotes the set up server. We assume that the normal-state vectors

x_{1}, \dots, x_{M}

are provided to

S e r v e r_{s e t u p}

as a specification of each target device. In the device registration stage,

S e r v e r_{s e t u p}

performs

S e t u p

for that device, encrypts

x_{1}, \dots, x_{M}

into

c t_{1}^{'}, \dots, c t_{M}^{'}

using

E n c o d e X

with the corresponding

m s k^{'}

of the device, and transfers the results to an anomaly detection server

S e r v e r_{d e t e c t}

for registration.

S e r v e r_{s e t u p}

stores

m s k^{'}

for each device, and

S e r v e r_{d e t e c t}

stores

c t_{1}^{'}, \dots, c t_{M}^{'}

for each device. When a device must be analyzed for an anomaly, it obtains temporal access to

m s k^{'}

in

S e r v e r_{s e t u p}

after it passes proper authentication. It encrypts the state vector

y

into

s k^{'}

using

E n c o d e Y

and sends the results to

S e r v e r_{d e t e c t}

with a request for an anomaly detection service. Upon receiving this request,

S e r v e r_{d e t e c t}

determines whether the device is currently in a normal state based on an NN search using the

L_{p}

distance. Thus,

S e r v e r_{d e t e c t}

computes

G e t D i s t a n c e (p p^{'}, s k^{'}, c t_{i}^{'})

for

i = 1

to M for this device and checks whether at least one of these distances is smaller than a predefined threshold. Otherwise, an anomaly is asserted.

To verify the feasibility of the above-mentioned application framework, we conducted additional experiments on the Secure Water Treatment (SWaT) dataset [5]. SWaT is a time series anomaly dataset provided by iTrust of Singapore University of Technology and Design (SUTD). The data in SWaT were collected using 24 sensors and 27 actuators on the water treatment testbed over 11 days through six steps: water storing, pre-treatment and chemical dosing, fine filtration, dechlorination, removing inorganic impurities, and water storing for distribution. Therefore, the SWaT dataset can be interpreted as a 51-dimensional time series dataset. Each time series is composed of numerical sensor and actuator values measured every second. We can model the water treatment application considered in the SWaT dataset using the components in Figure 3 as follows: the

S e r v e r_{s e t u p}

and

S e r v e r_{d e t e c t}

are the third party participants that are in charge of detecting anomalies in the entire water treatment system, and each IoT device in Figure 3 represents a water treatment plant. In each individual water treatment plant, a feature vector is created through a neural network-based feature extractor. Feature extraction is essentially a dimensionality reduction process. For this purpose, we used an encoder with three hidden fully connected layers of dimensions

4 n

,

2 n

, and n, respectively. This encoder can be viewed as the encoding layers of an autoencoder. Using an autoencoder is a well-established approach in the literature in time series anomaly detection [39]. We considered a hyperparameter n to represent the size of the final latent vectors, i.e., feature vectors. We tested

n = 8, 16

, and 32. This n corresponds to the vector size, n, in the experiment in Section 4. Among the 51 sensors and actuators, we removed six unstable data columns as in [39]. To represent the short-term temporal behavior of each sensor and actuator, we considered 60 consecutive measurements, i.e., 60 s for each sensor or actuator. Consequently, an input for the feature extractor consists of 45 time series with each series containing 60 measurements. In other words, a single input is a 60 × 45 matrix and the output is an n-dimensional feature vector. The feature vector is encrypted through the

E n c o d e X

and

E n c o d e Y

operations for registration and anomaly detection, respectively.

For our experiment, we used the same experimental setup as that in Section 4 (see Table 1). The feature extractor was implemented in C++ to extract feature vectors. Then, each IoT device conducted feature extraction followed by

E n c o d e X

(or

E n c o d e Y

). The experimental results measured at the anomaly detection stage, i.e., the time for feature extraction plus

E n c o d e Y

is presented in Figure 4. According to the results, the time required for feature extraction was roughly on the same order as that for

E n c o d e Y

. Note that the total computation time on Raspberry Pi was only 0.6396 s even for the combination of largest hyperparameters, that is,

p = 10

and

n = 32

.

5.1. Smart Building Management

The first application scenario involves smart building management. As already discussed in Section 1, IoT-based smart building management systems can cause serious privacy issues if the collected data are not properly protected [11]. We assume that

S e r v e r_{s e t u p}

belongs to the facility management department in a campus comprising several buildings, and

S e r v e r_{d e t e c t}

belongs to another department. Managers should be responsible for managing the IoT devices of their department, which is accomplished by observing the measured status.

An IoT device must be registered in advance using the following procedure. The device manager requests that the facility management department register the IoT device. The department then issues a unique device id

d e v i c e_{i}

and conducts

S e t u p

for the device to create its

m s k_{i}^{'}

. Referring to the device specifications,

S e r v e r_{s e t u p}

obtains the normal-state vectors

x_{1}, \dots, \begin{matrix} x_{M} \end{matrix}

and encrypts them into

c t_{i 1}^{'}, \dots, c t_{i M}^{'}

using

E n c o d e X

with

m s k_{i}^{'}

. The manager then stores

c t_{i 1}^{'}, \dots, c t_{i M}^{'}

in the department database for

S e r v e r_{d e t e c t}

.

Subsequently, anomaly detection is performed as follows: a manager who wants to investigate a suspicious

d e v i c e_{i}

in their department requests temporal access from

m s k_{i}

to

S e r v e r_{s e t u p}

. After authenticating this request,

S e r v e r_{s e t u p}

sends (

m s k_{i}^{'}

,

s e s s i o n I d

) to

d e v i c e_{i}

and (

d e v i c e_{i}

,

s e s s i o n I d

) to

S e r v e r_{d e t e c t}

. Device

d e v i c e_{i}

executes

E n c o d e Y

using

m s k_{i}^{'}

with the current feature vector

y_{i}

to output

s k_{i}^{'}

. Subsequently,

d e v i c e_{i}

sends (

d e v i c e_{i}

,

s e s s i o n I d

,

s k_{i}^{'}

) to

S e r v e r_{d e t e c t}

in the department.

S e r v e r_{d e t e c t}

validates the session and solves an encrypted NN search problem to obtain the minimum mean p-powered error by calculating

G e t D i s t a n c e (p p^{'}, s k_{i}^{'}, c t_{i j}^{'})

for

j = 1, \dots, M

. If the error is below a certain threshold, the device is considered healthy.

5.2. Remote Device Diagnosis

The second application scenario is remote device diagnosis. We assume that an IoT appliance manufacturing company provides self-diagnosis services to its clients and that the company knows the normal states of its devices before release. The company configures

S e r v e r_{s e t u p}

and

S e r v e r_{d e t e c t}

such that all

m s k^{'}

s for the devices are stored in

S e r v e r_{s e t u p}

and not shared with

S e r v e r_{d e t e c t}

and ensures that all the encrypted normal-state vectors of each device are stored in

S e r v e r_{d e t e c t}

.

When a user of

d e v i c e_{i}

wants to know whether their IoT appliance is working well, the user allows the device to initiate an anomaly detection protocol in a manner similar to that described in Section 5.1. The device pulls

m s k_{i}^{'}

from

S e r v e r_{s e t u p}

and uses

m s k_{i}^{'}

to encrypt the current state in

c t_{i}^{'}

and sends it to

S e r v e r_{d e t e c t}

. Finally, the device displays the result received from

S e r v e r_{d e t e c t}

to the user.

5.3. Possible Attacks and Mitigation on Our Systems

In this subsection, we address the possible attacks against each component of our system.

Attack to devices: an attacker may attempt to extract either the state vector or $m s k^{'}$ by observing the memory of a device while encryption is performed with $m s k^{'}$ . However, this type of physical threat can be mitigated using a a trusted execution environment.
Network attack: attackers may attempt to sniff the communication between a device and servers or steal the normal-state vectors stored in the database. As the vectors are encrypted by $E n c o d e X$ and $E n c o d e Y$ , attackers learn no information about the vectors, even when they obtain the encrypted vectors. Attackers may attempt to perform replay attacks and man-in-the-middle attacks on the communication between a device and servers to manipulate data. These attacks can be prevented by marking with a timestamp and authenticating with a message authentication code or digital signature of the server on every request and response.
Attack to servers: if $S e r v e r_{s e t u p}$ is compromised, $m s k^{'}$ s for all devices may be leaked. Therefore, we assume that $S e r v e r_{s e t u p}$ is protected with a proper mechanism. We also assume that $S e r v e r_{s e t u p}$ is trustworthy. In other words, it does not attempt to recover the device information with $m s k^{'}$ . Under this assumption, the only concern involves $S e r v e r_{d e t e c t}$ , which may want to recover any useful information from its database of encrypted feature vectors. (This also includes the case where $S e r v e r_{d e t e c t}$ is compromised from outside attackers.) However, this is prevented by the security property of FHIPE. For this, however, $S e r v e r_{s e t u p}$ and $S e r v e r_{d e t e c t}$ must be strictly separated to ensure that they do not share $m s k^{'} s$ with each other.

6. Limitation

The proposed method works only when the degree p is even. In (6), the

L_{p}

distance is composed of terms that are p-powers of

| x_{i} - y_{i} |

. Since it is not possible to compute an absolute value using FHIPE, we replaced

| x_{i} - y_{i} |^{p}

with

{(x_{i} - y_{i})}^{p}

. This is only possible when p is even. To be precise,

| x_{i} - y_{i} |^{p} = (| x_{i} - y_{i} {|^{2})}^{p / 2} = {({(x_{i} - y_{i})}^{2})}^{p / 2} = {(x_{i} - y_{i})}^{p}

. However when p is odd, this equality does not hold. To resolve this issue, we may consider a naive method for odd p as follows: given a maximum difference m between two vector elements

x_{i} \in Z_{q}

and

y_{i} \in Z_{q}

, we can compute

x_{i} - y_{i}

by computing the inner product of

(x_{i}, - 1)

and

(1, y_{i})

using FHIPE. If

x_{i} - y_{i} \leq m

, then

| x_{i} - y_{i} | = x_{i} - y_{i}

. Otherwise, if

x_{i} - y_{i} > m

, this implies that

y_{i}

is actually greater than

x_{i}

and

x_{i} - y_{i} = q - m^{'}

for

y_{i} - x_{i} = m^{'} \leq m

. Therefore, the actual absolute value can be computed as

q - (x_{i} - y_{i}) = m^{'}

. However, the above approach requires FHIPE Setup, KeyGen, Encrypt, and Decrypt for each absolute value term

| x_{i} - y_{i} |

separately. This may raise both performance and security issues. Therefore, a more advanced solution is required for odd p, which will be a future research direction.

7. Conclusions

In this study, we developed a method to compute the

L_{p}

distance privately using FHIPE for even

p > 2

. The proposed method is the first anomaly detection method that performs distance computation over ciphertext using the

L_{p}

distance metric for

p > 2

. Our experimental results demonstrate that the proposed method is sufficiently efficient for use in real-world IoT devices. For example,

E n c o d e X

and

E n c o d e Y

require only 787 ms and 430 ms, respectively, for a typical setting with

p = 6

and

n = 64

on a Raspberry Pi IoT device according to the experimental results in Section 4. Based on additional experiments using the SWaT dataset in Section 5, the time required to extract a 32-dimensional feature vector from a 60 × 45 input matrix using a three-layer fully connected neural network is 0.2604 s. If a typical setting is considered with

p = 6

[4], the time for

E n c o d e Y

on the Raspberry Pi side is 0.1944 s and the time for

G e t D i s t a n c e

on the server side is 0.4222 s. Therefore, the total computation time for anomaly detection is

(0.2604 + 0.1944 + 0.4222 M)

s when there are M possibilities for normal states. Assuming that M is not very large, e.g.,

M < 10

, the overall anomaly detection operation can be completed in a few seconds in a typical setting with

p = 6

. Therefore, the

L_{p}

distance can be used to compute the mean p-powered error for anomaly detection without revealing the actual state of IoT devices. Even though the anomaly detection server computes the distance between encoded state vectors of an IoT device, it never sees the actual values of these vectors. Thus, the proposed method is expected to be useful for outsourcing anomaly detection in a privacy-preserving manner. We suggest two practical scenarios for privacy-preserving anomaly detection: a smart building management system and remote device diagnosis.

Author Contributions

Conceptualization, M.-K.L.; software, D.-H.R. and S.-Y.J.; writing—original draft, D.-H.R.; writing—review and editing, S.-Y.J., J.H. and M.-K.L. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported in part by the Korea Institute of Energy Technology Evaluation and Planning (KETEP) and Ministry of Trade, Industry & Energy (MOTIE) of the Republic of Korea (No. 20214000000820) and in part by the Inha University Research Grant (2022).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are openly available at https://github.com/inhaislab/FHIPE-Lp-Distance (accessed on 19 April 2023).

Acknowledgments

We thank Hee-Yong Kwon for his help in preparing time series data for our experiments.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

FE	Functional encryption
IPE	Inner product encryption
FHIPE	Function-hiding inner product encryption
DL	Discrete logarithm
NN	Nearest neighbor

Appendix A

In this appendix, we provide the full numerical results of the experiments explained in Section 4.

Table A1. Execution time (ms) of

P D I S T . S e t u p

operation on desktop PC.

Table A1. Execution time (ms) of

P D I S T . S e t u p

operation on desktop PC.

n	$p =$ 2	$p =$ 4	$p =$ 6	$p =$ 8	$p =$ 10
8	0.2829	1.3004	4.0788	9.1245	17.5512
16	0.7050	6.2276	23.2380	57.6781	116.7271
32	2.7922	38.1033	157.1521	409.0067	847.1027
64	13.4316	262.8206	1154.1302	3169.3995	6654.9266
128	84.3335	1996.4130	9102.0401	24,568.7575	51,736.9127

Table A2. Execution time (ms) of

P D I S T . E n c o d e X

operation on desktop PC.

Table A2. Execution time (ms) of

P D I S T . E n c o d e X

operation on desktop PC.

n	$p =$ 2	$p =$ 4	$p =$ 6	$p =$ 8	$p =$ 10
8	0.5360	1.0178	1.4379	1.9993	2.3923
16	0.6928	1.5539	2.4909	3.3429	4.2303
32	1.0629	2.7375	4.5277	6.6077	9.3979
64	1.9716	5.4832	10.7796	17.2424	24.8129
128	3.6427	13.8724	28.8228	57.3657	100.5351

Table A3. Execution time (ms) of

P D I S T . E n c o d e Y

operation on desktop PC.

Table A3. Execution time (ms) of

P D I S T . E n c o d e Y

operation on desktop PC.

n	$p =$ 2	$p =$ 4	$p =$ 6	$p =$ 8	$p =$ 10
8	0.4172	0.6683	0.9487	1.3669	1.6381
16	0.4990	1.0367	1.7212	2.2604	3.1142
32	0.7689	2.1281	3.5350	5.6405	8.2282
64	1.3513	4.3957	9.2588	15.2354	21.8299
128	2.7581	12.0748	25.7857	51.5276	84.5382

Table A4. Execution time (ms) of

P D I S T . G e t D i s t a n c e

operation on desktop PC.

Table A4. Execution time (ms) of

P D I S T . G e t D i s t a n c e

operation on desktop PC.

n	$p =$ 2	$p =$ 4	$p =$ 6	$p =$ 8	$p =$ 10
8	3.9364	25.9514	209.0308	1943.1150	21,997.1054
16	5.5135	33.2946	273.7545	3005.3176	28,656.2952
32	7.1317	51.0461	422.1917	3934.5410	44,462.6137
64	11.1817	67.7086	555.1111	6066.4581	57,326.4031
128	15.8210	105.7101	856.1863	7930.5029	89,076.5944

Table A5. Execution time (sec) of

P D I S T . E n c o d e X

operation on Raspberry Pi.

Table A5. Execution time (sec) of

P D I S T . E n c o d e X

operation on Raspberry Pi.

n	$p =$ 2	$p =$ 4	$p =$ 6	$p =$ 8	$p =$ 10
8	0.0334	0.0681	0.1046	0.1382	0.1752
16	0.0515	0.1202	0.1906	0.2638	0.3386
32	0.0855	0.2281	0.3780	0.5351	0.6965
64	0.1560	0.4553	0.7879	1.1443	1.5455
128	0.3026	0.9714	1.7483	2.6981	3.8167

Table A6. Execution time (sec) of

P D I S T . E n c o d e Y

operation on Raspberry Pi.

Table A6. Execution time (sec) of

P D I S T . E n c o d e Y

operation on Raspberry Pi.

n	$p =$ 2	$p =$ 4	$p =$ 6	$p =$ 8	$p =$ 10
8	0.0162	0.0332	0.0512	0.0687	0.0866
16	0.0249	0.0596	0.0951	0.1337	0.1732
32	0.0419	0.1143	0.1944	0.2809	0.3765
64	0.0770	0.2376	0.4298	0.6494	0.9114
128	0.1534	0.5407	1.0486	1.7215	2.5846

References

Chandola, V.; Banerjee, A.; Kumar, V. Anomaly Detection: A Survey. ACM Comput. Surv. 2009, 41, 1–58. [Google Scholar] [CrossRef]
Denning, D. An Intrusion-Detection Model. IEEE Trans. Softw. Eng. 1987, SE-13, 222–232. [Google Scholar] [CrossRef]
Shafiq, M.; Tian, Z.; Bashir, A.K.; Du, X.; Guizani, M. CorrAUC: A malicious bot-IoT traffic detection method in IoT network using machine-learning techniques. IEEE Internet Things J. 2020, 8, 3242–3254. [Google Scholar] [CrossRef]
Shalyga, D.; Filonov, P.; Lavrentyev, A. Anomaly Detection for Water Treatment System based on Neural Network with Automatic Architecture Optimization. arXiv 2018, arXiv:1807.07282. [Google Scholar]
iTrust lab of Singapore University of Technology and Design (SUTD). Available online: https://itrust.sutd.edu.sg/itrust-labs-home/itrust-labs_swat (accessed on 16 January 2023).
Meteriz, Ü.; Y𝜄ld𝜄ran, N.F.; Kim, J.; Mohaisen, D. Understanding the Potential Risks of Sharing Elevation Information on Fitness Applications. In Proceedings of the 2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS), Singapore, 29 November–1 December 2020; pp. 464–473. [Google Scholar]
Tabassum, M.; Kosiński, T.; Frik, A.; Malkin, N.; Wijesekera, P.; Egelman, S.; Lipford, H.R. Investigating Users’ Preferences and Expectations for Always-Listening Voice Assistants. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2020, 3, 1–23. [Google Scholar] [CrossRef]
Ma, T.; Kim, S.D.; Wang, J.; Zhao, Y. Privacy Preserving in Ubiquitous Computing: Challenges & Issues. In Proceedings of the 2008 IEEE International Conference on e-Business Engineering, Xi’an, China, 22–24 October 2008; pp. 297–301. [Google Scholar]
Hall, D.; Llinas, J. An introduction to multisensor data fusion. Proc. IEEE 1997, 85, 6–23. [Google Scholar] [CrossRef]
Nesa, N.; Banerjee, I. IoT-Based Sensor Data Fusion for Occupancy Sensing Using Dempster–Shafer Evidence Theory for Smart Buildings. IEEE Internet Things J. 2017, 4, 1563–1570. [Google Scholar] [CrossRef]
Guo, E.; Ryan-Mosley, T. Inside the Bitter Campus Privacy Battle over Smart Building Sensors, MIT Technology Review, 3 April 2023. Available online: https://www.technologyreview.com/2023/04/03/1070665/cmu-university-privacy-battle-smart-building-sensors-mites/?truid=&utm_source=the_download&utm_medium=email&utm_campaign=the_download.unpaid.engagement&utm_term=&utm_content=04-03-2023&mc_cid=bcaff9641c&mc_eid=fa860b1d8f (accessed on 19 April 2023).
Kim, S.; Lewi, K.; Mandal, A.; Montgomery, H.; Roy, A.; Wu, D.J. Function-Hiding Inner Product Encryption Is Practical. In Proceedings of the Security and Cryptography for Networks, Amalfi, Italy, 5–7 September 2018; Catalano, D., De Prisco, R., Eds.; Springer: Cham, Switzerland, 2018; pp. 544–562. [Google Scholar]
Jeon, S.Y.; Lee, M.K. Acceleration of Inner-Pairing Product Operation for Secure Biometric Verification. Sensors 2021, 21, 2859. [Google Scholar] [CrossRef]
Zhou, K.; Ren, J. PassBio: Privacy-Preserving User-Centric Biometric Authentication. IEEE Trans. Inf. Forensics Secur. 2018, 13, 3050–3063. [Google Scholar] [CrossRef]
Lee, J.; Kim, D.; Kim, D.; Song, Y.; Shin, J.; Cheon, J.H. Instant Privacy-Preserving Biometric Authentication for Hamming Distance; Cryptology ePrint Archive, Report 2018/1214; IACR. 2018. Available online: https://eprint.iacr.org/2018/1214 (accessed on 21 April 2023).
Im, J.H.; Jeon, S.Y.; Lee, M.K. Practical Privacy-Preserving Face Authentication for Smartphones Secure Against Malicious Clients. IEEE Trans. Inf. Forensics Secur. 2020, 15, 2386–2401. [Google Scholar] [CrossRef]
Wang, X.; Xue, H.; Liu, X.; Pei, Q. A Privacy-Preserving Edge Computation-Based Face Verification System for User Authentication. IEEE Access 2019, 7, 14186–14197. [Google Scholar] [CrossRef]
Li, X.Y.; Jung, T. Search me if you can: Privacy-preserving location query service. In Proceedings of the 2013 Proceedings IEEE INFOCOM, Turin, Italy, 14–19 April 2013; pp. 2760–2768. [Google Scholar]
Wei, F.; Vijayakumar, P.; Kumar, N.; Zhang, R.; Cheng, Q. Privacy-Preserving Implicit Authentication Protocol Using Cosine Similarity for Internet of Things. IEEE Internet Things J. 2021, 8, 5599–5606. [Google Scholar] [CrossRef]
Murugesan, M.; Jiang, W.; Clifton, C.; Si, L.; Vaidya, J. Efficient privacy-preserving similar document detection. VLDB J. 2010, 19, 457–475. [Google Scholar] [CrossRef]
Gheid, Z.; Challal, Y. An efficient and privacy-preserving similarity evaluation for big data analytics. In Proceedings of the 2015 IEEE/ACM 8th International Conference on Utility and Cloud Computing (UCC), Limassol, Cyprus, 7–10 December 2015; pp. 281–289. [Google Scholar]
Li, D.; Chen, C.; Lv, Q.; Shang, L.; Zhao, Y.; Lu, T.; Gu, N. An algorithm for efficient privacy-preserving item-based collaborative filtering. Future Gener. Comput. Syst. 2016, 55, 311–320. [Google Scholar] [CrossRef]
Alabdulatif, A.; Kumarage, H.; Khalil, I.; Yi, X. Privacy-preserving anomaly detection in cloud with lightweight homomorphic encryption. J. Comput. Syst. Sci. 2017, 90, 28–45. [Google Scholar] [CrossRef]
Alabdulatif, A.; Khalil, I.; Kumarage, H.; Zomaya, A.Y.; Yi, X. Privacy-preserving anomaly detection in the cloud for quality assured decision-making in smart cities. J. Parallel Distrib. Comput. 2019, 127, 209–223. [Google Scholar] [CrossRef]
Mehnaz, S.; Bertino, E. Privacy-preserving Real-time Anomaly Detection Using Edge Computing. In Proceedings of the 2020 IEEE 36th International Conference on Data Engineering (ICDE), Dallas, TX, USA, 20–24 April 2020; pp. 469–480. [Google Scholar]
Lyu, L.; Law, Y.W.; Erfani, S.M.; Leckie, C.; Palaniswami, M. An improved scheme for privacy-preserving collaborative anomaly detection. In Proceedings of the 2016 IEEE International Conference on Pervasive Computing and Communication Workshops (PerCom Workshops), Sydney, NSW, Australia, 14–18 March 2016; pp. 1–6. [Google Scholar]
Mukherjee, S.; Chen, Z.; Gangopadhyay, A. A privacy-preserving technique for Euclidean distance-based mining algorithms using Fourier-related transforms. VLDB J. 2006, 15, 293–315. [Google Scholar] [CrossRef]
Kikuchi, H.; Nagai, K.; Ogata, W.; Nishigaki, M. Privacy-preserving similarity evaluation and application to remote biometrics authentication. Soft Comput. 2010, 14, 529–536. [Google Scholar] [CrossRef]
Bringer, J.; Chabanne, H.; Favre, M.; Patey, A.; Schneider, T.; Zohner, M. GSHADE: Faster privacy-preserving distance computation and biometric identification. In Proceedings of the 2nd ACM Workshop on Information Hiding and Multimedia Security, Santa Barbara, CA, USA, 27–28 June 2014; pp. 187–198. [Google Scholar]
Kang, H.E.D.; Kim, D.; Kim, S.; Kim, D.D.; Cheon, J.H.; Anthony, B.W. Homomorphic Encryption as a secure PHM outsourcing solution for small and medium manufacturing enterprise. J. Manuf. Syst. 2021, 61, 856–865. [Google Scholar] [CrossRef]
Kwon, H.Y.; Lee, M.K. Comments on “PassBio: Privacy-Preserving User-Centric Biometric Authentication”. IEEE Trans. Inf. Forensics Secur. 2022, 17, 2816–2817. [Google Scholar] [CrossRef]
Jia, Q.; Guo, L.; Jin, Z.; Fang, Y. Privacy-Preserving Data Classification and Similarity Evaluation for Distributed Systems. In Proceedings of the 2016 IEEE 36th International Conference on Distributed Computing Systems (ICDCS), Nara, Japan, 27–30 June 2016; pp. 690–699. [Google Scholar]
Barreto, P.S.L.M.; Naehrig, M. Pairing-Friendly Elliptic Curves of Prime Order. In Proceedings of the 12th International Conference on Selected Areas in Cryptography, Kingston, ON, Canada, 11–12 August 2005; Springer: Berlin/Heidelberg, Germany, 2005. [Google Scholar]
El Mrabet, N.; Joye, M. Guide to Pairing-Based Cryptography; CRC Press: Boca Raton, FL, USA, 2017. [Google Scholar]
Fallat, S.M.; Johnson, C.R. Hadamard powers and totally positive matrices. Linear Algebra Appl. 2007, 423, 420–427. [Google Scholar] [CrossRef]
A Library for Doing Numbery Theory (NTL). Available online: https://www.shoup.net/ntl (accessed on 16 January 2023).
A Portable and Fast Pairing-Based Cryptography Library. Available online: https://github.com/herumi/mcl (accessed on 16 January 2023).
The GNU MP Bignum Library. Available online: https://gmplib.org/ (accessed on 16 January 2023).
Wang, C.; Wang, B.; Liu, H.; Qu, H. Anomaly detection for industrial control system based on autoencoder neural network. Wirel. Commun. Mob. Comput. 2020, 2020, 8897926. [Google Scholar] [CrossRef]

Figure 1. Execution time (ms) of the operations of

\prod_{P D I S T}

on a desktop PC.

Figure 1. Execution time (ms) of the operations of

\prod_{P D I S T}

on a desktop PC.

Figure 2. Execution time (ms) of EncodeX and EncodeY on Raspberry Pi.

Figure 3. Example framework for privacy-preserving anomaly detection.

Figure 4. Total execution time for feature extraction and

E n c o d e Y

in Raspberry Pi. The line and bars represent the execution times of feature extraction and

E n c o d e Y

operations, respectively.

Figure 4. Total execution time for feature extraction and

E n c o d e Y

in Raspberry Pi. The line and bars represent the execution times of feature extraction and

E n c o d e Y

operations, respectively.

Table 1. Experimental setup for performance analysis.

	Desktop Computer	Raspberry Pi
CPU	AMD Ryzen5 3600 6-Core @ 3.60 GHz	ARM Cortex-A7 4-Core @ 900 MHz
Memory	16 GB	1 GB
OS	Ubuntu 20.04 (using Windows 11 Pro WSL2)	Raspbian
language	C++11
SW library	FHIPE library [13], NTL 11.3.2 [36], MCL 1.51 [37], GMP 6.1.2 [38]

Table 2. Comparison of privacy-preserving anomaly detection methods (* computed in plaintext).

Method	Cryptographic Primitive	Protection	Metric	p
[23]	HE	Client data	Euclidean distance	2
[24]	HE	Client data	Euclidean distance	2
[25]	HE (Additive only)	Client data	Q-function * (customized metric)	–
[26]	Differential Privacy	Train data	Mean absolute error *	1
Proposed	FE	Client data	$L_{p}$ distance	any even p

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ryu, D.-H.; Jeon, S.-Y.; Hong, J.; Lee, M.-K. Efficient L_p Distance Computation Using Function-Hiding Inner Product Encryption for Privacy-Preserving Anomaly Detection. Sensors 2023, 23, 4169. https://doi.org/10.3390/s23084169

AMA Style

Ryu D-H, Jeon S-Y, Hong J, Lee M-K. Efficient L_p Distance Computation Using Function-Hiding Inner Product Encryption for Privacy-Preserving Anomaly Detection. Sensors. 2023; 23(8):4169. https://doi.org/10.3390/s23084169

Chicago/Turabian Style

Ryu, Dong-Hyeon, Seong-Yun Jeon, Junho Hong, and Mun-Kyu Lee. 2023. "Efficient L_p Distance Computation Using Function-Hiding Inner Product Encryption for Privacy-Preserving Anomaly Detection" Sensors 23, no. 8: 4169. https://doi.org/10.3390/s23084169

APA Style

Ryu, D. -H., Jeon, S. -Y., Hong, J., & Lee, M. -K. (2023). Efficient L_p Distance Computation Using Function-Hiding Inner Product Encryption for Privacy-Preserving Anomaly Detection. Sensors, 23(8), 4169. https://doi.org/10.3390/s23084169

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Efficient L_p Distance Computation Using Function-Hiding Inner Product Encryption for Privacy-Preserving Anomaly Detection

Abstract

1. Introduction

1.1. Related Works

1.2. Contributions

2. Preliminaries

2.1. Barreto–Naehrig Curve (BN Curve)

2.2. Cryptographic Pairing

2.3. FHIPE

2.4. $L_{1}$ and $L_{2}$ Distances over FHIPE

3. Proposed Method

4. Performance Analysis

5. Applications

5.1. Smart Building Management

5.2. Remote Device Diagnosis

5.3. Possible Attacks and Mitigation on Our Systems

6. Limitation

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Efficient Lp Distance Computation Using Function-Hiding Inner Product Encryption for Privacy-Preserving Anomaly Detection

Abstract

1. Introduction

1.1. Related Works

1.2. Contributions

2. Preliminaries

2.1. Barreto–Naehrig Curve (BN Curve)

2.2. Cryptographic Pairing

2.3. FHIPE

2.4. L 1 and L 2 Distances over FHIPE

3. Proposed Method

4. Performance Analysis

5. Applications

5.1. Smart Building Management

5.2. Remote Device Diagnosis

5.3. Possible Attacks and Mitigation on Our Systems

6. Limitation

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Efficient L_p Distance Computation Using Function-Hiding Inner Product Encryption for Privacy-Preserving Anomaly Detection

2.4. $L_{1}$ and $L_{2}$ Distances over FHIPE