Next Article in Journal
Blockchain-Based Data Sharing and Trading Model for the Connected Car
Next Article in Special Issue
A Novel Location Privacy-Preserving Approach Based on Blockchain
Previous Article in Journal
A Metaheuristic Optimization Approach for Parameter Estimation in Arrhythmia Classification from Unbalanced Data
Previous Article in Special Issue
A Taxonomy of DDoS Attack Mitigation Approaches Featured by SDN Technologies in IoT Scenarios
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Combinatorial Subset Difference—IoT-Friendly Subset Representation and Broadcast Encryption

1
Department of Information System, Hanyang University, Seoul 04763, Korea
2
Department of Security Enhanced Smart Electric Vehicle, Kookmin University, Seoul 02707, Korea
3
Department of Electrical Engineering, Kookmin University, Seoul 02707, Korea
*
Author to whom correspondence should be addressed.
Sensors 2020, 20(11), 3140; https://doi.org/10.3390/s20113140
Submission received: 28 April 2020 / Revised: 27 May 2020 / Accepted: 27 May 2020 / Published: 2 June 2020
(This article belongs to the Special Issue Security and Privacy Techniques in IoT Environment)

Abstract

:
In the Internet of Things (IoT) systems, it is often required to deliver a secure message to a group of devices. The public key broadcast encryption is an efficient primitive to handle IoT broadcasts, by allowing a user (or a device) to broadcast encrypted messages to a group of legitimate devices. This paper proposes an IoT-friendly subset representation called Combinatorial Subset Difference (CSD), which generalizes the existing subset difference (SD) method by allowing wildcards (*) in any position of the bitstring. Based on the CSD representation, we first propose an algorithm to construct the CSD subset, and a CSD-based public key broadcast encryption scheme. By providing the most general subset representation, the proposed CSD-based construction achieves a minimal header size among the existing broadcast encryption. The experimental result shows that our CSD saves the header size by 17% on average and more than 1000 times when assuming a specific IoT example of IP address with 20 wildcards and 2 20 total users, compared to the SD-based broadcast encryption. We prove the semantic security of CSD-based broadcast encryption under the standard l-BDHE assumption, and extend the construction to a chosen-ciphertext-attack (CCA)-secure version.

1. Introduction

In the recent applications, the Internet of Things (IoT) systems are more likely to involve multicast of privacy-sensitive information; for example, an IoT sensor network in the smart city involves personal whereabouts transmitted to multiple devices, and an IoT medical system requires sensitive health information to be delivered to the authorized devices. In these IoT secure communications, cryptographic primitives can provide useful functionalities and efficiencies. Especially, advanced encryption protocols like attribute-based encryption or broadcast encryption can handle simultaneous multicast efficiently; attribute-based encryption is often applied to the IoT devices [1,2], and broadcast encryption, which is a specific and efficient version of the attribute-based encryption, is also considered [3] for secure firmware updates in IoT.
Until now, existing cryptographic applications in IoT systems such as those in References [1,2,3,4] assumed device ID as an intrinsic identity string. However, in a network environment in which the IoT device is identified by an IP address, it is useful to use IP address as an ID. Moreover, an arbitrary network group can be effectively represented by IP address including wildcard (*), and if a part of IP is connected to attributes, a device group specified by a set of attributes can also be effectively expressed. This flexible representation denoted by IP characteristics or attributes covers any specific group even without knowing every individual ID predefined. This paper focuses on the efficient and scalable public key broadcast encryption scheme that transmits a message securely to any group of legitimate receiver represented as an flexible IP address in the IoT environment.
Broadcast Encryption. The public key broadcast encryption is an effective cryptosystem for a secure group communication which allows a user (or a device) to broadcast secure messages so that only privileged users (or devices) can decrypt them. To minimize the broadcasting payload, the design of broadcast encryption adopts the hybrid encryption; an original message is often encrypted with a simple symmetric key encryption (e.g., advanced encryption standard (AES)), and the applied symmetric key becomes an official message for public key broadcast encryption. The encrypted key is called a header, which is often regarded as an official ciphertext in the broadcast encryption literature. For the transmission, the privileged users are organized within multiple subsets according to the subset representation. Then the broadcast encryption algorithm is repeatedly applied to each subset; the headers are collected into a vector so that the receiver can find and decrypt the header for his own subset. The symmetric key encryption of the original message is broadcast to the entire users, but only privileged users can decrypt the symmetric key from the header and obtain the message.
Subset Representation. In the broadcast encryption, a subset representation is an important factor which strictly determines the number of subsets. The number of subsets decides the total header size since each subset requires an individual header. Therefore, it is recommended to maintain the subset representation as general as possible so that the representation can cover as many as privileged users, which can decrease the number of subsets. In the broadcast encryption, the subset representation is considered separately: the construction assumes there already exists pre-defined subsets covering the given privileged users.
Existing Methods. The well-known subset representations are complete subtree (CS), subset difference (SD), and interval. They are all based on the binary tree structure, where the users are denoted as leaf nodes of a binary tree. Figure 1 visualizes each representation with an example where green nodes are privileged and red nodes are revoked.
The CS representation covers the privileged users by denoting the parent node; each parent node itself is a subset which includes the descendant leaf nodes. For the example in Figure 1, node 9 covers users 17 and 18, and node 19 covers only user 19. The CS representation requires 8 subsets to cover the privileged users in Figure 1.
The SD representation extends the CS by enhancing expressiveness; its covers users by denoting the difference of two subtrees as ( c , d ) where c is a covered set and d is a revoked set. For the example in Figure 1, a subset ( 5 , 20 ) which subtracts subtree 20 from subtree 5 covers users 17, 18, and 19. Since it generalizes the CS representation by adding a revoked set expression, the SD representation requires 4 subsets to cover the privileged users which is less than the CS representation.
The interval representation covers users within an interval, by denoting the left and right edge of the interval. For the example in Figure 1, a subset (17∼19) includes users 17, 18, and 19. The interval representation requires 4 subsets to cover the privileged users, which is comparable to the SD  representation.
Limitations. The main restriction of the existing methods is that they are all limited to the hierarchical tree structure. When expressing the node with a binary label instead of a number, tree-based representations cannot effectively handle non-hierarchical combinations required to represent flexible IP addresses we consider. That is, as in Figure 1, each node can be expressed with a bitstring label which consists of { 0 , 1 , * } , by considering the left branch as 0, right branch as 1, and both as wildcard (*) starting from the root. For instance, node 8 can be expressed as a label 011, and a label 01* can include nodes 7 (010) and 8 (011). In this case, the limitation is that each bit is determined from the root to the leaf node following the tree hierarchy; the wildcard cannot exist before 0 or 1 since the parent should be determined before the child nodes. For example, a label * 00 is not expressible since the wildcard stands before 0, which makes it as a non-existent node in a binary tree.
A main concern is that this limitation prevents the broadcast encryption from being efficiently applied to the IoT systems. Consider the secure group communication of IoT systems where a device is identified by its IP address combined with attributes which is not restricted to the hierarchy. In other words, the wildcard occurs in any position to express a specific group representation  (e.g., 0 * * * . * * 1 . * * * . 001 ). However, since existing representation methods limit the position of wildcards as we described above, they cannot cover the general and flexible IP addresses within a single subset, which leads to a large number of subsets.
Combinatorial Subset Difference. To overcome the hierarchical limitations, we propose a new IoT-friendly subset representation called combinatorial subset difference (CSD), which extends the SD representation by allowing any combinatorial labels, that is, allowing wildcards in any position. Specifically, the CSD expresses a subset similar to SD as ( c , d ) , but where c is a covered label and d is a revoked label that can allow wildcards in any bits. Our CSD is the most general representation among the existing methods, indicating that any subset expressed in the existing representation can be transformed to the CSD subset. Therefore, the CSD can achieve minimal number of subsets in any cases. To cover the privileged users in the example of Figure 1 (by expressing each user as a label), the CSD requires only a single subset ( 0 * * * * , 0 * * 11 ) while SD and interval representations require 4 different subsets.
To adopt the CSD representation in practical broadcast encryption, it is required to construct an algorithm that can output appropriate CSD subsets from the given input of privileged users. Thus, we propose a heuristic algorithm which generates CSD subsets from the list of users. We first construct an SD algorithm that can output SD subsets from the input of privileged users, then extend the algorithm to cover the general CSD cases. The proposed CSD algorithm generates r subsets for the worst case while the SD generates 2 r 1 for the worst case, where r is the number of revoked users. For any cases, the algorithm guarantees that the number of CSD subsets are always no more than the SD subsets.
CSD-based broadcast encryption. Designing the broadcast encryption is independent of the subset representation; since we proposed a new subset representation, it is also required to design a new broadcast encryption that can adopt the input of CSD subsets. In this paper, we propose a new CSD-based public key broadcast encryption that has a minimal total header size among the existing broadcast encryption due to the minimal number of subsets. The minimal header size indicates minimal transmission cost, which can be suitable for the IoT system applications. Moreover, our CSD-based broadcast encryption improves the encryption time and decryption time while maintaining the order of key sizes.
Table 1 shows the comparison between the CSD and the existing public key broadcast encryption schemes. The public key SD is from the specifically measured by applying the public key lifting transformation [5], which combines the hierarchical identity-based encryption (HIBE) [6] to the symmetric key SD-based broadcast encryption [7,8] from the advanced access content system (AACS) DVD standard [9]. The interval refers to the Lin’s construction [10], which is based on a new interval representation from binary trees. The revocable SD refers to Lee and Park’s construction [11], which is based on the same SD representation, but improves the order of key sizes and enhances identity revocation algorithm. The elements in revocable SD is within a composite order group, where the element size is 8 times larger than the normal prime order group for the security; the size of composite order group elements are denoted as × 8 . The header size refers to the total header size in the worst case, which is the product of the number of group elements per subset and the number of subsets. Among the state-of-the-art public key broadcast encryption, our CSD achieves a minimal header size of 2 r , due to the minimal number of subsets.
The CSD also improves the encryption time and decryption time, compared to the other constructions. Nevertheless, the CSD still maintains the public key size within O ( log n ) and the secret key size within O ( log 2 n ) , which is comparable to the existing public key broadcast encryption. To verify the theoretical improvements, we implemented the CSD, SD, and interval broadcast encryption on the Intel Edison embedded systems: the experimental results show that our CSD reduces the header size by 1000 times on average when applied to the IP address examples, with also improving both encryption and decryption time, compared to the existing methods.
The security of our CSD-based public key broadcast encryption is proven under the standard bilinear diffie-hellman exponent ( l B D H E ) assumption; we first prove the semantic security (CPA-secure) of our original construction, and then extend the construction to a CCA-secure version.
Contributions. We now summarize the contribution of our paper, in the sense of practicality and theoretical advances.
  • CSD subset representation: we propose a new IoT-friendly subset representation called combinatorial subset difference (CSD), the most general representation that achieves minimal number of subsets among the existing methods.
  • CSD covering algorithm: we construct a subset generation algorithm for the proposed CSD representation, which can output CSD subsets from the given input of privileged users.
  • CSD-based broadcast encryption: we propose a new public key broadcast encryption based on the CSD representation, which achieves a minimal header size due to the minimal number of subsets, suitable for the secure group communication in IoT systems. It also improves encryption and decryption time compared to the existing public key broadcast encryption.
  • Security proof: we provide a formal security proof for the semantic security of our construction under the l B D H E assumption, in the standard model.
  • CCA-secure extension: we also propose an extended CCA-secure version of our CSD-based public key broadcast encryption, along with a formal proof for the CCA security.
  • Implementation: we implemented our construction on the actual IoT device to verify the performance, and provide experimental results showing comparison between the CSD and the existing broadcast encryption schemes.
This paper is organized as follows. Section 2 states related works. Section 3 describes the subset construction algorithm. Section 4 organizes preliminaries for broadcast encryption systems. We present our broadcast encryption scheme in Section 5, and prove the security for security analysis in Section 6. Then we extend it to a CCA-secure broadcast encryption scheme in Section 7, and prove the security in Section 8. Section 9 shows experimental results. Section 10 makes conclusion.

2. Related Work

2.1. Secure Multicast

Secure multicast is a traditional encryption model for multi-user infrastructure, which considers an environment that encrypts messages to multiple users simultaneously. Most of the secure multicast protocols work with a central manager responsible for the broadcast, and focuses on improving the efficiency of key distribution and transmission.
Among the secure multicast research, Chu et al. [12] proposed a notable end-to-end protocol which can secure both message and the copyright of the content. Their work support the dynamic group, which can let users join/leave or expel users during the system running. However, the security of their protocol relies on informal analysis against few attacks, and the fundamental technique remains on multiple encryption which let the encryption required for the number of total users in the group.
For the efficiency, Kumar et al. [13] proposed a protocol based on the centralized key distribution. By providing more efficient key distribution from the hierarchical tree structure, their protocol could achieve more flexible group control and efficient computation. Still, the total communication relies on the individual transmissions, which leads to a considerably large transmission load.
Recently, Wang et al. [14] implemented the concept of secure multicast on the IoT environment, by proposing an end-to-end key agreement protocol for the IoT devices. It focuses on the authentication of devices when interacting with the user’s server, where the user can open a secure channel if the key is once agreed. For the security, their protocol deploy fuzzy extractor, which can use user’s biometric information such as fingerprints as a secret key.

2.2. Broadcast Encryption

The broadcast encryption is a traditional cryptosystem where one can encrypt a message to the group of multiple users. In the broadcast encryption, the encryption can be handled for each group rather than individuals; it is an efficient primitive to be deployed on the secure multicast.
There are many constructions for the broadcast encryption, with different functionalities and improvements [5,6,8,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29]. In the cryptographic literature, the broadcast encryption is categorized by two different criteria: symmetric/public key, and stateful/stateless.
Symmetric vs. Public. Similar to the categorization of standard encryption systems, the broadcast encryption can be either be based on the symmetric key settings or the public key (or asymmetric) settings. If the broadcast encryption is based on the symmetric key [25,29], only a user with the key can encrypt the message, that is, only the central authority can broadcast messages to other users. On the other hand, if the broadcast encryption is based on the public key [8,10,23], anyone can broadcast messages to other users.
Stateful vs. Stateless. The stateful broadcast encryption and stateless broadcast encryption is determined by the key update. In the stateful broadcast encryption [28], the secret key of users can be updated time to time, in order to help the revocation (proper broadcast). However, in the stateless broadcast encryption [8,25,26], the system allows the key exchange for only once in the initial setup. Stateful broadcast encryption can be easier to design with the help of key updates, but it also involves a price of simultaneous key managements. On the other hand, in the stateless broadcast encryption, the system can change revocation without any key management after the initialization.
  In many interactive IoT systems, it is often required to allow each device to broadcast messages rather than restricting the broadcast to the central manager, which requires public key broadcast encryption. Also, stateless broadcast encryption is more appropriate than the stateful settings for the IoT systems, since small and numerous IoT devices are not suitable for frequent updates and managements. In these sense, the proposed broadcast encryption for combinatorial subset difference is a stateless public key broadcast encryption, which satisfies the IoT requirements.

3. Proposed Subset Construction Algorithm

Despite the efficiency of the combinatorial subset difference (CSD) representation, it is a remaining work to construct a concrete algorithm which returns CSD subsets from a list of privileged/revoked users. In this section, we propose a heuristic algorithm for the CSD subset construction. The proposed algorithm does not assure the optimal minimization of the subsets. However, it guarantees the number of subsets in CSD is no more than the existing subset difference (SD) algorithm [8]. For the worst case, it generates no more than r subsets while the existing SD algorithm generates 2 r 1 subsets.
Since CSD is the generalization of SD, we first explain the how the SD algorithm minimizes the number of subsets. Then we show how to extend it to the CSD algorithm, by focusing on the difference between the SD and the CSD.
In the SD (and CSD) representation, a subset is represented as S = ( c , d ) , where c is a covered set (nodes to include), and d is a revoked set (nodes to exclude). For instance, in a tree structure, a subset S includes all descendants of the node c, except all descendants of the node d.
In the SD algorithm, an internal node c can be categorized as one of the following three types, as illustrated in Figure 2, depending on its children nodes.
  • Type ⊤: There is no revoked descendant of c.
  • Type ⊥: Node c is excluded by its parent, because either it is already covered by another subset or its all descendants are revoked.
  • Type S: Node c becomes a subset ( c , d ) , where all descendants of d are either revoked or already covered by other subsets.
The node type is determined by the types of its children. Therefore, the internal node type is determined from bottom to top, recursively. As in Figure 3, combining two internal nodes identifies the type (Figure 2) of their parent.
For both SD and CSD, the examples in Figure 3 are valid. However, as in Figure 4, the CSD is different from SD in case when two S nodes are combined; the SD let the parent of two S be a type ⊥, while CSD let it be another S. In the SD, a set can be fixed only if its descendants are all ⊤ or ⊥. Therefore, two S nodes cannot be merged into a single set; the parent node becomes ⊥ since two S type nodes are determined as individual subsets in SD. However, in CSD, the parent node becomes another S by merging two S nodes, without generating two subsets.
Figure 5 shows an example where SD and CSD outputs different results, where users (nodes) 1 and 3 are revoked among 8 total users. When representing the revoked nodes in a binary format, 001 and 011 are revoked. The parent nodes are denoted as *. For instance, 00 * is a parent node of 000 and 001, and 01 * is a parent node of 010 and 011. The type of node 000 is ⊤ since nothing is revoked, and the type of node 001 is ⊥ since it is revoked. The type of node 00 * is S, since its children 000 is included while its other children 001 is excluded, and it requires a subset ( 00 * , 001 ) to be covered. Similarly, the type of node 01 * is S. In case of the SD algorithm, the node 0 * * with type S children cannot be an inclusion label since both of its children are labeled as two different subsets ( ( 00 * , 001 ) and ( 01 * , 011 ) ) which cannot be merged. Thus, 0 * * should be excluded which determines its type as ⊥. Then the type of the root note is S since its left child is ⊥ and its right child is ⊤. As a result, three subsets are generated as ( * * * , 0 * * ) , ( 00 * , 001 ) , and ( 01 * , 011 ) .
On the other hand, in the CSD algorithm, node 0 * * is treated specially, different from the SD. Although the type of its children are both S, it is not necessary to exclude the node; if the node can be merged with a sibling node of type ⊤, it is possible to represent the node without generating a new subset. In Figure 5, the node 0 * * has two subsets ( 00 * , 000 ) and ( 01 * , 010 ) , and it has a sibling node 1 * * of type ⊤. When merging subsets ( 00 * , 000 ) and ( 01 * , 010 ) with ( 1 * * , ) , it generates ( * 0 * , 000 ) and ( * 1 * , 010 ) where * 0 * and * 1 * can include all eight users while 000 and 010 can exclude 000 and 010 since the CSD allows * in the middle of the representation regardless of the tree hierarchy. As a result, the CSD generates two subsets as ( * 0 * , 001 ) and ( * 1 * , 011 ) , which does not generates subsets more than the number of revoked users.
Table 2 describes node type resolution in SD and CSD, where T 0 is the type of left child and T 1 is the type of right child. The subset generation indicates whether a new subset generation is required for the resolved node type; a new subset should be generated when one child is included while the other child is excluded. The only difference between SD and CSD occurs when the type of both children are S: the node is merged into type ⊥ in SD while it is merged into another S type in CSD.
Algorithm 1 shows how to build subsets in both SD and CSD methods. The only difference appears at line 27: the CSD proceeds an additional line 27 while the SD does not. Consequently, as shown in Table 2, line 29 ( r e s o l v e T y p e ( T 0 , T 1 ) ) returns different values.
Algorithm 1 Subset construction for SD and CSD.
1:
function s d ( S e t , U n r e s o l v e d , i , p r e f i x , r e v S e t )
2:
if i = d e p t h then
3:
  if | r e v S e t | > 0 then
4:
   return
5:
  else
6:
   return
7:
  end if
8:
else if | r e v S e t | = 0 then
9:
  return
10:
end if
11:
S 0 = { r | r r e v S e t , b i t ( r , i + 1 ) = 0 }
12:
S 1 = { r | r r e v S e t , b i t ( r , i + 1 ) = 1 }
13:
T 0 s d ( S e t , U 0 , i + 1 , p r e f i x | | 0 , S 0 )
14:
T 1 s d ( S e t , U 1 , i + 1 , p r e f i x | | 1 , S 1 )
15:
if ( T 0 = ) ( T 1 = ) then
16:
   s c r e a t e S u b s e t ( p r e f i x , i , 0 )
17:
   U n r e s o l v e d s ; S e t S e t s
18:
else if ( T 0 = ) ( T 1 = ) then
19:
   s c r e a t e S u b s e t ( p r e f i x , i , 1 )
20:
   U n r e s o l v e d s ; S e t S e t s
21:
else if ( T 0 = S ) ( T 1 = ) then
22:
   ( c , d ) U 0 , c i * ; U n r e s o l v e d U 0
23:
else if ( T 0 = ) ( T 1 = S ) then
24:
   ( c , d ) U 1 , c i * ; U n r e s o l v e d U 1
25:
else if C o m b i n a t o r i a l ( T 0 = S ) ( T 1 = S ) then
26:
  // Only used for CSD
27:
   U n r e s o l v e d U 0 U 1
28:
end if
29:
return r e s o l v e T y p e ( T 0 , T 1 )
30:
end function
31:
procedure m a i n
32:
r e v S e t { r e v o k e d u s e r I D s }
33:
h e a d e r s d ( , , 0 , , r e v S e t )
34:
end procedure
In the algorithm, the subset construction starts with an ID set of revoked users as an input. Function s d reads each ID of revoked users from the most significant bit (MSB) to the least significant bit (LSB), where i denotes the current index. When i reaches the LSB, the user is determined as either revoked or legitimate. Therefore, it returns ⊥ or ⊤ as in lines from 2 to 7. Even if i does not reach LSB, it returns ⊤ if there is no revoked user. In line 11 through 12, the revoked users are divided into two sets, depending on the next bit of user’s ID, and the function s d is called recursively for each set denoting tree traverse in lines 13 through 14. If the types of each children is ⊥ and ⊤, a new subset should be created as described in lines 16 and 19. Function c r e a t e S u b s e t ( p r e f i x , i , b ) is for creating a subset ( c , d ) where c denotes an inclusion label and d denotes an exclusion label. Label c covers all users with p r e f i x 1 , , p r e f i x i , * , , and label d covers all users with p r e f i x 1 , , p r e f i x i , b , * , , where p r e f i x k is a k-th bit of p r e f i x . When the type of a child is S and the other one is ⊤, each inclusion label in the unresolved subset is updated to include the type ⊤ child, as in lines 22 and 24. In the CSD, the unresolved subsets for type S children are preserved as unresolved, since inclusion labels in unresolved subsets will be updated in case they include a type ⊤ node as in line 27.
Theorem 1.
For the combinatorial subset difference (CSD) representation, Algorithm 1 generates at most r subsets for r revoked users.
Proof. 
As illustrated in Table 2, a subset is generated when a child node has a ⊥ type. The resolved type becomes ⊥, at least only if a child node is ⊥. A leaf node (user) has type ⊥ only if it is revoked. If there are r revoked nodes, there are at most r leaf nodes with type ⊥. Hence, the number of generated subsets is no more than r. □
Unlike CSD, the number of subsets in the SD can exceed r, since a node can still be type ⊥ even if both children have type S.
For the worst case comparison, it is straightforward to deduce that the number of subsets in CSD does not exceed the number of subsets in SD. For the subset generation, a subset is created when a child is ⊥ and the other child is ⊤, both in SD and CSD algorithm. Also in both algorithms, a node can be type ⊤ when its children are both ⊤, while it should be ⊥ when one child is ⊥ and the other child is ⊥ or S. However, in SD, a node becomes ⊥ when its children are both S, which may require an additional subset. Therefore, the number of subsets in CSD algorithm is no more than that of SD algorithm.

4. Preliminaries

4.1. Public Key Broadcast Encryption

We follow the standard definition of public key broadcast encryption from Reference [23]. In the broadcast encryption, it is often combined with the symmetric key encryption, where the encryption encapsulates the session key instead of arbitrary messages, and the encryption is performed for each subset S. Formally, a public key broadcast encryption Π contains three functions as below:
  • ( P K , { S K I D } I D { 0 , 1 } l ) Setup( l , m ): the setup algorithm initializes the system by setting up public parameters, and generating private keys for stateless devices. From user ID length l and session key length m, the setup returns a public key P K and 2 l secret keys { S K I D } I D { 0 , 1 } l .
  • ( S , H d r , C M ) = ( H d r , K ) Encrypt( P K , S ): the encrypt algorithm is performed for each subset S. It first outputs a header H d r and a message encryption key K K from the input of subset S { 0 , 1 } l and a public key P K . Then, for the message M to be broadcast to a set S, let C M be an encryption of M under the session key K. The transmission for the users in S is selected as ( S , H d r , C M ) , where H d r is called broadcast ciphertext and C M is called broadcast body.
  • K K Decrypt( S , I D , S K I D , H d r ): the decrypt algorithm is performed by the individual privileged user. From the input of subset S { 0 , 1 } l , user ID I D { 0 , 1 } l , secret key S K I D , and a header H d r , if I D S (that is privileged) then it produces the message encryption key K K . Then the key K is utilized to decrypt the C M to retrieve plaintext M.
The encryption system must let every user in S obtain message M correctly. In other words, for all S { 0 , 1 } l and all I D S , if ( P K , ( S K I D 0 l , , S K I D 1 l ) ) $ Setup ( l , m ) , ( H d r , K ) $ Encrypt ( PK , S ) then it should satisfy Decrypt ( S , ID , SK ID , Hdr ) = K .
For the security definition of the broadcast encryption, we describe the selective security for chosen plaintext attacks (IND-sID-CPA-security) and chosen ciphertext attacks (IND-sID-CCA-security) as in Reference [23]. Then we separate the security notions for single-set security and multi-set security, depending on the number of representations of the challenged sets. Finally we show that the single-set security implies the multi-set security.

4.2. Security Definition

We define the single-set security by the selective game between the challenger C and the adversary A . Both C and A knows the ID length l and the session key length m as a default inputs. The security game proceed as follows:
Init:
A begins by selecting a set S * to claim the users which it intends to attack.
Setup:
C runs Setup ( l , m ) to get a public key P K and secret keys S K 0 l , , S K 1 l , and gives A the P K and all secret keys S K I D for I D S * .
Query phase1:
(optional for CCA) A can adaptively issue decryption queries q 1 , , q m ; each query consists of ( I D , S , H d r ) for S S * and I D S . C replies the query by Decrypt ( S , ID , SK ID , Hdr ) .
Challenge:
C runs Encrypt ( S * , PK ) to get ( H d r * , K ) for K K . Then, C flips a coin b { 0 , 1 } . If b = 1 , C sets K * = K , otherwise C chooses a random R K and sets K * = R to respond ( H d r * , K * ) back to A .
Query phase2:
(optional for CCA) A can continue asking decryption queries q m + 1 , ⋯, q q D ; each query consists of ( I D , S , H d r ) for S S * and I D S , but the only constraint is H d r H d r * . C replies the query same as in phase 1.
Guess:
A guesses b { 0 , 1 } for b. If b = b , A wins the game.
Let AdvSSBr A , Π ( l , m ) be the advantage of A to win the game above.
Definition 1.
A public key broadcast encryption Π for a single-set is ( t , ϵ , l , m ) -CPA secure (or ( t , ϵ , l , m , q D ) -CCA secure), if for every t-time adversary A (for CCA: which makes at most q D decryption queries) we have | AdvSSBr A , Π ( l , m ) 1 / 2 | < ϵ.
A multi-set security is also defined by the selective game between the C h and A , similar to the single-set security game above, but the only difference is that the challenged set is represented by multiple subsets.
Init:
A begins by selecting a set S * = ( S 1 * , , S w * ) to claim users which it intends to attack.
Setup:
C runs Setup ( l , m ) to get a public key P K and secret keys S K 0 l , , S K 1 l , and gives A the P K and all secret keys S K I D for I D S * .
Query phase1:
(optional for CCA) A can adaptively issue decryption queries q 1 , , q d ; each query consists of ( I D , S , H d r ) where S S i * for all i and I D S . C replies the query by Decrypt ( S , ID , SK ID , Hdr ) .
Challenge:
C runs Encrypt ( S i * , PK ) for i = 1 , , w to get ( H d r i * , K i ) for K K . Then, C flips a coin b { 0 , 1 } . If b = 1 , C sets K * = ( K 1 , , K w ) , otherwise C chooses randoms R i K for i = 1 , , w and sets K * = ( R 1 , , R w ) to respond ( H d r * , K * ) back to A .
Query phase2:
(optional for CCA) A can continues asking decryption queries q q d + 1 , ⋯, q q D ; each query consists of ( I D , S , H d r ) with S S i * for all i and I D S , but the only constraint is H d r H d r * . C replies the query same as in phase 1.
Guess:
A guesses b { 0 , 1 } for b. If b = b , A wins the game.
Let AdvMSBr A , Π ( l , m ) be the advantage of A to win the above game.
Definition 2.
A public key broadcast encryption Π for multi-set is ( t , ϵ , l , m ) -CPA secure (or ( t , ϵ , l , m , q D ) -CCA secure), if for every t-time adversary A (for CCA: which makes at most q D decryption queries) we have | AdvMSBr A , Π ( l , m ) 1 / 2 | < ϵ.
Finally, we show the single-set security implies the multi-set security.
Theorem 2.
Suppose a single-set public key broadcast encryption Π is ( t , ϵ , l , m , q D ) -CCA secure. Then a multi-set public key broadcast encryption Π is ( t , ϵ , l , m , q D ) -CCA secure for arbitrary ϵ < ϵ * w , where w is the number of subsets [30].
Proof. 
For the proof sketch, the basic idea is to convert the challenge key in the multi-set scheme from a real key set K * to a random key set K ¯ * by using hybrid games which change each key in the single-set scheme from a real key to a random key. If the adversary cannot distinguish the changes of each key in the single-set scheme, then it also cannot distinguish the changes of challenge key in the multi-set scheme since the number of hybrid games is within polynomial. Suppose that a challenge set is given as S * = ( S 1 , S 2 , , S w ) for polynomial w and the corresponding key set is described as K * = ( K 1 , , K w ) s.t. K i is the key for S i . The hybrid games G 0 , , G h , , G w for the proof are defined as follows:
Game  G 0
In this game, all keys K j are real key from an encryption on the set S j . That is, the challenge key K * is a set of real keys.
Game  G h
This game is almost identical to the game G h 1 except the key K h since K h in this game is a random key. Specifically, in this game, the key K j for j h is a random key and the key K j for h < j is a real key.
Game  G w
In this game, all keys K j are random keys. That is, the challenge key K * is a set of random keys K * ¯ .
Let S A G h be the event that A outputs 1 in G h . A distinguishes G h 1 from G h by the advantage of the single-set security. Thus, we have that
P r [ S A G 0 ] P r [ S A G w ] = P r [ S A G 0 ] + h = 1 w 1 ( P r [ S A G h ] P r [ S A G h ] ) P r [ S A G w ] h = 1 w | P r [ S A G h 1 ] P r [ S A G h ] | 2 w · AdvSSBr A ( ˘ ) .
Finally, we obtain the inequality relation as follows:
AdvMSBr A ( ˘ ) = | P r [ b = 1 ] · P r [ b = b | b = 1 ] + P r [ b = 0 ] · P r [ b = b | b = 0 ] 1 2 | =   | 1 2 · P r [ b = 1 | b = 1 ] + 1 2 · ( 1 P r [ b = 1 | b = 0 ] ) 1 2 ] | = 1 2 · | P r [ b = 1 | b = 1 ] P r [ b = 1 | b = 0 ] | 1 2 · | P r [ S A G 0 ] P r [ S A G w ] | w · AdvSSBr A ( ˘ ) .
 □
The above game can be transformed to define semantic security for a public key broadcast encryption system if the attacker is not allowed to issue decryption queries.
Definition 3.
A public key broadcast encryption system is ( t , ϵ , l , m ) semantically secure if it is ( t , ϵ , l , m , 0 ) -CCA secure.
For the scheme construction, we first provide a semantically-secure scheme, and then extend it to gain CCA security.

4.3. Bilinear Groups and Pairings

In References [31,32], the bilinear maps and bilinear map groups are defined as follows:
  • There exists two multiplicative groups G and G 1 , which are cyclic groups of prime order p.
  • Let g be a generator of the group G .
For two groups G and G 1 as above, the bilinear map is defined as a map e : G × G G 1 , which satisfies the following properties:
  • The map is bilinear: for all u , v G and a , b Z , e ( u a , v b ) = e ( u , v ) a b
  • The map is non-degenerate: e ( g , g ) 1 .
G is a bilinear group if the operation in G is efficient and there is a group G 1 with an efficiently computable bilinear map e : G × G G 1 .

4.4. Computational Complexity Assumptions

The security of our scheme is based on the bilinear Diffie-Hellman Exponent (BDHE) assumption which is commonly used as in References [6,23]. It is a natural extension of bilinear Diffie-Hellman Inversion (BDHI) assumption.
Let G be bilinear group of prime order p. The l-BDHE problem in G are stated as follows: given a vector of 2 l + 1 elements
( h , g , g α , g ( α 2 ) , , g ( α l ) , g ( α l + 2 ) , , g ( α 2 l ) ) G 2 l + 1
as input, output e ( h , g ) α l + 1 G 1 . For simplicity, after g and α are determined, we use y i to denote y i = g α i G . An algorithm A has advantage ϵ in solving l-BDHE in G if
P r [ A ( h , g , y 1 , , y l , y l + 2 , , y 2 l ) = e ( h , y l + 1 ) ] ϵ ,
where the probability is over the random choices g , h G , α in Z p , and A ’s random bits. The decisional version of the l-BDHE problem is defined in a similar manner; let y g , α , l = ( y 1 , , y l , y l + 2 , , y 2 l ) . An algorithm B which outputs b { 0 , 1 } has advantage ϵ in solving decisional l-BDHE if
| P r [ B ( h , g , y g , α , l , e ( g ^ , y l + 1 ) ) = 0 ] P r [ B ( h , g , y g , α , l , T ) = 0 ] | ϵ ,
where the probability is over the random choices of g , h G , α in Z p , T G 1 , and B ’s random bits.
Definition 4.
The (decisional) ( t , ϵ , l )-BDHE assumption holds in G , if for any t-time adversary the advantage is less than ϵ on solving the (decisional) l-BDHE problem in G .
The t and ϵ are often omitted, and referred as (decisional) l-BDHE in G .

5. CSD-based Broadcast Encryption

In this section, we present our public key broadcast encryption based on the combinatorial subset difference (CSD) representation. The CSD-based broadcast encryption accepts an input of CSD subset S; as standard broadcast encryption, it runs the broadcast encryption algorithm for each subset S. We first show the formal construction in Section 5.1, along with appropriate examples to aid the comprehension. Then we provide the formal proof for the collusion-resistant semantic security in Section 6.

5.1. Main Scheme

The main formal construction of the CSD-based public key broadcast encryption for a CSD subset S is described as follows:
Setup(l,m): The maximum number of users are set as 2 l , which can be interpreted as l-level public key broadcast encryption, and the message space is set as { 0 , 1 } m . To initialize and generate the keys, the setup algorithm selects random integers α , β Z p , and O ( l ) random group elements g G , g 2 , g 3 , h 1 , 0 , h 1 , 1 , , h l , 0 , h l , 1 , k 1 , 0 , k 1 , 1 , , k l , 0 , k l , 1 G . Then it computes g 1 = g α G . The public key is set as
P K ( g ^ , g 1 ^ , g , g 2 , g 3 , h 1 , 0 , h 1 , 1 , , h l , 0 , h l , 1 , k 1 , 0 , k 1 , 1 , , k l , 0 , k l , 1 ) .
A master secret key is set as m a s t e r k e y = g 2 α .
For an identity I D = b 1 b l , the setup algorithm generates a secret key S K I D from the master secret key. It selects random r 1 , , r l Z p and set S K I D = ( S K I D , 1 , , S K I D , l ) as
S K I D , i = ( g ^ r i , g 2 α ( h 1 , b 1 h l , b l k i , b ¯ i g 3 ) r i , h 1 , b ¯ 1 r i , , h l , b ¯ l r i , k 1 , 0 r i , k 1 , 1 r i , , k i 1 , 0 r i , k i 1 , 1 r i , k i , b i r i , k i + 1 , 0 r i , k i + 1 , 1 r i , , k l , 0 r i , k l , 1 r i ) G 3 l + 1 ,
where b ¯ i represents a bit NOT of b i , that is, 1 b i .
Example 1.
For an identity I D = 010 ,
S K 010 = ( g r 1 , g 2 α ( h 1 , 0 h 2 , 1 h 3 , 0 k 1 , 1 g 3 ) r 1 , h 1 , 1 r 1 , h 2 , 0 r 1 , h 3 , 1 r 1 , k 1 , 0 r 1 , k 2 , 0 r 1 , k 2 , 1 r 1 , k 3 , 0 r 1 , k 3 , 1 r 1 , g r 2 , g 2 α ( h 1 , 0 h 2 , 1 h 3 , 0 k 2 , 0 g 3 ) r 2 , h 1 , 1 r 2 , h 2 , 0 r 2 , h 3 , 1 r 2 , k 1 , 0 r 2 , k 1 , 1 r 2 , k 2 , 1 r 2 , k 3 , 0 r 2 , k 3 , 1 r 2 , g r 3 , g 2 α ( h 1 , 0 h 2 , 1 h 3 , 0 k 3 , 1 g 3 ) r 3 , h 1 , 1 r 3 , h 2 , 0 r 3 , h 3 , 1 r 3 , k 1 , 0 r 3 , k 1 , 1 r 3 , k 2 , 0 r 3 , k 2 , 1 r 3 , k 3 , 0 r 3 ) .
Encrypt( P K , S ): The encrypt algorithm takes a subset corresponding to the combinatorial subset difference (CSD) representation as an input. The CSD subset is represented as S = ( c , d ) , for c , d { 0 , 1 , * } l . The label c determines the covered (included) nodes, and the label d determines the revoked (excluded) nodes. The labels can express wildcard (*), which can embrace multiple nodes. For example, a subset ( * * 0 * , 0 * 01 ) indicates all users * * 0 * except users in 0 * 01 . * * 0 * includes 0000 , 0001 , 0100 , 0101 , 1000 , 1001 , 1100 , 1101 and 0 * 01 includes 0001 , 0101 , therefore, label ( * * 0 * , 0 * 01 ) covers 0000 , 0100 , 1000 , 1001 , 1100 , 1101 .
For each subset S = ( c , d ) , let c = c 1 c l and d 1 d l . To generate the header H d r and encryption key K, the encrypt algorithm selects a random s Z p and set H d r and K as
H d r ( g s , ( h 1 , c 1 h l , c l k 1 , d 1 k l , d l g 3 ) s )
K e ( g 1 , g 2 ) s
where H d r G 2 and K G 1 , h i , * = h i , 0 h i , 1 , and k i , * = 1 .
Example 2.
For subset ( * * 0 * , 0 * 01 ) ,
H d r ( g s , ( h 1 , 0 h 1 , 1 h 2 , 0 h 2 , 1 h 3 , 0 h 4 , 0 h 4 , 1 k 1 , 0 k 3 , 0 k 4 , 1 g 3 ) s )
K ( g 1 , g 2 ) s .
Decrypt( S , I D , S K I D , H d r ): Consider an identity I D = b 1 b l and a subset S = ( c , d ) . Let j be an index where the bit in b and d is different, that is, such that b j = 1 and d j = 0 , or b j = 0 and d j = 1 . For example, if d = 0 * 01 and I D = 0000 , then j = 4 since b 4 = 0 and d 4 = 1 .
From S K I D = ( S K I D , 1 , , S K I D , l ) , the decrypt algorithm choose S K I D , j such that index j satisfies the condition stated above. To retrieve decrypt key K from the header H d r = ( A 0 , A 1 ) and the secret key S K I D , j = ( a 0 , a 1 , h 1 , b ¯ 1 r j , , h l , b ¯ l r j , k 1 , 0 r j , k 1 , 1 r j , , k j 1 , 0 r j , k j 1 , 1 r j , k j , b j r j , k j + 1 , 0 r j , k j + 1 , 1 r j , , k l , 0 r j , k l , 1 r j ) , let B = a 1 · i = 1 , c i = * l h i , b ¯ i r j · i = 1 , i j , d i * l k i , d i r j and output
e ( A 0 , B ) e ( a 0 , A 1 ) = K
For a valid ciphertext, it satisfies that
e ( A 0 , B ) e ( a 0 , A 1 ) = e ( g s , g 2 α ( h 1 , c 1 h l , c l k 1 , d 1 k l , d l g 3 ) r j ) e ( g r j , ( h 1 , c 1 h l , c l k 1 , d 1 k l , d l g 3 ) s ) = e ( g , g 2 ) s α = e ( g 1 , g 2 ) s .
For the example above, with I D = 0000 , ( c , d ) = ( * * 0 * , 0 * 01 ) , and H d r = ( g s , ( h 1 , 0 h 1 , 1 h 2 , 0 h 2 , 1 h 3 , 0 h 4 , 0 h 4 , 1 k 1 , 0 k 3 , 0 k 4 , 1 g 3 ) s ) . Since b 4 = 0 , d 4 = 1 , and j = 4 , using S K 0000 , 4 = ( g r 4 , g 2 α ( h 1 , 0 h 2 , 0 h 3 , 0 h 4 , 0 k 4 , 1 g 3 ) r 4 , ) , we compute B as follows.
B = a 1 · ( h 1 , 1 h 2 , 1 h 4 , 1 ) r 4 · ( k 1 , 0 k 3 , 0 ) r 4 = g 2 α ( h 1 , 0 h 2 , 0 h 3 , 0 h 4 , 0 k 4 , 1 g 3 ) r 4 · ( h 1 , 1 h 2 , 1 h 4 , 1 ) r 4 · ( k 1 , 0 k 3 , 0 ) r 4 = g 2 α ( h 1 , 0 h 1 , 1 h 2 , 0 h 2 , 1 h 3 , 0 h 4 , 0 h 4 , 1 k 1 , 0 k 3 , 0 k 4 , 1 g 3 ) r 4 .
e ( A 0 , B ) e ( a 0 , A 1 ) = e ( g s , g 2 α ( h 1 , 0 h 1 , 1 h 2 , 0 h 2 , 1 h 3 , 0 h 4 , 0 h 4 , 1 k 1 , 0 k 3 , 0 k 4 , 1 g 3 ) r 4 ) e ( g r 4 , ( h 1 , 0 h 1 , 1 h 2 , 0 h 2 , 1 h 3 , 0 h 4 , 0 h 4 , 1 k 1 , 0 k 3 , 0 k 4 , 1 g 3 ) s ) = e ( g , g 2 ) s α e ( g , h 1 , 0 h 1 , 1 h 2 , 0 h 2 , 1 h 3 , 0 h 4 , 0 h 4 , 1 k 1 , 0 k 3 , 0 k 4 , 1 g 3 ) s r 4 e ( g , h 1 , 0 h 1 , 1 h 2 , 0 h 2 , 1 h 3 , 0 h 4 , 0 h 4 , 1 k 1 , 0 k 3 , 0 k 4 , 1 g 3 ) s r 4 = e ( g , g 2 ) s α .
Now, let us analyze the opposite case of I D = 0010 , which is not included by an inclusion label * * 0 * from the subset ( * * 0 * , 0 * 01 ) . Since S K 0010 , i = ( g r i , g 2 α ( h 1 , 0 h 2 , 0 h 3 , 1 h 4 , 0 k i , b ¯ i ) r i , ) and A 1 = ( h 1 , 0 h 1 , 1 h 2 , 0 h 2 , 1 h 3 , 0 h 4 , 0 h 4 , 1 ) s , it is necessary to eliminate h 3 , 1 r i from S K 0010 , i , which is not available. Thus, it is impossible to calculate B using S K 0010 , i .
Let us examine I D = 0001 for instance, which is included in the exclusion label 0 * 01 of subset ( * * 0 * , 0 * 01 ) . Since S K 0001 , 1 = ( g r 1 , g 2 α ( k 1 , 1 ) r 1 , ) , S K 0001 , 2 = ( g r 2 , g 2 α ( k 2 , 1 ) r 2 , ) , S K 0001 , 3 = ( g r 3 , g 2 α ( k 3 , 1 ) r 3 , ) , S K 0001 , 4 = ( g r 4 , g 2 α ( k 4 , 0 ) r 4 , ) , and A 1 = ( k 1 , 0 k 3 , 0 k 4 , 1 ) s , it is necessary to eliminate k 1 , 1 r 1 from S K 0001 , 1 , k 2 , 1 r 2 from S K 0001 , 2 , k 3 , 1 r 3 from S K 0001 , 3 , or k 4 , 0 r 4 from S K 0001 , 4 , which is not possible. Thus, B cannot be computed from S K 0001 , i .
Notice that the encrypt algorithm is applied for each subset S. In other words, the number of H d r is equal to the number of subsets. When the user receives a message, he chooses H d r for the set where he belongs and call Decrypt for that specific H d r .

6. CPA-Security Analysis

In this section, we formally prove the semantic security (IND-CPA-security) of our CSD-based broadcast encryption in Section 5 under the decisional l-BDHE assumption without the random oracle model.
Theorem 3.
Let G be a bilinear group of prime order p. Suppose the (decision) ( t , ϵ , 4 l ) -BDHE assumption holds in G . Then our l-level CSD public key broadcast encryption system is ( t , ϵ , l , m ) semantically secure for arbitrary l, and t < t O ( e l 2 2 l ) , where e is the maximum time for an exponentiation in G .
Proof. 
Suppose that the adversary A has advantage ϵ in attacking the l-level CSD-based public key broadcast encryption. Using A , we build an algorithm B which solves the (decisional) 4 l -BDHE problem in G .
For generators h , g G and α Z p * , let y i = g α i G . As a beginning of the game, B starts a game with the 4 l -BDHE problem, to get a random tuple ( h , g , y 1 , , y 4 l , y 4 l + 2 , , y 8 l , T ) which is either sampled from P B D H E (where T = e ( h , g ) ( α 4 l + 1 ) ) or from R B D H E (where T is uniform and independent in G 1 ). The goal of B ’s is to output 1 if the tuple is sampled from P B D H E , or output 0 otherwise. B proceeds by interacting with A in a following selective subset game:
Init: The game begins with A first outputting a subset S * = ( c * , d * ) that it intends to attack where c * , d * { 0 , 1 , * } l .
Setup: To generate the public key, algorithm B picks a random γ in Z p and sets g 1 = y 1 = g α and g 2 = y 4 l g γ = g γ + ( α 4 l ) . B picks random γ 1 , 0 , γ 1 , 1 , , γ l , 0 , γ l , 1 and ψ 1 , 0 , ψ 1 , 1 , , ψ l , 0 , ψ l , 1 in Z p , to set h i , 0 = g γ i , 0 / y l i + 1 , h i , 1 = g γ i , 1 / y 2 l i + 1 , k i , 0 = g ψ i , 0 / y 3 l i + 1 , k i , 1 = g ψ i , 1 / y 4 l i + 1 for i = 1 , , l . B also picks a random δ in Z p and sets g 3 = g δ i = 1 l y l i + 1 c i , 0 * y 2 l i + 1 c i , 1 * y 3 l i + 1 d i , 0 * y 4 l i + 1 d i , 1 * such that c i , 0 * = 0 and c i , 1 * = 1 if c i * = 1 , c i , 0 * = 1 and c i , 1 * = 0 if c i * = 0 , and c i , 0 * = 1 and c i , 1 * = 1 if c i * = * . d i , 0 * and d i , 1 * are similarly defined by d i * except that d i , 0 * = 0 and d i , 1 * = 0 if d i * = * .
Consider a query for the secret key corresponding to I D = b 1 b l { 0 , 1 } l . We now show that the revoked users can be simulated with two cases as follows. Case 1 shows how to simulate users that are not in the inclusion label c * in subset ( c * , d * ). Case 2 shows that we can also simulate users included in the exclusion label d * .
Case 1: If ID Sensors 20 03140 i001 c* there exists k { 1 , , l } such that b k c k * , b k * , and c k * * . We set k to be the smallest of such index. To generate the secret key S K I D , j , B first picks random r ˜ 1 , , r ˜ l in Z p . We pose r j = α ( 3 b k ) l + k + r ˜ j for j = 1 , , l . Next, B generates the secret key S K I D = ( S K I D , 1 , , S K I D , l ) where
S K I D , j = ( g r j , g 2 α ( h 1 , b 1 h l , b l k j , b ¯ j g 3 ) r j , h 1 , b ¯ 1 r j , , h l , b ¯ l r j , k 1 , 0 r j , k 1 , 1 r j , , k j 1 , 0 r j , k j 1 , 1 r j , k j , b j r j , k j + 1 , 0 r j , k j + 1 , 1 r j , , k l , 0 r j , k l , 1 r j )
which is a properly distributed secret key for the identity I D = b 1 b l . B can compute all elements for the secret key, given the values at its disposal, considering the fact that y i α j = y i + j for any i , j . Note that b i , 0 = b ¯ i and b i , 1 = b i if b i * ; otherwise, b i , 0 = b i , 1 = 1 . If b j = * and an equation includes b ¯ j then two equations are created by replacing b ¯ j by 0 and 1. To generate the second component of the secret key, first observe that
( h 1 , b 1 h l , b l k j , b ¯ j g 3 ) r j = ( i = 1 l h i , 0 b i , 0 h i , 1 b i , 1 · k j , b ¯ j · g 3 ) r j = ( i = 1 l ( g γ i , 0 y l i + 1 ) b i , 0 ( g γ i , 1 y 2 l i + 1 ) b i , 1 · g ψ j , b ¯ j y ( 3 + b ¯ j ) l j + 1 · g δ i = 1 l y l i + 1 c i , 0 * y 2 l i + 1 c i , 1 * y 3 l i + 1 d i , 0 * y 4 l i + 1 d i , 1 * ) r j = ( g δ + ψ j , b ¯ j + i = 1 , i k l ( b i , 0 γ i , 0 + b i , 1 γ i , 1 ) i = 1 , i k l ( y l i + 1 c i , 0 * b i , 0 y 2 l i + 1 c i , 1 * b i , 1 ) · i = 1 l ( y 3 l i + 1 d i , 0 * y 4 l i + 1 d i , 1 * ) y 3 l j + 1 b j , 0 y 4 l j + 1 b j , 1 y l k + 1 c k , 0 * b k , 0 y 2 l k + 1 c k , 1 * b k , 1 ) r j = ( g δ + ψ j , b ¯ j + i = 1 , i k l ( b i , 0 γ i , 0 + b i , 1 γ i , 1 ) i = 1 , i k l ( y l i + 1 c i , 0 * b i , 0 y 2 l i + 1 c i , 1 * b i , 1 ) · i = 1 l ( y 3 l i + 1 d i , 0 * y 4 l i + 1 d i , 1 * ) y 3 l j + 1 b j , 0 y 4 l j + 1 b j , 1 y ( 1 + b ¯ k ) l k + 1 c k , b ¯ k * y ( 1 + b k ) l k + 1 1 ) r j .
Let Z r j denote the product of y except the last one.
Z r j = ( g δ + ψ j , b ¯ j + i = 1 , i k l ( b i , 0 γ i , 0 + b i , 1 γ i , 1 ) i = 1 , i k l ( y l i + 1 c i , 0 * b i , 0 y 2 l i + 1 c i , 1 * b i , 1 ) · i = 1 l ( y 3 l i + 1 d i , 0 * y 4 l i + 1 d i , 1 * ) y 3 l j + 1 b j , 0 y 4 l j + 1 b j , 1 y ( 1 + b ¯ k ) l k + 1 c k , b ¯ k * ) r j = y ( 3 b k ) l + k δ + ψ j , b ¯ j + i = 1 , i k l ( b i , 0 γ i , 0 + b i , 1 γ i , 1 ) · i = 1 , i k l ( y ( 4 b k ) l i + k + 1 c i , 0 * b i , 0 y ( 5 b k ) l i + k + 1 c i , 1 * b i , 1 ) · i = 1 l ( y ( 6 b k ) l i + k + 1 d i , 0 * y ( 7 b k ) l i + k + 1 d i , 1 * ) · y ( 6 b k ) l j + 1 b j , 0 y ( 7 b k ) l j + 1 b j , 1 y ( 4 b k + b ¯ k ) l + 1 c k , b ¯ k * Z r ¯ j .
Since ( 3 b k ) l + k < 4 l , ( 4 b k ) l i + k + 1 = ( 4 b k ) l + 1 + ( k i ) 4 l + 1 and ( 5 b k ) l i + k + 1 4 l + 1 due to k i , ( 7 b k ) l i + k + 1 > ( 6 b k ) l i + k + 1 4 l + k + 1 > 4 l + 1 , ( 7 b k ) l j + k + 1 > ( 6 b k ) l j + k + 1 5 l l + k + 1 > 4 l + 1 , ( 4 b k + b ¯ k ) l + 1 = 3 l + 1 or 5 l + 1 , and Z is computable, B is able to compute all the terms in Z r j . Next, when observing the last term, y ( 1 + b k ) l k + 1 r j is:
y ( 1 + b k ) l k + 1 r j = y ( 1 + b k ) l k + 1 r ˜ j y ( 1 + b k ) l k + 1 α ( 3 b k ) l + k = y ( 1 + b k ) l k + 1 r ˜ j y 4 l + 1 1 .
Hence, the first component in the secret key is equal to:
g 2 α ( h 1 , b 1 h l , b l k j , b ¯ j g 3 ) r j = ( y 4 l + 1 y 1 γ ) Z r j ( y ( 1 + b k ) l k + 1 r ˜ j y 4 l + 1 1 ) = y 1 γ y ( 1 + b k ) l k + 1 r ˜ j Z r j .
Notice that the unknown term y 4 l + 1 cancels out, which indicates that B can compute the first secret key component. The first component, g r j , is y ( 3 b k ) l + k g r ˜ j which can be computed by B . In a similar manner, the remaining elements h 1 , b ¯ 1 r j , , h l , b ¯ l r j , k 1 , 0 r j , k 1 , 1 r j , , k j 1 , 0 r j , k j 1 , 1 r j k j , b j r j , k j + 1 , 0 r j , k j + 1 , 1 r j , , k l , 0 r j , k l , 1 r j can be computed by B since they do not involve a y 4 l + 1 term such that h i , b ¯ i r j = ( g γ i , b ¯ i y ( 1 + b ¯ i ) l i + 1 ) r j = ( y ( 3 b k ) l + k · g r ˜ j ) γ i , b ¯ i · ( y ( 4 b k + b ¯ i ) l i + k + 1 y ( 1 + b ¯ i ) l i + 1 r ˜ j ) 1 and k i , t r j = ( g ψ i , t y ( 3 + t ) l i + 1 ) r j = ( y ( 3 b k ) l + k · g r ˜ j ) ψ i , t · ( y ( 6 b k + t ) l i + k + 1 · y ( 3 + t ) l i + 1 r ˜ j ) 1 . Thus, B can derive h i , b ¯ i r j since ( 4 b k + b ¯ i ) l i + k + 1 ( 4 b k + b ¯ i ) l + 1 when i k and ( 4 b k + b ¯ i ) l i + k + 1 = 3 l + 1 or 5 l + 1 when i = k . Moreover, B can derive k i , t r j since ( 3 b k ) l + k 4 l , ( 6 b k + t ) l i + k + 1 5 l i + k + 1 > 4 l + 1 , and ( 3 + t ) l i + 1 4 l .
Case 2: If I D d * , to generate the secret key S K I D , j for identity I D = b 1 b l , B first selects a random r ˜ 1 , , r ˜ l in Z p . We pose r j = α ( 1 b ¯ j ) l + j + r ˜ j for j = 1 , , l . Next, B generates the secret key S K I D = ( S K I D , 1 , , S K I D , l ) The secret key parameters are computed similarly. To generate the second component of the secret key, first observe that
( h 1 , b 1 h l , b l k j , b ¯ j g 3 ) r j = ( g δ + ψ j , b ¯ j + i = 1 l ( b i , 0 γ i , 0 + b i , 1 γ i , 1 ) i = 1 l ( y l i + 1 c i , 0 * b i , 0 y 2 l i + 1 c i , 1 * b i , 1 ) i = 1 , i j l ( y 3 l i + 1 d i , 0 * y 4 l i + 1 d i , 1 * ) · y ( 3 + b j ) l j + 1 d j , b j * · y ( 3 + b ¯ j ) l j + 1 1 ) r j .
Let Z r j denote the product of y except the last one. That is
Z r j = ( g δ + ψ j , b ¯ j + i = 1 l ( b i , 0 γ i , 0 + b i , 1 γ i , 1 ) i = 1 l ( y l i + 1 c i , 0 * b i , 0 y 2 l i + 1 c i , 1 * b i , 1 ) i = 1 , i j l ( y 3 l i + 1 d i , 0 * y 4 l i + 1 d i , 1 * ) · y ( 3 + b j ) l j + 1 d j , b j * ) r j .
B can compute all the terms in Z r j given the values at its disposal since l i + 1 + ( 1 b ¯ j ) l + j ( 2 b ¯ j ) l i + j + 1 3 l , 2 l i + 1 + ( 1 b ¯ j ) l + j 4 l , 3 l i + 1 + ( 1 b ¯ j ) l + j = ( 4 b ¯ j ) l i + j + 1 4 l + 1 , 4 l i + 1 + ( 1 b ¯ j ) l + j = ( 5 b ¯ j ) l i + j + 1 4 l + 1 due to i j , and ( 3 + b j ) l j + 1 + ( 1 b ¯ j ) l + j = ( 4 + b j b ¯ j ) l + 1 = 3 l + 1 or 5 l + 1 . Note that if b j = * then d j , 0 * = d j , 1 * = 0 , and each case that b j = 0 or b j = 1 is considered. Next observe that the last term, namely y ( 3 + b ¯ j ) l j + 1 r j , is:
y ( 3 + b ¯ j ) l j + 1 r j = y ( 3 + b ¯ j ) l j + 1 r ˜ j y ( 3 + b ¯ j ) l j + 1 α ( 1 b ¯ j ) l + j = y ( 3 + b ¯ j ) l j + 1 r ˜ j · y 4 l + 1 1 .
Therefore, the first component in the secret key is equal to:
g 2 α ( h 1 , b 1 h l , b l k j , b ¯ j g 3 ) r j = ( y 4 l + 1 y 1 γ ) Z r j ( y ( 3 + b ¯ j ) l j + 1 r ˜ j / y 4 l + 1 ) = y 1 γ y ( 3 + b ¯ j ) l j + 1 r ˜ j Z r j .
Notice that the unknown y 4 l + 1 cancels out, which indicates that B can compute the first secret key component. The first component, g r j , is y ( 1 b ¯ j ) l + j g r ˜ j which can be computed by B . In a similar manner, B can compute the remaining elements h 1 , b ¯ 1 r j , ,
h l , b ¯ l r j , k 1 , 0 r j , k 1 , 1 r j , , k j 1 , 0 r j , k j 1 , 1 r j , k j , b j r j , k j + 1 , 0 r j , k j + 1 , 1 r j , , k l , 0 r j , k l , 1 r j since they do not involve y 4 l + 1 as h i , b ¯ i r j = ( g γ i , b ¯ i y ( 1 + b ¯ i ) l i + 1 ) r j = ( y ( 1 b ¯ j ) l + j · g r ˜ j ) γ i , b ¯ i · ( y ( 2 + b ¯ i b ¯ j ) l i + j + 1 y ( 1 + b ¯ i ) l i + 1 r ˜ j ) 1 and k i , t r j = ( g ψ i , t y ( 3 + t ) l i + 1 ) r j = ( y ( 1 b ¯ j ) l + j · g r ˜ j ) ψ i , t · ( y ( 4 + t b ¯ j ) l i + j + 1 · y ( 3 + t ) l i + 1 r ˜ j ) 1 . Thus, B can derive a valid secret key for I D since ( 4 + t b ¯ j ) l i + j + 1 ( 4 + t b ¯ j ) l + 1 when i j and ( 4 t + b ¯ j ) l i + j + 1 = 3 l + 1 or 5 l + 1 due to t = b j when i = j .
Therefore, B can derive a valid secret key for I D . Finally, B gives A public key P K = ( g , g 1 , g 2 , g 3 , h 1 , 0 , h 1 , 1 , , h l , 0 , h l , 1 , k 1 , 0 , k 1 , 1 , , k l , 0 , k l , 1 ) and S K I D such that ID Sensors 20 03140 i001 c* or I D d * . All the values are within an independent and uniform distribution from G , as required. The master key for the corresponding system is supposed to be g 2 α = g α ( α 4 l + γ ) = y 4 l + 1 y 1 γ , which is not known to B since B does not know y 4 l + 1 .
For the challenge, B generates H d r * as ( h , h δ + i = 1 l ( γ i , 0 c i , 0 * + γ i , 1 c i , 1 * + ψ i , 0 d i , 0 * + ψ i , 1 d i , 1 * ) ) . It sets K * = T · e ( y 1 , h γ ) and gives ( H d r * , K * ) to A , as a challenge. If T = e ( g , h ) 4 l + 1 (i.e., the input to B is from the 4 l -BDHE) then the challenge ( H d r * , K * ) is valid from A’s view as in the real attacking game. This can be seen by writing h = g c for some (unknown) c Z p as follows:
h δ + i = 1 l ( γ i , 0 c i , 0 * + γ i , 1 c i , 1 * + ψ i , 0 d i , 0 * + ψ i , 1 d i , 1 * ) = ( i = 1 l ( g γ i , 0 y l i + 1 ) c i , 0 * ( g γ i , 1 y 2 l i + 1 ) c i , 1 * ( g ψ i , 0 y 3 l i + 1 ) d i , 0 * ( g ψ i , 1 y 4 l i + 1 ) d i , 1 * · ( g δ i = 1 l y l i + 1 c i , 0 * y 2 l i + 1 c i , 1 * y 3 l i + 1 d i , 0 * y 4 l i + 1 d i , 1 * ) ) c = ( h 1 , c 1 * h l , c l * k 1 , d 1 * k l , d l * g 3 ) c
and
e ( g , h ) ( α 4 l + 1 ) · e ( y 1 , h γ ) = ( e ( y 1 , y 4 l ) · e ( y 1 , g γ ) ) c = e ( y 1 , y 4 l g γ ) c = e ( g 1 , g 2 ) c .
Thus, by definition, H d r * is a valid encryption of e ( y 4 l + 1 , g ) c . Also, observe that e ( y 4 l + 1 , g ) c = e ( g , h ) 4 l + 1 = T = K b , which confirms that ( H d r , K * ) is a valid challenge to the A . On the other hand, if T is random in G (i.e., the input to B is a random tuple), then K * is just a random independent key of K from A ’s view.
Guess:
Finally, A outputs a guess b { 0 , 1 } . Algorithm B concludes its own game by outputting the b . If b = 1 then B outputs 1 meaning T = e ( g , h ) ( α 4 l + 1 ) . Otherwise, it outputs 0 meaning T is random in G T .
If the input tuple was originated from P B D H E ( T = e ( g , h ) ( α 4 l + 1 ) ), A ’s view is the same as the view in a real attacking game. Therefore, A satisfies | P r [ b = 1 ] 1 / 2 | ϵ . When the input tuple was originated from R B D H E (T is uniform in G 1 ) then P r [ b = 1 ] = 1 / 2 . Hence, with g , h uniform in G , α uniform in Z p , and T uniform in G 1 we have that
| P r [ B ( g , h , y g , α , 4 l , e ( g , h ) ( α 4 l + 1 ) ) = 0 ] P r [ B ( g , h , y g , α , 4 l , T ) = 0 ] | | ( 1 / 2 + ϵ ) 1 / 2 | = ϵ .
as required, which completes the proof of the theorem. □

7. CCA-secure Broadcast Encryption

In this section, we extend our proposed CPA-secure combinatorial subset difference public key broadcast encryption scheme to obtain CCA-security by using similar technique from Reference [33] which transforms a CPA-secure identity-based encryption to the CCA-secure encryption. However, to apply the similar technique, it is required to allow wildcards (*) in the ID (as well as the subset) since the technique is based on the identity-based encryption. Therefore, we first construct a general ID construction, which is a generalized version of our CSD-based broadcast encryption which allows wildcards in the user ID in Section 7.1. Then, based on the general ID construction, we build a CCA-secure CSD-based public key broadcast encryption in Section 7.2. Finally, we provide a formal proof for the CCA-security in Section 8.

7.1. General ID Construction

In this section, we extend the proposed broadcast encryption scheme to a generalized ID version, which allows * in each ID (as well as subsets). This general ID scheme is a basic building block for building a CCA-secure broadcast encryption in Section 7.
For labels x and y, we define x y to indicate that x is covered by y, and x Sensors 20 03140 i001 y to denote that x is not covered by y. For instance, 0 * 00 0 * * 0 and 0*00 Sensors 20 03140 i001 1**0.
Definition 5.
x is covered by y, or x y iff i , x i = y i or y i = * for x = x 1 x l and y = y 1 y l .
Definition 6.
x is not covered by y, or x Sensors 20 03140 i001 y iff a x , i , a i y i , x i * , and y i * .
Setup(l,m): The setup is similar to the main scheme. The generation of public key is equivalent and the generation of secret key is similar to Equation (1), except that h i , * = 1 and h i , * ¯ r j populates two values of h i , 0 r j and h i , 1 r j . Similarly, since interpretation of k i , * covers both k i , 0 and k i , 1 , the secret key S K I D , i should be doubled into S K I D , i , 0 and S K I D , i , 1 if b i = * . The resulting secret key for ID is S K I D = ( S K I D , 1 , , S K I D , l ) where if b i * then i-th secret key is the same as the key generation in the main scheme:
S K I D , i = ( g r i , g 2 α ( h 1 , b 1 h l , b l k i , b ¯ i g 3 ) r i , h 1 , b ¯ 1 r i , , h l , b ¯ l r i , k 1 , 0 r i , k 1 , 1 r i , , k i 1 , 0 r i , k i 1 , 1 r i , k i , b i r i , k i + 1 , 0 r i , k i + 1 , 1 r i , , k l , 0 r i , k l , 1 r i ) G 3 l + 1
else if b i = * then i-th secret key duplicated into two elements as follows: S K I D , i = ( S K I D , i , 0 , S K I D , i , 1 ) with
S K I D , i , 0 = ( g r i , 0 , g 2 α ( h 1 , b 1 h i 1 , b i 1 h i + 1 , b i + 1 h l , b l k i , 0 g 3 ) r i , 0 , h 1 , b ¯ 1 r i , 0 , , h i , 0 r i , 0 , h i , 1 r i , 0 , , h l , b ¯ l r i , 0 , k 1 , 0 r i , 0 , k 1 , 1 r i , 0 , , k i 1 , 0 r i , 0 , k i 1 , 1 r i , 0 , k i , 1 r i , 0 , k i + 1 , 0 r i , 0 , k i + 1 , 1 r i , 0 , , k l , 0 r i , 0 , k l , 1 r i , 0 ) S K I D , i , 1 = ( g r i , 1 , g 2 α ( h 1 , b 1 h i 1 , b i 1 h i + 1 , b i + 1 h l , b l k i , 1 g 3 ) r i , 1 , h 1 , b ¯ 1 r i , 1 , , h i , 0 r i , 1 , h i , 1 r i , 1 , , h l , b ¯ l r i , 1 , k 1 , 1 r i , 1 , k 1 , 1 r i , 1 , , k i 1 , 0 r i , 1 , k i 1 , 1 r i , 1 , k i , 0 r i , 1 , k i + 1 , 0 r i , 1 , k i + 1 , 1 r i , 1 , , k l , 0 r i , 1 , k l , 1 r i , 1 ) .
Encrypt( P K ,S): Equivalent to Equations (2) and (3).
Decrypt(S, I D , S K I D , H d r ): Consider an identity I D = b 1 b l { 0 , 1 , * } l and a subset S = ( c , d ) . If I D c and ID Sensors 20 03140 i001 c then I D can decrypt the message. Let j be an index such that b j = * and d j * , b j = 0 and d j = 1 , or b j = 1 and d j = 0 . If b j * then the decryption is equivalent to Equation (4). Assume that b j = * . From S K I D , j , we use S K I D , j , d j as secret key. To regenerate decrypt key K using the given header H d r = ( A 0 , A 1 ) and the secret key S K I D , j , d j , compute B = a 1 · i = 1 , c i = * , b i * l h i , b ¯ i r j i = 1 , c i * , b i = * l h i , c i r j i = 1 , c i = * , b i = * l h i , 0 r j h i , 1 r j · i = 1 , i j , d i * l k i , d i r j and output
e ( A 0 , B ) e ( a 0 , A 1 ) = K .

7.2. CCA-Secure Construction

For the following description, a vector V = ( v 1 , , v n ) is represented interchangeably as v 1 v n . With two vectors V = ( v 1 , , v n ) and V = ( v 1 , , v m ) , we denote V | | V = ( v 1 , , v n , v 1 , , v m ) .
Given a strong one-time signature scheme ( S i g K e y g e n , S i g n , V e r i f y ) with verification keys which are mapped to { 0 , 1 } z , we enable construction of an l-level public key broadcast encryption system Π = ( Setup , Encrypt , Decrypt ) secure against chosen-ciphertext attacks using the ( l + z ) -level Π = ( Setup , Encrypt , Decrypt ) semantically secure broadcast encryption scheme. The intuition is that I D = ( b 1 , , b l ) { 1 , 0 , * } l in Π is mapped to I D = I D | | * z = ( b 1 , , b l , * , , * ) { 1 , 0 , * } l + z in Π . Thus, the secret key S K I D for I D in Π is the secret key S K I D in Π . Recall that S K I D | | * z can generate secret keys of all descendants of node I D | | * z , that is, S K I D | | 0 z , , S K I D | | 1 z . When encrypting a key K K to I D in Π , the sender generates a z-bit verification key V s i g = ( e 1 , , e z ) { 0 , 1 } z and then encrypts K to the I D = I D | | V s i g using Π .
In more details, l-level Π is constructed using ( l + z ) -level Π and an one-time signature scheme as follows:
Setup( l , m ): Let 2 l be the maximum number of users and { 0 , 1 } m be the message space. Assume the signature verification key space as { 0 , 1 } z . To obtain the public key and master secret key, run semantically secure broadcast encryption Π .
P K , m a s t e r k e y , { S K I D } I D { 0 , 1 } l + z Setup ( l + z ) .
To generate secret key S K I D for an identity I D = b 1 b l using the master secret, encode I D to I D = I D | | * * * z . The secret key S K I D is generated from the key generation algorithm in Setup of Π . Let S K I D = S K I D = ( S K I D , 1 , , S K I D , l ) . and output P K , m a s t e r k e y , { S K I D } I D { 0 , 1 } l
Encrypt( P K , S ): Run S i g K e y G e n ( 1 z ) algorithm to receive a signature signing key K s i g and a verification key V s i g . Assume that V s i g = e 1 e z . For a given S = ( c , d ) , run Encrypt’ to obtain header H d r and encryption key K
H d r , K Encrypt ( P K , S )
and output the pair ( H d r , K ) .
Decrypt( S , I D , S K I D , H d r ): Parse the header as H d r = ( ( C 0 , C 1 ) , σ , V s i g ) .
  • Verify the signature σ , for ( C 0 , C 1 ) and V s i g . If the signature is invalid, output ⊥.
  • Otherwise, let I D to I D = I D | | * * * z , run Decrypt’( S , I D , S K I D , H d r ) and output encryption key K.
In a similar way, if ID contains * the extension in Section 7.1 is applied.
The correctness is straightforward from a similar calculation as in Section 5. Note that the user key size increases from O ( l 2 ) to O ( ( l + z ) l ) and the header size is enlarged by the size of a signature and a verification key.

8. CCA-Security Analysis

In this section, we formally prove the security against the chosen-ciphertext attack (CCA-security) of the CCA-secure CSD-based broadcst encryption in Section 7 assuming the presence of a one-time signature scheme.
Theorem 4.
Let G be a bilinear group of prime order p. For all positive integer l, the above public key broadcast encryption Π is ( t , ϵ 1 + ϵ 2 , l , m , q D ) CCA-secure when assuming the public key broadcast encryption Π is ( t , ϵ 1 , l + z , m , 0 ) semantically secure in G and the signature scheme is ( t , ϵ 2 , z , 1 ) strongly and existentially unforgeable. And t < t ( 2 ( l + z ) a + 2 p ) q D t s , where a is point addition time, p is pairing time, and t s is sum of S i g K e y G e n , S i g n and V e r i f y computation time.
Proof. 
Suppose there exists a t-time adversary, A , such that | A d v B r A , Π 1 / 2 | > ϵ 1 + ϵ 2 . We build an algorithm B , which has an advantage | A d v B r B , Π 1 / 2 | > ϵ 1 in G . Algorithm B proceeds as follows.
Init: Algorithm B runs A and receives set S * in which users A wishes to be challenged on. And B runs the S i g K e y G e n to get a signature signing key K s i g * and a verification key V s i g * { 0 , 1 } z . Let V s i g * = e 1 e z , then B makes S * * = { U | | V s i g * | U S * } and outputs it.
Setup: ( l , m ) B gets the public key P K of Π and also gets secret keys S K I D for revoked I D S * * from challenger C . Note that I D S * * iff x such that x I D , x S * * . In addition, I D S * iff x such that x I D , x S * .
Since Π can generate secret keys in a compressed manner using *, wlog, I D can be categorized into the following two formats:
  • I D = I D | | * * * z for I D S *
  • I D = I D | | * * k 1 e ¯ k * * z k for I D S * and k { 1 , , z } .
B responds with P K and secret keys S K I D of the first type of I D . (Recall that the secret key S K I D = S K I D where I D = I D | | * * * z .) The secret keys S K I D of the second type of I D are used to respond to the decryption queries of A as described in the below.
Query phase1:
A can adaptively issue decryption queries. The decryption query consists of ( I D , S , H d r ) , where S S * , I D S , and H d r = ( ( C 0 , C 1 ) , σ , V s i g ) . For the query, B replies as follows:
  • Run V e r i f y to check the validity of signature σ on ( C 0 , C 1 ) by using the verification key V s i g . If the signature is not valid, B replies with ⊥.
  • If V s i g = V s i g * , an event forge happens, B outputs a random bit b $ { 0 , 1 } , and aborts the simulation.
  • Otherwise, B decrypts the header using the second type of secret keys. Let V = * * k 1 e ¯ k * * * z k where k { 1 , , z } . Since V s i g V s i g * , V s i g V . Hence, B can compute S K I D | | V s i g from S K I D | | V . Using S K I D | | V s i g , B can regenerate K Decrypt ( S , I D , S K I D | | V s i g , ( C 0 , C 1 ) ) .
Challenge: B gets the challenge ( H d r , K * ) from C . To generate challenge for A , B computes H d r * as follows:
σ * S i g n ( H d r , K s i g * ) H d r * ( H d r , σ * , V s i g * )
B replies with ( H d r * , K * ) to A .
Query phase2: Same as in query phase1.
Guess: The A outputs a guess b { 0 , 1 } . B outputs b.
We see that algorithm B can simulate all queries to run A . B ’s success probability as follows:
| A d v B r B , Π 1 2 | | A d v B r A , Π 1 2 | P r [ forge ] > ( ϵ 1 + ϵ 2 ) P r [ forge ] .
To conclude the proof, it is required to bound the probability of B aborting the simulation from forge. It can be seen that P r [ forge ] < ϵ 2 ; otherwise, the adversary A can be utilized to forge signatures with a probability of at least ϵ 2 . In brief, we can construct another simulator which knows the secret key, but receives K s i g * for the challenge in a forgery game. In the experiment above, A causes an abort by submitting a query which includes an existential forgery under K s i g * on some ciphertexts. Our simulator can use A to win the existential forgery game. During the game, the adversary makes only a single chosen message query to generate the signature required for the challenge ciphertext. Thus, P r [ forge ] < ϵ 2 , which concludes that B ’s advantage is at least ϵ 1 as required. □

9. Experiments

In this section, we show the experimental results on the proposed combinatorial subset difference (CSD), and compare the result to the other public key broadcast encryption. We implemented our CSD subset representation and CSD-based broadcast encryption, and also implemented the subset difference (SD) [7,8] and interval [10] representation along with its broadcast encryption on the Intel Edison IoT device with a 500 Mhz 32 bit Atom processor. Specifically, the SD-based broadcast encryption is implemented as a public key broadcast encryption by applying the public key lifting transformation [5], which combines the hierarchical identity-based encryption (HIBE) [6] to the original symmetric Naor SD construction [7,8] from the advanced access content system (AACS) DVD standard [9]. For the comparison, we first show the number of subsets generated from each algorithm to confirm the efficiency of CSD subset representation. We also verify that the CSD-based broadcast encryption achieves the minimal header size, due to the least number of subsets. Then for the broadcast encryption, we observe the practicality of our CSD-based broadcast encryption by comparing the key size and the performance (i.e., encryption time and decryption time) to the other constructions.
Subset Representation. The proposed CSD subset representation is the most general representation among the existing subset representations. Therefore, it can cover more users within a subset, which decreases the number of total subsets. When considering the IoT application, the improvement is more emphasized; the non-hierarchical IP address generates numerous subsets in the existing hierarchy-based representations (e.g., SD or interval), while the CSD can cover the non-hierarchical IP address in a single representation. For example, for a binary IP address 0 * * . * * 1 . * * * . 001 , existing representation methods end up generating many subsets, while the CSD can efficiently cover it with a single subset.
Figure 6 shows the subset generations in the IP address application: Figure 6a presents the number of subsets when randomly increasing the number of wildcards in a single binary IP address, and Figure 6b presents the number of subsets when increasing the random IP addresses. The suffix of representations denotes a total bit of a user: for example, S D 10 indicates an SD representation for 10-bit users, or 2 10 total users. In both (a) and (b), the CSD representation can cover an IP address with a single subset regardless of the total users. When observing (a), the number of subsets in SD and interval representation when the IP address includes more wildcards or the total users increase, while the CSD can cover the address within a single subset in any case. When assuming a binary IP address with 20 wildcards in 2 20 total users, the CSD can remain a single subset while the SD requires more than 1000 subsets; the CSD can save the header size 1000 times compared to the current standard SD representation. In (b), the SD and interval representation generates more subsets for more IP addresses or more total users, while the number of CSD subsets are same as the number of IP addresses.
Header Size. Since the CSD representation generates minimal subsets, the CSD-based broadcast encryption achieves minimal (total) header sizes when applied to the broadcast encryption. We verify the result by implementing and running the broadcast encryption constructions on the IoT machine, and measuring the size of the total header output size.
Figure 7 compares the header size of each broadcast encryption, by randomly revoking the users: by varying the total users to (a) 2 10 , (b) 2 15 , and (c) 2 20 . The header size is measured by the number of group elements, where a single group element is a 20-bytes object from the pairing-based cryptography (PBC) library. In all cases, the CSD-based broadcast encryption outputs the smallest headers, compared to the SD-based and interval-based broadcast encryption.
Key Size. The CSD-based broadcast encryption is also efficient in terms of key sizes; it maintains a public key and secret key size comparable to the standard broadcast encryption such as SD-based and interval-based broadcast encryption.
Table 3 shows the public key (PK) size and secret key (SK) size of the CSD-based, SD-based, and interval-based broadcast encryption. The public key size in CSD-based broadcast encryption is proportional to the user depth n, within the same order of O ( log n ) as the SD-based and interval-based construction. The secret key size is similar to the interval-based construction, within the same order of O ( log 2 n ) for user depth n.
Performance. For the performance of encryption and decryption, the CSD-based broadcast encryption improves the encryption and decryption time compared to the SD-based and interval-based broadcast encryption. The main factor is that our CSD-based construction does not involve any public parameter related computation, while the other constructions require additional delegation from the public parameters which is proportional to the user depth.
Table 4 shows the encryption time and decryption time of the CSD-based, SD-based, and interval-based broadcast encryption, by varying the user depth from 10-bits to 20-bits (total users of 2 10 , 2 15 , 2 20 ). Our CSD-based broadcast encryption maintains almost the same performance for all cases, while the other constructions suffer from the increase of encryption and decryption time proportional to the user depth.
Open Source. The experiment resources are publicly available at GitHub as an open source (https://github.com/snp-lab/CSD), for the open science and reproducible research. The resources consist of subset construction algorithm for each combinatorial subset difference (CSD), subset difference (SD), and interval, and broadcast encryption implementation for CSD-based, SD-based, and interval-based scheme.

10. Conclusions

We propose the combinatorial subset difference (CSD), which is a new subset representation for the broadcast encryption. The CSD representation is the most general representation compared to state-of-the-arts such as subset difference (SD) or interval representations. Due to the general coverage, the CSD representation generates minimal subsets which lead to the minimal header size when applied to the broadcast encryption. The CSD representation is the only representation to cover the non-hierarchical IP addresses, making it as a suitable choice for IoT network multicast. We design and implement the algorithm for CSD subset generation, and verify the practicality of our CSD in the IoT systems.
We also propose the CSD-based public key broadcast encryption, which can be efficiently applied to the IP multicast in IoT systems. The CSD-based broadcast encryption achieves minimal (total) header size due to the minimal subsets of the CSD representation, and it also improves the encryption time and decryption time compared to the existing broadcast encryption. Our experimental results verify that the CSD-based broadcast encryption reduces the header size 31% for the worst cases, improves encryption time by 6 times, and improves decryption time by 10 times compared to the SD-based broadcast encryption. For the security, we prove the collusion-resistance and semantic security of the CSD-based construction under the standard l-BDHE assumption, and extend the construction to the CCA-secure broadcast encryption.

Author Contributions

Conceptualization, J.K. and H.O.; methodology, J.K. and H.O.; software, J.L. and S.L.; validation, J.L. and S.L.; writing–original draft preparation, J.L and S.L; writing–review and editing, J.K., J.L., S.L., and H.O. All authors have read and agree to the published version of the manuscript.

Funding

This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Ministry of Science and ICT Korea (2017-0-00661, 2016-6-00599), and a Semiconductor Industry Collaborative Project between Hanyang University and Samsung Electronics Co. Ltd.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Yao, X.; Chen, Z.; Tian, Y. A lightweight attribute-based encryption scheme for the Internet of Things. Future Gener. Comput. Syst. 2015, 49, 104–112. [Google Scholar] [CrossRef]
  2. Belguith, S.; Kaaniche, N.; Laurent, M.; Jemai, A.; Attia, R. Phoabe: Securely outsourcing multi-authority attribute based encryption with policy hidden for cloud assisted iot. Comput. Netw. 2018, 133, 141–156. [Google Scholar] [CrossRef] [Green Version]
  3. Kim, J.Y.; Hu, W.; Sarkar, D.; Jha, S. ESIoT: Enabling secure management of the internet of things. In Proceedings of the 10th ACM Conference on Security and Privacy in Wireless and Mobile Networks, Boston, MA, USA, 18–20 July 2017; pp. 219–229. [Google Scholar]
  4. Mao, Y.; Li, J.; Chen, M.R.; Liu, J.; Xie, C.; Zhan, Y. Fully secure fuzzy identity-based encryption for secure IoT communications. Comput. Stand. Interfaces 2016, 44, 117–121. [Google Scholar] [CrossRef]
  5. Dodis, Y.; Fazio, N. Public Key Broadcast Encryption for Stateless Receivers. Lect. Notes Comput. Sci. 2002, 2696, 61–80. [Google Scholar]
  6. Boneh, D.; Boyen, X.; Goh, E. Hierarchical Identity Based Encryption with Constant Size Ciphertext. In Proceedings of the Advances in Cryptology - EUROCRYPT 2005, 24th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Aarhus, Denmark, 22–26 May 2005; pp. 440–456. [Google Scholar]
  7. Bhattacherjee, S.; Sarkar, P. Complete tree subset difference broadcast encryption scheme and its analysis. Des. Codes Cryptogr. 2013, 66, 335–362. [Google Scholar] [CrossRef] [Green Version]
  8. Naor, D.; Naor, M.; Lotspiech, J. Revocation and Tracing Schemes for Stateless Receivers. Lect. Notes Comput. Sci. 2001, 2139, 41–62. [Google Scholar]
  9. Henry, K.; Sui, J.; Zhong, G. An Overview of the Advanced Access Content System (AACS). Cent. Appl. Cryptogr. Res. (Cacr) 2007, 25, 1–24. [Google Scholar]
  10. Lin, H.; Cao, Z.; Liang, X.; Zhou, M.; Zhu, H.; Xing, D. How to construct interval encryption from binary tree encryption. In Proceedings of the International Conference on Applied Cryptography and Network Security, Bogota, Colombia, 5–7 June 2019; pp. 19–34. [Google Scholar]
  11. Lee, K.; Park, J.H. Identity-Based Revocation from Subset Difference Methods under Simple Assumptions. IEEE Access 2019, 7, 60333–60347. [Google Scholar] [CrossRef]
  12. Chu, H.; Qiao, L.; Nahrstedt, K.; Wang, H.; Jain, R. A secure multicast protocol with copyright protection. Acm Sigcomm Comput. Commun. Rev. 2002, 32, 42–60. [Google Scholar] [CrossRef]
  13. Kumar, V.; Kumar, R.; Pandey, S. A computationally efficient centralized group key distribution protocol for secure multicast communications based upon RSA public key cryptosystem. J. King Saud-Univ.-Comput. Inf. Sci. 2018, in press. [Google Scholar] [CrossRef]
  14. Wang, X.; Zhu, H. A Novel Two-party Key Agreement Protocol with the Environment of Wearable device using Chaotic maps. DSPR 2019, 3, 12–23. [Google Scholar]
  15. Naor, M.; Pinkas, B. Efficient trace and revoke schemes. Int. J. Inf. Sec. 2010, 9, 411–424. [Google Scholar] [CrossRef]
  16. Boneh, D.; Waters, B. A fully collusion resistant broadcast, trace, and revoke system. In Proceedings of the 13th ACM Conference on Computer and Communications Security, Alexandria, VA, USA, 30 October–3 November 2006; pp. 211–220. [Google Scholar]
  17. Delerablée, C. Identity-based broadcast encryption with constant size ciphertexts and private keys. In Proceedings of the ASIACRYPT:International Conference on the Theory and Application of Cryptology and Information Security, Kuching, Malaysia, 2–6 December 2007; pp. 200–215. [Google Scholar]
  18. Boneh, D.; Hamburg, M. Generalized identity based and broadcast encryption schemes. In Proceedings of the International Conference on the Theory and Application of Cryptology and Information Security, Melbourne, VIC, Australia, 7–11 December 2008; pp. 455–470. [Google Scholar]
  19. Boneh, D.; Sahai, A.; Waters, B. Fully collusion resistant traitor tracing with short ciphertexts and private keys. Adv. Cryptol. Eurocrypt 2006, 2006, 573–592. [Google Scholar]
  20. Gentry, C.; Waters, B. Advances in Cryptology—EUROCRYPT 2009: 28th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Cologne, Germany, 26–30 April 2009; Springer: Berlin/Heidelberg, Germany, 2009; pp. 171–188. [Google Scholar]
  21. Yao, D.; Fazio, N.; Dodis, Y.; Lysyanskaya, A. ID-based Encryption for Complex Hierarchies with Applications to Forward Security and Broadcast Encryption. In Proceedings of the ACM Conference on Computer and Communications Security, New York, NY, USA, 25–29 October 2004; pp. 354–363. [Google Scholar]
  22. Canetti, R.; Halevi, S.; Katz, J. Chosen-Ciphertext Security from Identity-Based Encryption. In Proceedings of the Advances in Cryptology—EUROCRYPT 2004, International Conference on the Theory and Applications of Cryptographic Techniques, Interlaken, Switzerland, 2–6 May 2004; pp. 207–222. [Google Scholar]
  23. Boneh, D.; Gentry, C.; Waters, B. Collusion Resistant Broadcast Encryption with Short Ciphertexts and Private Keys. Lect. Notes Comput. Sci. 2005, 3621, 258–275. [Google Scholar]
  24. Fiat, A.; Naor, M. Broadcast Encryption. Lect. Notes Comput. Sci. 1993, 773, 480–491. [Google Scholar]
  25. Goodrich, M.T.; Sun, J.Z.; Tamassia, R. Efficient Tree-Based Revocation in Groups of Low-State Devices. Lect. Notes Comput. Sci. 2004, 3152, 511–527. [Google Scholar]
  26. Halevy, D.; Shamir, A. The LSD Broadcast Encryption Scheme. Lect. Notes Comput. Sci. 2002, 2442, 47–60. [Google Scholar]
  27. Wallner, D.M.; Harder, E.J.; Agee, R.C. Key Management for Multicast: Issues and Architectures. Available online: https://www.hjp.at/(de,st_b)/doc/rfc/rfc2627.html (accessed on 2 June 2020).
  28. Canetti, R.; Garay, J.A.; Itkis, G.; Micciancio, D.; Naor, M.; Pinkas, B. Multicast Security: A Taxonomy and Some Efficient Constructions. Proc. IEEE 1999, 2, 708–716. [Google Scholar]
  29. Cheon, J.H.; Jho, N.; Kim, M.; Yoo, E.S. Skipping, Cascade, and Combined Chain Schemes for Broadcast Encryption. IEEE Trans. Inf. Theory 2008, 54, 5155–5171. [Google Scholar] [CrossRef]
  30. Lee, K.; Koo, W.K.; Lee, D.H.; Park, J.H. Public-key revocation and tracing schemes with subset difference methods revisited. Eur. Symp. Res. Comput. Secur. 2014, 8713, 1–18. [Google Scholar]
  31. Joux, A. Multicollisions in iterated hash functions. Application to cascaded constructions. Annu. Int. Cryptol. Conf. 2004, 3152, 306–316. [Google Scholar]
  32. Boneh, D.; Franklin, M. Identity-based encryption from the Weil pairing. Siam J. Comput. 2003, 32, 586–615. [Google Scholar] [CrossRef] [Green Version]
  33. Boneh, D.; Canetti, R.; Halevi, S.; Katz, J. Chosen-Ciphertext Security from Identity-Based Encryption. SIAM J. Comput. 2007, 36, 1301–1328. [Google Scholar] [CrossRef]
Figure 1. Example: subset representations in existing methods and Combinatorial Subset Difference (CSD).
Figure 1. Example: subset representations in existing methods and Combinatorial Subset Difference (CSD).
Sensors 20 03140 g001
Figure 2. The definition of node types. Node O is a legitimate node and node X is a revoked node.
Figure 2. The definition of node types. Node O is a legitimate node and node X is a revoked node.
Sensors 20 03140 g002
Figure 3. Node type examples in Subset Difference (SD) (and CSD).
Figure 3. Node type examples in Subset Difference (SD) (and CSD).
Sensors 20 03140 g003
Figure 4. Category difference between (a) SD and (b) CSD.
Figure 4. Category difference between (a) SD and (b) CSD.
Sensors 20 03140 g004
Figure 5. Subset construction in (a) SD, and (b) CSD.
Figure 5. Subset construction in (a) SD, and (b) CSD.
Sensors 20 03140 g005
Figure 6. The number of subsets in IP address multicast: (a) for a single binary IP address (b) for multiple addresses.
Figure 6. The number of subsets in IP address multicast: (a) for a single binary IP address (b) for multiple addresses.
Sensors 20 03140 g006
Figure 7. Comparison of header sizes in broadcast encryption.
Figure 7. Comparison of header sizes in broadcast encryption.
Sensors 20 03140 g007
Table 1. Size and performance comparison among public key broadcast encryptions.
Table 1. Size and performance comparison among public key broadcast encryptions.
Public Key SD ([6,7])Interval [10]Revocable SD [11]CSD
PK (size) log n 2 log n 7 × 8 4 log n
SK (size) log 3 n 2 log 2 n 4 log 2 n × 8 3 log 2 n
Hdr (size) 4 r 3 r ( 3 × 8 + 1 ) · 2 r 2 r
# subsets 2 r r 2 r r
# group elements23 3 × 8 + 1 2
Enc (computation) 2 r e log n 4 r e 12 r e 3 r e
Dec (computation) e log n + 2 p 2 e log n + 4 p e + 4 p 2 e + 2 p
Assumption l B D H I l B D H E D B D H l B D H E
e = point multiplications, p = pairings, n = total users, r = revoked users.
Table 2. Type solution in SD and CSD algorithms.
Table 2. Type solution in SD and CSD algorithms.
T 0 T 1 Type in SDType in CSDSubset Generation
0
SS1
S0
0
SSS0
SSS0
Table 3. Comparison of key sizes in broadcast encryption.
Table 3. Comparison of key sizes in broadcast encryption.
DepthPublic Key SD ([6,7])Interval [10]CSD (Ours)
PK (KB)10-bit0.250.490.83
15-bit0.370.731.24
20-bit0.500.981.65
SK (KB)10-bit19.614.035.92
15-bit66.189.0713.32
20-bit156.8816.1323.69
Table 4. Comparison of encryption time and decryption time (per subset) in broadcast encryption.
Table 4. Comparison of encryption time and decryption time (per subset) in broadcast encryption.
DepthPublic Key SD ([6,7])Interval [10]CSD (Ours)
Enc (s)10-bit0.700.240.20
15-bit0.940.250.20
20-bit1.180.250.20
Dec (s)10-bit1.021.670.17
15-bit1.372.410.17
20-bit1.753.170.17

Share and Cite

MDPI and ACS Style

Lee, J.; Lee, S.; Kim, J.; Oh, H. Combinatorial Subset Difference—IoT-Friendly Subset Representation and Broadcast Encryption. Sensors 2020, 20, 3140. https://doi.org/10.3390/s20113140

AMA Style

Lee J, Lee S, Kim J, Oh H. Combinatorial Subset Difference—IoT-Friendly Subset Representation and Broadcast Encryption. Sensors. 2020; 20(11):3140. https://doi.org/10.3390/s20113140

Chicago/Turabian Style

Lee, Jiwon, Seunghwa Lee, Jihye Kim, and Hyunok Oh. 2020. "Combinatorial Subset Difference—IoT-Friendly Subset Representation and Broadcast Encryption" Sensors 20, no. 11: 3140. https://doi.org/10.3390/s20113140

APA Style

Lee, J., Lee, S., Kim, J., & Oh, H. (2020). Combinatorial Subset Difference—IoT-Friendly Subset Representation and Broadcast Encryption. Sensors, 20(11), 3140. https://doi.org/10.3390/s20113140

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop