1. Introduction
The concept of smart cities has become central to contemporary discussions on urban development, where the integration of Information and Communication Technology (ICT) is pivotal in transforming the city’s infrastructure and services [
1,
2]. Smart cities utilize advanced data analytics and IoT technologies to optimize resources, improve service delivery, and enhance the quality of urban life. These urban areas are defined by their ability to efficiently manage vast amounts of data generated from a multitude of sources—ranging from traffic sensors to healthcare records—aiming to improve sectors such as energy, healthcare, and community governance, as
Figure 1 shows. Despite the advantages, the challenge of data acquisition persists, exacerbated by strict data protection regulations and the growing demand for privacy, which contribute to the formation of fragmented data ecosystems or ‘data islands’ within urban settings. In response, federated learning emerges as an effective approach to navigate these challenges. This method allows for the decentralized training of models on local data held by various stakeholders, thereby adhering to privacy concerns without centralizing sensitive information. Since its initial introduction by Google [
3], the application of federated learning has expanded, driven by ongoing research aimed at enhancing its efficiency and accuracy [
4,
5,
6,
7]. However, the implementation of federated learning within smart cities is fraught with obstacles, such as high communication costs; difficulties in achieving model convergence in diverse, non-IID data environments; and the critical need for robust security measures to safeguard against potential data breaches during the model training process [
8,
9,
10,
11].
In existing federated learning systems, the number of clients involved in each round of updates is usually fixed. In the context of a smart city, federated learning schemes normally select a small number of clients randomly to participate in each round, due to the limitations of participants’ state and network conditions. However, as there is a mass of heterogeneous clients in reality, such random selection of clients will increase the adverse impact of data heterogeneity [
12]. Therefore, it is very important to select appropriate clients for training. Current schemes either select clients with higher statistical utility based on the measurement of their contributions to model updates [
13] or select clients based on computing resources and communication constraints [
14]. Although these schemes achieve certain effects, there still exist some challenges. For example, some schemes need to analyze private gradients uploaded by participants, or they consume a lot of resources for learning and testing, while some can only select participants at a coarse-grained level.
Federated learning prevents direct uploads of private data, but the issue of privacy leakage has not been completely resolved. Traditional client selection schemes in federated learning typically allow participants to train models with local datasets and upload gradients to update the global model, so that the central server can use this information to avoid model poisoning and select participants for the next round of training to favor model convergence [
15,
16]. However, some scholars have pointed out that this will also cause serious privacy disclosures [
8]. To solve this problem, some studies have used homomorphic encryption [
17] and differential privacy [
18] to mask the gradient, but this undoubtedly prevents the central server from selecting participants, because the server cannot obtain valid information from the encrypted or confusing gradient. In addition, existing federated learning schemes usually assume that the participants unconditionally use local resources to train the models and upload gradients to the central server, which is not sustainable in reality [
19]. Some scholars have looked at federated learning from the perspective of crowdsourcing [
20]. Inspired by this, we believe that, in smart cities, the publisher of a federated learning task should have no control over the participants, and the clients should choose whether or not to use local data for training. Therefore, it is necessary to set up an incentive mechanism to attract participants to join the training [
11].
In the context of smart cities, we have sufficient reasons to design a federated learning framework from the perspective of crowdsourcing. This framework should consider selecting participants during training to improve the training efficiency, blocking malicious adversaries before training, and encouraging more high-quality clients to participate in constructing the models. In recent years, attribute-based encryption has been widely studied as a promising direction of functional encryption [
21]. Ciphertext-policy attribute-based encryption (CP-ABE) can conduct fine-grained access control for users conforming to specific policies without revealing any private data. This enables us to separate the participant selection module from the federated learning module, thus providing the possibility of complete privacy protection, including homomorphic encryption. It is worth noting that there is no research on its application in federated learning. In addition, a consortium blockchain is a tamper-resistant and traceable distributed ledger that can be used to record the contributions of participants.
To better understand our scheme, let us consider a scenario in which a company needs to train a model of people’s desire to consume different goods. It is hoped that as many clients as possible in the region will participate, even if this is done at a cost. At the same time, the company wishes to eliminate malicious attacks from competitors and select participants with an appropriate data distribution in training to improve the learning efficiency. Although stringent data confidentiality regulations prevent it from deducing the appropriateness from gradients, it can still apply an attribute-based encryption scheme to select participants. Specifically, the task publisher develops a policy for each round of training so that only those who meet this policy can decrypt and participate in subsequent training. At the same time, participants can record decryption logs in a blockchain, which can provide both non-repudiation credentials to incentivize the participants and an auditing report to trace the transactions if a malicious adversary tries to disrupt the model.
The contributions of this article are as follows.
We propose a client selecting framework in federated learning based on ciphertext-policy attribute-based encryption, which extends traditional federated learning from the perspective of crowdsourcing. Our scheme can select appropriate participants on the premise of protecting gradient privacy.
An incentive mechanism based on blockchain is proposed, so that the profits to participate in training belong to clients. The use of immutable smart contracts can greatly improve the enthusiasm of clients participating in federated learning.
The security of the proposed scheme is proven, and the performance of the proposed scheme is evaluated. The experiments show that the method proposed in this paper can perform better than the existing methods.
The rest of our article is organized as follows.
Section 2 presents an analysis of related work.
Section 3 briefly describes the preliminaries, including the security model of this scheme.
Section 4 describes the workflow and the architecture of the proposed CP-ABE scheme.
Section 5 characterizes the IND-CPA security model and describes other security proofs.
Section 6 compares the performance of our proposed scheme with that of other recent schemes. Finally,
Section 7 draws the conclusions.
2. Related Work
The concept of federated learning was proposed by researchers at Google [
3], who devised an interesting virtual keyboard application. Federated learning, as defined by Kairouz et al. [
9], is a machine learning setting where multiple entities (clients) collaborate in solving machine learning problems, under the coordination of a central server or service provider. Each client’s raw data are stored locally and not exchanged or transferred. A typical federated learning process consists of five steps: client selection, broadcast, client computation, aggregation, and model updates. Among them, it is a very challenging task to select appropriate clients during training, rather than performing random selection, and there are still some problems to be solved in the existing client selection schemes.
Zhang et al. [
14] selected the clients according to the resource information sent by them, such as the computing ability and channel state. However, this may mean that clients with a large amount of data are unlikely to participate in training. Chai et al. [
12] stratified the clients and adaptively selected those with similar training performance per round in order to mitigate heterogeneity without compromising the model accuracy, but this means that the central server has to control all participants to capture the training time on-the-fly. Fan et al. [
22] used importance sampling to select clients, i.e., to select clients by utility. In addition, they developed an exploration–exploitation strategy to select participants. However, each of these clients was designed to upload complete model updates to the central server at each round, ignoring the fact that not all model updates contribute equally to the global model. As an improvement on this work, Li et al. [
23] proposed PyramidFL, which calculated the importance ranking of each client based on feedback from past training rounds to determine a list of qualified clients for the next round of training, but the central server still obtains private information, such as the gradients and loss uploaded by clients. Wang et al. [
24] put forward an experience-driven federated learning framework (Favor) based on reinforcement learning, which can intelligently select the clients participating in each round of federated learning to offset the deviation caused by non-IID. However, the disadvantage is that the efficiency of reinforcement learning restricts the performance of the system, and sometimes it is unclear why it is effective.
We can consider federated learning from the perspective of crowdsourcing, which may be an important direction for future federated learning because few companies have as many registered users as Google. Thus, we have a strong motivation to respect participants’ willingness to participate in training while fully protecting their data. The additional challenge that needs to be addressed to apply federated learning in smart city scenarios is participant motivation [
11], and most existing federated learning schemes assume that the participants use local data for training and upload model updates unconditionally. This is not realistic, as participants have the right to claim remuneration for the resources that they consume to participate in training. In order to provide appropriate incentives, Sarikaya et al. [
25] designed a Stackelberg game to motivate participants to allocate more computing resources. Richardson et al. [
26] designed payment structures based on the impact characteristics of data points on the model loss function to motivate clients to provide high-quality data as soon as possible. In many applications, blockchain is considered to be the best solution to achieve an incentive mechanism, because it is immutable and auditable and has inherent consensus [
27]. Almutairi et al. [
28] proposed a solution integrating federated learning with a lightweight blockchain, enhancing the performance and reducing the gas consumption while maintaining security against data leaks. Weng et al. [
29] proposed a value-driven incentive mechanism based on blockchain to force participants to behave correctly. Bao et al. [
30] designed a blockchain platform that allows honest trainers to earn a fair share of profits from trained models based on their contributions, while malicious parties can be promptly detected and severely punished. Most of these blockchain platforms complete the verification and audit of gradient updates via the blockchain itself, while ignoring the costs. Moreover, these pure blockchains overemphasize transactions, without taking into account the difference in data value between different participants. We believe that, from the perspective of crowdsourcing, it is natural for the task publisher to pay high-value participants who meet his/her requirements.
In order to achieve a balance between privacy, performance, and incentives in federated learning, we introduce attribute-based encryption based on ciphertext-policy in participant selection. Sahai and Waters et al. [
31] proposed an attribute-based encryption scheme in 2005. Their scheme used a single threshold access structure, and only when the number of attributes owned by users is greater than or equal to a threshold value in the access policy can the ciphertext data be decrypted successfully. Bethencourt et al. [
32] first proposed an attribute-based encryption scheme based on ciphertext-policy in 2007. The keys were associated with an attribute set, and the access structure was embedded in the ciphertext. Only when a user’s own attribute set meets the access structure set by the data owner can the user successfully decrypt the ciphertext to obtain the ciphertext data, and the access tree structure is used in this scheme. In order to reduce the storage and transmission overhead of the CP-ABE scheme, Emura et al. [
33] proposed a scheme with a fixed ciphertext length for the first time, which improved the efficiency of encryption and decryption. However, all these schemes adopt a simple “AND” gate access structure. Waters et al. [
34] proposed a new linear secret shared scheme (LSSS) to represent the access structure, which can realize any monotonous access structure, such as “AND”, “OR”, and the threshold operation of attributes. This scheme is more expressive, flexible, and efficient.
In smart city scenarios, there are many complex situations, such as the attributes of the participants being revoked. Updating participants’ attributes timely and effectively guarantees system security. Pirretti et al. [
35] proposed a CP-ABE scheme of indirect attribute revocation in order to solve the loose coupling problem in social networks. Zhang et al. [
36] proposed a CP-ABE scheme based on an “AND" gate structure with attribute revocation, but this scheme has poor access structure expression abilities. Hur et al. [
37] proposed an access control scheme with coercive revocation capabilities to solve a problem in the access permissions caused by changes in the users’ identity in the system. They introduced the concept of attribute groups. Users with the same attributes belong to the same attribute group and are assigned to the same attribute group key. Once a member of the attribute group is revoked, a new group key is generated and sent to all group members except the revoked user. The ciphertext is updated in the cloud with the new group key, which makes it impossible for the revoked user to decrypt the data. However, their scheme does not prevent a collusion attack between the current and revoked users. In order to prevent cooperative decryption between users who have revoked attributes and users who do not have attributes, Li et al. [
38] proposed a CP-ABE scheme to resist collusion attacks and support attribute revocation. However, the computational complexity of their scheme is still too high.
To address the challenges identified in the related work, our study introduces a novel federated learning framework that utilizes ciphertext-policy attribute-based encryption (CP-ABE) and a consortium blockchain. This methodology combines the strengths of CP-ABE to provide fine-grained access control and ensure privacy with the transparency and traceability of blockchain to manage and audit participant contributions effectively. The selection of participants based on attribute encryption ensures that only those who meet pre-defined criteria can access and process the training data, thereby enhancing the privacy and security of the data used in our federated learning model. Additionally, the consortium blockchain serves as a decentralized ledger to record all participant activities, which supports non-repudiation and helps in maintaining a trustworthy environment for all parties involved.
3. Preliminary
3.1. Federated Learning
Federated learning is a promising research area for distributed machine learning that protects privacy. In the process of federated learning, the task publisher can train models with the help of other participants. Instead of uploading private data to the central server, participants obtain a shared global model from the server and train it on a local dataset. These participants then upload the gradients or weights of the local model to the task publisher to update the global model. In particular, taking FedAVG as an example, the objective function under federated learning is rewritten with the non-convex loss function of a typical neural network.
Here,
k represents a total of
k participants, and
represents the number of training set samples in the
k-
participant. The specific algorithm is quite simple. Firstly, we select some nodes in each batch for epoch training, and then each node uploads weight updates to the server.
Then, the server collects all the
to obtain the weighted average value of the new global
, and it is then sent to each participant.
Finally, each participant replaces the calculated from the last epoch with the delivered update to train a new epoch. The system repeats the above three steps until the server determines w convergence.
3.2. Bilinear Pairing
Bilinear pairing, also known as bilinear mapping, was initiated to build functional encryption schemes. At present, most ABE schemes [
39] are based on bilinear pairing cryptography, and its security has been recognized by many experts. The general definition of bilinear pairing is given below.
Consider three cyclic groups
,
, and
, each of prime order
p. Typically,
and
are groups of points on an elliptic curve over a finite field, and
is a multiplicative group of a finite field. A bilinear pairing is a map
that satisfies the following properties.
Bilinearity: For all elements
and
, the pairing operation respects the distributive property over the group operation. That is,
This property can be extended to the exponents in the groups
for all
. This property is fundamental in enabling many cryptographic protocols because it allows the pairing operation to “interact” with the group operations in a predictable way.
Non-degeneracy: The pairing is non-trivial in the sense that there exists at least some and such that in . This ensures that the pairing map is not constantly zero and thus is useful for cryptographic applications. It is often required that for all in and all in , .
Symmetry (in some cases): For some pairings, particularly symmetric pairings, and the pairing satisfies . This symmetry is not always required or desired, depending on the cryptographic application.
Computability: There must be an efficient algorithm to compute for all and . The efficiency of this computation is critical because the practicality of cryptographic protocols based on pairings depends heavily on the ability to compute these pairings quickly.
Bilinear pairings are not only theoretical constructs but are practically implemented using specific types of elliptic curves, such as supersingular curves or curves with a low embedding degree, which provide the necessary mathematical structure to support efficient and secure pairings. These properties make bilinear pairings powerful tools in modern cryptographic systems, providing functionalities that are not feasible with traditional cryptographic primitives.
3.3. Consortium Blockchain
Blockchain is essentially a decentralized database. It adopts distributed accounting and relies on ingenious algorithms based on cryptography to achieve the characteristics of tamper-proofing and traceability. These features can establish a foundation of trust for a fair distribution of incentives in federated learning [
10].
There are three main types of blockchain, namely public chain, private chain, and consortium chain. The essential differences between them are related to who has the write permission and how distributed they are. The public chain is highly decentralized, so anyone can access and view other nodes, but the cost is that the ledgers are very slow to update. At the other extreme is the private chain, where accessing and authoring are entirely controlled by an agency, but this also leads to the excessive concentration of power. The most appropriate blockchain applied in federated learning is the consortium chain, which is jointly maintained by the members and is highly suitable for transaction clearing within the consortium. It is more reliable than the purely private chain and has better performance than the public chain.
Regardless of the type of blockchain applied in a specific scenario, the data structure is a linked list of ledgers containing transaction records, as
Figure 2 shows. Each block in the linked list contains hash values of the previous block, a new transaction record, and other information, such as timestamps. This structure ensures that each block is not tampered with and any nodes can easily trace back each transaction along the pointer.
3.4. Security Model
Let be our scheme. To define a selective IND-CPA security model for , the following game is designed, involving a PPT attacker and a PPT challenger .
Init: An adversary controls a series of attribute authorities (where at least two authorities in are not controlled by ) and the remaining are controlled by the challenger . An adversary submits the access structure to be challenged, and then sends it to challenger .
Setup: runs a setup algorithm in order to obtain the master keys and public parameters . Subsequently, challenger sends the public parameters to adversary . Meanwhile, challenger initializes the user list, which includes authorization attributes and the challenged access structure .
Phase 1: adaptively sends a set of attributes S. generates the corresponding , which is returned to .
Challenge: submits two messages and with equal length and submits an access structure to . It is required that, for every S queried by , S cannot satisfy . flips a coin and encrypts with the access structure to obtain . Finally, sends the ciphertext to .
Phase 2: Repeat Phase 1. For every S queried by , S cannot satisfy the access structure .
Guess: outputs a guess for b and wins the game if .
The advantage of
is defined in this game as follows:
We note that the model can easily be extended to handle chosen-ciphertext attacks by allowing for decryption queries in Phase 1 and Phase 2.
Definition 1. The protocol Π is CPA security if no probabilistic polynomial-time (PPT) adversaries have a non-negligible advantage in the above game.
Under our security model, the task publisher and its central servers are considered to be honest but curious. In other words, they do not counterfeit, attack, or try to decipher the data uploaded by the owners, and they faithfully execute the algorithms. However, they may have a certain degree of curiosity and may bypass some restrictions to access users’ data or the system parameters directly. Meanwhile, the participants may be malicious, and they may attempt to access data that exceed their permissions in collusion with others.
5. Security Analysis
Before we begin our security analysis, we need to clarify the security assumptions of the various entities in the system. First, attribute authorities are considered to be fully trusted entities, similar to certificate authorities, generally initiated by city governments. The task publisher can be a commercial institution, which is reflected in the system as honest and curious, i.e., they faithfully execute the algorithms that they are responsible for without maliciously destroying the ciphertext uploaded by the clients, but they may spy on or infer the clients’ private information from the access record. Finally, there may be malicious clients in the system, trying to collude with other clients to obtain data beyond their own permissions or trying to destabilize the system.
5.1. Selective CPA Security
Theorem 1. There is no polynomial adversary that can selectively break our system with a challenge matrix of size , where , when the decisional q-parallel BDHE assumption holds.
Proof. Inspired by Waters [
34], we can build a simulator
that solves the decisional q-parallel BDHE problem with a non-negligible advantage under the prerequisite that none of the updated secret keys
that are generated by both the queried secret keys
and update keys
can decrypt the challenge ciphertext. This is based on the assumption that we have an adversary
that chooses a challenge matrix
with the dimension of at most
q columns with a non-negligible advantage
in the selective security game against our construction. The proof is produced by the challenger and the attacker through a series of interactions in the game. Because the mathematical discussion of the game details is beyond the scope of this article and it resembles Waters’ work, it is omitted. □
5.2. Data Security
In our scheme, only users with specific attributes can obtain the corresponding keys through the attribute authorities. Since the underlying protocol is based on elliptic curves, and ECDLP is unsolvable, clients without the correct attributes cannot obtain any information about the private keys from the corresponding public keys in polynomial time.
Based on the training progress and results, the task publisher will select the access policy and the flag f of the training round, which is hidden in ciphertext C. Since s is randomly chosen by the task publisher, it is a random number in the eyes of an attacker. Thus, the attacker cannot obtain any valuable information about f. With a linear secret sharing scheme, s is a secret divided by and can only be recovered if there are enough parts; in other words, the ciphertext can only be decrypted if the participant has a set of attributes that match the access policy. For any invalid users who do not have the attributes declared by the access policy, since they do not have the attributes corresponding to rows of M, they do not make true, where . Then, they cannot compute the first element of , which is s. Therefore, this scheme ensures data security.
5.3. Forward and Backward Security
Forward security means that any clients that have been revoked cannot access subsequent data unless the remaining set of attributes of the client still satisfies the access structure. In the scheme proposed in this paper, if the attributes of a client are revoked, only some of the keys and the ciphertext are updated by the central server, which not only reduces the local computational overhead but also effectively prevents clients who have lost access permissions from posing threats to the updated ciphertext in the system, so as to ensure forward security. Considering that the revoked client already has permission to read the old ciphertext, the central server must restrict him from downloading the old ciphertext.
Backward security means that new clients cannot decrypt previously encrypted data. Note that we use to control the ciphertext version; thus, new clients cannot decrypt the old ciphertext using the latest version of the attribute keys.
5.4. Collusion Attack
Theorem 2. The scheme is secure under a multi-user collusive attack.
Proof. In the proposed scheme, the attribute authority will assign a random value to each participant. Even if multiple participants have exactly the same attribute, the value will be different in the keys obtained by them. In the decryption algorithm, t must be consistent to realize a collusion attack. Therefore, no client can conspire with other users or groups of users to illegally decrypt the data. For example, one participant has attributes , and the other participant has attributes ; for an access policy of “”, individual participants or cannot decrypt the data alone. Even if they use their attribute keys with and to collude, the calculation cannot eliminate t; thus, they are unable to perform decryption. □
Tseng et al. [
40] found that some attribute-based encryption (ABE) schemes [
41,
42] based on elliptic curve scalar multiplication are vulnerable to collusion attacks, because users with the same attributes can obtain the attribute private key set in the system by solving linear equations. Our scheme does not have this problem because we use bilinear pairing instead of scalar multiplication, and no party can obtain the secret parameters of the system by solving the equations.
6. Performance Comparison and Evaluation
In this section, we use public datasets to evaluate the performance of our scheme and compare it with previous work. In particular, in addition to showing how the proposed scheme improves the model accuracy in federated learning, we analyze the impact of using attribute-based encryption on the computational efficiency.
First, in
Table 1, we present the characteristics of the currently popular federated learning client selection schemes. It can be seen that our proposed scheme comprehensively considers the dimensions of the client data quantity, data distribution, and computing power, avoiding complex importance measurements and reinforcement learning. We then qualitatively evaluate our work against some of the known incentive mechanisms. As shown in
Table 2, most of the existing schemes use either the quantity or quality of data to distribute revenues fairly. Fortunately, the task publisher in our scheme can consider two aspects comprehensively to formulate an access policy, which is more applicable to reality. With the help of the blockchain, we can easily implement the features of auditing and traceability. This is why we use post-training allocation rather than simultaneous allocation during training, to reduce the cost of evaluating the contributions of each participant.
Next, we describe some details of the experiments.
6.1. Setup
We trained popular convolutional neural network models on two benchmark datasets, FashionMNIST and CIFAR-10. The convergence speed and the final model accuracy of the proposed ABEFedAvg algorithm are compared with three other federated learning aggregation algorithms FedAvg [
3], FedProx [
50] and FedIR [
51] with randomly selected clients. The specific experimental Settings are as follows:
Hardware and Software setup: This paper conducts experiments on a set of Linux servers, each running one experimental task. After all resources have been allocated, the hardware and software setup of each server is shown in
Table 3.
Dataset: We comprehensively evaluate the efficiency of ABEFedAvg in simulation experiments using different datasets, namely FashionMNIST and CIFAR-10, which contain numerous fixed-size images and have been used in a large number of studies. The dataset, validation set and test set are allocated to different parties with different data distribution patterns according to Dirichlet distribution to evaluate the performance of ABEFedAvg under non-independent and identically distributed data. The FashionMNIST dataset is a very classic dataset in the field of machine learning. It consists of 60,000 training samples and 10,000 test samples, each of which is a 28 × 28 pixel image representing an item numbered from 0 to 9. The CIFAR-10 dataset has a total of 60,000 color images, each with a scale of 32 × 32 pixels, and is divided into 10 categories with 6000 images each. Of these, 50,000 images are used for training to form five training batches of 10,000 images each, and the remaining 10,000 images are used for testing to form a separate testing batch.
Party: Then this paper uses the method in [
52] to generate the partition of Non-IID. Specifically, the parameters of the Dirichlet distribution are set to partition the dataset to different parties in an unbalanced manner. When the parameter
is larger, the data of each party tends to be independently and identically distributed. On the contrary, the data distribution is more uneven. In this paper, three distribution cases are set, and
is used to simulate the ideal situation where the data is completely independent and identically distributed, as
Figure 4 shows. Use
to simulate a slightly independent and identically distributed scenario, which is common in real-world scenarios, as
Figure 5 shows. We use
to simulate a worst-case data distribution where almost each party has only 3–4 classes, as
Figure 6 shows. The data distribution of each parameter setting participant is shown as follows:
Model: The model used in this article is LeNet-5 convolutional Neural Network (CNN), which is commonly used for image classification. The model structure of LeNet-5 includes convolutional layer, pooling layer and fully connected layer. The convolutional layer and pooling layer are used to extract the local features of the image, and the fully connected layer is used to map the features to the class probabilities. The first and third layers are convolutional layers with 6 and 16 kernels, respectively, each of size and step size 1; Convolutional layers are followed by average pooling layers with a pooling kernel of size and no padding is used, their role is to downsample the input feature map and reduce the size of the feature map. The last three layers are fully connected layers with 120, 84 and 10 neurons, respectively. In the convolution kernel and the fully connected layer, ReLU is used as the activation function to avoid the problem of gradient disappearance. For the FashionMNIST dataset, the input image is , while the CIFAR-10 dataset has an input image specification of .
Performance index: In order to evaluate the optimization degree of the proposed party selection mechanism based on attribute encryption on various synchronous federated learning algorithms, this paper uses the test set accuracy as the main indicator to measure the performance of the model, trains on the FashionMNIST dataset and CIFAR-10 dataset for 500 rounds and 1000 rounds respectively, and plots the test set accuracy curve. Finally, the average accuracy and the highest accuracy are calculated, where the accuracy is defined as the ratio of the number of correctly classified images to the total number of test sets, and the range is between 0 and 1. To evaluate the convergence speed of the proposed ABEFedAvg algorithm, the number of communication rounds for the model to converge to the target accuracy, ToA@x, is used as the main metric to measure the efficiency of model training, where x represents the target accuracy.
6.2. Experimental Results
6.2.1. Effect of the Number of Participant Selection on performance
Firstly, we study the impact of using the stringency of the access policy in the proposed attribute-based encryption participant selection scheme and the participant selection score
of the baseline algorithm FedAvg on the performance of federated learning. In this paper, we assume that there are
parties in a region, and three different access strategies are selected, and the stringency is set to “strict”, “moderate” and “loose” respectively. The corresponding comparison of the three participant selection scores is as follows.
0.3. The performance evaluation of different access strategies and selection scores using FashionMNIST and CIFAR-10 picture datasets is shown in
Table 4.
For the FedAvg algorithm, when , the accuracy of the model in the test set is the highest, and with the decrease of the participant selection score, the accuracy also decreases in turn. When , the accuracy is only 0.8318, and the training curve has the largest degree of fluctuation. This is because the small number of selected parties in each round of training reduces the number of samples for model learning. When the selection score is 0.3, the model training time is the shortest, only 127 rounds are needed. When the selection score is reduced to 0.2, the number of communication rounds increases by 48 rounds. The training time of the model grows substantially, requiring 393 rounds of communication to reach the target accuracy, an increase of 266 rounds compared to the setting with . Therefore, in order to balance the model accuracy and training time overhead, the party selection score is set to in the following experiments.
For ABEFedAvg algorithm, the key to affect the number of selected parties is the stringency of the access policy. When using the “strict” access policy, the central server only allows the subjects with the largest number of samples and the most uniform distribution of all participants to participate in the training, while using the “loose” access policy means accepting the participants with low degree of independent and identical distribution. Experimental results show that the model has the highest accuracy when selecting the “moderate” access strategy, reaching an average accuracy of 0.8943 and 0.7508 on the FashionMNIST and CIFAR-10 datasets, which shown in
Figure 7 and
Figure 8, respectively. This is because choosing a more stringent access policy can improve the quality of the selected parties, but it also rejects more data samples that are still valuable for training. On the contrary, choosing a more relaxed access policy will weaken the effect of access control and introduce more parties with uneven local sample data. Based on this, the “moderate” access policy is selected in the following experiments.
6.2.2. Influence of Independent and Identically Distributed Data on Performance
It is well known that in real scenarios, the degree of independence and identically distributed data of each participant in federated learning is often unpredictable. Generally speaking, the higher the degree of independence and identically distributed data of each participant, the better the accuracy and generalization of the trained model. Therefore, this section verifies the effectiveness and robustness of the proposed scheme in three different data distribution scenarios according to the experimental setup described in
Section 6.1.
According to the experimental results shown in
Table 5, it is obvious that when each party meets the local independent and identically distributed (IID) data, the proposed scheme has limited improvement on the accuracy of model training. Compared with the original algorithm, the proposed scheme only improves 1.05 and 1.17 percentage points respectively on the FashionMNIST and CIFAR-10 datasets. The reason is that in such an ideal federated learning environment, the randomly selected clients all have almost the same data distribution as the clients that satisfy the access policy.
However, it can be seen that when using the setting
for Non-IID, the accuracy of the CNN model using ABEFedAvg is significantly higher than that of FedAvg with randomly selected clients, which is 3.12% and 4.62% higher for FashionMNIST and CIFAR-10 datasets, which shown in
Figure 9 and
Figure 10, respectively. If we look at the more extreme case of
, the advantage of our scheme will be even more prominent, outdoing the random selection strategy in traditional federated learning by 7.81% and 6.06% in two datasets, respectively. The reason here is also obvious, because the proposed scheme can adaptively select participants with matching access policies in each round of training, which enables the system to control the data distribution of participants in a better range, so as to achieve higher training accuracy. It is worth mentioning that under the setting of
, due to the moderate access strategy used in this scheme, there may be a proportion that the number of selected parties is less than the default, but from the experimental results, the influence of this factor on the training accuracy is very limited. In addition, the ‘-’ in
Table 5 indicates that the algorithm cannot reach the target accuracy within a given number of rounds. For example, under the setting of
of FashionMNIST dataset, neither algorithm can reach the test set accuracy of 0.85 within 500 communication rounds. Under the setting of
of CIFAR-10 dataset, the traditional FedAvg algorithm cannot achieve an accuracy of 0.70 within 1000 communication rounds, while the ABEFedAvg algorithm can achieve the accuracy target with 237 communication rounds.
6.2.3. Impact of Federated Learning Algorithms on Performance
This section investigates the applicability and optimization degree of the proposed attribute-based encryption party selection algorithm to two synchronous federated learning aggregation algorithms, FedProx and FedIR, when used as a module embeddable in federated learning. Although these latest schemes proposed many improvement strategies in the aggregation parameters, which improved the performance of the model to a certain extent, most of them still used the random selection method to select participants, which had a great impact on the accuracy of the model. Therefore, this paper applies the client selection scheme as a couplable module to each mainstream algorithm to show its performance optimization effect for each aggregation strategy.
Table 6 details the performance metrics for accuracy and processing time using two different datasets.
Figure 11 and
Figure 12 show the training curves of each algorithm on FashionMNIST and CIFAR-10 datasets, respectively. It can be observed that the performance of different algorithms on the two datasets is basically the same. In general, the three algorithms can achieve the target accuracy within a given number of communication rounds, and FedAvg algorithm produces the lowest performance, followed by FedProx algorithm and FedIR algorithm. Although FedIR algorithm has higher accuracy, its training curve has a large degree of fluctuation due to the addition of additional weight information. For example, FedAvg using FashionMNIST dataset has an accuracy of 0.8631, while FedProx and FedIR have an accuracy of 0.8747 and 0.8786, respectively. After adding the attribute-based encryption selection module, it can be clearly seen that the performance of each algorithm is improved, and the accuracy is increased by 3.12, 2.23 and 2.39 percentage points respectively compared with the above three benchmark algorithms. Using the proposed scheme has the most obvious optimization effect on the FedAvg algorithm.
On the CIFAR-10 dataset, the proposed scheme can obtain more obvious advantages. The original FedAvg algorithm achieves an average accuracy of 0.7046 on this dataset, the FedProx algorithm is 0.7100, and the highest accuracy algorithm is FedIR, which reaches 0.7202. Using the proposed ABE can also improve the overall performance of the above algorithms on the test set. For example, for the CIFAR10 dataset, the accuracy of FedProx and FedIR algorithms with ABE filtering module is 0.7597 and 0.7725, respectively, which is 4.97 and 5.23 percentage points higher than that of the random selection scheme. In addition, although the introduction of encryption and decryption mechanism in the participant selection phase will increase the time overhead, the number of communication rounds can be greatly reduced once the appropriate participants are selected. The results show that the number of communication rounds is reduced by 502, 255 and 158 rounds respectively for the above three schemes. It can be concluded that the scheme in this paper has a strong optimization effect on various aggregation algorithms of synchronous federated learning.
6.2.4. Comparison with Other Participant Selection Schemes
The comparison between ABEFedAvg and other party selection schemes is shown in the related work section. The most successful recent works include Newt proposed by Zhao et al. [
53] and FedFNS proposed by Wu et al. [
45]. The former is to find the balance between accuracy and execution time in each round based on weight difference. The weight change between two adjacent rounds is defined as a utility that converges quickly. Moreover, since clients with large data volumes may negatively affect the training time, the ratio of the local dataset size to the total data size is also added as a coefficient of the client utility. Since it is not always necessary to select participants in each round of testing, the authors also designed a feedback control component that dynamically adjusts the frequency of customer selection; The latter is based on the selection of probability assignment, which designs an aggregation algorithm to determine the optimal subset of local model updates by excluding unfavorable local updates. In addition, a probabilistic node selection framework (FedPNS) was proposed, which dynamically adjusted the selection probability of the device according to its contribution to the data distribution model.
Next, the performance of the proposed scheme is compared with the above two latest federated learning participant selection schemes. Similarly, this section also uses the most classical FedAvg aggregation algorithm of federated learning to evaluate the test set accuracy and stability of the two datasets under the setting of
and
. The experimental results are shown in
Table 7. On the FashionMNIST dataset, the proposed attribute-based encryption access control scheme achieves an average accuracy of 0.8943, Zhao et al.’s scheme achieves an accuracy of 0.8782, and Wu et al.’s scheme achieves an accuracy of 0.8715. Compared with the above two schemes, the proposed scheme is improved by 1.83% and 2.62% respectively. On the CIFAR-10 dataset, the average accuracy of the proposed scheme reaches 0.7508, the other two schemes are 0.7294 and 0.7148, and the accuracy is improved by 2.93% and 5.04%, respectively. Then we further evaluate the number of communication rounds required by ABEFedAvg algorithm and other two schemes applied to federated learning training to achieve the target accuracy. As shown in
Figure 13 and
Figure 14, on the FashionMNIST and CIFAR-10 datasets, the accuracy of 0.85 and 0.7 are achieved respectively, and the proposed scheme only needs 29 and 167 rounds. Although Newt and FedFNS have a great improvement over the original FedAvg random selection strategy, they are still weaker than the proposed FedABE scheme in this index. In summary, the party selection strategy based on attribute-based encryption proposed in this paper has obvious advantages even in the existing latest work, and has great application and promotion value.
7. Conclusions
In conclusion, our study introduces an innovative attribute-based participant selecting scheme for federated learning within smart city frameworks that leverages the integration of ciphertext-policy attribute-based encryption (CP-ABE) and consortium blockchain. This approach enhances both the security and efficiency of participant selection, mitigating common risks associated with privacy breaches and malicious attacks.
Our findings demonstrate that the proposed scheme significantly improves the efficiency of federated learning processes by enabling precise participant selection based on detailed attribute criteria, rather than relying on the traditional methods of random or resource-based selection. The attribute-based method ensures that only participants meeting specific pre-defined criteria contribute to the model training, thus optimizing the quality and relevance of the aggregated data.
Moreover, the incorporation of consortium blockchain technology provides a robust incentive mechanism and audit trail that ensures participant accountability and motivates continued engagement. This novel integration not only supports the scalability and sustainability of federated learning projects but also enhances their transparency and trustworthiness.
7.1. Theoretical and Practical Implications
Our research introduces a novel attribute-based participant selecting scheme enhanced with blockchain technology for federated learning in smart cities. This approach theoretically expands the understanding of federated learning by integrating privacy-preserving techniques (CP-ABE) and blockchain to safeguard against unauthorized access and ensure data integrity. Practically, the scheme provides a reliable and scalable solution for smart city administrators to deploy machine learning models that comply with stringent privacy regulations while maintaining high efficiency and participant motivation.
The implementation of our scheme in smart cities could significantly enhance the operational efficiency of various urban systems, such as public transportation networks, healthcare services, and emergency response systems. By ensuring that only qualified and authorized participants contribute to federated learning tasks, our model promotes the creation of more accurate and reliable predictive models, driving smarter decision-making in urban management.
7.2. Limitations
While our approach offers substantial improvements in privacy and efficiency, there are several limitations to consider. The complexity of CP-ABE may lead to an increased computational overhead, particularly as the number of attributes grows. This could potentially slow down the process in scenarios where real-time data processing is crucial. Additionally, our study’s focus on theoretical design and simulated environments may not fully capture the practical challenges encountered in real-world implementations. The effectiveness and efficiency of the encryption might vary significantly under different operational conditions and with different data volumes.
7.3. Future Research Directions
Considering the identified limitations, future research should focus on optimizing the efficiency of attribute-based encryption techniques to reduce the computational demands, particularly in environments with extensive attributes. Further empirical research is also necessary to test the scheme across various real-world settings in smart cities, to evaluate its practicality and performance under diverse conditions. Such studies could help to refine the model, making it more robust and adaptable to different types of data and applications.
Exploring the application of our federated learning scheme in other domains, such as healthcare and public safety, could provide insights into its adaptability and effectiveness in other critical areas of smart city development. Moreover, integrating advanced machine learning techniques, such as deep learning, might enhance the predictive capabilities of the models trained using our scheme, thus broadening its applicability and impact.