LFL-COBC: Lightweight Federated Learning on Blockchain-Based Device Contribution Allocation

Li, Qiaoyang; Sun, Yanan; Gao, Ke; Xi, Ning; Zhou, Xiaolin; Wang, Mingyan; Fan, Kefeng

doi:10.3390/electronics13224395

Open AccessArticle

LFL-COBC: Lightweight Federated Learning on Blockchain-Based Device Contribution Allocation^†

by

Qiaoyang Li

¹,

Yanan Sun

²,

Ke Gao

¹,

Ning Xi

^2,*,

Xiaolin Zhou

²,

Mingyan Wang

^3,* and

Kefeng Fan

⁴

¹

Xi’an Aeronautics Computing Technique Research Institute, Aviation Industry Corporation of China, Xi’an 710065, China

²

The School of Cyber Engineering, Xidian University, Xi’an 710071, China

³

Information Technology Center, Tsinghua University, Beijing 100084, China

⁴

China Electronics Standardization Institute, Andingmen East Street, Dongcheng District, Beijing 100007, China

^*

Authors to whom correspondence should be addressed.

^†

This paper is an extended version of our paper published in 2024 International Conference on Networking and Network Applications under the tile “LFL-COBC: Lightweight Federated Learning on Blockchain-Based Device Contribution Allocation”.

Electronics 2024, 13(22), 4395; https://doi.org/10.3390/electronics13224395

Submission received: 29 September 2024 / Revised: 31 October 2024 / Accepted: 7 November 2024 / Published: 9 November 2024

(This article belongs to the Special Issue AI in Blockchain Assisted Cyber-Physical Systems)

Download

Browse Figures

Versions Notes

Abstract

:

In the distributed cyber-physical systems (CPSs) within the industrial domain, the volume of data produced by interconnected devices is escalating at an unprecedented pace, presenting novel opportunities to enhance service quality through data sharing. Nevertheless, data privacy protection emerges as a significant challenge for data providers in wireless networks. This paper puts forward a solution integrating blockchain and lightweight federated learning, designated as LFL-COBC, which aims to tackle the issues related to data privacy and device performance optimization. We initially analyze multiple dimensions influencing the performance of computing devices, such as mining capacity, data quality, computational efficiency and local device deviation, which are crucial for augmenting user engagement. Based on these dimensions, we deduce a set of cooperation strategies for selecting the optimal committee members and rewarding the contributions of node devices equitably, thereby stimulating cooperation between users and servers. To intelligently and automatically detect device anomalies and alleviate the operational burden, a convolutional neural network (CNN) model is employed. Additionally, to address the escalating cost of customer participation and the potential data explosion issue, a near-optimal model pruning algorithm is designed. This algorithm can make the model obtained from the training of node equipment lightweight, thereby reducing the load of federated learning and the blockchain, as well as enhancing the overall efficiency of the system. The efficacy of our approach is demonstrated through numerical experiments on the HDFS and BGL public data sets. Experimental results indicate that the LFL-COBC scheme can effectively safeguard data privacy and optimize device performance concurrently, providing an effective solution for device anomaly detection in CPSs.

Keywords:

cyber-physical systems; blockchain; federated learning; cooperation strategy; model pruning; device anomaly detection

1. Introduction

With the introduction of a large number of intelligent and unmanned equipment, and with the program structure of equipment being too complex—making the maintenance and overhaul of industrial equipment more and more complex—artificial intelligence real-time anomaly monitoring for distributed industrial equipment is the future development trend [1]. However, nowadays, many industries have put forward higher requirements for the ability of industrial equipment to provide services. With the value brought by data, there are serious concerns about the protection of device privacy [2,3].

At present, multi-directional privacy computing, full homomorphic encryption and other technologies are often used to protect device data privacy [4], but they will may cause huge computing and resource consumption whilst also not supporting real-time overhaul and maintenance of current industrial distributed equipment. In order to maintain an efficient and safe working state of computing devices and to ensure the operation of devices’ data, federated learning coordinates multiple-IoT devices to protect privacy and automate networks. Many neural networks are over-parameterized [5,6], enabling the compression of each layer or the entire network [7]. Some compression methods achieve more efficient calculations with pruning parameters [8], training quantization [9], or blockchain data storage technology [10,11,12]. To mitigate the computation and resource consumption resulting from these techniques, we devise a lightweight privacy protection mechanism, enabling it to support real-time overhaul and maintenance of industrial distributed devices. Therefore, prior to the server aggregating the models, the models of each node device need to undergo lightweight processing. Currently, there exists a correlation among different computing devices within the industrial Internet ecosystem. Taking into account the correlation among different computing devices in the industrial Internet ecosystem, we incorporate blockchain technology to enhance the security and stability of the system. Once the data of computing devices are leaked or subject to deliberate attacks, the intervention of the blockchain can guarantee the security and stability of the entire industrial architecture [13].

In order to balance the overhead and security of the system, Lu et al. [10] designed a reward mechanism based on the amount of node data but ignored the impact of data quality on the effectiveness of client-side model training. Hyesung Kim et al. [14] analyzed the end-to-end delay model of the system and combined the factors that have a significant impact on the efficiency of each client to carry out the study, paving the way for us to design the blockchain reward mechanism. According to the characteristics and requirements of the industrial Internet, the system should follow the principles of ensuring the safety of devices in complex environments and promoting the fair and equal participation of multiple organizations [15,16]. We design an incentive mechanism based on device mining, data collection, computing power, and local and global dissimilarity to improve the work enthusiasm of client devices and promote the sharing of distributed data among multiple untrusted parties. In order to improve the efficiency of federated learning, we design a reward mechanism based on mining ability, node data volume, and work efficiency. At the same time, we consider the impact of data quality on the effectiveness of client-side model training, which ensures the accuracy and reliability of model training. The convergence of the blockchain and federated learning enables data sharing and privacy protection in industrial infrastructure, facilitating progress in the field while ensuring data security for clients.

Based on the blockchain-federated learning architecture proposed by [17], we extend and make the following contributions:

This paper presents a privacy protection scheme for the industrial environment that integrates the blockchain and federated learning (LFL-COBC), mitigating the aggregation of local models by the central server through pruning the models trained by node devices.
In order to enhance data privacy and alleviate the burden of the blockchain in the industrial Internet environment, a lightweight federated learning model for device anomaly detection is proposed. We make global and local adjustments to adapt to the local data set. A global model trained using such imbalanced data is challenging to generalize well on the data of each customer.
We further update the blockchain incentive mechanism to achieve distributed multi-party data sharing, reduce the disparity between local and global, and we verify data calculations, thereby enhancing the fairness and effectiveness of each node’s equipment operation.

2. Paper Organization

The rest of this article is organized as follows. Firstly, the proposed global scheme overview for computer devices in industry and detailed structural design flow is discussed in Section 3. We then present the implementation of the blockchain and federated learning algorithms in Section 4. Section 5 elaborates on the proposed approach for detecting anomalies in industrial equipment and conducting lightweight operations. Next, we present the experimental results, which are described in Section 6. Finally, this article is concluded in Section 7.

3. Lightweight Federated Learning on Blockchain-Based Device Contribution Allocation

3.1. System Model

Traditional machine learning aggregates data from different devices to a central server, which may lead to data leakage and privacy infringement. Users are unwilling to share data, which leads to the problem of an isolated data island [18]. FL has been studied by many researchers in order to alleviate these problems. However, the application of FL technology to devices faces many challenges. For example, devices upload local gradients or models to a central station, which will impose resource overhead and security risks. It is essential to integrate the blockchain and federated Learning to protect data privacy.

Blockchain technology together with decentralized federal learning architecture can solve many problems. The blockchain is a kind of decentralized, trusted database. It can use digital encryption and timestamp methods to implement point-to-point transactions without mutual trust. Our proposed global architecture is shown in Figure 1. The proposed system consists of two modules: a blockchain module and a federated learning module. The blockchain establishes secure connections among all the end-user computing devices through its encrypted records, which are maintained by the entities equipped with computing and storage resources.There are two types of transactions in our approved blockchain: data sharing transactions and data computation transactions [15]. To ensure fairness, we evaluate the client based on their mining and working abilities. Additionally, to minimize resource waste and alleviate the load on the blockchain, we employ compression and lightweight techniques to share the model results from each node device.

In this paper, we adopt a classic blockchain system architecture and blockchain system, which are divided into four levels: the cloud layer, the smart consensus layer, the edge layer, the local layer. The model of federated learning is stored in the edge layer. The cloud layer and the edge layer support the operation of the system, training, and update the model together, and the smart consensus layer can support the data isolation and privacy protection of each distributed device in the whole system.

The cloud encompasses a cloud server possessing potent computing, communication, and storage capabilities, and the task publisher is an organization or enterprise with FL tasks. The task publisher transmits the task information, encompassing the task publisher’s identity, task description, and model accuracy requirements to the cloud server. The cloud server supervises the task training process and conveys the global model to the task publisher upon meeting the accuracy requirements.

The smart consensus layer contains all the information of the blockchain, and the blockchain layer is based on the POW-POD-Deviation-Quatity (PODQ) consensus for recording the global model and edge model. In this layer, each edge server corresponds to a consensus node or leader node, and each cloud server corresponds to a unique monitoring node. In this scenario, edge servers and consensus nodes are referred to as edge nodes. Each consensus node stores their edge model and verifies edge models from other consensus nodes. The leader node manages data communication and transfers between nodes, and the monitoring node generates and broadcasts blockchain information containing task information and aggregates local models to generate the global model. The blockchain supports off-chain storage of private data and on-chain storage of hashes, suitable for storing user data and ensuring the trustworthiness and availability of the model, improving system performance.

The edge layer mainly realizes the rapid consensus and calculation of transactions and data by highly scattered nodes in the whole network. The main element of this layer is the edge node, which has fixed computing and communication capabilities. The edge nodes are deployed close to the terminals to reduce the communication pressure on the cloud. There can be overlap between edge nodes. At each epoch, edge nodes will aggregate from locally uploaded model aggregations to generate edge models.

The local layer consists of identically distributed local devices with limited computing and communication capabilities. Assemble the device nodes. At each epoch, each local device trains a local model using its own data and uploads it to the relevant edge node.

3.2. The Constituents of the Overall Architecture

The proposed LFL-COBC architecture is shown in Figure 2. The system consists of two modules: a blockchain module and a federated learning module. Blockchain technology combined with federated learning architecture can highly solve the problems currently faced by distributed industrial equipment. Participants can train their models locally and upload the model parameters to a central server for aggregation. However, due to concerns about data privacy, blockchain technology can provide a secure way to share and transmit data at this point, allowing participants to train models together without exposing raw data [19,20].

The overall system architecture of LFL-COBC consists of the following three important components:

Local devices: The local industrial equipment trains the model based on the collected data and transmits the trained model to the edge cloud.
Edge cloud: The edge cloud is capable of uploading the model from the local device to the blockchain, selecting the committee through an incentive consensus mechanism, specifying the highest score for summarizing the local device’s model, and updating the global model.
Blockchain: The blockchain is a trusted database that establishes a secure connection between terminal computing devices through its encrypted records. It can use digital encryption and timestamping methods to realize peer-to-peer trustless transactions [19].

In summary, the implementation of the proposed blockchain-based lightweight federated learning method is given in Algorithm 1. Let

N_{i t e r}

denote the number of iterations in the federated learning process and

N_{e p o c h}

be the number of iterations in the local training process.

M_{i n i t}

represents the initialization model,

N_{c l i e n t}

represents the number of client users, and

M_{N_{i t e r}}^{i}

represents the model under client i for a federated learning iteration

N_{i t e r}

.

Algorithm 1 Proposed blockchain-based lightweight federated learning approach

Require:: model structure, $N_{i t e r}$ , $N_{e p o c h}$ , $M_{i n i t}$ , $N_{c l i e n t}$ , reward
Ensure:: Device fault detection model ${M_{N_{i t e r}}^{i}}_{i = 1}^{N_{c l i e n t}}$
1:: Build the blockchain, initialize the fault diagnostic model as $M_{0}$ , and set the reward score for each client to $r e w a r d_{i}$
2:: for $j = 1 \to N_{i t e r}$ do
3:: Selection of $N_{c m t}$ committee members based on each client reward score
4:: for each non-committee client i do
5:: Download latest global model from the blockchain $M_{j}^{i} = M_{j - 1}$
6:: for $k = 1 \to N_{e p o c h}$ do
7:: The client model trains and updates the model to $M_{j}^{i}$
8:: The model is compressed to $M_{j}^{i}$
9:: end for
10:: Send the trained model to the committee members
11:: end for
12:: Calculate the score of each non-committee member with Equation (5)
13:: The miner computes the global model with Equation (9)
14:: The blockchain is expanded with updated global model $M_{j}$
15:: end for

The architecture design flow of the proposed scheme in the industrial Internet environment is shown in Figure 3. The overall system architecture workflow based on the proposed lightweight blockchain-based federated learning is as follows:

Node registration. All computing devices need to be registered on the blockchain before joining the network. This can be achieved by a decentralized registration mechanism that uses smart contracts to verify the identity of devices and issue unique identifiers.
The excitation of the node. With the help of the PODQ incentive policy, each node device is scored regularly, and then the members of the committee are elected according to the scores of each node device. According to the scores of nodes, rewards are automatically allocated through smart contracts to encourage nodes to actively participate in network maintenance and data processing.
Block generation and verification. Each node device mines on the blockchain to find available blocks, and the node device mines blocks through the PODQ mechanism. After block generation, it needs to be verified by other nodes in the network. Byzantine Fault Tolerance (BFT) can be implemented to ensure that more than 2/3 of the nodes reach a consensus on a new block. Once a block is verified enough, it will be added to the blockchain, ensuring the immutability and security of the data.
Model upload. After scoring each node device based on its mining ability, the quality of its local owned data, and the efficiency of its work, the node device uses its local data to train the model or participates in the federated learning process to cooperate with other nodes to train a shared model. Each node can choose to upload its own model or aggregate the models of other nodes based on its score and computing power. The uploaded model needs to be verified by the network to ensure the validity and security of the model. The model verification process can be implemented through smart contracts. The model that passes the verification will be added to the blockchain for use by other nodes or for further training.

3.3. Attack Model

The types of attacks that can be encountered in blockchain-federated learning architectures include tampering and impersonation. In this attack model, the attacker’s goal is to undermine the results and credibility of federated learning by tampering or falsifying data, and they may try to inject wrong data into the federated learning model to cause the model to produce wrong outputs [2,18]. The attacker may be a malicious data provider, participant, or other malicious entity in the network.

In the attack model of tampering and impersonation, the attacker can take several paths to carry out the attack.

Data tampering: An attacker could manipulate certain samples or features within the data set or might inject counterfeit data samples into the federated learning model to disrupt the training and prediction process of the model.
Malicious model update: An attacker may modify model parameters or updates to produce erroneous model output results. They can do this by injecting malicious code into the model updates.
Collusion attack: The attacker may collude with other participants to increase the success rate of the attack by jointly tampering or falsifying the data. This type of attack may make affect the results of federated learning more.

4. Blockchain-Enabled Federated Learning Approach for Industrial Equipment Inspection

The blockchain and federated learning are two popular technology areas that have a wide range of applications in different application scenarios. The blockchain is a distributed ledger technology that enables secure sharing and transmission of data. Federated learning is a machine learning technology that enables model training and updating through model aggregation without exposing original data.

4.1. Data Structure Design for Blockchain

The blockchain is a storage method for electronic data. The data are obtained in the form of blocks, which are linked together to provide immutability to their internal data. Once a block is linked to the chain, its internal data can no longer be changed. Once a block is added to the chain, the data inside are publicly visible to anyone. This technology can be used to record almost any information we can think of (e.g., property rights, identities, balances, medical records, etc.) without the risk of tampering with the records. Current practices require the existence of a decentralized industrial Internet system that allows for direct peer-to-peer monetary transactions between trading parties. In the transaction process between nodes and the blockchain, processing of data is involved, so it is especially important to design a reasonable blockchain data structure.

Based on the incentive mechanism and the demand of federated learning for data, we can reason out the blockchain block data structure, as shown in Figure 4.

The block header uses the hash of the previous block to maintain the chain structure, and the Merkle tree root is used to summarize the transaction information in the block. At the same time, nodes can use the Merkle tree to quickly verify transactions. At the same time, proof, base model, accuracy, and quantity store the node mining computing power, the final trained model, the model accuracy, and the data quality owned by the node, which are all prepared for the reward and punishment of each node device and the aggregation model later. Since each industrial equipment node is dynamically changing, it will transmit the updated model to the blockchain for a data update. Afterwards, it is necessary to create a data structure about data transactions under the block body, as shown in Figure 4.

Block bodies mainly hold information regarding hundreds or thousands of transactions. Once a transaction is sent to the blockchain network, it is packaged into the block. In the transaction information of this block, the type determines whether the update model information of the obtained node is acquired through a global download or a self-update.

4.2. Incentive: Certificate of Training Quality and Data Management (PODQ)

Transforming data sharing issues into model sharing brings significant benefits in terms of data security. By sharing only the model without disclosing the original data, the privacy of the data owner can be effectively protected. However, existing incentive mechanisms like Proof of Work (POW) either impose high computational and communication resource costs or lack a comprehensive understanding of node allocation contributions [4,16]. To tackle this challenge, we propose the Training Quality Data Proof (PODQ) protocol for federated learning authorization. PODQ integrates data model training and the quantity of data processed into the consensus process, thereby optimizing the utilization of computing resources across nodes.

POW: The incentive mechanism of Proof of Work (POW) in the blockchain is primarily designed for public blockchains. In POW, nodes engage in a competition to secure accounting rights by solving complex mathematical puzzles, known as hashing algorithms. The ability to successfully find the correct numerical solution and generate blocks is a tangible demonstration of a node’s computational power. Nodes with a higher computational power have a greater likelihood of solving these numerical values and gaining accounting rights [4].

P_{i} = \frac{φ_{i}}{\sum_{j = 1}^{N} φ_{j}}

(1)

where i represents each participating node, N is the total number of nodes, and

φ_{i}

represents the arithmetic power of node i.

Training quality and data volume-based incentive: The committee node selection process involves a subset of all participants, and it aims to strike a balance between costs, security, and fairness. To achieve a consensus on data sharing, we have introduced additional factors, such as proof of training workload and data capability. The selection of committee leaders is based on the quality, volume of data, and mining ability of the training model. Since each committee node trains a local data model, it becomes essential to validate and measure the quality of these models during the consensus process. The performance of the trained local model is quantified using prediction accuracy. By incorporating these metrics, we can effectively assess and compare the performance of local models.

M A E (m_{i}) = \frac{1}{n} \sum_{i = 1}^{n} | y_{i} - f (x_{i}) |

(2)

where

f (x_{i})

is the prediction value of model

m_{i}

and

y_{i}

is the real value of the records. The lower the MAE of model

m_{i}

, the higher the accuracy of

m_{i}

.

After model training, we obtain a trained global data model M and a local model

m_{i}

for each committee node. The incentive process is performed by the committee. As a proof of training work,

D_{j}

verifies all transactions it receives by calculating the MAE defined in Equation (3). The MAE for

D_{j}

,

M A E^{u} (D_{j})

is calculated as follows:

M A E^{u} (D_{j}) = γ + M A E (m_{j})

(3)

where

M A E (m_{j})

is the MAE of the locally trained model

m_{j}

, and

γ

is the weight parameter denoting the contribution of

D_{j}

to the global model, which is determined by the training data size of

P_{j}

and other participants,

γ = | d_{j} | / \sum_{i} | d_{i} |

.

Local device deviation: In the context of the distributed industrial Internet, the issue of collaborative learning with streaming data entails multiple nodes collecting data incrementally. Since data are scarce at the onset of data collection, collaboration through data sharing can furnish nodes with machine learning models trained on shared data. Moreover, nodes with higher contributions (e.g., richer nodes with superior local data) can obtain better incentives. However, this can widen the “gap” between richer and less-resourced nodes, possibly impeding the participation of less-resourced nodes, which is undesirable. Subsequently, the score calculation for each node device also needs to incorporate the gap between local devices and global devices

D F_{i}

to provide equal and comprehensive incentives for maintenance equipment.

D F_{i} = | | f_{i} (w) - g_{t} | |

(4)

f_{i} (w)

represents the deviation gap between the model trained using the i-th local device and the

g_{t}

global model.

PODQ: By combining the POW mechanism with the training data and the quality, the formula is as follows:

S_{j} = γ + M A E (m_{j}) + P_{j} + D F (m_{j})

(5)

where

S_{j}

denotes the score obtained by the j-th node through the reward mechanism, and

P_{j}

represents the mining computational arithmetic provided by node j on the blockchain.

D F (m_{j})

is the dissimilarity between the local and global models.

When the incentive process starts, the committee node with the highest score at that time will be elected as the committee leader through PODQ-based voting. The leader is responsible for driving the consensus process among the participating nodes, and the committee nodes audit the blocks by verifying the model transaction traces in the same way that a verification node verifies the amount of Bitcoin transactions. The leader broadcasts the task of generating a new block to all members of the committee for approval, and if the block containing all transactions is approved by each committee node, the leader sends the block data signed with their signature to all nodes. Then, the records are stored in the blockchain, which are tamper-proof.

4.3. Federated Learning

In the multi-party environment of the industrial Internet, the distribution of data sets is fragmented. Moreover, based on the principle that the data do not move but the model moves, data are available but not visible [3], the local devices are trained based on the data sets, and the model parameters are uploaded to the central server for aggregation. After aggregation, new model parameters are sent to each device [10]. In this process, the blockchain can remove the trusted central authority. Therefore, it can solve the trust problem of the central server in federated learning and prevent single points of failure in federated learning [20,21]. To change the tradition and pursue a superior global model, we strike a balance between the local model and the global model—making global and local adjustments to adapt to the local data set.

In traditional federated learning, we need to solve the following problem across N clients: find a global model w that minimizes the loss function

f (w)

for each client.

m i n {f (w) : = \frac{1}{N} \sum_{i = 1}^{N} f_{i} (w)}

(6)

In LFL-COBC, add an

l_{2}

regular term to the loss function:

f_{i} (w) = f (θ_{i}) + \frac{λ}{2} | | θ_{i} - w | |

(7)

where

θ_{i}

represents the personalization model of client i,

λ

lambda is used as a hyperparameter to control the degree of personalization, and the larger

λ

is, the lower the personalization program will be.

In order to improve the communication efficiency, when the gradient is uploaded to the central server, the method employs the PODQ mechanism to judge the rewards provided by each node. The detailed process of FL is shown below.

Step 1: Global model initialization. All industrial devices are registered on the blockchain to participate in federated learning. When each industrial device completes the local modeling task, the edge cloud aggregates the local models and sends the initialized global model

g_{0}

to the blockchain.

g_{0} \to D_{r}

(8)

where

g_{0}

denotes the initial global model, and

D_{r}

denotes the r-th industrial equipment.

Step 2: Local model training. The industrial equipment uses local data to train a local model

g_{t} \to g_{k}

by using

g_{t}

and receive an updated local model

L_{k + 1}

with a trace optimized via the value of the loss function.

Step 3: Aggregate the model. According to the PODQ reward mechanism, the node with the highest reputation is selected to aggregate the blockchain model, and the elected node aggregates the local model through global training to obtain the globally updated model

g_{t + 1}

. Then, the server broadcasts the new model to all devices.

g_{t + 1} = g_{t} + λ \sum_{i = 0}^{m} (L_{i}^{t + 1} - g_{i}^{t})

(9)

g_{t + 1}

denotes the global model parameters updated in round t + 1, and

L_{i}^{t + 1}

denotes the model of the i-th client after the local update in round t + 1.

g_{i}^{t}

denotes the model of the i-th client after the t-th round of global update.

λ

denotes the reward obtained by each industrial equipment based on the PODQ mechanism.

5. Real-Time State Anomaly Detection of Industrial Equipment

The increase in scale and complexity of modern systems makes manual detection infeasible, and in the current industrial Internet environment, intelligent automatic detection of equipment anomalies is an urgent need in order to reduce the manual workload [6]. The log sequence is constructed, and the TF-IDF method is used to construct the feature vector, and then the CNN model is used to perform device anomaly detection [22] on the extracted features. At the same time, in order to reduce the communication overhead and resource allocation burden of the blockchain, the model rows need to be compressed to form a smaller network model before the model is uploaded by the computing device.

5.1. Industrial Equipment Anomaly Detection

With the rapid growth in the scale and complexity of modern systems, traditional manual detection approaches have been unable to fulfill the requirements of the current industrial Internet environment. In this context, intelligent and automatic detection of equipment anomalies has emerged as an urgent necessity, which can effectively alleviate the manual workload and enhance the detection efficiency and accuracy.

To achieve this objective, a diverse range of advanced technologies and approaches can be employed. Firstly, the working log of the computing device offers a detailed account of the events generated by the system. Nevertheless, traditional anomaly detection methods based on keywords or regular expressions do not yield satisfactory performance in sequential pattern-based anomaly detection. As modern systems expand in size and increase in complexity, manual detection becomes unfeasible. By constructing log sequences and utilizing the TF-IDF method to construct feature vectors, crucial information can be effectively extracted, laying a foundation for subsequent anomaly detection. The TF-IDF method can measure the relative significance of terms in a document and reduce the influence of common yet unimportant terms through an inverse document frequency, thereby enabling a more accurate identification of key terms in a document. This approach has many applications in document classification, keyword extraction, and recommender systems.

Next, a convolutional neural network (CNN) model was utilized to detect anomalies in the device based on the extracted features, which could further enhance the detection accuracy. The CNN model possesses distinct advantages in extracting abnormal waveform features, can handle nonlinear relationships, and has a flexible network structure design, which is suitable for various business scenarios and improves the universality and accuracy of the model.

5.2. Lightweight Model Pruning

Over-parameterization primarily implies that during the training phase, a significant number of differential solutions [23] in mathematics are necessary to capture minute variations in the data. Once the iterative training is accomplished, the network model does not require such a large number of parameters during inference. The pruning algorithm is founded on the theoretical basis of over-parameterization [24]. The core concept of the pruning algorithm is to decrease the quantity of parameters and computations in the network model while striving to ensure that the performance of the model remains unaffected. In the AI framework, the main effect of pruning is essentially to make the end-side model smaller, enabling small IoT devices such as tablets, phones, watches, headphones, etc., to effortlessly utilize the AI model [2,25]. From a macro perspective, pruning techniques are mainly categorized into two classic pruning algorithms, namely Drop Out and Drop Connect. In this paper, the drop-connect approach is employed for pruning, as shown in Figure 5.

Drop-Out: On the left of Figure 5, the output of the neuron is randomly set to zero.
Drop-Connect: In the process of training the neural network model, the right figure of Figure 5, the solid line is the input neuron under the retention, and the dotted line with red x is the redundant input neuron. Instead of randomly changing the output of the hidden layer node to 0, it changes the weight of each input connected to it in the node to 0, and the weight of the input neuron under the retention to 1.

Due to the limited carrying capacity of the blockchain on resource capacity, it is necessary to utilize the redundancy of neural network parameters and network structure to streamline the model. This enables us to obtain a model with fewer parameters and a more streamlined structure without affecting task completion. In a deep learning network model that extends from the convolutional layer to the fully connected layer has a large number of redundant parameters, with a large number of neurons’ activation value tending to 0. Moreover, the removal of these neurons can show the same expression ability of the model, with pruning algorithms being based on the theoretical basis of over-parameterization. In this paper, the connection between some neurons is randomly set to zero, which makes the weight connection matrix become sparse [11]. There are three main parts: training, pruning, and fine-tuning. Figure 6 and Algorithm 2 show model pruning, with the steps being as follows:

Pre-training. The network model is first trained to construct a baseline model for the pruning algorithm to obtain the original model trained on the specific underlying task.
Pruning. The magnitude of the weight values is ranked, and connections below a preset pruning threshold or ratio are removed to obtain the pruned network.
Fine-tuning. Fine-tune the pruned network to recover lost performance, and then continue with the second step, alternating in this order until the termination condition is satisfied, e.g., the accuracy drops within a certain range.

Algorithm 2 Model pruning algorithm

Input: Trained model

m o d e l

, pruning threshold

t h r e s h o l d

Output: Pruned model

p r u n e d_m o d e l

for each layer $l a y e r$ in $m o d e l . l a y e r s$ do
if $l a y e r$ is of type $C o n v 2 D$ then
$w e i g h t s \leftarrow l a y e r . g e t_w e i g h t s () [0]$
$w e i g h t_a b s \leftarrow abs (w e i g h t s)$
$t h r e s h o l d_w e i g h t s \leftarrow w e i g h t_a b s > t h r e s h o l d$
for $i \leftarrow 0$ to $w e i g h t s . s i z e - 1$ do
if $t h r e s h o l d_w e i g h t s [i] = = False$ then
$w e i g h t s [i] \leftarrow 0$
end if
end for
end if
end for
$l a y e r . s e t_w e i g h t s ([w e i g h t s, l a y e r . g e t_w e i g h t s () [1 :]])$

Return:

m o d e l

6. Experiment and Evaluation

In this section, we investigate model aggregation for device clustering and the impact of large models’ lightweight treatment on device log anomaly detection. We then compare our scheme with other schemes on homebrew FL learning models and real data sets.

6.1. Data Set

HDFS [26]: HDFS (Hadoop Distributed File System) is a distributed file system running on general-purpose hardware to support the Hadoop ecosystem, a big data processing framework. There are three groups of HDFS logs: HDFS v1, HDFS v2, and HDFS v3. HDFS v1 was generated using a benchmark workload in a 203-node HDFS and manually marked via manual rules to identify anomalies. In addition, HDFS v1 provides specific anomaly-type information while also allowing us to study duplicate problem identification. HDFS v2 is collected by aggregating logs from an HDFS cluster in our laboratory environment, which includes a name node and 32 data nodes HDFS v3 is an open data set from trace-oriented monitoring, which was collected by detecting HDFS systems using MTracer 2.1 in a real IaaS environment.
BGL [26]: BGL is an open log data set gathered by the BlueGene/L supercomputer system, which possesses 131,072 processors and 32,768 GB of memory. It is located at Lawrence Livermore National Laboratory (LLNL), Livermore, California [27]. The log encompasses both alert and non-alert messages, identified through alert category labels. In the first column of the log, a `-’ indicates non-alert messages, whereas others signify alert messages. The label information is conducive to studies on alert detection and prediction.

6.2. Security Analysis

In this experiment, we conduct security analysis from three aspects: data privacy, network attacks, and malicious participants.

Data tampering: Malicious participants may attempt to obtain sensitive data of other participants and violate the privacy rights of users. In this paper, we adopt federated learning techniques, which can perform calculations without exposing data, having the ability to protect data privacy and ensure the accuracy of experimental results.
Malicious model update: Attackers may attempt to perform attacks on communication channels to steal data or tamper with fake data transmissions. In this experiment, we use blockchain technology. Moreover, the data uploaded to the blockchain can only be stored in the blockchain when more than 2/3 of the clients pass the verification test, which ensures the integrity and security of the data in the transmission process.
Collusion attack: There may be malicious participants in blockchain-federated learning experiments who try to corrupt the results of the experiment or obtain sensitive information of other participants. However, we use an incentive mechanism based on PODQ for the election of the committee to ensure that only legitimate participants with the highest rewards can participate in the experiment and ensure the reliability of the experiment.

6.3. The Performance of the Proposed Scheme

In this subsection, we investigate the impact of the proposed scheme on the model under different operations of model lightening and aggregation.

6.3.1. Comparison and Evaluation of Incentive Mechanisms

We use three non-interfering virtual machine devices to conduct experiments, and we explore the impact of the three incentives of PODQ, POW and POD on the training of models at each node, as shown in Figure 7 and Figure 8. With the increase in training batches, under the incentive mechanism we designed (PODQ), the training accuracy of the three devices is almost the best or close to that of the other incentives, and the loss rate is not only the lowest but is also maintained in the downward trend continuously. In contrast, the results of the training of multi-models of PODQ under the overall system architecture are superior to those of POW and POD.

6.3.2. Model Pruning Evaluation

In this subsection, we compare the model device working time and efficiency before and after model pruning.

Comparison of aggregation times before and after model pruning. We compared the changes in the aggregation time of the models before and after pruning. As can be seen in Figure 9, we found that the aggregation time of the models after pruning is much shorter. At the same time, the size of the models after pruning was relatively reduced by 87%, which improves the efficiency and resource utilization of the overall system architecture to a certain extent.
Comparison of node device work efficiency before and after model pruning. In order to investigate the effect of model pruning on the model training results of node devices, we compared the model training results of the virtual machines before and after model pruning, as shown in Figure 10. Model lightweighting had almost no effect on the level of model accuracy loss; however, the loss rate of the node devices was lower after model pruning. These data results suggest that the original model is over-parameterized and lightweight model operations are essential.

7. Conclusions and Discussion

In this paper, we discuss the emerging issues of LFL-COBC on incentives and resource maximization. Furthermore, we investigate the impact of updating the incentives of multidimensional clients and model lightweighting on the overall system architecture. We also derive the collaborative interactions among individual nodes by deducing their equilibrium strategies based on the rewards received by node devices. We further propose a near-optimal pruning algorithm to minimize the loss of model accuracy while minimizing model size, thus further improving the efficiency of individual nodes and resource utilization. In a future study, we will explore how to perform model aggregation in the blockchain for de-decentralization to avoid privacy and security issues that arise when model aggregation is performed on a designated client. Additionally, we will design a practical blockchain-federated learning system based on our analysis results.

Author Contributions

Conceptualization, Q.L. and K.G.; methodology, M.W. and K.F.; software, Y.S. and X.Z.; formal analysis, Q.L.; writing—original draft preparation, Y.S.; writing—review and editing, Y.S. and N.X.; supervision, N.X.; project administration, N.X. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Major Research Plan of the National Natural Science Foundation of China (Grant Nos. 92167203, 92267204, and 62232013).

Data Availability Statement

The Hadoop Distributed File System (HDFS) dataset used in this study is publicly available at https://gitcode.com/gh_mirrors/lo/loghub/tree/master/HDFS, accessed on 24 May 2024. The Blockchain Graph Dataset (BGL) used for the analysis is accessible at https://gitcode.com/gh_mirrors/lo/loghub/tree/master/BGL, accessed on 24 May 2024.

Conflicts of Interest

Authors Qiaoyang Li and Ke Gao were employed by the company Xi’an Aeronautics Computing Technique Research Institute. These two authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as apotential conflict of interest. Author Kefeng Fan was employed by the company China Electronics Standardization Institute. The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as apotential conflict of interest. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as apotential conflict of interest.

References

Chen, Y.; Poskitt, C.M.; Sun, J. Learning from Mutants: Using Code Mutation to Learn and Monitor Invariants of a Cyber-Physical System. In Proceedings of the 2018 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 21–23 May 2018; pp. 648–660. [Google Scholar]
Yeom, S.; Giacomelli, I.; Fredrikson, M.; Jha, S. Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting. In Proceedings of the 2018 IEEE 31st Computer Security Foundations Symposium (CSF), Oxford, UK, 9–12 July 2018; pp. 268–282. [Google Scholar] [CrossRef]
Li, Q.; Wen, Z.; Wu, Z.; Hu, S.; Wang, N.; Li, Y.; Liu, X.; He, B. A survey on federated learning systems: Vision, hype and reality for data privacy and protection. IEEE Trans. Knowl. Data Eng. 2021, 35, 3347–3366. [Google Scholar] [CrossRef]
Qu, X.; Wang, S.; Hu, Q.; Cheng, X. Proof of federated learning: A novel energy-recycling consensus algorithm. IEEE Trans. Parallel Distrib. Syst. 2021, 32, 2074–2085. [Google Scholar] [CrossRef]
Cheng, Y.; Wang, D.; Zhou, P.; Zhang, T. A Survey of Model Compression and Acceleration for Deep Neural Networks. arXiv 2017, arXiv:1710.09282. [Google Scholar] [CrossRef]
Guo, Y.; Wu, Y.; Zhu, Y.; Yang, B.; Han, C. Anomaly Detection using Distributed Log Data: A Lightweight Federated Learning Approach. In Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN), Virtual, 18–22 July 2021; pp. 1–8. [Google Scholar] [CrossRef]
Zhang, C.; Xie, Y.; Bai, H.; Yu, B.; Li, W.; Gao, Y. A survey on federated learning. Knowl. Based Syst. 2021, 216, 106775. [Google Scholar] [CrossRef]
Luo, J.; Wu, J. An Entropy-based Pruning Method for CNN Compression. arXiv 2017, arXiv:1706.05791. [Google Scholar] [CrossRef]
Han, S.; Mao, H.; Dally, W.J. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding. arXiv 2015, arXiv:1510.00149. [Google Scholar]
Lu, Y.; Huang, X.; Dai, Y.; Maharjan, S.; Zhang, Y. Blockchain and federated learning for privacy-preserved data sharing in industrial IoT. IEEE Trans. Ind. Inform. 2019, 16, 4177–4186. [Google Scholar] [CrossRef]
Singh, S.K.; Yang, L.T.; Park, J.H. FusionFedBlock: Fusion of blockchain and federated learning to preserve privacy in industry 5.0. Inf. Fusion 2023, 90, 233–240. [Google Scholar] [CrossRef]
Li, Y.; Chen, C.; Liu, N.; Huang, H.; Zheng, Z.; Yan, Q. A Blockchain-Based Decentralized Federated Learning Framework with Committee Consensus. IEEE Netw. 2021, 35, 234–241. [Google Scholar] [CrossRef]
Ali, S.; Li, Q.; Yousafzai, A. Blockchain and federated learning-based intrusion detection approaches for edge-enabled industrial IoT networks: A survey. Ad Hoc Netw. 2024, 152, 103320. [Google Scholar] [CrossRef]
Kim, H.; Park, J.; Bennis, M.; Kim, S.L. Blockchained On-Device Federated Learning. IEEE Commun. Lett. 2020, 24, 1279–1283. [Google Scholar] [CrossRef]
He, Y.; Li, H.; Cheng, X.; Liu, Y.; Yang, C.; Sun, L. A blockchain based truthful incentive mechanism for distributed P2P applications. IEEE Access 2018, 6, 27324–27335. [Google Scholar] [CrossRef]
Han, R.; Yan, Z.; Liang, X.; Yang, L.T. How Can Incentive Mechanisms and Blockchain Benefit with Each Other? A Survey. ACM Comput. Surv. 2022, 55, 1–38. [Google Scholar] [CrossRef]
Li, Q.; Sun, Y.; Xi, N. LFL-COBC:Lightweight Federated Learning On Blockchain-based Device Contribution Allocation. In Proceedings of the 2024 International Conference on Networking and Network Applications (NaNA), Yinchuan, China, 9–12 August 2024; pp. 1–7. [Google Scholar] [CrossRef]
Hewa, T.M.; Hu, Y.; Liyanage, M.; Kanhare, S.S.; Ylianttila, M. Survey on blockchain-based smart contracts: Technical aspects and future research. IEEE Access 2021, 9, 87643–87662. [Google Scholar] [CrossRef]
Xu, S.; Liu, S.; He, G. A Method of Federated Learning Based on Blockchain. In Proceedings of the 5th International Conference on Computer Science and Application Engineering, Sanya, China, 19–21 October 2021. [Google Scholar] [CrossRef]
Mohammed, M.A.; Lakhan, A.; Abdulkareem, K.H.; Khanapi Abd Ghani, M.; Abdulameer Marhoon, H.; Nedoma, J.; Martinek, R. Multi-objectives reinforcement federated learning blockchain enabled Internet of things and Fog-Cloud infrastructure for transport data. Heliyon 2023, 9, e21639. [Google Scholar] [CrossRef] [PubMed]
Nguyen, D.C.; Ding, M.; Pham, Q.V.; Pathirana, P.N.; Le, L.B.; Seneviratne, A.; Li, J.; Niyato, D.; Poor, H.V. Federated learning meets blockchain in edge computing: Opportunities and challenges. IEEE Internet Things J. 2021, 8, 12806–12825. [Google Scholar] [CrossRef]
Xu, W.; Huang, L.; Fox, A.; Patterson, D.; Jordan, M.I. Detecting large-scale system problems by mining console logs. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, Big Sky, MT, USA, 11–14 October 2009; pp. 117–132. [Google Scholar] [CrossRef]
Abadi, M.; Chu, A.; Goodfellow, I.; McMahan, H.B.; Mironov, I.; Talwar, K.; Zhang, L. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, 24–28 October 2016; pp. 308–318. [Google Scholar]
Kairouz, P.; McMahan, H.B.; Avent, B.; Bellet, A.; Bennis, M.; Bhagoji, A.N.; Bonawitz, K.; Charles, Z.; Cormode, G.; Cummings, R.; et al. Advances and open problems in federated learning. Found. Trends^® Mach. Learn. 2021, 14, 1–210. [Google Scholar] [CrossRef]
Bagdasaryan, E.; Veit, A.; Hua, Y.; Estrin, D.; Shmatikov, V. How to backdoor federated learning. In Proceedings of the International Conference on Artificial Intelligence and Statistics, Online, 26–28 August 2020; pp. 2938–2948. [Google Scholar]
He, S.; Zhu, J.; He, P.; Lyu, M.R. Loghub: A Large Collection of System Log Datasets towards Automated Log Analytics. arXiv 2020, arXiv:2008.06448. [Google Scholar] [CrossRef]
Oliner, A.; Stearley, J. What Supercomputers Say: A Study of Five System Logs. In Proceedings of the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, Edinburgh, UK, 25–28 June 2007. [Google Scholar] [CrossRef]

Figure 1. The system model of LFL-COBC.

Figure 2. The framework of LFL-COBC.

Figure 3. Flow of LFL-COBC system operation.

Figure 4. Structure design of the blockchain.

Figure 5. Comparison of drop-out and drop-connect approaches.

Figure 6. Model pruning flow chart.

Figure 7. Training accuracy based on different incentives under multiple node devices.

Figure 8. Training loss rate based on different incentives under multiple node devices.

Figure 9. Comparison plot of model aggregation time before and after lightweight operation.

Figure 10. Training results before and after model pruning.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, Q.; Sun, Y.; Gao, K.; Xi, N.; Zhou, X.; Wang, M.; Fan, K. LFL-COBC: Lightweight Federated Learning on Blockchain-Based Device Contribution Allocation. Electronics 2024, 13, 4395. https://doi.org/10.3390/electronics13224395

AMA Style

Li Q, Sun Y, Gao K, Xi N, Zhou X, Wang M, Fan K. LFL-COBC: Lightweight Federated Learning on Blockchain-Based Device Contribution Allocation. Electronics. 2024; 13(22):4395. https://doi.org/10.3390/electronics13224395

Chicago/Turabian Style

Li, Qiaoyang, Yanan Sun, Ke Gao, Ning Xi, Xiaolin Zhou, Mingyan Wang, and Kefeng Fan. 2024. "LFL-COBC: Lightweight Federated Learning on Blockchain-Based Device Contribution Allocation" Electronics 13, no. 22: 4395. https://doi.org/10.3390/electronics13224395

APA Style

Li, Q., Sun, Y., Gao, K., Xi, N., Zhou, X., Wang, M., & Fan, K. (2024). LFL-COBC: Lightweight Federated Learning on Blockchain-Based Device Contribution Allocation. Electronics, 13(22), 4395. https://doi.org/10.3390/electronics13224395

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

LFL-COBC: Lightweight Federated Learning on Blockchain-Based Device Contribution Allocation^†

Abstract

1. Introduction

2. Paper Organization

3. Lightweight Federated Learning on Blockchain-Based Device Contribution Allocation

3.1. System Model

3.2. The Constituents of the Overall Architecture

3.3. Attack Model

4. Blockchain-Enabled Federated Learning Approach for Industrial Equipment Inspection

4.1. Data Structure Design for Blockchain

4.2. Incentive: Certificate of Training Quality and Data Management (PODQ)

4.3. Federated Learning

5. Real-Time State Anomaly Detection of Industrial Equipment

5.1. Industrial Equipment Anomaly Detection

5.2. Lightweight Model Pruning

6. Experiment and Evaluation

6.1. Data Set

6.2. Security Analysis

6.3. The Performance of the Proposed Scheme

6.3.1. Comparison and Evaluation of Incentive Mechanisms

6.3.2. Model Pruning Evaluation

7. Conclusions and Discussion

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

LFL-COBC: Lightweight Federated Learning on Blockchain-Based Device Contribution Allocation †

Abstract

1. Introduction

2. Paper Organization

3. Lightweight Federated Learning on Blockchain-Based Device Contribution Allocation

3.1. System Model

3.2. The Constituents of the Overall Architecture

3.3. Attack Model

4. Blockchain-Enabled Federated Learning Approach for Industrial Equipment Inspection

4.1. Data Structure Design for Blockchain

4.2. Incentive: Certificate of Training Quality and Data Management (PODQ)

4.3. Federated Learning

5. Real-Time State Anomaly Detection of Industrial Equipment

5.1. Industrial Equipment Anomaly Detection

5.2. Lightweight Model Pruning

6. Experiment and Evaluation

6.1. Data Set

6.2. Security Analysis

6.3. The Performance of the Proposed Scheme

6.3.1. Comparison and Evaluation of Incentive Mechanisms

6.3.2. Model Pruning Evaluation

7. Conclusions and Discussion

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

LFL-COBC: Lightweight Federated Learning on Blockchain-Based Device Contribution Allocation^†