1. Introduction
In the context of new power-system construction strategies [
1], new business systems such as virtual power plants (VPP), load integrators with multiple interactions and cross-network operations have emerged in large numbers. Along with the continuous increase in the volume of business, the interaction between the main bodies on both sides of the public and private sector is becoming increasingly large and complex. VPPs have a higher requirement fo the flexible scheduling of communication networks, differentiated service guarantees and network security [
2,
3]. As a regional multi-energy aggregation model which enables large-scale connection of renewable energy generation to the grid, virtual power plants aggregate distributed power sources [
4], controllable loads and energy storage devices in the grid into a virtual controllable aggregate through a distributed power management system. However, due to the existence of a series of new energy sites and low-voltage distributed power generation systems in the virtual-power-plant business system, which may involve multiple stakeholders, there may be a large number of problems in the operation process of the power system, such as random fluctuations in the output of new energy; multi-level, multi-time-scale stability; and the balance of the power system. Therefore, based on existing information-physical system theory and analysis, the dynamic interaction process of information and energy flows on both sides of the virtual power plant is modelled to realize the state sensing, fusion analysis and coordination control of the virtual-power-plant system [
5]. On the other hand, in order to achieve flexible and efficient regulation of the source network load and storage in the virtual-power-plant system, the information nodes and power nodes need to complete the regulation and control instructions and upload the physical-equipment status information through frequent data interaction, but the frequent-interaction data contains a large number of multi-source heterogeneous data, e.g., structured data in the form of data tables and unstructured data, such as text and images, etc. If these heterogeneous data cannot be digitally transformed in real time, it will seriously affect the accuracy and stability of the model in the information-physical system of the virtual power plant. However, cross-type data analysis techniques for new power systems, such as VPPs, are currently very weak [
6]; the correlation analysis mining capabilities between state quantities are rather insufficient meaning that the front-back-end data fusion is difficult to effectively operate [
7]. Therefore, an integrated and unified model for multiple sources of heterogeneous interactive data in a public–private interactive power-grid environment can efficiently help to achieve efficient fusion of multiple sources of heterogeneous data and to improve the performance of virtual-power-plant information-physical system models.
In recent years, some scholars have started to investigate the data generated by the interaction of public and private information in virtual power plants [
8], given the increasing number of occasions where big data is present in the construction of smart grids. Currently, work on the integration of heterogeneous data from multiple sources in virtual power plants is mainly focused on sensor-node data fusion. Current power-data-fusion techniques are mainly aimed at the fusion of multiple sources of power equipment, and most of these techniques face the fault location of power equipment and the identification and calibration of multiple sources of data in the distribution network. Jiao et al. [
9] used the distance measurement results of distance relays in substations at both ends of transmission lines to propose a new method for improving the fault location accuracy of transmission lines based on multi-source data-fusion techniques. Based on the analysis of existing series-compensated transmission-line fault-location algorithms, some scholars introduced artificial-intelligence techniques to the differential equation mathematical model and fuse the model with neural networks to design a new power-fault location method [
10,
11]. For multiple data sources in distribution-network data collection, a practical multi-source data pre-processing technique can repair some of the bad data and improve the quality of the state estimation input data, so that the advantages of redundant data can be fully utilized and misjudgement and omissions in data pre-processing can be avoided. In response to the problems of complex rectification, difficult coordination and poor adaptability of traditional protection methods for smart distribution networks, Lin et al. developed a condition-monitoring and fault-handling method [
12] based on big-data analysis of smart distribution networks. According to the network correlation matrix and regional difference rules, the current and power data collected by the measurement and control integration terminals at each node are pre-processed and the results are fused in time and space, a high-dimensional spatial-temporal condition monitoring matrix is generated. However, the above methods only consider the fusion error as an indicator when fusing data from multiple sources. As the multi-source data contains a large amount of redundant data, the transmission of redundant data results in a waste of bandwidth resources and even causes network blockage in severe cases, reducing the fusion efficiency [
13]. Therefore, how to screen feature attributes efficiently with guaranteed fusion error and reduce the transmission of redundant data is a focus of multi-source data-fusion processing. Recently, deep-learning-based methods are involved in FDIA detection, e.g., the CNN-LSTM scheme [
14] and a multi-head-attention-like scheme [
15]. They can achieve better results than traditional machine-learning methods, obtaining the advantages in terms of classification accuracy and training speed. Nevertheless, for the multi-business-entity nature of virtual power plants, the data obtained from different business entities is clearly heterogeneous and from multiple sensing sources, and these data are fragmented between different systems, creating a highly heterogeneous data. Yet, these deep-learning-based methods can only effectively process data from the same data source sensor, while, for heterogeneous data from different sources, their fusion efficiency will be greatly reduced, resulting in a decline in FDIA-detection accuracy.
Overall, a unified data-fusion model can effectively integrate multi-source sensor data by eliminating data redundancy. The fused and efficient data can improve the detection performance for FDIA, providing a reliable guarantee for the stable operation of the power system. However, existing deep-learning-based FDIA detection schemes usually only focus on the detection-feature extraction between false data and normal data, ignoring the feature correlation that easily produces diverse data redundancies, resulting in the significant difficulty of detecting false-data-injection attacks. Facing the aforementioned problem, we were motivated to propose a multi-source self-attention data-fusion model for FDIA detection. The proposed data-fusing model firstly employs a temporal-alignment technique to integrate the collected multi-source sensing data to the identical time dimension. Subsequently, a hybrid deep-learning network was built by combining long short-term memory (LSTM) and a convolution neural network (CNN), which can effectively extract hybrid features for different multi-source sensing data. Furthermore, we designed a self-attention module to further eliminate hybrid-feature redundancy and aggregate the differences between the attack-data features and normal-data features. Finally, the extracted features and their weights were integrated to implement false-data-injection-attack detection using a single convolution operation. Extensive simulations were performed over IEEE14 node test systems and the experimental results demonstrate that our model can obtain better data-fusion effects and presents a superior detection performance compared with the state-of-the-art.
In general, compared to existing works, we make the following novel contributions:
- -
We build a hybrid deep-learning network by combining long short-term memory and a convolution neural network, which can effectively extract hybrid features for different multi-source sensing data. The proposed network model is a good attempt to achieve a balance between efficient feature construction and high-accuracy attack detection.
- -
We designed a self-attention module which can further eliminate hybrid-feature redundancy by aggregating the differences between attack-data features and normal-data features. Since the proposed self-attention module works in a plug-and-play mode and further optimizes the model scale, it does not add to overall network-training-time consumption.
- -
Comprehensive experiments were performed over IEEE14 and IEEE118 node test systems demonstrate that our model can outperform existing methods in terms of feature effectiveness, FDIA-detection accuracy and network training complexity.
The rest of this paper is organized as follows.
Section 2 presents the related work on data-fusion schemes and self-attention mechanisms. In
Section 3, we describe the details of the multi-source self-attention data-fusion model for FDIAs detection. Comprehensive experiments were performed to evaluate the performance of proposed scheme. The experimental results and corresponding discussions are presented in
Section 4. Finally,
Section 5 concludes the paper.
4. Experimental Results and Discussions
4.1. Experimental Setup
In our experiments, the IEEE 14 node-test system data was selected as the experimental data. The IEEE 14 node data was collected from New York independent system operator from 1 January 2020 to 1 May 2022 for real loads. Eleven regions were selected to represent the 11 load buses of the IEEE 14 node-test system. The individual bus-state variables were obtained by performing a trend calculation on the system. Moreover, the multi-source sensor data for the experiments in this paper were formed by simulating a split of the IEEE 14 node data, e.g.,
Figure 5. Each node in the IEEE 14 node-test system independently monitors the load information of the test system for that node, the node dividing in the system can be, thus, considered as dividing the different sensor data sources that monitor the same system.
In our experiments, the data for the training and validation sets are divided from the normal data of the system. For the training set, the attack data is generated using a false-data-injection method, while the attack strength c and the measurement noise e are combined to achieve the performance evaluation of the model detection. The attack strength c is a parameter used to measure the degree of impact of attackers on the system. It is typically used to measure the percentage of attackers who can successfully inject false data and influence system behavior.The measurement noise e, also known as noise variance, is commonly used to describe the degree to which the distribution of random variables deviates from the mean; Gaussian noise is often added to the input data during training to improve the robustness and generalization ability of the model. In addition, we designed the random FDIA with an attack duration of 1–5 times and number of attacks of 5000 times. The attack strength and the measurement environment noise follows a Gaussian distribution, where measurement noise e = 0.25, 0.3, 0.35, 0.4, 0.45, 0.5. The training and validation sets were divided from 30,000 normal data points with a validation-set ratio of 0.3. The testing set consists of 13,298 normal data points and 270 false-attack data points, for a total of 13,568 data points.
To fully characterize the performance of our model, we exploit four metrics, named as accuracy, recall, precision and
score. True Positive (TP) denotes the amount of false data detected correctly, True Negative (TN) the amount of normal data detected as normal, False Positive (FP) as the amount of normal data incorrectly detected as false, and False Negative (FN) the amount of false data incorrectly detected as normal; the four metrics can be correspondingly expressed as:
In the above metrics, accuracy reflects the proportion of correct classifications, recall indicates the percentage of false data that we successfully detected out of all false data, precision means the percentage of false data that we correctly detected out of all predicted false data and
is the harmonic mean of recall and accuracy [
35]. Overall, a larger
score implies a better overall performance of the model, while a larger area under the ROC curve [
36] indicates better performance.
4.2. Effectiveness Verification for Proposed Model
In this experiment, we first carried out a series of experiments to test the effectiveness of our proposed network model. All simulation experiments were implemented over the platform with Intel (R) Core (TM) i5-8250U CPU @ 1.60 GHz and 8 GB Memory. The running time is 1064 s. In general, our network model mainly consists of four stages during normal operation, including data pre-processing, model training, threshold selection and model testing. In the data pre-processing phase, a temporal alignment operation was performed on the data using Pearson coefficients. The model training phase used the CNN-LSTM combined with self-attention network structure proposed to train the classification model, where the number of training rounds was 15, the batch size was 128, the loss function used MSE loss, and the optimizer was set to the Adam learner. The default parameters were (0.9, 0.99) and the optimization was performed by the Adam learner. The initial learning rate was and the weights were weakened in each round. During the training of the model, the optimal model parameters were fixed using threshold selection and were used for FDIA-attack detection. We tested the change in loss during model training when two sensors and three sensors were fused for data, respectively.
Figure 6 shows the test results. It can be seen from this figure that our data-fusion network model can achieve a rapid training loss reduction and eventually become stable, whether two or three sensors are used. This indicates that our network model is effective in achieving fast data fusion. This phenomenon can be easily explained as follows. Unlike multiple self-attention using complex batch matrix multiplication, our model uses simple numerical operations (e.g., multiplication and summation), which can achieve the rapid convergence of the training model and, ultimately, reduce time consumption.
Furthermore, a series of experiments were conducted to validate the effect of the self-attention module in IEEE14 and IEEE118 system data sets. In this test, four basic deep-learning network models—the CNN model, LSTM model, MLP model and CNN-LSTM model—were used to test the effect of self-attention. We tested the fusion of data from two sensors and built the FDIA attack mentioned in
Section 4.1, using the
score as an evaluation metric. The corresponding results are shown in
Figure 7. We can observe from this figure that the attack-detection performance of the four basic models in a multi-source data environment are significantly improved by adding the self-attention module. To be specific, in IEEE14, the
score of the CNN model achieves an about 26% improvement; LSTM and MLP also have about 2% and 3% gains in
score; while for the CNN-LSTM hybrid model, a 3%
score improvement can be also achieved. Similarly, for the IEEE118 system data set, our fusion mechanism can achieve an about 21%
score improvement comparing with the CNN model, a 3%
score improvement for LSTM, and a 1%
score improvement for the CNN-LSTM hybrid model. This shows that our self-attention module can effectively improve data availability before and after fusion. As the spatial and temporal features can be extracted separately using our designed CNN-LSTM hybrid network, the redundancy in the original data can be effectively eliminated while the two features do not affect each other. Moreover, the self-attention module can further learn deep internal relationships between features and assign weights to them. The network model pays more attention to the differences between attack features and normal features. Accordingly, it greatly enhances the usability of the fused data, resulting in, ultimately, an improvement in the accuracy of the detection of FDIA attacks.
4.3. Performance Comparison with the State-of-the-Art
To gain more insight, we tested the overall performance of different data-fusion methods before and after the addition of the self-attention module. Similar to previous experiments, we performed a further comparison among three deep-detection methods—CNN, LSTM and CNN-LSTM hybrid model—to show the advantages of the proposed self-attention module. In this experiment, we tested the fusion effectiveness using two sensors and three sensors, respectively, and then compared the detection performance based on accuracy and scores in IEEE14 and IEEE118.
The corresponding test results are shown in
Table 2 and
Table 3. In these tables, the four basic models were above 90% for the detection accuracy, according to the
score and the threshold set by the ROC curve. With the addition of the self-attention module, these basic models presented different degrees of improvement. Specifically, when our self-attention module was added to each basic detection model, the accuracy gains of CNN, LSTM, and CNN-LSTM were 0.5%, 0.08%, and 0.13%, respectively, using the maximum
score as the threshold criterion. As the ROC curves focus on the detection of abnormal and normal data, the corresponding ROC gains were also 3.8%, 0.35%, and 0.3%, respectively, when the optimal threshold value of ROC is used as the threshold value. In contrast, the data-fusion method proposed in this paper achieves the highest values of 99.46% and 99.08% based on both
scores and ROC curves [
35,
36], respectively; on the IEEE118 dataset, we can also see improvements in different models. This demonstrates that our model can achieve the best detection performance.
We can explain this phenomenon as follows. It is well-known that the temporal features of the multi-source data are not involved in the CNN model. The LSTM model considers the temporal features, and it completely ignores the spatial correlation between the multi-source data. For the CNN-LSTM hybrid model, although it involves both spatial and temporal features, the deep temporal information of the fused spatial-temporal features is still abandoned during the feature-fusion process. For our CNN-LSTM hybrid model with the self-attention mechanism, our designed self-attention module focuses on the internal structure of the features, looking for intrinsic dependencies between different features, and provides the specific weights for each feature. Since different features usually have different levels of importance to the classification, the differences can be, thus, described by the weights. Overall, our model not only considers spatial and temporal features, but also uses the self-attention mechanism to further mine the internal connections between features, which can assign different weights to them according their importance, and, finally, it obtains the deep temporal information of the convolved features using one-dimensional convolution operations. Our model greatly enhances the usability of the data and ultimately results in a significant improvement in detection performance. In addition, the similar results can be also observed when using three sensors for data fusion in
Table 3, which also verified the above conclusion.
Additionally, we tried to test the performance change of the self-attention by adjusting the parameters. We modified the output dimensionality of the linear layer
from one to three parameters 2, 3, 4, respectively (e.g., out_features = 2, 3, 4). Similar to the previous experiments, we compared the accuracy and
scores to evaluate the performance of the proposed data-fusion scheme with different parameters. The corresponding detection results are shown in
Figure 8. As can be seen from this figure, after modifying the parameters, both in terms of
score and accuracy, the performance of the proposed data-fusion scheme can achieve the best performance when the output dimension is 1 (out_features = 1), and the overall performance gradually decreases as the output dimension increases. We explain the phenomenon with the following reason. As the output dimension increases, the training models tend towards a finite performance and the resources used for training also have a similar tendency. In other words, more resources do not yield better benefit. Therefore, our experiments set the output dimension of self-attention as out_features = 1 to balance training resources and output performance.
4.4. Computational-Complexity Comparison
We further tested the computational complexity of our self-attention module when different parameters are used. To be specific, we compared the complexity based on the running time and different input sizes by just changing the parameter
k, e.g.,
Figure 3; the self-attention module was employed to provide the testing results. As shown in
Figure 9, for different input sizes, a linear complexity can be generally maintained, although the running time continues to change. This is mainly because in branch
I of the self-attention model, a
k-dimensional sequence can be effectively obtained using a linear projection, which can be computed relative to the latent node; in addition, it is not necessary to use the batch matrix approach in multi-head self-attention to process the computation. Therefore, the complexity can be significantly reduced to
. Apparently, modifying the output dimension may only increase the cost of the computation, not the computation complexity. Therefore, it is conceivable that the time complexity of our self-attention module will always be linear with the size increase in the input parameters, because the running time is only related to
k. This undoubtedly demonstrates that our proposed self-attention mechanism can improve the efficiency of data fusion without significantly increasing the time complexity of the network model. Therefore, compared to other data-fusion models, our model has a superior comprehensive performance.
5. Conclusions
In this paper, we propose a hybrid deep data-fusion framework based on LSTM and convolutional neural networks with self-attention combined. The proposed approach first processes the data using temporal-alignment techniques and mines spatial-temporal features by building a symmetric hybrid CNN-LSTM network model, followed by a kind of self-attentiveness to mine the internal relationships between features and assign different weights to them and, finally, perform convolution with a separate LSTM to further mine the deep temporal information of spatial-temporal features. The proposed model is validated on a load dataset of a IEEE 14-bus system and the experimental results show that the proposed fusion model can achieve better detection performance compared to the original multi-source heterogeneous sensing data. The proposed method solves the problem of the efficient fusion of multi-sensor heterogeneous data in smart grids, and also provides a solid data basis for attack detection in smart grids.
While our proposed method showed a superior performance in the test with FDIA detection, we should note that our data-fusion scheme mainly uses the load data set from the IEEE 14-bus system to simulate multi-source sensing data fusion on the electrical side. In fact, for VPP scenarios in current new power systems, multi-source data fusion considering the public–private side may be more practical. In addition, although our method achieves good performance on continuous data, a power cyber physical system contains not only continuous data, but also discrete data, which have a significant impact on the detection of FDIA attacks. Considering the above problem, we plan to further improve our work in two ways. First, we will try to introduce a multi-modal attention mechanism to address the problem of multi-source heterogeneous data fusion on the public–private sides in new power systems, which remains an open challenge. Second, we will investigate lightening the deep hybrid network model.