1. Introduction
With the rapid integration of DGs [
1,
2,
3,
4,
5], distribution networks are being transformed into active distribution networks (ADNs) which have more complex structures and put forward higher requirements for monitoring and control [
6,
7]. To ensure the secure and stable operation of the ADN, it is necessary to efficiently and accurately monitor the system states in real time. So, it is important to study the state estimation (SE) of ADNs.
At present, the traditional SE models are mostly based on the weighted least square (WLS) method, which has been widely used in different scenarios for power systems and derived from many different branches [
8]. Considering the characteristics of distribution networks, such as radial structure, a large number of network nodes, and low coverage of measurement configurations, various methods of SE in DN, including the nodal-voltage method [
9,
10,
11], branch current method [
12,
13,
14], and branch power method [
15,
16,
17] have been proposed. However, the establishment of WLS theory relies on the assumption that the measurement noise obeys Gaussian distribution, and the estimation accuracy of WLS will be greatly affected if there are non-Gaussian noisy or significant deviation data (hereinafter referred to as bad measurement) in power grids. For this reason, robust SE methods with nonquadratic estimation criteria have been proposed, such as weighted least absolute value (WLAV) estimation [
18,
19], Huber-M estimation [
20,
21], and exponential objective function estimation [
22,
23]. In addition, to improve the computational accuracy of SE using the phasor measurement unit (PMU), SE methods that combine PMU and supervisory control and data acquisition (SCADA) measurements have been proposed, including linear SE by incorporating SCADA into PMU [
24,
25], nonlinear SE by incorporating PMU into SCADA [
26], and two-stage models combining linear and nonlinear [
27,
28]. The above research provides a rich reference for SE in ADN. However, due to the complexity of multisource data types and frequent changes in network topology in ADN, traditional models have the following limitations.
First, traditional SE methods have difficulty simultaneously satisfying the requirements of robustness and fast computational speed. These methods typically rely on the system’s topology and measurements. So, incorrect topology and measurements will result in significant estimation biases and even cause the model’s failure of convergence. To enhance algorithm robustness, methods such as bad measurement detection and robust computation have been proposed. However, these methods are computationally complex and time-consuming, rendering them unable to meet the requirements of ADNs. With the increasing penetration of DGs, the numbers of operating states and load fluctuations in DN are significantly increased. Traditional SE is no longer capable of simultaneously meeting the requirements of robustness and speed.
Second, traditional SE methods have not achieved the fusion of PMU and SCADA data when performing the estimation. On the one hand, PMU and SCADA data have inconsistent updating cycles [
29], whereas PMU data has a much higher updating frequency. Existing linear and nonlinear fusion models typically perform the fusion calculation only at the time instants with both PMU and SCADA updates, which does not fully leverage the PMU data [
30]. On the other hand, some works attempted to generate pseudo-SCADA measurements for time instants with only the PMU data. However, pseudo-measurements suffer from low accuracy and difficult parameter configuration and are highly influenced by system states [
28,
31,
32], leading to unstable estimations. The high sampling speed and accuracy of PMUs make it possible to enhance the frequency and accuracy of SE. So, there is an urgent need for more effective PMU–SCADA fusion schemes.
Thanks to the rich measurement data in ADNs, data-driven SE has attracted attention in recent years [
33,
34,
35,
36,
37,
38]. Refs. [
23,
24,
25,
26,
27,
28,
29,
30,
31,
32,
33,
34,
35] performed robust SE based on deep NN. Ref. [
36] used convolutional NNs and developed a data-fusion SE method. Refs. [
37,
38] used NNs for identifying DN topology. However, these methods still have shortcomings when performing SE of ADNs: (1) they only consider preknown bad measurement and have poor SE performances under random noises [
33,
34,
35], (2) they neglect ADN’s topology changes and have low SE accuracy under different topologies [
37,
38], and (3) they fail to achieve a proper fusion of PMU and SCADA data, as discussed above [
36].
Considering this research gap, this paper proposes a fast and robust SE for ADNs, which performs PMU and SCADA data fusion and is adaptive to ADN’s topology changes. The contributions are threefold:
- (1)
To address the issues of low accuracy and poor robustness in the SE of ADNs, this paper proposes a data-driven and classic model integrated fast robust SE model (FRSEM). It firstly constructs a multioutput NN for state pre-estimation, which directly associates the measured data with the true state values. Then, it corrects the NN outputs by using a linear SE model and increases the data redundancy for the SE. This not only enhances the SE accuracy but also greatly enhances the robustness in the cases of high-level noise and random topological changes of ADNs.
- (2)
Aiming at a proper fusion of PMU and SCADA data, the proposed model includes different combinations of measurements during the NN’s training. The obtained NN can perform SE using SCADA data, PMU data, and the fused PMU and SCADA data, respectively. Even under scenarios with inadequate PMU measurements, the proposed model can still utilize PMU data for SE, which significantly reduces the time interval between two estimations and achieves real-time estimation.
- (3)
In terms of estimation speed, due to the elimination of iterations, the proposed FRSEM has a faster computation speed compared to traditional methods. As verified by simulations, it is six times faster than the WLS method.
The remainder of the paper is organized as follows.
Section 2 introduces the overall framework of the proposed FRSEM. The NN and its offline training are discussed in
Section 3. Based on the NN’s outputs, The linear estimation is conducted in
Section 4.
Section 5 provides the simulation. Finally,
Section 6 concludes the paper.
3. Neural Network Design and Training in the SE of ADNs
This section discusses the NN model and its offline training in the SE, which addresses the issues of bad measurement interference, data fusion, and topological changes. Among them, the hyperparameters, such as the number of hidden layers and learning rate, are the optimal results obtained after repeated training many times.
3.1. NN Design
To achieve SE and topology identification, this paper designs a multioutput SE neural network (MSENN) based on the input–output characteristics and internal mechanisms of traditional methods. The model takes measurement data from various nodes in the distribution network as the input (1) and directly obtains the state parameters (voltage magnitude and phase angle) of each node and the network’s topological status through network computation (2). The specific design of the network is as follows.
where,
represents the input matrix of the network,
represents the measurement values, and
is the total number of measurements.
represents the output matrix of the network, and
represents the binary matrix indicating the topological status.
represents the probability of the result being the
th topological structure.
represents the number of topological states the system can have.
represents the output-voltage magnitude matrix,
represents the voltage magnitude,
represents the output-node voltage vector,
represents the node phase angle (in degrees), and
represents the number of nodes.
- (1)
Network Architecture and Activation Functions
Referring to the existing research, the layers of the state-estimation neural network are mostly between three and six. On this basis, this paper compares the neural network with three to six layers, and the results are shown in
Appendix A. In addition, because the functional characteristics of the three outputs are not consistent, the network is designed in the form of hierarchical optimization. Firstly, a common layer network is used to extract the global characteristics of the measurement data, and, then, independent network layers are designed for different problems to achieve functional differentiation. As shown in
Figure 2, the network consists of one input layer, five feature-extraction layers, and three output layers. Each layer is discussed as follows:
Input layer: This layer receives the input matrix which contains the measurement values;
Feature-extraction layers 1–2: These layers extract the overall features of the measurement data. Since the input can have both positive and negative values, the leaky ReLU activation function is used to ensure comprehensive feature extraction;
Feature-extraction layer 3: This layer extracts the topological information from the overall features and uses the ReLU activation function to confine the output within the non-negative range;
Feature-extraction layer 4: This layer extracts the voltage-magnitude information from the overall features. Since the voltage magnitudes of the system nodes typically fluctuate around one, the tanh activation function is used to limit the output magnitudes within the range of one;
Feature-extraction layer 5: This layer extracts the voltage phase angle information and uses the leaky ReLU activation function to output phase angles within the range of [−180, 180];
Output layer 1: This layer outputs the topological states in the features using the softmax activation function;
Output layers 2 and 3: These layers use the tanh and leaky ReLU activation functions, respectively, to output the voltage magnitudes and phase angles;
The detailed parameter settings for each layer are shown in
Table 1. The number of neurons in each layer was also obtained experimentally, see
Appendix A.
- (2)
Loss Function
In MSENN, the distinction of the topological state belongs to the classification problem in traditional machine learning, while the estimation of voltage for each node belongs to the regression problem. Therefore, for these two different outputs, different types of loss functions need to be used for training. In terms of topology, the cross-entropy loss function is used to directly calculate the topological loss value. In terms of SE, the loss is set as the MSE between the network’s output of voltage magnitude/angle and corresponding true values, following the traditional WLS method used in SE. The magnitude of the three loss values is maintained at an order of magnitude during the training without loss imbalance impacting the network effect. So, the overall loss function of the network is obtained by directly summing, as shown in the following formula:
where,
,
, and
are the losses of topological, voltage magnitude, and angle, respectively, and
is the overall loss of the network.
,
and
are the true values of topological state, voltage magnitude, and angle, respectively, of node
.
3.2. Data Set Generation
MSENN’s generalization requires historical measurement data. To obtain those data, reference [
39] replaces the historical measurement data with noisy power-flow measurements under different load conditions. The actual load data for a week in a region of Belgium [
40] are averaged, and, then, the true load curve is normalized to 0~1 to obtain the baseline daily load curve as shown in
Figure 3 (where one day has 96 time instants with 15 min intervals). Gaussian noise with a standard deviation of 5% is added to generate the actual injected power at each node. A power-flow calculation was carried out through Matpower [
41] to obtain the real value of voltage magnitude and angle, injected power, and line power at each ADN node. On this basis, measurement noise with normal distribution is added to generate SCADA and μPMU measurement data, where the error setting is the same as [
42].
To address the situation that some of the measurement data in the distribution network may have significant deviations, noise with a mean value of ±50% and standard deviation of ±20% is added for 5% to 20% of each group’s data. These data are used as fault data to be cotrained with the network as a way to reduce the interference of deviant data on the results by using the filtering property of the NN. Generally, the measurement data with an error greater than ±6σ can be considered as fault data [
43], and the error set in this paper has met the practical requirements. Regarding topological changes, when calculating the true values from each dataset, the topological structure is randomly modified to generate measurement data reflecting different network topologies. This simulates the topology changes that may occur in the distribution network and records the topology status of each data set.
In this study, for the case of inconsistent sampling periods between PMU and SCADA measurement data, a direct physical fusion approach is employed. At any given moment, the current monitoring data is directly fed into the network for estimation using the corresponding neural network neurons, while data not within the sampling period are inputted as zero to the corresponding neural network neurons. It should be noted that, apart from the aforementioned physical fusion, no additional data-processing techniques are applied to eliminate bad data. The purpose of using neural networks for estimation is to treat the neural network as a filter to mitigate interference caused by data anomalies, thereby enhancing the accuracy and performance of subsequent estimations.
Table 2 lists five methods used for combining SCADA and μPMU measurement data. On this basis, a training dataset of m×1 dimensions can be obtained. Among them, the role of combination methods 1 and 2 is mainly to enhance the global features learned by the network during the training process and accelerate the convergence speed and accuracy of the network. Combinations 3, 4, and 5 correspond to scenarios where the network contains only μPMU data, only SCADA data, or a combination of both, respectively.
The flow chart of the data generation is shown in
Figure 4. All the measured data are combined to construct the data set. In this paper, 100 days of measured data are generated, and 9600 groups of data sets are combined for network training and testing.
3.3. NN Training
The data sets of 6000, 1200, and 2400 were taken as the training set, validation set, and test set of the training network, respectively, and the network was trained for 10 rounds using the Adam optimizer. Among them, the initial learning rate used for training was
, with a decay of 0.75 per round; the network was trained 600 iterations per round, with 10 sets of data input each time. Since the randomly added noise has sufficiently prevented the overfitting of the network, no regularization strategy is taken. The loss decay of the network training is given in
Figure 5.
As can be seen from
Figure 5, MSENN converges well on the dataset, while the two loss curves have the same trend, indicating that the network has strong generalization performance. Due to the different bad measurements randomly added for each training, the loss values continue to fluctuate in the later stage of the training.
5. Simulation and Analysis
To verify the performance of the proposed FRSEM, simulations on the IEEE 33 distribution system with DGs are conducted.
Section 5.1 verifies the accuracy and robustness of the proposed model.
Section 5.2 and
Section 5.3 demonstrate the model’s performance in two scenarios—data fusion and topology changes.
Section 5.4 performs a timeliness analysis of the model.
Figure 6 shows the studied distribution system, with 33 nodes and 37 lines. The load data for each node is sourced from MATPower. DGs with rated capacities of 400 kW, 500 kW, 350 kW, and 450 kW are connected to nodes 7, 10, 14, and 33, respectively. Referring to [
44], the output range of the DGs is 0~1.5 times of rated capacities.
Table 3 lists four reconstructed networks with corresponding DG outputs.
In
Section 5.1, measurement devices for node and branch power measurements are installed at all nodes and branches, respectively. Additionally, a voltage-magnitude measurement is installed at node 1 to provide reference voltage values for SE.
In
Section 5.2 and
Section 5.3, in addition to the aforementioned measurements, eight μPMUs are installed at nodes 3, 6, 10, 13, 17, 21, 25, and 29. These μPMUs provide synchronized voltage and current measurements.
The mean absolute error (MAE), mean relative error (MRE), and topology identification accuracy (TIA) are used as evaluation metrics for SE and topology identification accuracy, respectively. The formulas are as follows:
where,
is the estimated value,
is the true value,
is the total number of test-set groups (2400), and
is the number of groups with correct topology estimation.
The simulation was performed on a personal computer equipped with an Intel(R) Core i7-12700 CPU and MATLAB 2022a environment. The computer has 64 GB of memory and Matpower version 4.01 installed. The simulation code was executed using the CPU.
5.1. Comparison of Estimation Accuracy and Robustness
SE simulation experiments are conducted using both the traditional WLS method and the proposed FRSEM method in two SCADA measurement scenarios with normal data and with bad measurement. The estimation accuracy and robustness are compared. The MAE for each node is shown in
Figure 7, while the MAE and TIA of the system are presented in
Table 4.
As can be seen from the graphs, WLS and FRSEM can obtain stable results in both scenarios, and the accuracy of each node in the same method and scenario is roughly consistent, staying within a relatively small fluctuation range. Under normal measurement conditions, the voltage-magnitude errors for WLS and FRSEM fluctuate around 0.0210 pu and 0.0029 pu respectively, while the voltage-angle errors fluctuate around 0.0942° and 0.0529°. The estimation performance of FRSEM significantly surpasses that of WLS, especially in terms of voltage-magnitude errors, which are almost an order of magnitude smaller than WLS. When introducing bad measurements into the measurements, the voltage-magnitude errors of the system in WLS show significant increases, while the voltage-angle errors exhibit larger fluctuations; but, the overall errors remain nearly unchanged. On the other hand, most nodes in FRSEM maintain unchanged voltage-magnitude errors, with only a few nodes showing minor increases. The voltage-angle errors of each node also experience some increases, but the overall phase-angle error is still much smaller than WLS. Additionally, FRSEM possesses the capability of topology identification, which is not present in conventional WLS. It accurately identifies the system’s topology in both scenarios.
The experiments demonstrate that FRSEM achieves higher estimation accuracy and robustness compared to WLS, confirming the expected results. This is because traditional WLS is essentially a probabilistic estimation model that relies on known measurement-error probability distributions to infer the most probable system-state distribution. The estimation accuracy is limited within a certain range due to fixed measurement-error sizes. Furthermore, when bad measurement disturbs the correct measurement-error distribution, the estimation accuracy decreases significantly. Unlike traditional WLS, the FRSEM proposed in this paper utilizes NNs as its core component. It directly trains the network using the true system states, establishing a stronger correlation between measurement values and true values. Moreover, the inherent filtering of NNs helps reduce the impact of bad measurement, resulting in better performance than traditional WLS in both scenarios.
5.2. Comparison of SE Accuracy under Fused Measurement Data
The estimation accuracy of the fusion of mixed measurements (including μPMU and SCADA) using WLS [
9] and FRSEM in SE is compared. The comparison is conducted for two cases: one with both types of measurements and the other with only μPMU measurements. The results are shown in
Figure 8 and
Table 5. It is noted that there are insufficient observable regions for the eight μPMUs to support estimation using WLS alone, so the results for WLS + μPMU are not included in the experimental results.
As shown in
Figure 8, after introducing μPMU into the measurement, the voltage-magnitude errors in WLS exhibit significant fluctuations, while the voltage-magnitude errors at the μPMU-connected nodes are significantly reduced. The voltage-angle errors show some fluctuations but, overall, remain unchanged. On the other hand, FRSEM demonstrates a significant reduction in both voltage-magnitude and phase-angle errors, with the angle error at the μPMU-connected nodes almost reaching zero. Although the reduction in voltage-magnitude error of FRSEM is not as great as WLS, the voltage-magnitude errors at all nodes in FRSEM are still smaller than in WLS. Additionally, the results in the table demonstrate that, unlike WLS, FESEM can still perform SE even with insufficient μPMU measurements, and the missing SCADA measurements only cause slight accuracy degradation in the model.
Simulation results show that μPMU measurements significantly enhance the effectiveness of both methods, but the performance of FESEM proposed in this paper is more outstanding. Moreover, this method increases the SE usage scenario of μPMU measurement by supplementing the measurement data with NNs.
5.3. Comparison of SE Accuracy under ADN Topology Changes
To demonstrate the impact of topology changes on the estimation accuracy and highlight the necessity of correcting the topology state, this subsection compares the SE accuracy of WLS and FESEM before and after correcting the topology information (based on the topology state output by MSENN), respectively.
Each day is divided into 96 time instants (with 15 min intervals). The ADN topology changes at the 42nd and 50th time instants. The MAE is obtained as shown in
Figure 9.
The results show that during the time of 0~42, there are no topology changes in the system, and the results obtained by both methods before and after topology correction completely overlap. However, after the topology changes occur at the 42nd and 50th time instants, correcting the topology leads to a significant reduction in errors. Among them, due to the existence of voltage-magnitude measurement data, the output-voltage magnitude is less sensitive to topology changes, and the reduction of angle error is more obvious [
45,
46]. In addition, topology changes in the results have a greater impact on the WLS because the network pre-estimation in the first stage of FESEM does not require topology information, and incorrect topology information only affects the process of linear SE, which uses μPMU measurements consisting entirely of voltage magnitude and angle. It greatly reduces the extent to which topology affects the estimation.
The above analysis indicates that using MSENN’s topology output to correct the network topology before estimation effectively enhances the SE accuracy under topology changes. This is particularly significant for distribution networks with frequent topology changes caused by DER integration.
5.4. Time-Sensitive Analysis
To verify the efficiency of the proposed method, the computation time for running both methods 100 times was calculated and compared. Additionally, to clarify the impact of the linear SE on the computation time and accuracy of FESEM, the computation time for the standalone MSENN was also included in the comparative experiment. The accuracy of each method is presented in
Table 6.
Compared to WLS, the proposed MSENN method in this study replaces the Newton–Raphson method with forward computation and backward propagation, and utilizes parallel computing instead of serial computing, resulting in significant speed advantages. Although the introduction of linear SE in FRSEM increases the computation time to some extent, the overall computation time is still much smaller than that of WLS. Both MSENN and FRSEM demonstrate better efficiency compared to the traditional WLS. In terms of overall accuracy, it can be observed that FRSEM achieves the highest accuracy among the three methods, but it is slower than MSENN. On the other hand, MSENN is the fastest in terms of computation speed but has slightly lower accuracy compared to FRSEM. Therefore, depending on different scenarios, it is possible to flexibly choose between the two methods to meet specific practical requirements.
5.5. Simulation on P&F 69 System
To further demonstrate the generality of the proposed method, this study conducted additional testing on the P&F 69 distribution system. The results are shown in
Figure 10 and
Table 7 below. Compared to the 33-node system, the accuracy fluctuates relatively more for each node, while the overall average accuracy remains almost unchanged. Additionally, due to the increased number of input measurements, the overall processing time has slightly increased.