This section mainly describes the materials and methods used for selecting the desired features and their evaluation. Generally speaking, the load identification method structure mainly consisted of the following three parts: (1) event detection, (2) feature extraction, and (3) fuzzy evaluation and identification. In this work, we mainly focused on feature extraction, and the whole framework is illustrated in
Figure 1. In the event detection, the bilateral cumulative cum control chart (CUSUM) algorithm, combined with minimum inner-class variance rule method, was used to ensure that the load events and the change points were detected accurately. According to these change points, the resampling method was introduced to avoid the influence by the fluctuations of the voltage and current. Besides, the extracted features were validated by using the fuzzy evaluation, and, thus, can be applied to load identification. And then, the work is described as follows.
2.1. Event Detection Algorithm
Load event, which is defined as changes in load characteristics caused by switching on/off or state changes of individual devices [
13], is the first and significant step in the load identification. In practical applications, the reliability and accuracy of event detection could be affected by the unpredicted switching and the interference of voltage and current fluctuations. In this paper, a non-parametric cumulative sum control chart (CUSUM) event detection algorithm was used for load detection. This method accumulates the sample data as well as the small deviation of the process. Since the accumulated value is significantly higher, a load event occurs. In addition, the method can be extended to the algorithm bilateral CUSUM due to the fact that the load events about turning on and turning off usually happened in pairs.
Let the time series of extracting load data be
X = {
x (
k)},
k = 1, 2, … The statistic function in nonparametric bilateral CUSUM algorithm is defined as:
where
μ0 is the average value before the occurrence of the load event,
θ is random noise introduced from outside, and
fk+ and
fk− are the random variable with 0 being the mean (i.e., random fluctuations around zero). When the load is turning on,
xk will increase and
fk+ will have an increasing trend. On the contrary, the load when turned off will make
fk− decrease. So the load event can be detected when the change exceeds the threshold
h. Usually, the threshold
h is set according to the lowest power value of the load.
To make the Equation (1) more understandable,
Figure 2 gives a detailed description of the CUSUM. The load is a continuous change on the time axis, i.e., the mean
μ0 in the Equation (1) is changed with time. The sliding window model was then constructed to constrain the accumulated sum to ensure the load event was acquired accurately. So the W1 window and W2 window were modeled in this paper. The W1 window was used to calculate the mean value
μ0 of the sampling sequence. The W2 window was considered as the basis for judging whether a load event has occurred. From Equation (1), the value of
fk+ in window W2 gradually accumulates when the value of
xi increases. So when
fk+ exceeds a certain threshold
h, the load occurs. On the contrary, the value of
fk+ in window W2 fluctuates within a small range if no load event is detected. In this case, the W1 and W2 windows slide to the new sampling point and continue to detect.
Considering that the threshold
h is a global parameter, it is usually determined by the minimum load characteristic value. Therefore, to reduce the influence of the manually set threshold
h, the minimum inner-class variance rule can be taken as the change point detection method. In the load event detection window, the active power data samples are classified into two categories: class
C0 {
x1,
x2, …,
xk} and class
C1 {
xk+1,
xk+2, …,
xV}, where
V is the sample length in the window, let:
When the objective function
reaches the minimum and |
m(
C1) −
m(
C0)| is greater than the set active power change value, the time of the change point could be found.
2.2. Identification Feature Extraction and Resampling
Since the change point was found by the above method, the feature of the load could be obtained by using the load characteristics of the changes [
14]. Usually, the load event is determined based on physical changes in current, voltage, and other power information. Therefore, the load characteristics of these changes can be considered as the characteristics of the switching of the electrical device. For example,
Figure 3 illustrates three types of load, named resistive load, capacitive load, and inductive load, which have different current phases for their capacitive reactance and opposite impedance performances. We can figure out the active power P by the voltage
U, current
I, and their phase difference
φ that:
Similarly, the reactive power Q can be described as:
Active power and reactive power can be calculated by Equations (5) and (6), and they can distinguish between the different types of loads according to their values. Moreover, active power and reactive power can be captured by low-frequency meters. In this paper, active power P and reactive power Q were adopted as the features.
In order to clearly illustrate the extraction of the load features, including the active and reactive characteristics, let
P(
t) denote the active power variation with time
t for an example. Usually, it is statistically stable. However, once the load changes the status, the
P(
t) may undergo large changes at that time. So, the difference of
P can represent the change of the status of the device, thus the value of
P(
t) can be disaggregated. Here, we denote the Δ
P =
P(
t + Δ
t) −
P(
t) as the difference of
P, and the
P(
t) satisfies the condition as follows:
where
P(
t) is the active power at time
t;
m is the total number of load in the database;
ai is the mark of the state of load, where
ai = 1 indicates the running state and
ai = 0 means turned off; and
T is the time interval.
Similarly, the reactive characteristic (or called the difference of Q) Δ
Q = Q(
t + Δ
t) −
Q(
t) at time
t satisfies the condition as follows:
From Equations (7) and (8), it can be observed that the extraction of load characteristics from power load switching is related to the time of change point, i.e., the time interval T.
Although the change point is found according to the rules of the minimum inner-class variance, the voltage and current fluctuations make it difficult to determine this time interval
T. Usually, the different time
T can obtain the different P and Q features. In some papers [
15,
16,
17], the time
T is selected as the point after the change point. For some situations, the changes in P and Q may mismatch the load during load identification. So, in this paper, the importance resampling method was proposed to avoid the uncertainty of time interval
T and the influence of power load information fluctuation.
The resampling algorithm is often used to solve the sequential importance sampling algorithm [
18]. At present, there are many kinds of resampling algorithms [
19,
20]. In this paper, the importance resampling algorithm is adopted.
To further explain,
Figure 4 describes the process of resampling. It can be seen that this method regards the characteristics at each time as a particle and resamples the importance of each particle according to the distribution of particles before and after the change point. The specific process is as follows:
Step 0: Assuming that the load is put into operation, the change point time t is obtained. Let k = 0, which randomly gets N particles after the current time t and before the next change point, and initializes each particle xi with equal weight , i = 1, …, N.
Step 1: Importance sampling is used to distribute the weight of each particle. For each particle
i = 1, …,
N, estimate the weight of importance
ωk according to the degree of center deviation:
where Ω is the particle set. And we normalize Ω to get new weight
.
Step 2: We discard those particles with smaller weights and substitute sampling near the particles with larger weights.
Step 3: Set and repeat the process of Step 1 and Step 2 to minimize the variation of variance in particle set Ω.
Step 4: We differentiate the current load characteristics in the particle set from the previously recorded load characteristics to extract the load variation characteristics.
2.3. Fuzzy Evaluation Method
It is necessary to propose an evaluation method based on fuzzy membership after the identification features obtained by using the resampling method. The concept of the fuzzy set was first introduced in [
21]. Fuzzy theory is a kind of transaction that copes with the concept of uncertainty through membership degree [
22]. So, it can evaluate the relationship between load identification features and real load characteristics in the database. Considering the use of P and Q features as identification features, the 2-dimension fuzzy set are used in this paper.
Suppose that there are
n loads,
A1,
A2,
A3, …,
An, with two evaluation factors, active power (
f1) and reactive power (
f2). Consider
m linguistic hedges Ψ. Note that it is possible to consider an objective application between the finite chain
L and the ordinal scale Ψ, which keeps the order. Thus, each normal convex fuzzy subset defined on the ordinal scale Ψ can be considered as a discrete fuzzy number with the support
L,
L = {1, 2, …,
m}. Then the following data can be set for the load
Ai (
i = 1, 2, …,
n):
where
Ai1 and
Ai2 are two discrete fuzzy numbers of the metric feature,
m is the evaluation coefficient level, and
xijk is the evaluation factor of the object
Ai (
i = 1, 2, …,
n).
Then, the mean value can be worked out as . Let elements in K be the number (or numbers) that is (or are) closest to the mean value μ(Aij) in L = {1, 2, …, m}, i.e., . It is obvious that the number of the elements of K can only be one or two. Then the following method can be given to construct one-dimensional discrete fuzzy number : R → [0, 1] for any i = 1, 2, …, n and any j = 1, 2.
As
K only has one element (denoted by
k0),
can be defined as:
As
K has two elements (denoted by
k0 and
k0 + 1),
can be defined as:
where
,
,
,
, and
i = 1, 2, …, n,
j = 1, 2. We stipulate that
as
,
as
, and
as
in Equations (10) and (11).
Then, we can construct the two-dimensional unite discrete fuzzy number
of
and
to express device
Ai according to
for any
X = (
x1, x2) ∈
R2 (
i = 1, 2, …,
n). Next, the centroid can be calculated based on the resulting matrix:
In order to obtain the final evaluation value, it is necessary to combine the ratios of the two criteria of the centroid
p = (
p1,
p2), where
p1 and
p2 describe the importance of the features of the centroid counterpart. Considering that the combination of the centroid and weight is more conducive to the comprehensive evaluation of the possibility of the category, the metric can be established as follows:
Finally, through comparing the v values of different objects, the actual object, which has the highest evaluation value, is found. Therefore, if the obtained load identification features had the largest evaluation value, the actual object was determined.