To address the aforementioned issues, this paper proposes an encoding decomposition-based multi-scale graph neural network (CMNN). After decomposing electricity data using EEMD to obtain multiple components, the CMNN fits the high-order components and residuals with Multi-scale Bi-directional Long Short-Term Memory (MBLSTM); for the lower-order components that exhibit volatility, the CMNN constructs a low-order component graph based on temporal correlation and numerical similarity. Then, the CMNN builds a Gaussian Graph Attention Auto-Encoder based on this low-order component graph, which learns temporal features and numerical correlation features and encodes them as the expectations and covariance of a Gaussian distribution to predict the low-order components. Finally, the CMNN combines the predictions of all component levels to produce the final electricity consumption forecast results. The model structure of the CMNN is shown in the
Figure 1:
Next, this paper details the model specifics.
3.1. Analysis of Periodic Characteristics of Electricity Consumption Data
To capture the periodic characteristics of electricity consumption data, the CMNN employs a data decomposition approach, which breaks down the electricity data into components exhibiting various periodic features. This allows for detailed feature extraction and the analysis of components that demonstrate significant periodicity. Specifically, the CMNN utilizes the EEMD technique to decompose the electricity consumption data and extract the corresponding IMF components. The purpose of the CMNN is to model the approximate periodic characteristics of electricity consumption data across different time resolutions by employing different-order IMF components. This facilitates the capture of the approximate periodicity of electricity data across multiple temporal scales and enables predictions for the upcoming time slices of the IMF components. To achieve this, the CMNN incorporates a MBLSTM to forecast both high-order IMF components and residuals, allowing it to learn the periodic features of the electricity data effectively.
Specifically, the CMNN first decomposes the electricity consumption data using EEMD, represented by Equation (1):
where
x is electricity consumption data,
J is the index of the residual component of the IMF,
rJ(
t) represents the residual term of the generated Intrinsic Mode Function (IMF), and
Cj(
t) represents the
j-th order component of the IMF. Then, for high-order IMF components and residuals, incorporating additional weather and date information, the CMNN constructs a composite input vector that includes this additional information. Based on this composite input vector, the CMNN designed a MBLSTM to learn the periodic features of the high-order IMF components and residuals and predict the values of the next time slice of high-order IMF components and residuals. The structure of the MBLSTM model is shown in
Figure 2:
In
Figure 2, the feature extraction process marked in blue is carried out on each node with 1-scale Bi-LSTM, the feature extraction process marked in purple is carried out on the adjacent two nodes using a 2-scale Bi-LSTM, and then the extracted features are combined for prediction. The MBLSTM can be achieved simply by integrating multiple different-scale bi-directional LSTMs. After features are extracted with these bi-directional LSTMs, the extracted features are merged to form a new feature vector. The feature extraction process for the
j-th order IMF component is represented as Equation (2):
where
Multi_scale_Bi_LSTM contains several
Bi_LSTMs in different scales and, among them,
Bi_LSTM_K is the
Bi_LSTM used with a
K scale,
add(
t) represents the encoded weather and date information at time
t,
Cj(
t) represents the
j-th order component of the IMF, and
fIMF_
j represents the feature of the
j-th order component of the IMF, where
j > 1. Based on the extracted features, the CMNN predicts the values of high-order components, modeling the periodic features in the high-order components. The prediction result of the high-order components is represented as a combination of the outputs from bi-directional LSTMs across multiple scales in Equation (3):
where
Line is a simple linear layer with
ReLU as the activation function.
To model the volatility of electricity consumption data, the CMNN constructs a low-order component graph based on temporal correlation and numerical similarity for the low-order IMF components that exhibit volatility, as well as a Gaussian Graph Attention Auto-Encoder based on the low-order component graph to learn the volatility features.
3.2. Volatility Characteristics Analysis of Electricity Consumption Data
After the decomposition of the electricity consumption data using EEMD, the volatility of the data is primarily reflected in the lower-order IMF components. The volatility of these IMF components is closely related to industrial production events. The proposed CMNN aims to model these variations in electricity usage brought about by volatility, thereby improving the accuracy of electricity consumption forecasts. To avoid over-smoothing in the forecasts for the lower-order components that primarily exhibit volatility, the CMNN introduces additional correlations among electricity usage by expanding the dimensionality of the lower-order components. This allows for the modeling of the correlations between production events embedded in the volatile electricity data, based on the correlations among electricity values. Consequently, the CMNN constructs a low-order component graph based on temporal correlation and numerical similarity. On this basis, the CMNN builds a Gaussian Graph Attention Auto-Encoder to extract temporal features and numerical correlation features, encoding them as the expectations and covariances of a Gaussian distribution to parameterize the modeling of data changes and volatility. Based on these parameterized data features, the CMNN predicts the low-order components.
The CMNN converts one-dimensional temporal relationships into a two-dimensional graph structure. To preserve both the temporal information and the correlations between electricity values within this two-dimensional framework, the CMNN defines temporal Cosine similarities to assign weights to the edges of the graph during the graph construction process. This approach effectively integrates both temporal dynamics and electricity value correlation information. This information can be represented as Equation (4):
where
temporal(
i,
j) is the temporal feature between nodes
i and
j,
Similar(
i,
j) is the correlation feature between nodes
i and
j, and
sequence(
i,
j) is the duration between time slices
i and
j. Longer intervals weaken the impact of temporality. For the correlation features between nodes
i and
j, the CMNN uses Cosine similarity for measurement, where
fi is the electricity data for node
i. Then, the constructed low-order component graph structure is as shown in
Figure 3:
In
Figure 3, the one-dimensional temporal relations between the low-order components are transferred into a two-dimensional graph, and each edge in the graph is calculated based on the Cosine similarity. Based on the low-order component graph, the CMNN designs a Gaussian Graph Attention Auto-Encoder to extract temporal features and numerical correlation features, encoding them as the expectations and covariances of a Gaussian distribution for the parameterized modeling of data changes and volatility. Specifically, the CMNN uses the constructed low-order component graph as the graph input for the Gaussian Graph Attention Auto-Encoder, and the electricity data with additional information as node features, encoding the extracted features into the form of a parameterized Gaussian distribution in
Figure 4:
In
Figure 4, the Gaussian Graph attention is used for encoding, and a corresponding decoder is constructed to extract attention features. During the encoding computation, the CMNN defines a temporal Cosine attention mechanism to extract features. The temporal Cosine attention mechanism includes a similarity function, attention coefficients, and the construction of attention features. The similarity function measures the associations between electricity usages at the time slices represented by the nodes in the low-order component graph. To express the different impacts caused by different similarities between inputs, the CMNN defines a temporal Cosine similarity function. This function measures the similarity between two input vectors as Equation (5):
where
A is the adjacency matrix of the low-order component graph,
W refers to learnable weights,
SLP() is a single-layer neural network, and sequence(
i,
j) is the duration between time slices
i and
j. Longer intervals weaken the impact of temporality. The temporal Cosine similarity function calculates the angle between vectors. Based on this function, attention coefficients are defined as Equation (6):
where
αij are the attention coefficients for
fi and
fj. Thus, the attention feature for the current node
i can be expressed as Equation (7):
where
V and
U are learnable weights. Based on Equations (5)–(7), the Gaussian Graph Attention Auto-Encoder calculates electricity usage features for each time slice and encodes the low-order component graph. The decoding process of the Gaussian Graph Attention Auto-Encoder can be achieved using a standard attention layer. To effectively model the numerical relationships among electricity values and provide guidance for forecasts, the Gaussian Graph Attention Auto-Encoder introduces a reconstruction loss based on the low-order component graph, and its loss function is expressed as Equation (8):
In Equation (8), Recon is the reconstructed data, Edge
Num is the number of edges, and the second term of the loss introduces a feature consistency function, ensuring that the extracted features are consistent with the low-order component graph in terms of similarity. Based on the extracted features of the low-order component graph, the CMNN uses an output layer to predict the results for the low-order components as Equation (9):
where
Line is a simple linear layer with
ReLU as the activation function. Finally, combining the predictions of the
IMF components and residuals, the final predicted value for electricity data is obtained in Equation (10):
For the effective training of the CMNN, this paper constructs a loss function based on the reconstruction loss, loss_ae, and the minimum mean squared error loss for predictions, expressed as Equation (11):
where
Loss_pre represents the prediction loss, Recon is the reconstructed data, and
MSE is the Mean Square Error. The loss function,
Loss, can be optimized using the gradient descent method, and this paper uses the Adam method.