1. Introduction
Electric load forecasting is an important aspect of modern power system management and a key research focus of power companies [
1]. It comprises long-term, medium-term, and short-term forecasting, depending on the specific goals [
2]. Notably, short-term load forecasting plays an important role in power generation planning and enables relevant departments to establish appropriate power dispatching plans [
3,
4], which is crucial for maintaining the safe and stable operation of the power system and enhancing its social benefits [
5]. In addition, it facilitates the growth of the power market and boosts economic benefits [
6]. Therefore, devising an effective and precise method for short-term load forecasting is of significant importance.
With the need for accurate energy forecasting in mind, various forecasting methods have been developed. Early studies produced several models for short-term power load forecasting, including the Auto-Regressive (AR), Auto-Regressive Moving Average (ARMA), and Auto-Regression Integrated Moving Average (ARIMA) models. A case in point is the work of Chen et al. [
7], who employed the ARMA model for short-term power load forecasting. This method utilizes observed data as the initial input, and its fast algorithm produces predicted load values that are in line with the trend in load variation. However, it falls short in terms of accounting for the factors that affect such variation, thus leaving room for enhancement in prediction accuracy.
In recent years, scholars have turned to machine learning [
8] and deep learning [
9] to improve electric load forecasting accuracy and uncover complex data patterns. Among traditional machine learning algorithms, Support Vector Machine (SVM) [
10] is the most widely used in the field of electric load forecasting. Its advantages include the need for relatively few training samples and interpretable features. Hong [
11] and Fan et al. [
12] have demonstrated the high accuracy of SVM in short-term electric load forecasting. However, as the smart grid continues to develop, power load data have become increasingly numerous and multifaceted, and SVM is confronted with the challenge of slow computing in such situations. Compared to traditional machine learning methods, deep learning methods exhibit stronger fitting capacity and produce better results. Currently, a diverse set of deep learning approaches have been implemented for load forecasting, including the Gated Recurrent Unit (GRU) [
13], Temporal Convolutional Network (TCN) [
14], Long-Short-Term Memory (LSTM) [
15], as well as other deep learning methods [
9,
16]. Compared to traditional Recurrent Neural Networks (RNN) and LSTM, GRU presents better forecasting results and faster running speed in short-term load forecasting. Wang et al. [
17] used the GRU algorithm to extract and learn the time characteristics of load consumption. Their results showed that the predictive accuracy improved by more than 10% compared to RNN. Cai [
18] found the GRU uses fewer parameters in the model and the important features were preserved, resulting in faster running speeds compared to LSTM. Imani [
19] utilized Convolutional Neural Network (CNN) to extract the nonlinear relationships of residential loads and achieved remarkably precise outcomes. Song et al. [
20] devised a thermal load prediction model by utilizing TCN networks, which facilitated the extraction of complex data features and enabled precise load prediction.
Since single prediction models are insufficient in terms of applicability scenarios and prediction accuracy to achieve optimal results [
21], a considerable amount of literature has employed hybrid models for prediction. Hybrid models combine data preprocessing, feature selection, optimization algorithms, decomposition algorithms, and other technologies to fully utilize the benefits of disparate methods and improve load power prediction accuracy. Research has revealed that the decomposition method and the ensemble learning method are particularly advantageous among the hybrid models [
22].
According to frequency analysis, the electric load exhibits clear cyclical patterns that result from the underlying superposition of multiple components with varying frequencies [
23]. Therefore, decomposing time series has become a widely employed method in the area of electric load forecasting. Sun [
24] proposed a short-term load forecasting model utilizing Ensemble Empirical Mode Decomposition (EEMD) and neural networks, considering wind power grid connections, and verified better decomposition effects of EEMD than wavelet decomposition. Liu Hui et al. [
25] utilized Variational Modal Decomposition (VMD) to decompose load sequences and developed a hybrid forecasting model for accurate prediction, achieving an accuracy of 99.15%. Irene et al. [
26] employed a hybrid prediction model combining Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) to enhance the accuracy of energy consumption prediction.
The ensemble learning method combines multiple sets of data with multiple individual learners, whether independent or identical, which have different distributions to improve predictive performance [
27]. Popular ensemble learning algorithms include boosting, bagging, and stacking algorithms. Ensemble learning methods are commonly conducted by stacking-based or weight-based strategies [
28]. Rho et al. [
29] used a stacking ensemble approach to merge short-term load forecasting models to more accurately predict building electric energy consumption. Massaoudi et al. [
30] proposed a stacked XGB-LGBM-MLP model to cope with the stochastic variations in load demand. Bento et al. [
31] present an automatic framework using deep learning-based stacking methodology to select the best Box–Jenkins models for 24-h ahead load forecasting from a wide range of combinations.
Although the above load power prediction models achieve a satisfactory forecasting effect, some limitations persist, and there is still some room for improvement. Firstly, the current short-term load forecasting models seldom consider detecting and correcting outliers in the original data. Studies have demonstrated that adopting outlier correction can significantly improve the performance of pollution forecasting [
32]. Secondly, the existing combination weights of load power ensemble prediction models lack diversity and should take into account different weight distribution strategies for the prediction results generated by different base learners. The literature shows that weight ensemble based on reinforcement learning can offer advantages in wind speed prediction [
33,
34].
To address the aforementioned research gaps, this paper presents a short-term load forecasting model (HI-CEEMDAN-Q-TEG) based on outlier correction, decomposition, and ensemble reinforcement learning. The contributions and novelty of this paper are summarized as follows:
This paper employs an outlier detection method to correct outliers in the original load power data. Such outliers may arise due to human error or other situations. Directly inputting the original data into the model without processing could lead to problems. To address this and identify and correct outliers in the data, this paper utilizes the Hampel identifier (HI) algorithm. This step is crucial as it provides the nonlinear information in the data to the forecasting model;
This paper utilizes a decomposition method to extract fully waveform characteristics of the data. Specifically, the CEEMDAN method is utilized in this study to decompose the raw non-stationary load power data. By decomposing the load power data into multiple sub-sequences through CEEMDAN, the waveform characteristics of the data can be extracted thoroughly, ultimately enhancing the performance of the predictor;
This paper introduces an ensemble learning algorithm based on reinforcement learning. It is necessary to consider varying weights when combining preliminary predictions from different base learners. This study employs three single models to predict processed load power data, followed by the utilization of the Q-learning method to obtain cluster weights that are suitable for the ensemble forecast. Compared to other ensemble learning algorithms, the Q-learning method deploys agents to learn in the environment through trial and error, resulting in an innovative and superior method.
2. Methodology
2.1. Framework of the Proposed Model
This study presents a novel forecasting model, namely the HI-CEEMDAN-Q-TEG, for predicting load power. The model framework, as depicted in
Figure 1, consists of three distinct steps with specific details as follows:
Step 1: Using HI to detect and correct outliers. The original load power data is characterized by fluctuations, randomness, and nonlinearity; therefore, outliers can arise as a result of either equipment or human factors. By using HI, outliers can be identified and corrected in the training set, which eliminates the likelihood of their interference with model training. This approach serves as a valuable tool for enhancing the precision of load power prediction;
Step 2: Applying CEEMDAN to decompose original data into subseries. Given its prominent cyclical characteristics, the load power data can be perceived, from a frequency domain perspective, as a composite of several components with varying frequencies. The CEEMDAN method can adaptively decompose this data into multiple subseries, thereby reducing the model’s non-stationarity and enhancing the predictor’s modeling efficiency and capacity;
Step 3: Using the Q-learning ensemble method for prediction. The load power data prediction is achieved by employing three base learners: the temporal convolutional network (TCN); gate recurrent unit (GRU); and extreme learning machine (ELM), which are referred to as TEG. After correcting for outliers, the TEG is used to make accurate predictions. Ensemble weights for different single models are determined using the Q-learning method. This algorithm updates the weights repeatedly through trial-and-error learning, thereby optimizing the diversity and appropriateness of the ensemble weights.
2.2. Hampel Identifier
HI is a widely used method for detecting and correcting outliers [
35]. Due to its excellent effectiveness, many researchers employ this method. To apply the HI algorithm to input data
, set the sliding window length as
. For each sample
, obtain the median
, as well as median absolute deviation (MAD) from the samples of length
around the specific center point. Set the evaluation parameter as
, and calculate the standard deviation
using MAD and
[
36]. The formulas for calculating
, MAD, and
are as follows [
32]:
Based on the 3d statistical rule, if the difference between a sample value and the window median exceeds three standard deviations, the window median will replace the sample data [
37]:
The use of HI allows for the outliers to be corrected in the raw data, which, if left untreated, could potentially disrupt the model training process. The incorporation of HI into data preprocessing leads to an enhanced nonlinear fitting performance of the data.
2.3. Complete Ensemble Empirical Mode Decomposition with Adaptive Noise
CEEMDAN is a decomposition algorithm used to analyze time series data for nonlinearity and non-stationarity [
38]. By smoothing the overall data and extracting information about multiple frequencies from the original data, CEEMDAN can decompose the data into sub-sequences with varying frequency and time information. The CEEMDAN algorithm is adaptive, meaning it can automatically select the appropriate noise level based on the unique characteristics of a given signal. This adaptability and robustness make the CEEMDAN algorithm ideal for processing nonlinear and non-stationary signals [
39].
Based on the EMD algorithm, the CEEMDAN algorithm makes the signal more stable and accurate in the decomposition process by introducing a noise signal. Meanwhile, it adopts multiple decompositions and average methods to improve the accuracy and stability of signal decomposition [
40].
The CEEMDAN algorithm has the advantage of solving mutual interference and noise interference problems between intrinsic mode functions (IMFs). This leads to improved accuracy and stability of signal decomposition.
2.4. Base Learners
2.4.1. Temporal Convolutional Network
The TCN algorithm is a commonly used convolutional network in time series predictions [
41]. Because of the causal relationship between load data over time, the prediction at time
depends on previous times, and the TCN network effectively maintains this temporal order and causality. TCN consists of three parts: causal convolution; expansive convolution; and residual convolution.
In TCN, causal convolution ensures that the output of the upper layers of the network at time only depends on the input of the lower layers before time . Expansion convolution involves setting hyperparameters of the expansion factor to adjust the convolutional interval. To reduce the limitations of downward transmission after nonlinear transformation in the original network structure, TCN adds multiple direct channels to the original network structure, allowing the input information to be directly transmitted to later layers.
2.4.2. Extreme Learning Machine
ELM is an efficient artificial neural network whose principle is based on fully random projections and the least squares method [
42]. Fully random projection refers to the projection of input data into a high-dimensional space. This increases the separability of data in the feature space [
43]. Through random initialization of the weights of the input and hidden layers, the ELM algorithm can minimize training errors very quickly, facilitating rapid learning and prediction.
ELM can be expressed mathematically as follows [
32]:
where
represents the output weight matrix,
represents the activation function,
represents the input weight matrix, and
represents the vector of bias.
With
representing the output matrix and
representing the true value matrix, the matrix expression for Extreme Learning Machine (ELM) is as follows:
where
is a matrix whose rows represent the output of the hidden layer for each input sample, and
is a matrix of output weights.
2.4.3. Gate Recurrent Unit
In 2014, Cho proposed the Gated Recurrent Unit (GRU) as an improvement on Long-Short-Term Memory (LSTM) [
44]. The GRU has two gates, the reset gate and the update gate, which, respectively, determine whether to add historical information to the current state and the relevance of historical information. Compared to the LSTM, the GRU uses fewer parameters in the model and the important features are preserved, resulting in faster running speeds.
The formulas for the update gate as well as reset gate calculation are as follows:
where
represents the current input value;
represents the state of the previous hidden;
represents the matrix of weight.
2.5. Ensemble Reinforcement Learning Method
As a distinct machine learning method, reinforcement learning is different from supervised learning or unsupervised learning due to its continuous interactions with the environment as an agent, which guides subsequent actions by providing feedback on the reward received, aiming to maximize the rewards [
45]. The Q-learning method is a reinforcement learning algorithm based on estimated values [
46]. Q-learning generates a Q-value table that captures the relationship between each action taken and state. Each value in this table represents the obtained reward for actions taken in each state.
The Q-table approach selects the action with the highest potential reward and uses a penalty and reward mechanism to keep the Q-table in the update until the optimal result is achieved. This happens when a specific condition is met, signifying that the algorithm has found the optimal action for each state. [
47]. In this study, we employ the Q-learning method to combine the forecasting outcomes of TCN, ELM, and GRU. As a result, different ensemble weights are generated for each base learner to effectively address the issue of weak robustness associated with a single weight as well as a single model.
4. Conclusions
Load forecasting is crucial for maintaining the stable operation of the power grid. This paper proposes an outlier correction, decomposition, and ensemble reinforcement learning model for load power prediction. The HI-CEEMDAN-Q-TEG model uses the HI outlier correction method to eliminate outliers. The CEEMDAN decomposition method is employed to break down raw load power data into various subseries to reduce volatility. Furthermore, the commonly used reinforcement learning method Q-learning is utilized to generate optimal weights by combining the forecasting results of three single models: TCN, ELM, and GRU. Based on the aforementioned experiments, some conclusions can be drawn as followed:
The utilization of HI significantly improves prediction accuracy. HI detects and eliminates outliers in the original data, reducing their interference in model training, improving its data fitting ability, and ultimately enhancing its forecasting performance;
Using TCN, ELM, and GRU as the base learners confer significant advantages, and the ensemble model employing the Q-learning method yields superior forecasting performance compared to individual base learners. As a type of reinforcement learning method, the Q-learning optimizes the weights of base learners via trial and error within the given environment;
Out of the four decomposition algorithms examined in this study, CEEMDAN exhibited superior forecasting performance. Unlike the other algorithms, CEEMDAN effectively handles non-stationary data and mitigates the impact of unsteady components on forecasting results;
The load power prediction model proposed in this study incorporates several techniques to enhance its accuracy. Firstly, it leverages the use of HI to correct any outliers. Next, it combines the strengths of various intelligent models by employing ensemble reinforcement learning. Additionally, CEEMDAN is adopted to further enhance the prediction results, resulting in exceptional load power prediction performance.
However, there are some limitations to the proposed model in this paper: (a) as a short-term forecasting model, the proposed model is designed to capture immediate changes and it may not be able to capture longer-term trends that develop over weeks, months, or years; and (b) the proposed model is relatively time-consuming when using the CEEMDAN decomposition algorithm. Thus, we intend to construct a parallel computing framework to support the proposed method in future work.