Next Article in Journal
Investigation of Thermo-Hydraulic Performances of Artificial Ribs Mounted in a Rectangular Duct
Previous Article in Journal
Cloud-Based Artificial Intelligence Framework for Battery Management System
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Short-Term Load Forecasting Based on Outlier Correction, Decomposition, and Ensemble Reinforcement Learning

Institute of Artificial Intelligence & Robotics (IAIR), Key Laboratory of Traffic Safety on Track of Ministry of Education, School of Traffic & Transportation Engineering, Central South University, Changsha 410075, China
*
Author to whom correspondence should be addressed.
Energies 2023, 16(11), 4401; https://doi.org/10.3390/en16114401
Submission received: 8 May 2023 / Revised: 26 May 2023 / Accepted: 28 May 2023 / Published: 30 May 2023
(This article belongs to the Topic Short-Term Load Forecasting)

Abstract

:
Short-term load forecasting is critical to ensuring the safe and stable operation of the power system. To this end, this study proposes a load power prediction model that utilizes outlier correction, decomposition, and ensemble reinforcement learning. The novelty of this study is as follows: firstly, the Hampel identifier (HI) is employed to correct outliers in the original data; secondly, the complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) is used to extract the waveform characteristics of the data fully; and, finally, the temporal convolutional network, extreme learning machine, and gate recurrent unit are selected as the basic learners for forecasting load power data. An ensemble reinforcement learning algorithm based on Q-learning was adopted to generate optimal ensemble weights, and the predictive results of the three basic learners are combined. The experimental results of the models for three real load power datasets show that: (a) the utilization of HI improves the model’s forecasting result; (b) CEEMDAN is superior to other decomposition algorithms in forecasting performance; and (c) the proposed ensemble method, based on the Q-learning algorithm, outperforms three single models in accuracy, and achieves smaller prediction errors.

1. Introduction

Electric load forecasting is an important aspect of modern power system management and a key research focus of power companies [1]. It comprises long-term, medium-term, and short-term forecasting, depending on the specific goals [2]. Notably, short-term load forecasting plays an important role in power generation planning and enables relevant departments to establish appropriate power dispatching plans [3,4], which is crucial for maintaining the safe and stable operation of the power system and enhancing its social benefits [5]. In addition, it facilitates the growth of the power market and boosts economic benefits [6]. Therefore, devising an effective and precise method for short-term load forecasting is of significant importance.
With the need for accurate energy forecasting in mind, various forecasting methods have been developed. Early studies produced several models for short-term power load forecasting, including the Auto-Regressive (AR), Auto-Regressive Moving Average (ARMA), and Auto-Regression Integrated Moving Average (ARIMA) models. A case in point is the work of Chen et al. [7], who employed the ARMA model for short-term power load forecasting. This method utilizes observed data as the initial input, and its fast algorithm produces predicted load values that are in line with the trend in load variation. However, it falls short in terms of accounting for the factors that affect such variation, thus leaving room for enhancement in prediction accuracy.
In recent years, scholars have turned to machine learning [8] and deep learning [9] to improve electric load forecasting accuracy and uncover complex data patterns. Among traditional machine learning algorithms, Support Vector Machine (SVM) [10] is the most widely used in the field of electric load forecasting. Its advantages include the need for relatively few training samples and interpretable features. Hong [11] and Fan et al. [12] have demonstrated the high accuracy of SVM in short-term electric load forecasting. However, as the smart grid continues to develop, power load data have become increasingly numerous and multifaceted, and SVM is confronted with the challenge of slow computing in such situations. Compared to traditional machine learning methods, deep learning methods exhibit stronger fitting capacity and produce better results. Currently, a diverse set of deep learning approaches have been implemented for load forecasting, including the Gated Recurrent Unit (GRU) [13], Temporal Convolutional Network (TCN) [14], Long-Short-Term Memory (LSTM) [15], as well as other deep learning methods [9,16]. Compared to traditional Recurrent Neural Networks (RNN) and LSTM, GRU presents better forecasting results and faster running speed in short-term load forecasting. Wang et al. [17] used the GRU algorithm to extract and learn the time characteristics of load consumption. Their results showed that the predictive accuracy improved by more than 10% compared to RNN. Cai [18] found the GRU uses fewer parameters in the model and the important features were preserved, resulting in faster running speeds compared to LSTM. Imani [19] utilized Convolutional Neural Network (CNN) to extract the nonlinear relationships of residential loads and achieved remarkably precise outcomes. Song et al. [20] devised a thermal load prediction model by utilizing TCN networks, which facilitated the extraction of complex data features and enabled precise load prediction.
Since single prediction models are insufficient in terms of applicability scenarios and prediction accuracy to achieve optimal results [21], a considerable amount of literature has employed hybrid models for prediction. Hybrid models combine data preprocessing, feature selection, optimization algorithms, decomposition algorithms, and other technologies to fully utilize the benefits of disparate methods and improve load power prediction accuracy. Research has revealed that the decomposition method and the ensemble learning method are particularly advantageous among the hybrid models [22].
According to frequency analysis, the electric load exhibits clear cyclical patterns that result from the underlying superposition of multiple components with varying frequencies [23]. Therefore, decomposing time series has become a widely employed method in the area of electric load forecasting. Sun [24] proposed a short-term load forecasting model utilizing Ensemble Empirical Mode Decomposition (EEMD) and neural networks, considering wind power grid connections, and verified better decomposition effects of EEMD than wavelet decomposition. Liu Hui et al. [25] utilized Variational Modal Decomposition (VMD) to decompose load sequences and developed a hybrid forecasting model for accurate prediction, achieving an accuracy of 99.15%. Irene et al. [26] employed a hybrid prediction model combining Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) to enhance the accuracy of energy consumption prediction.
The ensemble learning method combines multiple sets of data with multiple individual learners, whether independent or identical, which have different distributions to improve predictive performance [27]. Popular ensemble learning algorithms include boosting, bagging, and stacking algorithms. Ensemble learning methods are commonly conducted by stacking-based or weight-based strategies [28]. Rho et al. [29] used a stacking ensemble approach to merge short-term load forecasting models to more accurately predict building electric energy consumption. Massaoudi et al. [30] proposed a stacked XGB-LGBM-MLP model to cope with the stochastic variations in load demand. Bento et al. [31] present an automatic framework using deep learning-based stacking methodology to select the best Box–Jenkins models for 24-h ahead load forecasting from a wide range of combinations.
Although the above load power prediction models achieve a satisfactory forecasting effect, some limitations persist, and there is still some room for improvement. Firstly, the current short-term load forecasting models seldom consider detecting and correcting outliers in the original data. Studies have demonstrated that adopting outlier correction can significantly improve the performance of pollution forecasting [32]. Secondly, the existing combination weights of load power ensemble prediction models lack diversity and should take into account different weight distribution strategies for the prediction results generated by different base learners. The literature shows that weight ensemble based on reinforcement learning can offer advantages in wind speed prediction [33,34].
To address the aforementioned research gaps, this paper presents a short-term load forecasting model (HI-CEEMDAN-Q-TEG) based on outlier correction, decomposition, and ensemble reinforcement learning. The contributions and novelty of this paper are summarized as follows:
  • This paper employs an outlier detection method to correct outliers in the original load power data. Such outliers may arise due to human error or other situations. Directly inputting the original data into the model without processing could lead to problems. To address this and identify and correct outliers in the data, this paper utilizes the Hampel identifier (HI) algorithm. This step is crucial as it provides the nonlinear information in the data to the forecasting model;
  • This paper utilizes a decomposition method to extract fully waveform characteristics of the data. Specifically, the CEEMDAN method is utilized in this study to decompose the raw non-stationary load power data. By decomposing the load power data into multiple sub-sequences through CEEMDAN, the waveform characteristics of the data can be extracted thoroughly, ultimately enhancing the performance of the predictor;
  • This paper introduces an ensemble learning algorithm based on reinforcement learning. It is necessary to consider varying weights when combining preliminary predictions from different base learners. This study employs three single models to predict processed load power data, followed by the utilization of the Q-learning method to obtain cluster weights that are suitable for the ensemble forecast. Compared to other ensemble learning algorithms, the Q-learning method deploys agents to learn in the environment through trial and error, resulting in an innovative and superior method.

2. Methodology

2.1. Framework of the Proposed Model

This study presents a novel forecasting model, namely the HI-CEEMDAN-Q-TEG, for predicting load power. The model framework, as depicted in Figure 1, consists of three distinct steps with specific details as follows:
Step 1: Using HI to detect and correct outliers. The original load power data is characterized by fluctuations, randomness, and nonlinearity; therefore, outliers can arise as a result of either equipment or human factors. By using HI, outliers can be identified and corrected in the training set, which eliminates the likelihood of their interference with model training. This approach serves as a valuable tool for enhancing the precision of load power prediction;
Step 2: Applying CEEMDAN to decompose original data into subseries. Given its prominent cyclical characteristics, the load power data can be perceived, from a frequency domain perspective, as a composite of several components with varying frequencies. The CEEMDAN method can adaptively decompose this data into multiple subseries, thereby reducing the model’s non-stationarity and enhancing the predictor’s modeling efficiency and capacity;
Step 3: Using the Q-learning ensemble method for prediction. The load power data prediction is achieved by employing three base learners: the temporal convolutional network (TCN); gate recurrent unit (GRU); and extreme learning machine (ELM), which are referred to as TEG. After correcting for outliers, the TEG is used to make accurate predictions. Ensemble weights for different single models are determined using the Q-learning method. This algorithm updates the weights repeatedly through trial-and-error learning, thereby optimizing the diversity and appropriateness of the ensemble weights.

2.2. Hampel Identifier

HI is a widely used method for detecting and correcting outliers [35]. Due to its excellent effectiveness, many researchers employ this method. To apply the HI algorithm to input data A = [ a 1 , a 2 , , a k ] , set the sliding window length as w = 2 n + 1 . For each sample a i , obtain the median m i , as well as median absolute deviation (MAD) from the samples of length n around the specific center point. Set the evaluation parameter as α = 0.6745 , and calculate the standard deviation σ i using MAD and a [36]. The formulas for calculating m i , MAD, and σ i are as follows [32]:
m i = median ( a i n , a i n + 1 , , a i , , a i + n 1 , a i + n )
MAD i =   median   ( | a i n m i | , | a i n + 1 m i | , , | a i + n 1 m i | )
σ i = M A D i / α
Based on the 3d statistical rule, if the difference between a sample value and the window median exceeds three standard deviations, the window median will replace the sample data [37]:
| a i m i | > 3 σ i
The use of HI allows for the outliers to be corrected in the raw data, which, if left untreated, could potentially disrupt the model training process. The incorporation of HI into data preprocessing leads to an enhanced nonlinear fitting performance of the data.

2.3. Complete Ensemble Empirical Mode Decomposition with Adaptive Noise

CEEMDAN is a decomposition algorithm used to analyze time series data for nonlinearity and non-stationarity [38]. By smoothing the overall data and extracting information about multiple frequencies from the original data, CEEMDAN can decompose the data into sub-sequences with varying frequency and time information. The CEEMDAN algorithm is adaptive, meaning it can automatically select the appropriate noise level based on the unique characteristics of a given signal. This adaptability and robustness make the CEEMDAN algorithm ideal for processing nonlinear and non-stationary signals [39].
Based on the EMD algorithm, the CEEMDAN algorithm makes the signal more stable and accurate in the decomposition process by introducing a noise signal. Meanwhile, it adopts multiple decompositions and average methods to improve the accuracy and stability of signal decomposition [40].
The CEEMDAN algorithm has the advantage of solving mutual interference and noise interference problems between intrinsic mode functions (IMFs). This leads to improved accuracy and stability of signal decomposition.

2.4. Base Learners

2.4.1. Temporal Convolutional Network

The TCN algorithm is a commonly used convolutional network in time series predictions [41]. Because of the causal relationship between load data over time, the prediction at time t depends on previous times, and the TCN network effectively maintains this temporal order and causality. TCN consists of three parts: causal convolution; expansive convolution; and residual convolution.
In TCN, causal convolution ensures that the output of the upper layers of the network at time t only depends on the input of the lower layers before time t . Expansion convolution involves setting hyperparameters of the expansion factor to adjust the convolutional interval. To reduce the limitations of downward transmission after nonlinear transformation in the original network structure, TCN adds multiple direct channels to the original network structure, allowing the input information to be directly transmitted to later layers.

2.4.2. Extreme Learning Machine

ELM is an efficient artificial neural network whose principle is based on fully random projections and the least squares method [42]. Fully random projection refers to the projection of input data into a high-dimensional space. This increases the separability of data in the feature space [43]. Through random initialization of the weights of the input and hidden layers, the ELM algorithm can minimize training errors very quickly, facilitating rapid learning and prediction.
ELM can be expressed mathematically as follows [32]:
y i = β g ( W x i + b )
where β represents the output weight matrix, g ( x ) represents the activation function, W represents the input weight matrix, and b represents the vector of bias.
With H representing the output matrix and Y representing the true value matrix, the matrix expression for Extreme Learning Machine (ELM) is as follows:
H β = Y
where H is a matrix whose rows represent the output of the hidden layer for each input sample, and β is a matrix of output weights.

2.4.3. Gate Recurrent Unit

In 2014, Cho proposed the Gated Recurrent Unit (GRU) as an improvement on Long-Short-Term Memory (LSTM) [44]. The GRU has two gates, the reset gate and the update gate, which, respectively, determine whether to add historical information to the current state and the relevance of historical information. Compared to the LSTM, the GRU uses fewer parameters in the model and the important features are preserved, resulting in faster running speeds.
The formulas for the update gate as well as reset gate calculation are as follows:
x t = σ ( W x [ h t 1 , x t ] )
r t = σ ( W r [ h t 1 , x t ] )
where x t represents the current input value; h t 1 represents the state of the previous hidden; W represents the matrix of weight.

2.5. Ensemble Reinforcement Learning Method

As a distinct machine learning method, reinforcement learning is different from supervised learning or unsupervised learning due to its continuous interactions with the environment as an agent, which guides subsequent actions by providing feedback on the reward received, aiming to maximize the rewards [45]. The Q-learning method is a reinforcement learning algorithm based on estimated values [46]. Q-learning generates a Q-value table that captures the relationship between each action taken and state. Each value in this table represents the obtained reward for actions taken in each state.
The Q-table approach selects the action with the highest potential reward and uses a penalty and reward mechanism to keep the Q-table in the update until the optimal result is achieved. This happens when a specific condition is met, signifying that the algorithm has found the optimal action for each state. [47]. In this study, we employ the Q-learning method to combine the forecasting outcomes of TCN, ELM, and GRU. As a result, different ensemble weights are generated for each base learner to effectively address the issue of weak robustness associated with a single weight as well as a single model.

3. Case Study

3.1. Data Description

To verify the practicality of the proposed model, three sets of load power data from Pecan Street datasets were utilized in this study [48]. The Pecan Street datasets contain the load power data of 25 households in the Austin area of the United States, recorded at a sampling interval of 15 min in 2018. Figure 2 showcases the load power datasets #1, #2, and #3 collected from the 1st to the 15th of each month in January, April, and September, respectively, in the Austin area. Each dataset comprises 1440 samples, divided into two parts: 1240 training set samples and 200 test set samples. The training sets are utilized to train the single models and the Q-learning ensemble method, while the testing set is utilized to evaluate the performance of all the models discussed in this paper.
Table 1 lists the statistical characteristics of three load power datasets. As observed from Figure 2 and Table 1, these three sets of load power data possess distinct statistical characteristics; however, they all exhibit non-stationarity and volatility.

3.2. Performance Evaluation Indexes

To provide a comprehensive evaluation of the forecasting performance of the models, three statistical indexes are employed in this study: mean absolute error (MAE); root mean square error (RMSE); and mean absolute percentage error (MAPE). The smaller the values of these indexes, the higher the model’s prediction accuracy. The definitions of these indexes are shown as follows:
M A E = ( t = 1 T | y ( t ) y ^ ( t ) | ) / T ,
M A P E = ( t = 1 T | ( y ( t ) y ^ ( t ) ) / y ( t ) | ) / T ,
R M S E = ( t = 1 T [ y ( t ) y ^ ( t ) ] 2 ) / T ,
where y ( t ) is the original load power data at time t , y ^ ( t ) is the forecasted load power data at time t , and T is the number of samples in y ( t ) .

3.3. Forecasting Results and Analysis

The experiments aimed to compare the proposed hybrid HI-CEEMDAN-Q-TEG model with other relevant models. The main experimental parameters of our hybrid model are given in Appendix A. The experiments were divided into three parts:
In Part I, the models with HI were compared to those without HI to demonstrate the potential efficacy of HI and the performance improvements attainable by using HI in load power forecasting;
Part II compared four commonly used intelligent models running with HI (namely, HI-TCN, HI-ELM, HI-GRU, and HI-BPNN) to demonstrate the superiority of HI-TCN, HI-ELM, and HI-GRU in different datasets. Furthermore, HI-Q-TEG was compared with HI-TCN, HI-ELM, and HI-GRU to demonstrate the effectiveness of the Q-Learning ensemble method;
Part III aimed to verify the advantages of the decomposition method by comparing the results of the HI-Q-TE method with those obtained using the HI-CEEMDAN-Q-TE decomposition algorithm. In addition, different decomposition algorithms were compared to show the superiority of the CEEMDAN decomposition algorithm proposed in this study.

3.3.1. Experimental Results of Part I

In this part, we investigate the impact of employing HI in load power forecasting. Figure 3 depicts the outlier points and the dissimilarity between the original power load data and the data after HI. Table 2 displays the sample entropy (SampEn) values for both the original load power data and the data post HI application. To further investigate the potential gains from HI, the accuracy of HI-based models is compared to that of models sans HI, and we present the percentage enhancements in all three performance evaluation indices in Table 3.
Based on the results presented in Figure 3 and Table 2 and Table 3, this study draws the following conclusions:
  • The application of the HI model leads to the identification and correction of outlier points, which improves the overall quality of the dataset. Figure 2 depicts the presence of outlier points in the original power load data, which can interfere with model training and negatively impact forecasting accuracy;
  • The HI model effectively reduces the complexity of the original data, as evidenced by a lowered value of SampEn. SampEn is a statistical measure that quantifies the complexity of a time series. A lower value of SampEn indicates a higher degree of self-similarity in the sequence, whereas a higher value implies greater complexity. Table 2 indicates that for all three datasets, the values of SampEn were lower in the data processed with the HI model compared to the original load power data;
  • The HI model improves forecasting accuracy compared to models without the HI model. The comparative analysis of HI-CEEMDAN-Q-TEG with CEEMDAN-Q-TEG shows an improvement in MAPE accuracy by 2.6104%, 3.7628%, and 3.2095%, respectively, for datasets #1, #2, and #3, as listed in Table 3. The improvement is due to the correction of outliers. The findings demonstrate that the implementation of the HI model reduces the load power prediction error in all three series.

3.3.2. Experimental Results of Part II

This part of the experiment compares four commonly utilized single intelligent models (HI-TCN, HI-ELM, HI-GRU, and HI-BPNN) with the HI-Q-TEG method. The MAE values for the four single intelligent models across three datasets are displayed in Figure 4, while Table 4 presents the performance evaluation indexes for all four models. In addition, Figure 5, Figure 6 and Figure 7 provide the forecasting results and errors of HI-Q-TEG, HI-TCN, HI-ELM, and HI-GRU across the three datasets. The effectiveness of the Q-Learning ensemble method is presented in Table 5, which highlights the improvement percentages of each method. Notably, the bolded data within the table represents the model evaluation results that resulted in the lowest forecasting error for the respective dataset.
The findings from Figure 4, Figure 5, Figure 6 and Figure 7 and Table 4 and Table 5 support the following conclusions:
  • The prediction performance of the same single models varied across different datasets due to varying volatility and nonlinearity, as evidenced by the differing precision orders for the same dataset across different performance evaluation indexes. However, overall, HI-TCN, HI-ELM, and HI-GRU exhibited the best prediction accuracy across three different datasets, respectively, with HI-TCN producing the most accurate predictions for Dataset #1, HI-ELM for Dataset #2, and HI-GRU for Dataset #3. Thus, incorporating the three mentioned single models as base learners for the ensemble method is recommended;
  • The Q-Learning ensemble algorithm yielded improved forecasting accuracy for load power compared to single intelligent models. Table 5 highlights that comparing HI-Q-TEG with HI-TCN, the MAPE improvement percentages for Dataset #1, Dataset #2, and Dataset #3 are 8.8436%, 5.7540%, and 12.8483%, respectively. Additionally, Figure 4, Figure 5 and Figure 6 display how the Q-Learning ensemble method effectively combines the strengths of various intelligent models and mitigates the negative effect of performance deficiencies in a single model on forecasting accuracy.

3.3.3. Experimental Results of Part III

This part of the experiment compares four decomposition algorithms (WPD, EMD, EEMD, and CEEMDAN) by showcasing their improvement percentages of three performance evaluation indexes for different datasets in Table 6. Additionally, Figure 8, Figure 9 and Figure 10 depict scatter diagram comparisons between the HI-CEEMDAN-Q-TEG method and other decomposition models. The closer the scatter plot points are to the diagonal line, the better the prediction effect of the corresponding model.
From Table 6 and Figure 8, Figure 9 and Figure 10, the following conclusions could be drawn:
  • When comparing models that utilize decomposition algorithms to those that do not, consistent improvements in percentage can be observed. For instance, comparing HI-CEEMDAN-Q-TEG with HI-Q-TEG, the improvements in the RMSE across datasets #1, #2, and #3 with percentage reductions of 43.74%, 38.65%, and 33.09%, respectively. The use of decomposition algorithms breaks down raw load power data into several frequency components, which, in turn, enhances the performance of recognition for models;
  • The proposed decomposition model that is based on the CEEMDAN algorithm provides better forecasting outcomes than other decomposition algorithms. For Dataset #2, the improvement percentage of MAE for HI-WPD-Q-TEG, HI-EMD-Q-TEG, HI-EEMD-Q-TEG, and HI-CEEMDAN-Q-TEG is 25.8923%, 20.9483%, 19.9478%, and 38.3934%, respectively. The CEEMDAN algorithm is highly effective at decomposing both high and low-frequency data, allowing for better handling of the high volatility of raw data. This results in optimal performance for forecasting.

4. Conclusions

Load forecasting is crucial for maintaining the stable operation of the power grid. This paper proposes an outlier correction, decomposition, and ensemble reinforcement learning model for load power prediction. The HI-CEEMDAN-Q-TEG model uses the HI outlier correction method to eliminate outliers. The CEEMDAN decomposition method is employed to break down raw load power data into various subseries to reduce volatility. Furthermore, the commonly used reinforcement learning method Q-learning is utilized to generate optimal weights by combining the forecasting results of three single models: TCN, ELM, and GRU. Based on the aforementioned experiments, some conclusions can be drawn as followed:
  • The utilization of HI significantly improves prediction accuracy. HI detects and eliminates outliers in the original data, reducing their interference in model training, improving its data fitting ability, and ultimately enhancing its forecasting performance;
  • Using TCN, ELM, and GRU as the base learners confer significant advantages, and the ensemble model employing the Q-learning method yields superior forecasting performance compared to individual base learners. As a type of reinforcement learning method, the Q-learning optimizes the weights of base learners via trial and error within the given environment;
  • Out of the four decomposition algorithms examined in this study, CEEMDAN exhibited superior forecasting performance. Unlike the other algorithms, CEEMDAN effectively handles non-stationary data and mitigates the impact of unsteady components on forecasting results;
  • The load power prediction model proposed in this study incorporates several techniques to enhance its accuracy. Firstly, it leverages the use of HI to correct any outliers. Next, it combines the strengths of various intelligent models by employing ensemble reinforcement learning. Additionally, CEEMDAN is adopted to further enhance the prediction results, resulting in exceptional load power prediction performance.
However, there are some limitations to the proposed model in this paper: (a) as a short-term forecasting model, the proposed model is designed to capture immediate changes and it may not be able to capture longer-term trends that develop over weeks, months, or years; and (b) the proposed model is relatively time-consuming when using the CEEMDAN decomposition algorithm. Thus, we intend to construct a parallel computing framework to support the proposed method in future work.

Author Contributions

Conceptualization, J.W., H.L. and G.Z.; methodology, J.W.; software, J.W.; validation, J.W., G.Z. and Y.L.; formal analysis, J.W.; investigation, J.W.; resources, J.W.; data curation, J.W.; writing—original draft preparation, J.W.; writing—review and editing, H.L. and S.Y.; visualization, J.W.; supervision, H.L.; project administration, J.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Nomenclatures

HIHampel identifier
CEEMDANcomplete ensemble empirical mode decomposition with adaptive noise
ARauto regressive
ARMAauto-regressive moving average
ARIMAauto-regression integrated moving average
SVMsupport vector machine
GRUgated recurrent unit
TCNtemporal convolutional network
LSTMlong-short-term memory
RNNrecurrent neural networks
CNNconvolutional neural network
EEMDensemble empirical mode decomposition
VMDvariational modal decomposition
WPDwavelet packet decomposition
ELMextreme learning machine
MADmedian absolute deviation
IMFsintrinsic mode functions
MAEmean absolute error
MAPEmean absolute percentage error
RMSEroot mean square error
QQ-Learning algorithm
TEGTCN, ELM and GRU

Appendix A

The main experimental parameters of our hybrid model are given in Table A1.
Table A1. The main experimental parameters.
Table A1. The main experimental parameters.
Name of ParameterValue
GRUSize of input units10
Size of hidden units100
Size of output units1
Learning rate0.01
TCNSize of kernel3
Skip connectionFalse
BatchnormFalse
Q-learningMaximum iteration50
Learning rate0.95
Discount parameter0.5

References

  1. Vanting, N.B.; Ma, Z.; Jørgensen, B.N. A Scoping Review of Deep Neural Networks for Electric Load Forecasting. Energy Inform. 2021, 4, 49. [Google Scholar] [CrossRef]
  2. Gong, R.; Li, X. A Short-Term Load Forecasting Model Based on Crisscross Grey Wolf Optimizer and Dual-Stage Attention Mechanism. Energies 2023, 16, 2878. [Google Scholar] [CrossRef]
  3. Zanib, N.; Batool, M.; Riaz, S.; Afzal, F.; Munawar, S.; Daqqa, I.; Saleem, N. Analysis and Power Quality Improvement in Hybrid Distributed Generation System with Utilization of Unified Power Quality Conditioner. Comput. Model. Eng. Sci. 2023, 134, 1105–1136. [Google Scholar] [CrossRef]
  4. Wang, S.; Zhou, C.; Riaz, S.; Guo, X.; Zaman, H.; Mohammad, A.; Al-Ahmadi, A.A.; Alharbi, Y.M.; Ullah, N. Adaptive Fuzzy-Based Stability Control and Series Impedance Correction for the Grid-Tied Inverter. Math. Biosci. Eng. 2023, 20, 1599–1616. [Google Scholar] [CrossRef]
  5. Li, L.; Guo, L.; Wang, J.; Peng, H. Short-Term Load Forecasting Based on Spiking Neural P Systems. Appl. Sci. 2023, 13, 792. [Google Scholar] [CrossRef]
  6. Ran, P.; Dong, K.; Liu, X.; Wang, J. Short-Term Load Forecasting Based on CEEMDAN and Transformer. Electr. Power Syst. Res. 2023, 214, 108885. [Google Scholar] [CrossRef]
  7. Chen, J.-F.; Wang, W.-M.; Huang, C.-M. Analysis of an Adaptive Time-Series Autoregressive Moving-Average (ARMA) Model for Short-Term Load Forecasting. Electr. Power Syst. Res. 1995, 34, 187–196. [Google Scholar] [CrossRef]
  8. Yildiz, B.; Bilbao, J.I.; Sproul, A.B. A Review and Analysis of Regression and Machine Learning Models on Commercial Building Electricity Load Forecasting. Renew. Sustain. Energy Rev. 2017, 73, 1104–1122. [Google Scholar] [CrossRef]
  9. Shi, H.; Xu, M.; Li, R. Deep Learning for Household Load Forecasting—A Novel Pooling Deep RNN. IEEE Trans. Smart Grid 2018, 9, 5271–5280. [Google Scholar] [CrossRef]
  10. Chen, Y.; Tan, H. Short-Term Prediction of Electric Demand in Building Sector via Hybrid Support Vector Regression. Appl. Energy 2017, 204, 1363–1374. [Google Scholar] [CrossRef]
  11. Hong, W.-C. Electric Load Forecasting by Support Vector Model. Appl. Math. Model. 2009, 33, 2444–2454. [Google Scholar] [CrossRef]
  12. Fan, S.; Chen, L.; Lee, W.-J. Machine Learning Based Switching Model for Electricity Load Forecasting. Energy Convers. Manag. 2008, 49, 1331–1344. [Google Scholar] [CrossRef]
  13. Pan, C.; Tan, J.; Feng, D. Prediction Intervals Estimation of Solar Generation Based on Gated Recurrent Unit and Kernel Density Estimation. Neurocomputing 2021, 453, 552–562. [Google Scholar] [CrossRef]
  14. Wang, Y.; Chen, J.; Chen, X.; Zeng, X.; Kong, Y.; Sun, S.; Guo, Y.; Liu, Y. Short-Term Load Forecasting for Industrial Customers Based on TCN-LightGBM. IEEE Trans. Power Syst. 2021, 36, 1984–1997. [Google Scholar] [CrossRef]
  15. Kong, W.; Dong, Z.Y.; Jia, Y.; Hill, D.J.; Xu, Y.; Zhang, Y. Short-Term Residential Load Forecasting Based on LSTM Recurrent Neural Network. IEEE Trans. Smart Grid 2019, 10, 841–851. [Google Scholar] [CrossRef]
  16. Huang, K.; Hallinan, K.P.; Lou, R.; Alanezi, A.; Alshatshati, S.; Sun, Q. Self-Learning Algorithm to Predict Indoor Temperature and Cooling Demand from Smart WiFi Thermostat in a Residential Building. Sustainability 2020, 12, 7110. [Google Scholar] [CrossRef]
  17. Wang, Y.; Liu, M.; Bao, Z.; Zhang, S. Short-Term Load Forecasting with Multi-Source Data Using Gated Recurrent Unit Neural Networks. Energies 2018, 11, 1138. [Google Scholar] [CrossRef]
  18. Cai, C.; Li, Y.; Su, Z.; Zhu, T.; He, Y. Short-Term Electrical Load Forecasting Based on VMD and GRU-TCN Hybrid Network. Appl. Sci. 2022, 12, 6647. [Google Scholar] [CrossRef]
  19. Imani, M. Electrical Load-Temperature CNN for Residential Load Forecasting. Energy 2021, 227, 120480. [Google Scholar] [CrossRef]
  20. Song, J.; Xue, G.; Pan, X.; Ma, Y.; Li, H. Hourly Heat Load Prediction Model Based on Temporal Convolutional Neural Network. IEEE Access 2020, 8, 16726–16741. [Google Scholar] [CrossRef]
  21. Yue, W.; Liu, Q.; Ruan, Y.; Qian, F.; Meng, H. A Prediction Approach with Mode Decomposition-Recombination Technique for Short-Term Load Forecasting. Sustain. Cities Soc. 2022, 85, 104034. [Google Scholar] [CrossRef]
  22. Li, K.; Huang, W.; Hu, G.; Li, J. Ultra-Short Term Power Load Forecasting Based on CEEMDAN-SE and LSTM Neural Network. Energy Build. 2023, 279, 112666. [Google Scholar] [CrossRef]
  23. Habbak, H.; Mahmoud, M.; Metwally, K.; Fouda, M.M.; Ibrahem, M.I. Load Forecasting Techniques and Their Applications in Smart Grids. Energies 2023, 16, 1480. [Google Scholar] [CrossRef]
  24. Sun, W.L. The Short-Term Load Forecasting Method Based on EEMD and ANN by Considering Grid-Connected Wind Power. Master’s Thesis, Southwest Jiaotong University, Chengdu, China, 2013. [Google Scholar]
  25. Hui, L.; Houjun, L.; Yuwei, L.; Qixiao, Z. Power Load Forecasting Method Based on VMD and GWO-SVR. Mod. Electron. Technol. 2020, 43, 167–172. [Google Scholar]
  26. Karijadi, I.; Chou, S.-Y. A Hybrid RF-LSTM Based on CEEMDAN for Improving the Accuracy of Building Energy Consumption Prediction. Energy Build. 2022, 259, 111908. [Google Scholar] [CrossRef]
  27. Wang, L.; Mao, S.; Wilamowski, B.M.; Nelms, R.M. Ensemble Learning for Load Forecasting. IEEE Trans. Green Commun. 2020, 4, 616–628. [Google Scholar] [CrossRef]
  28. Wang, Y.; Chen, Q.; Sun, M.; Kang, C.; Xia, Q. An Ensemble Forecasting Method for the Aggregated Load With Subprofiles. IEEE Trans. Smart Grid 2018, 9, 3906–3908. [Google Scholar] [CrossRef]
  29. Moon, J.; Jung, S.; Rew, J.; Rho, S.; Hwang, E. Combination of Short-Term Load Forecasting Models Based on a Stacking Ensemble Approach. Energy Build. 2020, 216, 109921. [Google Scholar] [CrossRef]
  30. Massaoudi, M.; Refaat, S.S.; Chihi, I.; Trabelsi, M.; Oueslati, F.S.; Abu-Rub, H. A Novel Stacked Generalization Ensemble-Based Hybrid LGBM-XGB-MLP Model for Short-Term Load Forecasting. Energy 2021, 214, 118874. [Google Scholar] [CrossRef]
  31. Bento, P.M.R.; Pombo, J.A.N.; Calado, M.R.A.; Mariano, S.J.P.S. Stacking Ensemble Methodology Using Deep Learning and ARIMA Models for Short-Term Load Forecasting. Energies 2021, 14, 7378. [Google Scholar] [CrossRef]
  32. Liu, H.; Xu, Y.; Chen, C. Improved Pollution Forecasting Hybrid Algorithms Based on the Ensemble Method. Appl. Math. Model. 2019, 73, 473–486. [Google Scholar] [CrossRef]
  33. Liu, H.; Yu, C.; Wu, H.; Duan, Z.; Yan, G. A New Hybrid Ensemble Deep Reinforcement Learning Model for Wind Speed Short Term Forecasting. Energy 2020, 202, 117794. [Google Scholar] [CrossRef]
  34. Chen, C.; Liu, H. Dynamic Ensemble Wind Speed Prediction Model Based on Hybrid Deep Reinforcement Learning. Adv. Eng. Inf. 2021, 48, 101290. [Google Scholar] [CrossRef]
  35. Liu, H.; Shah, S.; Jiang, W. On-Line Outlier Detection and Data Cleaning. Comput. Chem. Eng. 2004, 28, 1635–1647. [Google Scholar] [CrossRef]
  36. Pearson, R.K. Outliers in Process Modeling and Identification. IEEE Trans. Control. Syst. Technol. 2002, 10, 55–63. [Google Scholar] [CrossRef] [PubMed]
  37. Wang, J.; Du, P.; Hao, Y.; Ma, X.; Niu, T.; Yang, W. An Innovative Hybrid Model Based on Outlier Detection and Correction Algorithm and Heuristic Intelligent Optimization Algorithm for Daily Air Quality Index Forecasting. J. Environ. Manag. 2020, 255, 109855. [Google Scholar] [CrossRef]
  38. Torres, M.E.; Colominas, M.A.; Schlotthauer, G.; Flandrin, P. A Complete Ensemble Empirical Mode Decomposition with Adaptive Noise. In Proceedings of the 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech Republic, 22–27 May 2011; pp. 4144–4147. [Google Scholar]
  39. Chen, Z.; Jin, T.; Zheng, X.; Liu, Y.; Zhuang, Z.; Mohamed, M.A. An Innovative Method-Based CEEMDAN–IGWO–GRU Hybrid Algorithm for Short-Term Load Forecasting. Electr. Eng. 2022, 104, 3137–3156. [Google Scholar] [CrossRef]
  40. Huang, S.; Zhang, J.; He, Y.; Fu, X.; Fan, L.; Yao, G.; Wen, Y. Short-Term Load Forecasting Based on the CEEMDAN-Sample Entropy-BPNN-Transformer. Energies 2022, 15, 3659. [Google Scholar] [CrossRef]
  41. Bai, S.; Kolter, J.Z.; Koltun, V. An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling. arXiv 2018, arXiv:1803.01271. [Google Scholar]
  42. Zhu, Q.-Y.; Qin, A.K.; Suganthan, P.N.; Huang, G.-B. Evolutionary Extreme Learning Machine. Pattern Recognit. 2005, 38, 1759–1763. [Google Scholar] [CrossRef]
  43. Huang, G.-B.; Zhu, Q.-Y.; Siew, C.-K. Extreme Learning Machine: Theory and Applications. Neurocomputing 2006, 70, 489–501. [Google Scholar] [CrossRef]
  44. Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. arXiv 2014, arXiv:1412.3555. [Google Scholar]
  45. Wiering, M.A.; Van Otterlo, M. Reinforcement Learning. Adapt. Learn. Optim. 2012, 12, 729. [Google Scholar]
  46. Szepesvari, C. Algorithms for Reinforcement Learning: Synthesis Lectures on Artificial Intelligence and Machine Learning; Morgan Claypool: San Rafael, CA, USA, 2010. [Google Scholar]
  47. Watkins, C.J.C.H.; Dayan, P. Q-Learning. Mach. Learn. 1992, 8, 279–292. [Google Scholar] [CrossRef]
  48. Pecan Street Data. Available online: https://www.pecanstreet.org/dataport (accessed on 12 December 2022).
Figure 1. The framework of the proposed model.
Figure 1. The framework of the proposed model.
Energies 16 04401 g001
Figure 2. Three original load power datasets.
Figure 2. Three original load power datasets.
Energies 16 04401 g002
Figure 3. The HI results of three load power datasets.
Figure 3. The HI results of three load power datasets.
Energies 16 04401 g003
Figure 4. The MAE of different single intelligent models.
Figure 4. The MAE of different single intelligent models.
Energies 16 04401 g004
Figure 5. The forecasting results and errors for Dataset #1: (a) predicted results; (b) error distribution.
Figure 5. The forecasting results and errors for Dataset #1: (a) predicted results; (b) error distribution.
Energies 16 04401 g005
Figure 6. The forecasting results and errors for Dataset #2: (a) predicted results; (b) error distribution.
Figure 6. The forecasting results and errors for Dataset #2: (a) predicted results; (b) error distribution.
Energies 16 04401 g006
Figure 7. The forecasting results and errors for Dataset #3: (a) predicted results; (b) error distribution.
Figure 7. The forecasting results and errors for Dataset #3: (a) predicted results; (b) error distribution.
Energies 16 04401 g007
Figure 8. The prediction comparison results with other decomposition algorithms in Dataset #1.
Figure 8. The prediction comparison results with other decomposition algorithms in Dataset #1.
Energies 16 04401 g008
Figure 9. The prediction comparison results with other decomposition algorithms in Dataset #2.
Figure 9. The prediction comparison results with other decomposition algorithms in Dataset #2.
Energies 16 04401 g009
Figure 10. The prediction comparison results with other decomposition algorithms in Dataset #3.
Figure 10. The prediction comparison results with other decomposition algorithms in Dataset #3.
Energies 16 04401 g010
Table 1. The statistical characteristics of three power consumption datasets [48].
Table 1. The statistical characteristics of three power consumption datasets [48].
DatasetMinimum (kW)Maximum (kW)Mean (kW)Standard Deviation (kW)
#1133.1340827.8200373.6066121.3303
#2161.3510976.8730351.2301136.5601
#3200.59501653.8300677.7043288.2719
Table 2. The SampEn of load power data before HI and data after HI.
Table 2. The SampEn of load power data before HI and data after HI.
SampEnDataset #1Dataset #2Dataset #3
Data before HI6.23647.02116.1759
Data after HI6.22757.01396.1654
Table 3. The improvement by using HI.
Table 3. The improvement by using HI.
DatasetModelPMAE (%)PMAPE (%)PRMSE (%)
#1HI-TCN vs. TCN4.59664.02565.3476
HI-ELM vs. ELM6.82847.08178.2709
HI-GRU vs. GRU5.34234.32124.9438
HI-Q-TEG vs. Q-TEG4.92995.02506.2904
HI-CEEMDAN-Q-TEG vs. CEEMDAN-Q-TEG3.46342.61043.5044
#2HI-TCN vs. TCN2.56983.95673.0261
HI-ELM vs. ELM3.35382.58230.9395
HI-GRU vs. GRU3.32434.90234.0294
HI-Q-TEG vs. Q-TEG4.09593.99082.9216
HI-CEEMDAN-Q-TEG vs. CEEMDAN-Q-TEG7.66773.76280.7874
#3HI-TCN vs. TCN2.93354.82245.5556
HI-ELM vs. ELM0.76563.94083.8763
HI-GRU vs. GRU1.39482.78343.9056
HI-Q-TEG vs. Q-TEG3.01534.49145.2185
HI-CEEMDAN-Q-TEG vs. CEEMDAN-Q-TEG2.78403.20954.2586
Table 4. The evaluation of forecasting results for the single intelligent models 1.
Table 4. The evaluation of forecasting results for the single intelligent models 1.
DatasetModelMAE (kW)MAPE (%)RMSE (kW)
#1HI-TCN20.318412.107033.2972
HI-ELM23.491612.465335.7800
HI-GRU22.900013.199234.5907
HI-BPNN24.248415.523234.9866
#2HI-TCN11.66049.571525.5669
HI-ELM10.27538.432324.5534
HI-GRU10.46998.768124.7721
HI-BPNN12.909113.599127.6510
#3HI-TCN31.195413.165947.4673
HI-ELM30.570912.904448.3280
HI-GRU29.423112.658747.1891
HI-BPNN34.651513.716851.9016
1 The values in bold represents the model evaluation results that resulted in the lowest forecasting error.
Table 5. The improvement of the Q-Learning ensemble method.
Table 5. The improvement of the Q-Learning ensemble method.
DatasetModelPMAE (%)PMAPE (%)PRMSE (%)
#1HI-Q-TEG vs. HI-TCN4.3723 8.8436 0.7334
HI-Q-TEG vs. HI-ELM6.9519 11.4638 3.4204
HI-Q-TEG vs. HI-GRU5.6687 9.5325 1.2771
#2HI-Q-TEG vs. HI-TCN4.8393 5.7540 2.2847
HI-Q-TEG vs. HI-ELM3.6675 4.3632 2.3143
HI-Q-TEG vs. HI-GRU4.2632 5.7205 2.7620
#3HI-Q-TEG vs. HI-TCN4.1633 12.8483 2.6601
HI-Q-TEG vs. HI-ELM3.3152 11.0823 3.6086
HI-Q-TEG vs. HI-GRU1.7167 9.3564 2.3495
Table 6. The improvement percentages of different decomposition algorithms.
Table 6. The improvement percentages of different decomposition algorithms.
DatasetModelPMAE (%)PMAPE (%)PRMSE (%)
#1HI-WPD-Q-TEG vs. HI-Q-TEG47.3353 65.2986 39.5611
HI-EMD-Q-TEG vs. HI-Q-TEG26.5890 50.7157 18.6278
HI-EEMD-Q-TEG vs. HI-Q-TEG35.4821 57.8237 25.1166
HI-CEEMDAN-Q-TEG vs. HI-Q-TEG 52.9294 68.9604 43.7355
#2HI-WPD-Q-TEG vs. HI-Q-TEG25.8923 57.3398 30.6425
HI-EMD-Q-TEG vs. HI-Q-TEG20.9483 55.8583 22.1706
HI-EEMD-Q-TEG vs. HI-Q-TEG19.9478 53.8074 23.2814
HI-CEEMDAN-Q-TEG vs. HI-Q-TEG 38.3934 65.1822 38.6528
#3HI-WPD-Q-TEG vs. HI-Q-TEG31.1852 30.2364 25.3324
HI-EMD-Q-TEG vs. HI-Q-TEG26.2456 25.6166 24.6462
HI-EEMD-Q-TEG vs. HI-Q-TEG34.5388 26.2974 30.2950
HI-CEEMDAN-Q-TEG vs. HI-Q-TEG 36.0141 32.7952 33.0906
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, J.; Liu, H.; Zheng, G.; Li, Y.; Yin, S. Short-Term Load Forecasting Based on Outlier Correction, Decomposition, and Ensemble Reinforcement Learning. Energies 2023, 16, 4401. https://doi.org/10.3390/en16114401

AMA Style

Wang J, Liu H, Zheng G, Li Y, Yin S. Short-Term Load Forecasting Based on Outlier Correction, Decomposition, and Ensemble Reinforcement Learning. Energies. 2023; 16(11):4401. https://doi.org/10.3390/en16114401

Chicago/Turabian Style

Wang, Jiakang, Hui Liu, Guangji Zheng, Ye Li, and Shi Yin. 2023. "Short-Term Load Forecasting Based on Outlier Correction, Decomposition, and Ensemble Reinforcement Learning" Energies 16, no. 11: 4401. https://doi.org/10.3390/en16114401

APA Style

Wang, J., Liu, H., Zheng, G., Li, Y., & Yin, S. (2023). Short-Term Load Forecasting Based on Outlier Correction, Decomposition, and Ensemble Reinforcement Learning. Energies, 16(11), 4401. https://doi.org/10.3390/en16114401

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop