1. Introduction
In regard to dealing with global climate change as well as the increasing global energy needs, turning to renewable energy alternatives has been the focus of researchers in recent years. Wind power represents one of the most important renewable resources for wind power generation thanks to its widely distributed nature [
1]. On the other hand, the need to efficiently exploit wind power in order to replace conventional energy power generation in power systems has created various operation and planning problems, due to the wind’s stochastic nature and intermittence [
1].
Thanks to technological advances, neural networks (NNs) have been introduced and have been excessively used in order to develop accurate wind power forecasting models able to estimate and control wind power generation. Forecasting models are not only able to predict wind power values, but also help in the organization of electricity markets as well as the stabilization of power systems [
2]. Throughout the years, NNs have been used as deterministic forecasting models in order to generate point forecasts and provide the user with an estimated wind power output series, which is as accurate as possible. However, such models fail to provide information on the uncertainty of a prediction. Consequently, due to the increasing penetration of wind power into power systems, deterministic models cannot always be efficiently used for real-life problems as well as decision-making tasks.
In order to overcome the limitations of deterministic forecasting models in uncertainty estimation, over the last years, wind power probabilistic forecasting (WPPF) models have been the main focus of researchers. Such models provide a wider view of the predictive outcome of a forecast in the form of prediction intervals (PIs), quantiles, distributions or scenarios and thus offer more rich predictive information to the user [
3].
The most common probabilistic forecasting models consider a parametric approach, where the predictive density follows a specific pre-defined distribution shape. However, such an assumption of a specific distribution shape is not always reasonable for real-life problems and as a result the parametric approach is not ideal to cope with decision-making problems.
Various non-parametric methods have been proposed to construct PIs based on NN technology. The work [
4] introduced a direct quantile regression-based methodology to generate predictive quantiles without statistical inference or pre-assumption of error distribution. The proposed model efficiently used the extreme learning machine along with the quantile regression for the probabilistic forecasting process. In [
5], feed-forward neural networks (FFNNs)-based models were used for the wind power forecasting process and a moving block bootstrap was used for the quantification of the uncertainty of the forecasts. The work [
6] proposed a convolutional neural network (CNN)-based hybrid model along with the wavelet transform (WT) methodology. WT was used on the wind power data time-series in order to decompose them to their components for different frequencies and afterwards, the new data were used in a back-propagation CNN to provide the forecasts. In [
7], a wavelet-based NN was proposed for PI construction. The proposed methodology was further optimized via an evolving knowledge-based multi-objective artificial bee colony algorithm that was used to improve the NN’s parameters. In [
8], a linear NN with tapped delay model was proposed for the execution of the wind power forecasting process at multiple steps. WT was further adapted to the proposed model for the pre-processing of the raw wind power input data.
In [
9], a lower upper bound estimation (LUBE) methodology was introduced to efficiently construct PIs. The main advantage of the LUBE method is that it uses a FFNN with two outputs that can directly construct the PI. Each of those outputs represents the lower and upper bound of a PI. As a result, the whole PI construction process is faster and simpler. Moreover, the LUBE methodology avoids the use of pre-defined distributions of the data.
The predictive results of different forecasting methodologies can be further improved via using optimization algorithms appropriately modified to be used for WPPF. The particle swarm optimization (PSO) is a powerful optimization algorithm, which is used for the optimization of the synaptic weights of NNs. In [
10], an enhanced PSO algorithm was used in order to determine the weight coefficients of an adaptive network-based fuzzy inference system methodology that was used for the forecasting process. The implementation of a mutation operator could further improve the searching capabilities of the PSO and aid it to not be trapped in local optima. The work [
11] proposed a convolutional NN-based model for the wind power forecasting process, in order to exploit its deep-feature extraction potential. The proposed model was further improved by implementing a PSO algorithm that was used to optimize different wind power segments of the wind power sequence.
This article focuses on: (1) the construction of accurate PIs via developing an efficient wind power probabilistic forecasting model, and (2) the analysis of the results from a seasonal perspective. The LUBE methodology is proposed for efficient PI construction. In order to further optimize the LUBE method’s accuracy, a PSO algorithm is implemented for the optimization of the NN parameters. Aiming to simplify the data pre-processing process of the proposed LUBE-PSO model, the WT is adopted to decompose the original raw data. The evaluation of the proposed model is based on the minimization of the coverage width criterion (CWC) cost function that allows the concurrent optimization of both the calibration and the sharpness of the constructed PIs. The results of the proposed WT-LUBE-PSO-CWC model are studied via a seasonal analysis of the provided datasets.
The main contributions of this paper are the following:
A LUBE-based methodology is adopted for accurate PI construction due to its ability to directly create PIs from the outputs of a FFNN. The PSO algorithm along with a mutation operator is implemented in the model in order to optimize the weights and biases of the NNs used. In order to simplify the pre-processing of the wind power data as well as to improve the prediction accuracy of the model, WT is implemented in the provided dataset. By combining the advantages of WT, LUBE and PSO methods, the proposed WT-LUBE-PSO-CWC model: a) provides PIs of high quality and accuracy, and b) decreases the number of predictive errors when compared to state-of-the-art methodology.
The Yam-Chow initialization method is proposed in this work in order to efficiently initialize the weights of the FFNN used for the training process. The use of Yam-Chow initialization algorithm into the proposed model manages to improve the training speed of the FFNN by decreasing the initial NN’s error via preventing it to be trapped in using the initial weights. A contribution of this paper is that the Yam-Chow initialization method is further modified to fit the proposed LUBE methodology.
A k-fold cross validation is implemented in order to further improve the model’s accuracy. The aim of the k-fold cross validation is twofold. It is initially applied to the proposed model in order to determine the optimal NN structure and in five-fold cross validation is further implemented to define the optimal number of the particles of the swarm for the PSO algorithm.
The accuracy of the proposed model is tested using publicly available wind power forecasting data from the 2014 Global Energy Forecasting Competition [
12]. Moreover, the results of the proposed method are compared with the results of another method applied on exactly the same data.
The analysis of the results using different metrics shows that the proposed method is able to efficiently construct accurate PIs. The forecasting accuracy of the proposed method is further verified by seasonality analysis.
4. Results
Table 1 shows the optimal values of the model’s hyperparameters after fine-tuning while
Table 2 shows the optimal BELM model’s hyperparameters.
Figure 8 shows the convergence speed of the proposed LUBE-PSO-CWC with and without the Yam-Chow initialization technique. The dataset used to generate the results of
Figure 8 is that of the autumn 2012 data of zone 7. With Yam-Chow initialization, the convergence rate is higher and more constant throughout the iterations even though the convergence speed is smaller compared to random initialization. This means that with the Yam-Chow initialization, fewer iterations are needed for the weights and biases to converge to their global optimal values.
Table 3,
Table 4,
Table 5 and
Table 6 present the results of the evaluation of five different training sessions, for each season of zone 1, of the proposed WT-LUBE-PSO-CWC in comparison with the bootstrap extreme learning machine (BELM) [
26] for a confidence level of 0.9. The chosen evaluation indices are
PICP,
PINAW,
CWC and
CRPS. For each evaluation index, the median value of the five different training sessions is calculated, while the best median values are indicated with bold. Similarly,
Table 7,
Table 8,
Table 9 and
Table 10 compare the results of the proposed WT-LUBE-PSO-CWC with the results of BELM, for data corresponding to zone 7.
Table 3,
Table 4,
Table 5,
Table 6,
Table 7,
Table 8,
Table 9 and
Table 10 show that the proposed WT-LUBE-PSO-CWC outperforms BELM in both
PICP and
CWC. Specifically, the proposed WT-LUBE-PSO-CWC achieves a higher median
PICP in all cases, and a lower median
CWC in seven out of eight cases. BELM achieves a slightly better median
CWC only for the spring 2013 dataset of zone 1. This means that the generated PIs by WT-LUBE-PSO-CWC have a bigger coverage rate, even though they have a smaller range on average. The proposed model also has a more stable response, since in every run of each case,
PICP is equal to or higher than the nominal confidence level. On the other hand, the coverage probability of the PIs generated by BELM cannot reach the nominal confidence level in several runs, resulting in a much higher
CWC value. Furthermore, in terms of the
CRPS error metric, the proposed model outperformed the BELM model significantly in seven out of eight cases.
In order to further prove the superiority of the proposed model, comparative results between the proposed WT-LUBE-PSO-CWC model and BELM are presented in
Table 11 and
Table 12 for a confidence level of 0.85 and 0.95, respectively. Due to space limitations, the median values for five training sessions of
PICP,
CWC and
CRPS for each model are presented for every season from summer 2012 to spring 2013. It can be derived that in 14 out of 16 cases, the proposed WT-LUBE-PSO-CWC achieves a higher median
PICP, while only in 2 out of 16 cases BELM has a slightly better median
PICP. The median
PICP values of the proposed model remain equal or higher than the nominal confidence level in all cases. Concerning the median
CWC values, the proposed model outperforms the BELM model in all cases. In terms of
CRPS, the proposed model outperformed the BELM model in 13 out of 16 cases. More specifically, for the dataset of Zone 1, the proposed model outperformed the BELM model for both the 0.85 and 0.95 confidence level. For the dataset of Zone 7, the BELM model gave slightly better results for autumn for both confidence levels, which is probably a result of the quality of the data compared to the dataset of Zone 1.
Figure 9 and
Figure 10 show the PIs generated by the WT-LUBE-PSO-CWC and the BELM methods, respectively, for the autumn 2012 dataset of zone 7. In each figure, the upper chart shows the PIs for the first 48 predictions of the test set (first two days), and the lower chart shows the predicted PIs for the whole test set (546 samples, i.e., 23 days). The red spots in
Figure 9 and
Figure 10 refer to the spot forecasts in each case, while the blue lines in the upper charts indicate the PIs’ width. The quality of the PIs generated by WT-LUBE-PSO-CWC is generally higher than the quality of the PIs generated by BELM. For the rest of the seasons researched, the conclusions drawn for the PIs are generally similar for the two methods compared; however, due to space limitations, only one season is presented.
In
Table 13, an overall comparison between BELM and the proposed WT-LUBE-PSO-CWC model is made for every case of each zone concerning the 0.9 confidence level. The average values of
CWC,
PICP and the model’s run time are obtained by the corresponding median values of the eight cases shown in
Table 3,
Table 4,
Table 5,
Table 6,
Table 7,
Table 8,
Table 9 and
Table 10. The average
CWC of WT-LUBE-PSO-CWC is 0.487056, which is 5.05% less than the average
CWC of BELM. The average
PICP of WT-LUBE-PSO-CWC is 0.929234, which is 2.29% more than the average
PICP of BELM. The conclusion is that both coverage rate (
CWC) and average PI range (
PICP) are improved with the proposed WT-LUBE-PSO-CWC model. As expected, however, BELM is faster, since its core consists of extreme learning machines.
In order to achieve a more reliable evaluation of the performance of WT-LUBE-PSO-CWC and BELM, the two models are trained and evaluated again, this time with year-long datasets, containing information from 1 June 2012 to 31 May 2013 for both zones 1 and 7. Thus, the dataset of each zone consists of 8760 hourly sets of data. Again, the training sets consist of 75% of the initial datasets and the remaining 25% correspond to the test sets. The results are shown in
Table 14. The median values of
PICP,
CWC and
CRPS for each zone are obtained after five implementations of the training and the evaluation process. In order to show the effect of wavelet transformation, the results are also obtained for LUBE-PSO-CWC without the wavelet transform. Thus, the input data of LUBE-PSO-CWC without wavelet consist only of the original input data of the GEFCom 2014. Again, WT-LUBE-PSO-CWC outperforms BELM in both coverage rate and average PI range, in both zones. Additionally, it can be seen from
Table 14 that the application of wavelet transformation improves the obtained results of LUBE-PSO-CWC, in both
PICP and
CWC.
In
Table 14, the results of the combined BELM and WT-LUBE-PSO-CWC methods are also presented. Combined PIs can improve forecasts in both accuracy and calibration [
27]. The PIs generated by both BELM and WT-LUBE-PSO-CWC are characterized by a low level of overconfidence. Furthermore, the correlation between the two models is small. Thus, the best methods to combine PIs are the average method, the median method and the exterior trimming method [
28]. Since there are only two methodologies involved, the average (Avg) method is chosen. Combined PIs do not seem to improve the overall performance of the forecasts. Although the results of the combined method are better than those of BELM, the WT-LUBE-PSO-CWC method still has the best performance. However, if more methods were combined and a more complex combination method was used, the results could probably be improved.
In order to present the accuracy of the results from a seasonal point of view,
Figure 11 shows the average value of the median
CWC between the proposed WT-LUBE-PSO-CWC and BELM models for each season in zone 1 and 7. For both zones, the worst performance is obtained in summer. The average values per season of the values shown in
Figure 11 are presented in
Table 15. Again, the highest average
CWC is observed in summer. This is expected, since in summer the average wind speed is usually lower than during the rest of the year, resulting in a bigger fluctuation in wind power output, or no power output at all, for longer periods of time. On the other hand, the best results are obtained in spring and autumn. Consequently, in the summer a greater error rate of the wind power forecasts should be expected, while the smallest error rates should be expected in spring and autumn.
Table 15 presents the standard deviation of the values of
CWC shown in
Figure 11. The biggest standard deviation of
CWC is obtained in summer, while the lowest standard deviation of
CWC is obtained in spring. This means that when using spring data, WT-LUBE-PSO-CWC and BELM have a more stable response, independent from the geographical location of the wind farm. On the other hand, when using summer data, the models have a more unstable response, which means that the quality of the results has a much bigger dependence on the geographical location of the wind farm.
It can be further observed from
Figure 11 that there is a slight difference in the median
CWC between spring and autumn in Zone 1 and Zone 7. Considering that each data zone concerns wind farms in different locations, the difference in the median
CWC could be a result of the different quality of data during those specific seasons. As a result, for Zone 1 the spring data are of better quality than the autumn data, while for Zone 7 the autumn data provide better results. Furthermore, the technical characteristics of the wind farms located in each zone could also play a significant role in the observed difference in the median
CWC between spring and autumn in Zone 1 and Zone 7.
Table 16 presents the average number of observations below the lower limit and above the upper limit of the PIs. For the BELM method, the average number of observations above the upper prediction limit is significantly higher than the observations below the lower prediction limit, for almost every test case. This is even more evident for the WT-LUBE-PSO-CWC method. This is probably related to the fact that more target values of the datasets lie near 0 instead of 1, since the output of a wind farm is rarely close to its nominal level. The fact that for the WT-LUBE-PSO-CWC method the observations below the lower prediction limit are less compared to those for the BELM method proves that the WT-LUBE-PSO-CWC method is more accurate. The only test cases where the average number of observations below the lower prediction limit is higher than the average number of observations above the upper prediction limit are those related to the summer datasets. Due to space limitations, only the observations exceeding the PI limits for a confidence level of 0.9 are presented, since the results for confidence levels of 0.85 and 0.95 were relatively similar to those presented in
Table 16.
It is concluded that the proposed WT-LUBE-PSO-CWC model with the proposed Gaussian mutation operator overall achieves better results than the state-of-the-art BELM methodology. Its PIs have on average a bigger coverage rate, while maintaining a smaller average range. These results are observed for both seasonal datasets and year-long datasets, for both zone 1 and zone 7. Additionally, WT-LUBE-PSO-CWC has a much more stable response than BELM, since its coverage rate is equal to or higher than the nominal confidence level in all cases. Initializing the weights and biases of LUBE-PSO-CWC with the Yam-Chow technique leads to a higher convergence rate, while using the wavelet transformation further improves the results of the model in both PICP and CWC. In summer, the generated PIs are of the lowest quality, both in terms of CWC and stability. On the other hand, the best results are obtained in autumn and spring. In spring, not only is the best average CWC obtained, but also the standard deviation of the results is the lowest, resulting in high stability.
5. Discussion and Future Research
A LUBE method was proposed in this paper for the PI construction and was further developed and extended. A wavelet transformation methodology was applied to the wind power data of the publicly available GEFCom2014 database, in order to analyze the wind power series down to its components and simplify the preprocessing procedure. In order to evaluate the constructed PIs, the PINRW evaluation index was proposed over PINAW, since PINRW enlarges wider PIs, while PINAW gives equal weights to the widths of a PI. The CWC cost function was developed and further modified in order to research the case study as a single objective optimization problem. A PSO algorithm along with a mutation operator was implemented in order to further optimize the WT-LUBE methodology. The proposed WT-LUBE-PSO-CWC model was finally used to minimize the cost function and provide optimal PIs.
Datasets from two zones of the provided data were used in this paper in order to evaluate the proposed model’s accuracy. The seasonality of zone 1 and zone 7 of the provided data was researched. A five-fold cross validation method was used to define the optimal NN structure as well as to estimate the optimal number of the particles of the PSO. Furthermore, being a LUBE-based model, the proposed methodology successfully allowed for an easier and faster PI construction.
Compared to the state-of-the-art BELM model, the proposed methodology managed to efficiently construct higher-quality PIs, by achieving higher PICP as well as narrower PINAW evaluation metrics. Moreover, the Yam-Chow initialization technique further improved the training speed of the FFNNs, since fewer iterations are needed for the weights and biases to converge to their global optimal values.
Aiming to further improve and develop the efficiency and the accuracy of the proposed methodology, various possible directions exist.
5.1. Multi-Objective Methodology
The work presented in this paper focuses on minimizing the CWC cost function and thus on optimizing a single objective problem. In the future, in order to further highlight the efficiency of the proposed model in real-life problems, the focus should be to develop a multi-objective optimization methodology based on the proposed WT-LUBE-PSO-CWC model. Focusing on evaluating multi-objective optimization problems from different research fields, and adapting them to deal with wind power forecasting problems, could be a possible extension of the proposed methodology.
5.2. Spatio-Temporal Correlation
Recent works aim to highlight the importance of the spatio-temporal correlation between different wind farms’ datasets in order to provide more efficient wind power forecasting models. Instead of limiting the input datasets to one wind farm, exploiting different historical or meteorological data from different wind farms could increase the amount of data and consequently improve the accuracy of the proposed model. Developing the proposed method from a spatio-temporal viewpoint and comparing it with novel spatio-temporal-based methodologies, i.e., the calibrated regime-switching method [
29], could be a significant extension of our proposed model.
5.3. Data Tests
The proposed methodology relied on the use of the dataset’s wind components and the generated normalized wind power values. Further use of meteorological data or the use of a greater amount of historical data could further improve the accuracy of the proposed model as well as further improve its computational cost. Furthermore, adapting the proposed work to more complex data could further improve its accuracy, as well as its adaptability to different situations.