1.2. Literature Review
Forecasting strategies for wind power are divided into four categories: physical, statistical, machine learning, and hybrid methods [
3,
6,
7,
8,
9,
10,
11]. However, physical methods (such as numerical weather prediction (NWP)) are computationally expensive and inaccurate at short time scales [
6,
12,
13], while the latter three are appropriate for short-term forecasting. Thus, physical methods are reserved for medium-term and long-term forecasting. The most widely used statistical method for wind speed short-term forecasting is the class of the well-known Box–Jenkins (1976) autoregressive moving average (ARMA) models which are based on historical data. Their appeal derives from their simplicity, ability to capture linear characteristics inherent in the data, and their high degree of prediction accuracy at a short-term horizon (see, e.g., [
3] for details). However, ARMAs are unable to effectively capture the nonlinearity typically inherent in wind speed data. With the advancement of technology in recent years, machine learning methods such as artificial neural networks (ANNs) have gained popularity [
6] in time series forecasting. In addition to identifying trends/patterns, these methods can automate various decision-making processes and can effectively capture nonlinear characteristics in wind speed series that cannot be adequately captured by statistical methods [
6]. Hybrid models combine various models, such as statistical and machine learning methods, to form a new and improved models (see, e.g., [
12,
13]). By retaining the advantages of each technique, hybrid modelling improves forecast accuracy.
In the literature, various ANNs have been discussed, including feed-forward neural networks (FNNs) [
11,
12,
14], convolutional neural networks (CNNs) [
9], and recurrent neural networks (RNNs) [
6,
9,
12,
13]. As a result of their ability to capture nonlinearity and thereby achieving high forecasting accuracy abilities, ANNs have enjoyed preference over the ARMA Box–Jenkins (1976) models over the past decades [
14]. The FFNN is an ANN derived from statistical methods [
12,
14]. There are no cycles in these simple architectures, and data are only channelled in one forward direction [
12,
15]. Wind speed forecasts are dependent on the current inputs as well as the continuity and dependence of input datasets [
5]. FFNNs lack historical dependencies since their input neurons are independent [
12,
13]. Hence, FFNNs have a slow convergence rate and are easily overfitted to unstable and noisy data [
16].
RNNs are another variant of ANNs with a unit internal structure that carries the memory of the historical information of the time series [
7,
9,
13,
16]. In contrast to FFNNs, this deep learning technique uses recurrent neural connections to handle highly variant time series. Although RNNs (to some extent) can learn time relationships or long-term dependencies of time series [
12,
16], they are susceptible to gradient disappearance and explosion during deep propagation due to the hyperbolic tangent activation function [
13,
16,
17]. Consequently, RNNs cannot fully learn the temporal time series behaviour, thus compromising the learning efficiency of the network [
5]. To overcome this deficit, ref. [
5] proposed long short-term memory (LSTM) network, which are an improved and more robust type of RNN. In contrast to FFNNs and traditional RNNs, LSTM learns temporal time-related behaviour and long-term dependencies while addressing the defects of gradient disappearance and explosion [
5,
12,
13]. The structure of LSTMs is such that they are able to forget irrelevant information by turning multiplication into addition and preserving a long-range relationship before reaching zero to eliminate the gradient disappearance problem [
9,
12,
13,
16]. The LSTM model can also be used to detect and capture both long-term and short-term time-dependencies in time series data [
6]. Hence, the LSTM network has been widely applied in various areas ranging from speech recognition and machine translation to handwriting recognition [
13].
The accuracy of LSTMs can be significantly improved with pre-processing through strategies such as variational mode decomposition (VMD), empirical mode decomposition (EMD), and wavelet transformation (WT) [
7,
12,
13,
14,
18,
19]. In addition to having excellent localisation properties, the WT strategy can also extract patterns, discontinuities, and trends in non-stationary time series [
8,
20], thus capturing these localisation properties. In [
14], the authors proposed a combination of the wavelet transform technique (WTT) and a two-hidden-layer neural network (TNN) (WT-TNN) for wind speed forecasting. The WTT-TNN model results showed higher accuracy in wind speed forecasting than the persistence (naïve) model (PM), one-hidden-layer neural network (ONN), and TNN [
14]. In wind speed forecasting, a hybrid of WT and linear neural networks with tapped delay (LNNTD) termed WT-LNNTD was proposed by [
21]. The authors discovered that the WT-LNNTD model improved the performance of individual forecasting models. Similar results were obtained in the work of [
7,
19], where WT and FFNNs were combined in short-term wind speed forecasting. In [
13], the authors combined WT and LSTM in short-term wind speed forecasting. It was found that pre-processing through WT and then predicting each subseries using LSTM significantly improves the accuracy of the forecasts. In a similar study, ref. [
5] blended WT and bidirectional-LSTM (bi-LSTM). The proposed hybrid model effectively utilised historical and future information of subseries to accurately forecast wind speed data.
Among the studies reviewed, none assessed the randomness/complexity of wavelet signals through information theory to determine the most suitable modelling and forecasting approach, which would improve models’ prediction accuracy. In fact, in the literature wind speed forecasting models are scantly optimised and enhanced by coupling sample entropy (SampEn) and wavelet decomposition. Moreover, ref. [
5] argued that ensemble or combination methods should be used to account for non-linear aspects of wind speed forecasts since linear combination techniques, such as those used in [
3,
7,
13,
14], are insufficient. There is a significant scarcity of research studies that leverage the combination of the aforementioned approaches for reliable, accurate, and robust short-term wind speed predictions. The lack of information in this area has left a void in the knowledge base that needs to be addressed.
In this paper, we formulate a hybrid approach that combines the benefits of WT, SampEn, neural network autoregression (NNAR), LSTM, and gradient boosting machine (GBM) methods to form WT-NNAR-LSTM-GBM. In the proposed model, the WT is used to decompose wind speed data into less complex multiple components (approximate and detailed subseries), while the SampEn will determine the presence of complex features in each decomposed subseries. The NNAR and LSTM networks are used to independently predict deterministic (low-variant) and complex (high-variant) subseries, respectively. The nonlinear, scalable, and highly accurate GBM is used to optimally reconcile subseries forecasts to obtain final forecasts. The efficacy and robustness of the proposed model, which has not been explored in the wind speed forecasting literature to the best of our knowledge, is tested against the naïve model and four other models, namely NNAR (benchmark), LSTM, and two hybrid models. One of these hybrid models replaced LSTM with a k-nearest neighbour (KNN) to form WT-NNAR-KNN-GBM, while the other one substituted NNAR with LSTM in WT-NNAR-KNN-GBM to form WT-LSTM-KNN-GBM.
The current study addresses the shortcomings identified in the previously reviewed work by enhancing the accuracy of quantifying wind energy resources in the Southern Africa region within the short-term forecasting framework. Considering nonlinear, deterministic, and random facets of wind speed, the study proposes developing a robust, reliable, and comprehensive multi-model framework. In this way, the model will facilitate the seamless and reliable integration of significant volumes of wind energy into electrical grids.
This study uses minutely averaged wind speed data from the Council for Scientific and Industrial Research (CSIR) energy centre, GIZ Richtersveld (RVD), USAid Venda (Venda), and USAid Namibian University of Science and Technology (NUST) radiometric stations downloaded from the Southern African Universities Radiometric Network (SAURAN) website (
http://www.sauran.ac.za) (accessed 12 December 2022). An R.M. Young (05103 or 03001) anemometer instrument was used to measure these high-resolution minute-based wind speed data. This instrument, which is durable, corrosion-resistant, and lightweight, has a four-blade helicoid propeller to accurately measure wind speed and a vane to determine wind direction.
1.3. Innovations and Contributions
Wind speed data are complex and volatile. A single model may not fully capture these behaviours, which, in the short term disrupt real-time grid operations, uniform wind power distribution, and output optimisation. However, these behaviours can be effectively and accurately captured (to a certain degree) using hybrid models. It is necessary to unmask the complexities as well as the deterministic and random patterns of wind speed before delving into fitting appropriate models.
The literature shows that hybrid models are generally implemented by denoising and decomposing the original signal into subseries, modelling and forecasting decomposed subseries, and then combining subseries forecasts. However, this approach often disregards other fundamental and critical elements of the subseries. For instance, when using WTs, the complexity and variability of the subseries at lower levels of decomposition are distinct from those at higher levels of decomposition. Consequently, treating these subseries similarly may compromise the accuracy of the final forecasts (see [
2,
3] for details). As such, it is pivotal that each subseries be independently treated, i.e., inspected and modelled relative to its inherent features. In turn, this will improve prediction accuracy. Furthermore, the majority of hybrid models also rely excessively on a linear combination of subseries of forecasts to generate the final forecast (see e.g., [
4,
6,
14,
21]). Despite being simple and efficient, this linear approach lacks accuracy and stability when combining wind speed subseries forecasts that are inherently nonlinear, leading to excessive error accumulation in the final forecast value (see [
2,
5] for details). It is vital to optimise a forecast combination method using other nonlinear prediction methods in various stages of the prediction process, viz., model parameter optimisation and output error correction.
The above-mentioned issues are at the core of the innovation and novelty of the hybrid strategy proposed in this paper. Hence, the main contributions of this paper are as follows:
The proposed hybrid model for wind speed forecasting is predicated upon a multi-model ensemble approach, incorporating data decomposition through WTs, complexity classification through SampEn, individual subseries modelling and prediction using NNAR and LSTM techniques, and forecast combination through GBM strategies. This model presents a comprehensive solution for wind speed forecasting, leveraging the strengths of several techniques to provide a refined and accurate forecast.
WTs play a crucial role in data transformation and decomposition, as they offer exceptional efficiency while minimising random fluctuations in data sequences, thus improving models’ prediction accuracy. As such, these techniques are highly recommended for breaking down irregular wind speed data into low- and high-frequency signals. Significantly, the decomposed signals exhibit more apparent trends and patterns with less variability than the original wind speed signal. This allows for more efficient training and prediction, to a certain extent.
The concept of randomness pertains to the frequency of distinct digits appearing in a given sequence. To gauge randomness, statistical metrics such as mean, standard deviation, skewness, kurtosis, etc., are often employed in the literature. Nevertheless, traditional approaches fall short of addressing certain complexities that emerge when attempting to examine randomness using this methodology (see [
22] for details). To the contrary, the SampEn criterion enables efficient and effective classification/judgment of decomposed signals based on their analogous complex properties and deterministic properties. Consequently, the most suitable modelling and forecasting approach is employed to improve prediction accuracy.
Besides identifying patterns, NNAR models are resistant to non-stationarity and outliers. These nonlinear approximators leverage the SampEn criterion to precisely predict less random and deterministic subseries.
To curb the gradient disappearances and explosions to which NNARs are susceptible, subseries classified as more complex or highly random by the SampEn criterion are modelled and predicted using more reliable, optimised, and robust stateless LSTMs. Different from the stateful LSTM, a stateless LSTM can effectively and accurately learn patterns in unstable random time series data such as wind speed. Furthermore, stateless LSTMs are preferred over stateful LSTMs for this time series prediction task because of their higher stability, simplicity, and accuracy.
To capture the complex nonlinear structure embedded in wind speed subseries forecasts, it is imperative to employ a nonlinear forecast combination method. Therefore, a highly scalable, robust, nonlinear GBM model is preferred over a linear combination model for combining nonlinear wind speed subseries forecasts.
The study has been conducted in a manner that is reliable and reproducible, and it has provided appropriate and comprehensive assessment metrics (statistical testing) that are suitable for short-term wind speed modelling and prediction.