1. Introduction
South Africa has been seen to be a late participant in the three key industrial revolutions [
1]. The use of artificial intelligence (AI) and data is on the rise in South Africa [
2,
3,
4]. This rise means that South Africa might not be a late participant in the fourth industrial revolution. In 2007, 2013, 2018, and 2019, South Africa experienced a shortage in power supply due to various challenges, leading to load shedding [
1]. South Africa’s public power utility, Eskom, has on several occasions stated its inability to accurately predict/forecast the unplanned capability loss factor (UCLF) as one of the major factors leading to an unreliable power supply and unpredictable load shedding [
5,
6]. UCLF is a term that refers to the measure of unplanned plant breakdown. The behavior of South African UCLF has not been well studied. Pretorius et al. studied the impact of the South African energy crisis on emissions [
7]. This study only talks about an increase in UCLF due to maintenance deferral. The study does not talk about how to forecast UCLF, nor the major factors that contribute to UCLF that can help in the forecasting of UCLF. The UCLF, planned capability loss factor (PCLF), and other capability loss factor (OCLF), together with the installed capacity, determine the power available to supply customers. The PCLF is the planned plant outages for the maintenance or refurbishments of the plant. This is typically a planned, set value set by the utility. The utility can decide to change their planned outage/PCLF depending on different factors. The OCLF accounts for other or random losses and is usually significantly smaller than the UCLF [
8]. The installed capacity gives the number of megawatts of the installed power plant units. Micali studied the prediction of new coal power plants’ availability in the absence of data in South Africa [
8]. The author mentions that the work is a precursor to predicting UCLF in new plants. The author proposes using expert opinion with some data from stations where data are available. However, the work in [
8] did not focus on the total UCLF, assumed limited availability of data, did not use AI techniques, and depended on expert knowledge. In [
9], the authors state that expert knowledge can change from one expert to the next, and thus expert results can be different from the same data. The author, in addition, did not investigate factors that affect power supply and may influence the UCLF [
8]. There is, thus, a gap in South Africa in terms of accurately forecasting UCLF. In addition, the study of the total South African UCLF behavior is a gap as only precursor work exists, and the precursor work is focused on new plants. Another gap is the use of intelligent systems that are not reliant on human experts in UCLF forecasting.
To add to the previous paragraph, the knowledge of when the power system might experience a power shortage is still a topic of interest and is not only important for the utility, but also customers. Knowing when there may be a power shortage, and hence a requirement to reduce consumption, helps customers plan their operations. Unplanned failures have been studied before. In [
10], real-time prediction of distribution system outage duration using historical outage records to train neural networks was studied. The Netherlands collects information on unplanned outages from its utilities to inform its maintenance and investment policies [
11].
South Africa is the highest producer of electricity in Africa and is in the top 25 producers of power in the world [
12,
13]. Over 80% of South Africa’s power is produced by coal-fired power stations and a nuclear power station. The total South African power grid UCLF can, thus, be modeled as that of the coal and nuclear power stations. Despite the recent move towards cleaner energy, the largest power-producing countries, such as India and China, still rely heavily on coal-fired power stations [
12]. The study of coal thermal power plants and behavior is, thus, still of interest [
14,
15,
16,
17]. The study of the South African coal-fired power station UCLF is, therefore, important as coal power plants are still highly used and are still a research topic of interest.
Forecasting and prediction have been topics of interest for many researchers [
10,
18]. This is mainly due to an interest in understanding and predicting the future behavior of certain variables. Artificial intelligence (AI) techniques have become popular in these forecasting/prediction tasks. One of the reasons for this popularity is their ability to model non-linearity with high accuracy. Khoza and Marwala used an ensemble of the multi-layer perceptron and rough set theory to predict the direction that the South African gross domestic product (GDP) would take [
18]. Galius proposed a probabilistic model for modeling power distribution network blackouts [
19]. In Egypt, power cable failures were analyzed to help prevent future power outages [
20]. In [
21], bilateral long short-term memory (LSTM) was used to forecast the short-term cycle of wafer lots for the planning and control of wafer manufacturing. The rise of computational power and access to labeled data has led to an increase in the utilization of deep learning techniques [
22]. Deep learning techniques have been seen to have an excellent performance in multiple areas, such as language and speech processing, as well as computer vision [
23,
24]. Alhussein et al. used a hybrid of convolutional neural networks (CNN) and long short-term memory (LSTM) to forecast individual house loads [
25]. Here, the researchers use CNN to select features from the input data and LSTM to learn the sequence. The authors stated a mean absolute percentage error (MAPE) improvement greater than 4% in comparison to LSTM-based models. Kong et al. also combined CNN and LSTM for short-term load forecasting in Singapore [
26]. Pandit et al. compared LSTM and Markov chain models in weather forecasting for German offshore wind farms to improve their wind turbine availability and maintenance [
27]. Deep learning has also been used to forecast wind speeds at turbine locations [
28]. The authors combine CNN and the gated recurrent unit (GRU) to achieve satisfactory results in comparison to existing models. Deep learning techniques have also been used to forecast the Korean postal delivery service demand [
29]. This observed performance of deep learning techniques has also led to their adoption in recent load forecasting studies [
30,
31]. A gap still exists in the application of the state-of-the-art techniques in forecasting UCLF (and South African UCLF), as applied in forecasting in the different engineering areas.
As observed, a number of studies have used a combination of techniques to achieve improved performance [
25,
26,
27,
28,
29]. This combination of techniques is usually termed ensemble or hybrid techniques. Ensemble techniques have also been used for classification in different engineering applications. Ramotsoela et al. used an ensemble of five artificial intelligence techniques to detect intrusion in water distribution systems [
32]. The ensemble model used here combined an artificial neural network (ANN), RNN (recurrent neural network), LSTM, GRU, and CNN in a voting system. The ensemble model classified its output as an anomaly if at least two constituent models classified their outputs as an anomaly. CNN models have been combined to determine driver behavior from multiple data streams [
33]. The proposed ensemble model incorporated a voting system to enhance the classification accuracy. A double ensemble model of semi-supervised gated stacked auto-encoders has been used to predict industrial key performance indicators [
34]. Drif et al. proposed an ensemble of auto-encoders for recommendations [
35]. The authors used an aggression method to combine outputs from the sub-models to form the ensemble model output. Bibi et al. used an ensemble-based technique to forecast electricity spot prices in the Italian electricity market [
36]. The authors estimated deterministic components using semi-parametric techniques and then determined stochastic components using time series, and machine learning algorithms. The final forecast is obtained from the estimates of both components [
36]. Shah et al. used a similar approach to Bibi et al. in short-term electricity demand forecasting for the Nordic electricity market [
37]. The similarity is that the authors separated their approach into a deterministic and a stochastic component and then combined the estimates from them to obtain the final forecast. None of the literature covers the use of ensemble techniques in forecasting UCLF. The use of ensemble techniques in UCLF forecasting is, thus, an existing research gap.
This paper introduces the following contributions: (i) A novel study of the South African UCLF behavior using state-of-the-art AI (deep learning and ensemble) techniques. (ii) An investigation of the impact of the installed capacity, historic demand, and PCLF on the UCLF forecasting accuracy. (iii) An introduction of a novel deep-learning ensemble total South African UCLF forecasting system.
The remainder of this paper is arranged as follows:
Section 2 presents the techniques used in this research.
Section 3 presents the experimental setup. The proposed UCLF forecasting system is presented in
Section 4.
Section 5 then presents the experimental results and the discussion of the results. The paper conclusions are presented in
Section 6.
Section 7 presents the limitations of the study as well as future work. The paper flow chart is shown in
Figure 1.
3. Experimental Setup
This section presents the experiment setup via two sub-sections. The first sub-section presents the South African coal generation plants overview. The second sub-section presents the experimental approach.
3.1. South African Key Coal Power Generation Plants Overview
South Africa has 15 key coal-powered thermal power stations. These stations are owned and operated by Eskom. Two of these stations are the new supercritical power stations, Medupi and Kusile, which are still under construction and at different stages of completion. The power stations are mostly concentrated in Mpumalanga Province, mainly due to the large availability of coal in this province. Twelve coal power stations are located in Mpumalanga, two in Limpopo Province, and one in Free State Province.
Figure 4 shows the location of the South African coal-fired power stations [
40]. South Africa also has one nuclear power generation station located in the Western Cape Province. This power station has an installed capacity of 1940 MW. This nuclear station and the coal-fired power stations contribute to over 80% of South Africa’s installed capacity and supply the country’s baseload. The PCLF and UCLF data used in this research are from these coal-fired powered stations and the nuclear power station, collected from a centralized database.
3.2. Data Description
The data used in this study were real utility data collected from January 2010 to December 2019.
Figure 5 shows the different periodicities of the UCLF over time.
Figure 5c shows the periodicity over weeks in parts of the South African winter (June–July) and summer (November–December) season in the year 2019.
The collected data were for four variables: the installed capacity, demand, PCLF, and UCLF. To investigate how these variables affect the UCLF forecast accuracy of the different techniques, the variables were arranged into five experiments, as shown in
Figure 6. A tick indicates that a variable is used in the respective experiment and a cross indicates that the variable was not used in the experiment. The experiment with the best performance will, thus, indicate which variables should be used with which technique to achieve the lowest year-ahead UCLF forecasting error. The installed capacity is the total power that can be generated by the installed power generation plants in megawatts. The demand is the historic total national power demand in megawatts. The PCLF and UCLF are the respective historic variables in megawatts. The UCLF data used for the input in the training and testing of the models were split into the UCLF two years before the target UCLF,
UCLF T-2 Years, and the UCLF a year before the target,
UCLF T-1 Year. The UCLF data used was a daily peak value. A variable indicating if it is a weekend or a weekday, the
Weekend Index, was also used as an input. This variable was a 1 for weekends and a 0 for weekdays. This variable was included for the models to be able to differentiate the data for a weekday and the weekend, respectively. This resulted in six input variables. The training period was between 1 January 2012 and 31 December 2018. The testing period was between 1 January 2019 and 31 December 2019. Thus, the forecasts were a daily peak UCLF for the year-ahead forecast period. All the variables, except the weekend index, were normalized to be between 0 and 1. The training input data were, thus, a 2555
× n matrix, where the 2555 is the daily input values over 7 years and
n is the number of variables used in the respective experiment, as described next.
The training input variable matrix sizes were, thus, 2555 × 6 for Exp 1, 2555 × 5 for Exp 2 to Exp 4, and 2555 × 3 for Exp 5.
3.3. Experimental Approach
The different techniques’ models were, respectively, developed using various approaches.
The OP-ELM models were trained by tuning the model dimensions. A different number of hidden nodes were used to train the model in the respective experiments. Optimal pruning using the LOO method was key in determining the model’s dimensions. Various dimensions were investigated and the model with the lowest errors in each experiment was captured and is presented in the results section.
LSTM-RNN models were trained with different numbers of stacked hidden LSTM units. The variation of the hidden units was consistent in all the different experiments. Similar to the OP-ELM, the performance results for the model with the lowest obtained UCLF forecast errors were captured.
Single layered DBN models were developed with the number of hidden units being varied for the respective models, the lowest number of hidden units used was four with the highest number of hidden units being sixteen.
The aggregation ensemble approach was used for the ensemble of the three techniques. These ensembles were of two techniques at a time. Here, the various respective parameters per technique are tuned and combined to form different ensemble models. The performance results of the forecast results with the lowest errors are captured per experiment. For each technique and experiment, the other hyperparameters, such as training rate and the number of layers, were kept the same. In future work, the effect of optimizing the hyperparameters can be investigated.
3.4. Performance Measures Used
Each model’s performance was measured using three key performance measures: symmetric mean absolute percentage error (sMAPE), mean absolute error (MAE), and root-mean-square error (RMSE). Motepe et al. state that the MAE, RMSE, MPE, MAPE, and sMAPE are common forecasting error measurements [
30]. They further state the challenge that the MAPE faces when target values are too small, which leads to errors being too large. The three used performance measurements in this research are presented in (17)–(19).
where
Fk is the forecasted value,
Tk is the target value, and
N is the number of forecasted values.
3.5. Statistical Significance Test
After the model performance is measured, the model results can be found to not be statistically different from each other. This means that despite one model achieving results with a lower error in comparison to the next model’s results, the model with the lower error does not necessarily outperform the model it is being compared to. A statistical test can be used to determine if model results are statistically significantly different. One such test is the t-test. The t-test uses the mean and the variance to check if two samples are from the same sample. The test calculates a significant value, also termed the p-value. A p-value less than the acceptable value means that the samples being compared have a significant difference, and vice versa. A p-value of 0.05, which is a commonly used value in scientific studies, was used in this study. The statistical significance test is performed, for each technique between the results with the lowest overall errors and results with the lowest errors from Exp 1, Exp 2, Exp 3, Exp 4, and/or Exp 5.
6. Conclusions
This paper contributed to the body of knowledge about South African UCLF forecasting. (i) A novel study of the South African UCLF behavior using state-of-the-art AI (deep learning and ensemble) techniques was presented. LSTM-RNN, DBN, OP-ELM, and ensembles of these three techniques’ models were investigated in South African UCLF forecasting. (ii) An investigation of the impact of the installed capacity, historic demand, and PCLF on the UCLF forecasting accuracy was presented. It was found that the installed capacity had the biggest impact on the UCLF forecasting error, with the exclusion of this variable doubling the errors with the respective techniques used. (iii) A novel deep-learning ensemble total South African UCLF forecasting system was introduced. It was found that an ensemble of LSTM models achieved the lowest errors with an sMAPE of 6.43%, MAE of 7.36%, and RMSE of 9.21%. The lowest achieved LSTM model UCLF forecast errors were an sMAPE of 7.95%, MAE of 9.14%, and RMSE of 11.42%. The lowest achieved DBN model UCLF forecast errors were an sMAPE of 9.74%, MAE of 11.52%, and RMSE of 13.74%. The lowest achieved OP-ELM model UCLF forecast errors were an sMAPE of 10.21%, MAE of 11.57%, and RMSE of 14.65%. The lowest attained error was, thus, given by the ensemble model, followed by LSTM-RNN. The non-deep learning techniques’ lowest achieved error was higher than that of the lowest errors achieved by the other techniques. Thus, ensemble deep learning techniques can be used to effectively forecast the total South African UCLF and, thus, load shedding.