1. Introduction
At present, China is formulating a national energy strategy calling for the use of a high proportion of renewable energy as the core means to achieve the national non-fossil energy development goals in 2020 and 2030 and to realize an energy production and consumption revolution. A high proportion of renewable energy terminal power consumption has become the basic energy structure layout for 2050. The China Energy Bureau announced that by 2050, China will form an integrated energy system based on renewable energy and a high proportion of renewable energy in the terminal energy consumption. The proportion should reach more than 60%, and the proportion of total renewable energy generation will reach more than 85% in the energy consumption layout, the electrification of the terminal energy consumption will be above 50%, the total electricity consumption will increase to 13.5~15 trillion kWh, and the per capita electricity consumption shall be 10,000~11,000 kWh. [
1]. The central role of electricity in achieving a high proportion of renewable energy development is objectively determined by the characteristics of electricity, resource endowments and energy development. Regardless of the relationship between electricity and other energy sources, or from the aspects of ensuring energy security, optimizing energy structure, and promoting ecological civilization construction, studying the proportion of renewable energy consumption of electricity is crucial to achieving the government’s strategic goals. As for significance, wind power, solar energy, and electricity have provided considerable macroeconomic and environmental benefits for achieving high proportions of renewable energy development. The high proportion of renewable energy development has also significantly replaced coal consumption. Through the high proportion of renewable energy development, the continuous reduction of the total emissions of major atmospheric pollutants (SO and NOx) will be ensured and controlled at 250 in 2050. Within 10,000 tons and 2.7 million tons, the emission of major pollutants (including heavy metal mercury, etc.) is equal to the emission level in 1980 [
2], thus realizing China’s responsibility for environmental protection in the world.
The energy consumption of the whole society is directly or indirectly affected by many factors. Up to now, many scholars have studied the factors affecting the energy consumption [
3,
4,
5,
6]. The problem of electricity consumption is a complex non-linear problem, for which so far scholars have proposed various prediction models, such as grey theory [
7,
8,
9], multiple regression [
10,
11,
12], and time series models [
13,
14,
15]; In recent years, various intelligent algorithms have also been applied to power consumption prediction [
16,
17,
18,
19,
20,
21,
22].
Meng et al. [
3] proposed a three-dimensional decomposition model and a mixed trend extrapolation model to explore the factors driving the growth of household electricity consumption in China, predicting the future development trend before 2030, and summarizing some of the main influencing factors; Akay et al. [
4] used the Grey Prediction and Rolling Mechanism (GPRM) method to predict Turkey’s overall and industrial electricity consumption, and both social and economic factors were adopted to forecast power consumption. Castillo et al. [
5] used a unified data set of 13 income and expenditure household surveys to assess changes in electrical and electricity consumption, taking into account income distribution, GDP, population, etc. as indicators of impact assessment; Pablo-Romero [
6] analyzed the relationship between electricity consumption and tourism growth in hotels and restaurants in 11 EU countries between 2005 and 2012, and modelled energy use based on three variables: energy price, income and climate. The result showed that both income and climate have a significant impact on increasing electricity consumption, while energy prices have no effect on electricity consumption.
Meng et al. [
7] proposed adding an improved grey model(1,1) ((GM(1,1)) into the method of residual correction and artificial neural network symbol estimation, and successfully predicted the power consumption in Taiwan. The example results showed that the improved grey prediction model had higher prediction accuracy. Wang et al. [
8] considered that power consumption prediction stability is more important than accuracy. Therefore, they proposed a hybrid prediction model based on an improved grey prediction model optimized by a multi-objective ant colony optimization algorithm to improve the prediction stability. Chiang et al. [
9] combined a neural network with grey theory to predict electrical loads. The proposed grey correlation analysis can select high-efficiency influencing factors, which makes the results better than single scheme and statistical autoregressive methods.
Mikayilov et al. [
10] used a time-varying coefficient cointegration method to study the correlation between electricity demand and the change of income and price in the time dimension, and proposed policy recommendations for the income and price of electricity consumption in Azerbaijan; Azadeh [
11] estimated and predicted the power consumption in an uncertain environment by an algorithm of fuzzy regression analysis consisting of 16 fuzzy regression models, and experimented with historical data from Iran to prove the algorithm’s superiority; Mohamed [
12] not the author surname—check selected economic and demographic variables as influencing factors to analyze the changing characteristics of New Zealand’s annual electricity consumption, and used a multiple linear regression analysis development model to predict the changes in New Zealand’s electricity consumption. The experimental results proved that electricity consumption is effectively related to all variables.
Azadeh et al. [
13] used a simulation-based comprehensive fuzzy regression time series model to estimate and predict the power demand for seasonal and monthly changes in power consumption in developing countries such as China and Iran. The results of the final example demonstrated the effectiveness and accuracy of the model. Kumar [
14] used three time series models, in which the grey Markov model has been used to predict crude oil and oil consumption, the grey model uses rolling mechanisms to predict coal, utility electricity consumption and singularity spectral analysis (SSA) to predict natural gas consumption, and SSA predicts India’s conventional energy consumption and compares the results with the Indian Planning Commission’s predictions, indicating that these time series models can be considered as a viable alternative to energy consumption prediction; Hussain [
15] applied the Holt-Winter and Autoregressive Integrated Moving Average models to time series secondary data from 1980 to 2011 to predict Pakistan’s overall and component electricity consumption.
In recent years, many scholars in the field of energy research have studied a large number of predictive machine learning intelligent algorithms. Li et al. [
16] used a new meta-heuristic algorithm, the
Drosophila optimization algorithm (DOA), to determine the values of two parameters of the least squares support vector machine (LSSVM). Based on this, an annual power load was constructed. Meng et al. [
17] considered the importance of monthly power consumption forecasting for planning power generation and distribution of electric utilities, using discrete wavelet transform to derive three relatively simple sequences, which are constructed in ascending trend and periodic wave respectively; Kandananond [
18] forecasted power demand according to the population of Thailand, GDP, stock index, exports, etc. and compared the performance of the prediction models including autoregressive comprehensive moving average (ACMA), artificial neural network (ANN) and multiple linear regression (MLR). The results showed that the ACMA and MLR models are better than ANN due to their simple structure; Zhao [
19] proposed a new hybrid power consumption prediction method, namely the grey model (1,1) (GM(1,1)), which was optimized by the moth flame optimization (MFO) algorithm with rolling mechanism. The example study proved the proposed better performance of the method. Ma et al. [
20] proposed a new power load forecasting method based on fuzzy reasoning and artificial neural network, and verified that the method can improve the prediction accuracy. Liang [
21] proposed a prediction model based on the improved fruit fly algorithm to optimize the parameters of the support vector machine (SVM) to improve the accuracy of the prediction. Wang et al. [
22] used differential evolution algorithm-optimized support vector regression to predict power consumption.
Renewable energy grid-connected operation has been studied by many scholars. The influencing factors affecting the integration of renewable energy can be divided into technology, operational cost-effectiveness, and power system planning.
Eissa, et al. [
23] considered various renewable energy sources, as well as information and communication technology components, and the grid will become more complex. Based on this, a wide area monitoring system (WAMS) based method was developed to solve technical difficulties in accessing renewable energy to smart grids; Dominguez-Navarro et al. [
24] thought that renewable energy and storage systems can effectively increase the profitability of electric vehicle (EV) and reduce the high energy required by the grid. The Monte Carlo method was used to simulate electric vehicle demand and renewable energy generation. Denholm et al. [
25] argued that a large number of variable power generation (VG) resources can improve system flexibility by changing support technologies such as grid operation and deployment of energy storage, and simulated three different proportions of wind and solar power generation scenarios; Bornapour [
26] proposed a stochastic model for coordinated scheduling of renewable heat units for renewable energy power dispatching, considering proton exchange membrane fuel cells, wind and photovoltaics, etc., and then using the improved teaching-learning-based optimization (MTLBO) algorithm to solve the problem; Emanuele et al. [
27] believed that the integration of variable renewable energy (VRE) improved the flexibility and dispersion of power systems, and that electric vehicles (EVs) can increase the integration of VREs and capture the potential advantages of power systems; Angenendt et al. [
28] considered the economics of grid-connected economics from the economics of residential photovoltaic cell energy storage. The strategy to evaluate operational strategies by simulating DC-coupled PV and battery systems was expected to reduce power leveling costs by 12%.
In recent years, more and more scholars have applied the principle of signal decomposition to the fields of prediction and decision-making, and used the time series decomposition technique of signal science to decompose the original signal sequence to form several sub-sequences. Among them, empirical mode decomposition (EMD), wavelet signal decomposition etc. are commonly used by scholars. An et al. [
29] used EMD to decompose wind farm power into several inherent mode function (IMF) components and a residual component, using different models to predict each component. The results showed that the decomposed results were more suitable for short-term wind farms; Kim et al. [
30] used feature decomposition for deep learning to decompose the load profile into a weekly load profile and then trained the long-term short-term memory network model with three-step regularized three-dimensional input data to predict the demand side load. The experimental results show the validity of the proposed model; Pang et al. [
31] analyzed the original vibration signal of the rotor by the improved singular spectral decomposition (ISSD) and Hilbert transform (HT) joint time-frequency method. Xie et al. [
32] proposed a method based on improved set empirical mode decomposition (MEEMD) to decompose deformation time series into a series of subsequences with significantly different complexity, and then established an approximation for each new subsequence; Xiao et al. [
33] obtained the eigenmode function (IMF) by improving empirical mode decomposition (IEMD). The Particle Swarm Optimization (PSO) algorithm was used to optimize the LSSVM algorithm to accurately identify the misalignment type of the large doubly-fed wind turbine (DFWT); Zhao et al. [
34] used the correlation coefficient analysis method to calculate and determine three improved IMFs, so that they were close to the original signal, and then used the multi-scale fuzzy entropy to calculate the entropy of the IMF.
With the wide application of intelligent algorithms, more and more scholars apply intelligent algorithms to forecasting and decision making in various fields. Extreme learning machine (ELM) is one of the most widely used intelligent prediction algorithms with high accuracy and applicability. Aiming at the parameter optimization of ELM and the optimization of single hidden layer activation function, many scholars have conducted research. Li et al. [
35] proposed using the kernel function in SVM instead of the connection weight matrix between the original hidden layer and the output layer in the ELM algorithm; li et al. [
36] proposed a new type of Laplacian bipolar to learning machine (LapTELM), enabling LapTELM to fully exploit the benefits of large numbers of unlabeled samples while preserving the learning power and efficiency of the double extreme learning machine (TELM); Fang et al. [
37] introduced a ELM’s multimodal data hierarchical framework which demonstrated that ELM has better learning efficiency than gradient-based multimodal deep learning methods; shang et al. [
38] developed a classification and regression tree (CART) based on A new predictive model of the Extreme Learning Machine (EELM) method, which improved the accuracy of PM2.5 concentration prediction per hour. Ming et al. [
39] proposed two parallel changes of ELM including local data and model parallel ELM (LDMP-ELM) and global data and model parallel ELM (GDMP-ELM), and used parallel technology to improve the parallelism and scalability of ELM.
In order to accurately predict the amount and proportion of China’s renewable energy terminal power consumption, this paper proposes a combined forecasting model. We optimized original ELM model with Inverse Square Root Linear Units (ISRLU) activation function which named improved extreme learning machine (IELM) algorithm. Based on EMD and bacterial foraging algorithm (BFO), the combined EMD-BFO-IELM forecasting model is proposed to predict the amount and proportion of renewable energy power consumption in China. The main contents of the article are as follows: The second part introduces the mathematical principles of EMD, BFO and IELM algorithm and the flow chart of the overall forecasting model is put forward. In the third part, the proposed EMD-BFO-IELM model is applied to predict China’s renewable energy terminal power consumption. By comparing with the IELM, BFO-IELM, the accuracy and training speed of EMD-BFO-IELM model has been proved better than others. Finally, we apply this model to predict China’s renewable energy terminal power consumption from 2018 to 2030 and mining its change rule. The fourth part presents more discussions and forward-looking conclusions.
2. Forecasting Model Including Materials and Methods
2.1. E.mpirical Mode Decomposition
The empirical mode decomposition (EMD) algorithm is a form of converting an irregular frequency wave into a plurality of waves and residual waves of a single frequency. The basic principle of EMD is to determine the “instantaneous equilibrium position” by using the average of the upper and lower envelopes to extract the intrinsic eigenmode function (IMF), that is, to decompose a complex signal into a finite eigenmode function and margin, each IMF The component contains local characteristic signals of different time scales of the original signal, so as to preserve the characteristics of the original data as much as possible. IMF is orthogonal to each other, has good performance, and can express the original signal very well. The residual wave is also an extremely smooth trend sequence. Therefore, EMD can linearize and smooth the non-stationary data sequence.
The specific steps of the EMD algorithm are as follows:
(1) Firstly, determine all local maxima and minima points on the original signal
; then, use the cubic spline interpolation function to determine the upper and lower envelopes
,
. Finally, calculate the average curve of the upper and lower envelopes as:
Find the difference between the original signal and the envelope mean:
In case
does not meet the two conditions of IMF, one needs to put
as the original signal, repeat the above steps to get:
This step operates
k times until
becomes an IMF, called the first-order IMF, which is recorded as:
(2) Subtracting
from the original signal yields the first-order residual signal
. Considering that the first-order residual signal
still contains longer-period components, the same filtering is required for
. Thus, the second order IMF, ..., the
n-th order IMF and the second order residual signal, ..., the
n-th order residual signal are sequentially obtained as well. This process can be expressed as:
When
becomes a monotonous function, the filter ends. Then Equation (6) is obtained:
In the formula, represents the average trend of the signal, which means that the initial sequence is equal to the sum of several intrinsic mode functions and residual terms.
2.2. Improved Learning Function of theExtreme Learning Machine
ELM is a feedforward neural network learning algorithm. The algorithm has a good global search ability, and once the parameters of the algorithm are confirmed, no adjustment is needed during the training. Compared with other machine learning algorithms, ELM has the advantages of high learning efficiency and good generalization performance.
Training sample in this article
in the middle, let ELM have
u input nodes,
L hidden layer nodes,
q output nodes, and the activation function is
, then
the network output can be expressed as:
In the formula, ωj a weight vector representing the j-th implicit node and the input node, represents the threshold of the j-th hidden layer node, βj represents the weight vector between the j-th implicit node and the output node
The activation function
is a key factor affecting network performance in ELM. Appropriate activation functions can improve the accuracy and generalization of ELM. In the current research, the Sigmoid function is commonly used as the traditional hidden layer activation function in ELM, which is a discriminant function using two-sided suppression. However, when the generalized Hop-world problem is encountered, the approximation value of the value function is monotonic, then the double-side suppression method will increase the waste operation [
40]. At this time, unilateral suppression is needed to complete the value discrimination. In addition, a modified linear function is widely used in the field of deep learning as a new type of activation function [
40], and its rectified linear unit (ReLU) is defined as:
The ReLU function is simple in form, fast in operation, and more generalized than Sigmoid, but the sparsity of the function will reduce the predictive ability of the function and reduce the average performance of the network. In this paper, an inverse square root linear units (ISRLU) is proposed as the activation function of the ELM algorithm, which is a nonlinear smooth representation of ReLU. The ISRLU function is nonlinear continuous and differentiable, and is closer to the biological activation model than the Sigmoid function, which can better avoid the forced sparsity of ReLU and improve the average performance of the network. In this study, the ISRLU function is selected as the activation function of ELM, and the function is defined as:
In the formula, α is the parameter to the ISRLU function.
2.3. Bacterial Foraging Algorithm
The Bacterial Foraging Algorithm (BFOA) was proposed by the scholar Passino in 2002 to classify biopsies based on the foraging behavior of E. coli in the human large intestine. Passino mentioned in the initial publication of the algorithm that the algorithm can be used in the field of automatic control and adaptive control of automatic locomotive. After several years of research, the bacterial foraging algorithm is now applied to more fields: power system, control engineering, power forecasting, etc. The solution process for specific problems is generating the initial solution population, calculating the value of the evaluation function, using the interaction and mechanism of the group to iteratively optimize, and implementing the three main operators of chemotaxis, reproduction and migration to achieve the optimal solution. The general bacterial foraging algorithm is divided into four processes: chemotaxis operation, aggregation operation, copy operation, and migration operation.
2.3.1. Chemotaxis Operation
The chemotaxis operation consists of two basic actions: flipping and swimming. When the bacteria encounters a favorable area with good nutrition, they will continue to swim. If the area where the adverse concentration is not as good as the previous step, it will flip and change the direction of swimming. Each set of bacteria moves to a new area to represent a set of optimization parameters. Calculate the individual fitness at this point to derive the value
which is used as an indicator of the next move formula.
represents the number of individuals, and
represents the chemotaxis operation,
k represents the copy operation,
l represents the migration operation. The
i-th bacterial trending operation is expressed as follows:
In the formula, indicates the step size of the bacteria, is the direction vector of the random direction of the element which value is a random number of [–1,1].
2.3.2. Aggregation Operations
In the process of searching for food in the bacillus community, there is an interaction force between a bacillus and other bacilli, that is, gravity and repulsion, and the gravitation makes the individual be “between”. The behavior of the group is held, and the repulsion allows the individual to have a position, gain energy, and maintain life. There is an attractive function in the bacterial foraging algorithm to describe this aggregation operation, whose definition function is:
In the formula: is the depth at which the bacteria release the substance, is a measure of the width of the substance that attracts bacteria. is the height of rejection, is the width of the exclusion. These parameters are mainly selected according to the characteristics of food richness. This aggregation behavior is only to accelerate the convergence rate of the bacterial foraging algorithm, but the application process is more complicated. Passino introduced the description when publishing the algorithm. This step can be omitted. Learning and researchers who later explored the bacterial foraging algorithm rarely applied this step to the algorithm, so there is no discussion of what principles these parameters should follow in this paper.
2.3.3. Copy Operation
The process of biological evolution has a rule of survival of the fittest. In the process of bacterial foraging, the adaptability is strong, and the weak will be eliminated. After the chemotaxis operation is completed, it is concluded that all individual health values (sum of all function values) and better healthy bacterial positions represent better optimization parameters. In order to speed up the search, bacteria need to search in these good positions, then the difference is poor. The location will be eliminated, this is the copy operation given by
with
for
:
The formula calculates each individual’s health value which is ranked from large to small, half of the health is better, so half of the poor health is eliminated. The formula is . The surviving bacteria split into two at the same position, so that the total number of bacteria can be kept constant, and the optimal position of high nutrition can be found more quickly, which improves the efficiency of bacterial convergence.
2.3.4. Migration Operation
The migration operation is based on a certain set probability, and each bacterium will randomly generate a random number rand(). If the probability of a given bacillus’ migration is greater than the random number, the bacillus will be eliminated, and the bacillus will randomly generate a new bacillus in the solution area to keep the total bacterial population unchanged. Randomly generated individuals may be closer to the global optimal position, solving the situation of entering premature and local optimal stagnation in the chemotaxis operation.
Due to space limitations, only the flow chart of the migration operation in the bacterial foraging algorithm is shown in
Figure 1:
The flow chart of the bacterial foraging process is shown in
Figure 2:
2.4. Renewable Energy Power Consumption Forecasting Model Design Process
Because the initial parameters of the traditional ELM network model are random, and the suitability of the activation function is not considered, the bacterial foraging algorithm can solve the problem of optimal chattering and precocity, and determine the optimal weight and threshold. The ISRLU function improves the generalization of ELM. The original data for the influencing factors is easy to homogenize, and the different scale information features of the data cannot fully discover the time-frequency characteristics of the time series data and affect the performance of the forecasting model. Based on the above reasons, this paper combines the three algorithms of EMD, BFO and improved IELM to propose a new forecasting model of renewable energy terminal power consumption. The overall forecasting steps are as follows:
(1) Time series data decomposition. Decompose x(t) to obtain IMF components and one residual rn with EMD.
(2) Construct training and test sample sets. In each IMF component, the input and output of each component training sample set and test sample set are constructed.
(3) Construct an optimized limit learning machine training and forecasting model for each component. In the bacterial foraging algorithm, the fitness function in the bacterial foraging algorithm is calculated; the initial population size and the maximum evolution algebra maxgen are set, and the genetic operations such as selection, improved crossover and mutation are performed on the individuals in the population, and finally the global excellent fitness; use optimal fitness to obtain optimal weight , and threshold ;
(4) Set the activation function of the ELM network to the ISRLU function, and then calculate the output matrix h and output weight of ELM
with
and
. Determine the IELM network structure; use the BFO algorithm to iteratively optimize parameters of each IELM model. The IELM fitting prediction model of the optimal parameters is established in each IMF component and remainder
to obtain the forecasting results of each component:
In the formula, is the actual value at time t, is the predicted value at time , H+ is the generalized inverse matrix of the output matrix H.
(5) Output of prediction results. The predicted results of each IMF component and remainder are summed to obtain the final forecasting result of China’s renewable energy terminal power consumption.
The algorithm flow of the renewable energy terminal power consumption based on EMD-BFO-IELM proposed in this paper is shown in
Figure 3.
4. Conclusions
In this paper, the grey relational analysis (GRA) theory is applied to screen the influencing factors affecting China’s renewable energy terminal power consumption. On this basis, a new EMD-BFO-IELM renewable energy terminal power consumption forecasting model is proposed. Firstly, we use EMD to decompose and denoise the data of the original historical renewable energy terminal power consumption in series, and remove the noise sequence, which improved the quality of the original data and successfully increased the data of training sets and test sets. Therefore, the data series met the data-level requirements of the machine intelligence algorithm, and successfully realized the realization of the machine intelligent algorithm prediction of China’s renewable energy terminal power consumption. Then, we use BFO algorithm to optimize the parameters of the ELM algorithm including optimal weight abest, and threshold bbest. This novel BFO-IELM forecasting model is applied to predict the sub-sequences after EMD denoising. Finally, we reconstruct the prediction series and superimpose the predicted values of each subsequence to obtain the prediction results of renewable energy terminal power consumption. In order to show the effectiveness of the proposed forecasting model, some commonly used statistical indicators are used to compare the accuracy of IELM, BFO-IELM and EMD-BFO-IELM models. The comparison results verify that the EMD-BFO-IELM forecasting model proposed in this paper is far better than the others. The generalization ability and robustness are proved by empirical analysis. After analysis, the main reason for the improvement of prediction accuracy is that China’s renewable energy terminal power consumption is a complex non-linear prediction problem, and it is a new research field. The lack of historical data makes the traditional methods useless. Because the training process of any intelligent algorithm needs a multitude of data, this paper introduces signal decomposition in the issue of renewable energy terminal power consumption, and the original data volume has been upgraded, increasing the reliability of the training process, so the proposed combined model enables machine intelligence algorithms to be applied to the issue of China’s renewable energy terminal power consumption, which is the main advantages of the proposed prediction model. Although some of the computational speed advantages are lost, the novel forecasting model can ultimately achieve higher prediction accuracy. Finally, the EMD-BFO-IELM forecasting model proposed in this paper is applied to predict the amount of renewable energy terminal power consumption in China from 2018 to 2030. The results show that China will realize 3.30 billion kWh of renewable energy terminal power consumption in 2030, and China’s renewable energy terminal power consumption ratio will exceed 38%, which indicates that China has great potential of renewable energy terminal power consumption, and can fulfill non-fossil energy development goals in 2030, and achieve the goal of energy production and consumption revolution. A high proportion of renewable energy terminal energy consumption can transform China’s current unsustainable energy consume and supply mode, and stop relying on heavy energy consumption of fossil energy. A high proportion of renewable energy terminal energy consumption mode brings China pressures on cost-benefit costs to a certain extent, including grid-connected infrastructure for renewable energy generation, renewable energy generation, and energy storage technology upgrades, which all require large investments. This will lead to an increase in China’s overall average cost of power generation in the short term, but the cost will also bring high external benefits, including upgrading and transformation of the power and energy industries, and reducing environmental pollution. From an economic perspective, the transformation of investment in terminal power and energy indicates that a large number of employment opportunities will be created in the future, thus making up for the current reduction of employment opportunities in China’s traditional coal industry supply chain. In generally, all results proposed in the paper are in line with China’s current active energy innovation strategy.