1. Introduction
At present, a new round of power system reform is deepening, and the major changes in the supervision mode and profit mode of power grid enterprises are constantly exerting pressure on power grid enterprises. In the context of the impact of enterprise income accounting methods on their profits [
1], how to further optimize the efficiency of capital utilization and the level of capital control in the operation mode of “reducing costs and improving benefits” has become one of the research priorities of power grid enterprises. With the development of economic construction, power grid construction projects, as important national power supply carriers, need a lot of money from power grid enterprises due to their capital-intensive and technology-intensive characteristics. Under the background of a new round of power system reform, the operating pressure of power grid enterprises makes it necessary to further improve the lean level of engineering investment and improve the accuracy of engineering cost analysis. Therefore, more accurate project cost prediction should be carried out to reduce investment costs and provide investment decision-making basis for enterprises [
2].
As an important part of the power system, substation engineering has obvious characteristics of integration and uncertainty [
3]. Substation project cost refers to the total cost of a series of investment management activities required for substation investment and construction projects. Different engineering types, influencing factors, and construction time will have a direct impact on substation project cost. With the increasing planning and construction scale of power transmission and transformation projects, most enterprises have accumulated a great deal of engineering construction cost records and data when participating in engineering construction. If massive data are only stored in the database, they cannot be well used and applied to the construction of power grid projects, which will cause the waste of enterprise data resources.
Therefore, the accurate prediction of substation project investment is conducive to improving the economic efficiency of substation construction projects, optimizing the level of substation project cost control, realizing the full and effective analysis and utilization of cost data, and providing scientific decisions and help for the investment management of power grid enterprises [
4].
Most scholars found in their research on substation engineering that an important reason why it is difficult to implement cost control of substation engineering is the high-dimensional and dynamic nature of its influencing factors [
5]; therefore, a lot of the literature has made the influencing factors the focus of cost control of substation construction projects. To meet the requirements of refined management of cost of transmission and substation projects, Chen et al. [
6] used a multiple regression method to find the scale factors that have greater influence on the cost of transmission and substation projects, and through the analysis of the factors affecting the cost, he distinguished the scale factors, market factors, factors of external environment of the project, design technology, and construction standards, and clarified the key factors of cost reasonableness assessment. Lu et al. [
7] focused on the external influences of transmission and substation projects, but the article did not validate the proposed theory with examples. Starting from the assembly type system, Sun et al. [
8] innovatively analyzed the influencing factors of assembly-type substation cost. Wang and Ding [
9] analyzed the key influencing factors of the principal cost of power transmission projects, classified the key factors into linear and nonlinear, and then established a cost prediction regression model. In the study of previous articles, it is found that the independent prediction of substation engineering investment in the past is less. Additionally, the process of establishing the influence factors on the cost of substation projects is relatively rough, and a more systematic approach to the determination of the influence factors is not adopted. Therefore, this paper will adopt reasonable index selection method, starting from the technical factors based on cost composition which have great influence on substation project cost, and further analyze its influence on total cost.
Due to many influencing factors of substation engineering, scholars also try to improve the prediction accuracy through different prediction methods, thus, achieving the prediction effect. The fuzzy mathematics principle [
10], regression analysis method [
11], and other mathematical prediction methods are widely used in early prediction because of their simple calculation and ease of understanding. However, due to the complexity and uncertainty of the influencing factors of substation engineering, this type of prediction method is less capable of handling complex nonlinear problems. With the continuous improvement and development of forecasting methods, some modern forecasting methods, such as the grey forecasting method [
12,
13,
14], time series method [
15], system dynamics method [
16], combined prediction method [
17], etc., have also been gradually applied. Along with the development of increasingly sophisticated computer technology, the integration of mathematical principles with computer skills can make the forecasting method embody the characteristics of intelligence. As time goes by, the advantages of intelligent algorithms are gradually reflected. Intelligent algorithms are a form of “soft computing” that simulate the habits, the behavior, and the bodily functions of animals to obtain algorithmic models. SVM [
18,
19], neural networks [
20,
21], and other intelligent algorithms, and their optimization algorithms [
22] have achieved better research results in recent years. In the literature [
23,
24], the authors propose different perspectives to optimize the parameters of the SVM prediction model. Among them, Peng et al. [
23] uses the adaptive particle swarm algorithm to optimize the parameter setting and optimization of support vector machine so that the parameters and the engineering data achieve a good combination, and the proposed APSO-SVM prediction model is verified by an example. After optimizing the model using the firefly algorithm, Song et al. [
24] performed a validation of the prediction model based on 220 kV substation engineering data. Lin et al. [
25] introduced PCA to reduce the dimensionality of substation project cost data, and then used the particle swarm optimization algorithm to optimize the model parameters, to provide a reliable basis for investment decision of substation projects. Guo et al. [
26] used a genetic algorithm to adjust the weights and thresholds of a BP neural network, according to the principle of error backward conduction, and proved the value of this method in the application of cost prediction through the example of engineering cost index. The literature [
27] proposed a combined model based on particle swarm optimization SVR to predict the transmission project in a region, and the results showed that the use of the combined model is scientifically effective for estimating and reviewing the cost of transmission projects. Based on past data, Duan et al. [
28] used a BP neural network to achieve accurate project investment prediction from a non-linear perspective. Although the current research on intelligent algorithms is relatively extensive, the use of prediction methods is more demanding due to the wide range of influencing factors of power grid projects. Among many intelligent algorithms, the BP neural network has better organization and self-adaptability. The optimization of BP neural networks can further grasp the nonlinear relationship between influencing factors and project cost, thus, providing a new idea for substation project cost prediction.
At present, scholars have studied the prediction of substation project cost. However, due to the influence of internal and external, dynamic, and static factors, the complexity of influencing factors also increases the difficulty of its prediction [
29]. Existing research on index mining of engineering cost technology is relatively rough; at the same time, although some intelligent prediction methods have improved the prediction technology, some algorithms are still not suitable for dealing with nonlinear problems and easily fall into local optimum, which is not highly applicable to substation projects that need to deal with a large amount of input data, and there is an urgent need to find an intelligent optimization algorithm to improve the prediction accuracy. Therefore, this paper proposes a swarm intelligence optimization algorithm with strong adaptability and good stability, which is derived from the imitation of biological group behavior in nature, and comprehensively investigates the technical factors that have great influence on project cost, and proposes a new way to solve the optimal problem.
Compared with the classical swarm intelligence optimization algorithm, the newly proposed sparrow search algorithm in 2020 simulates the process of sparrow flock foraging and has the advantages of fast convergence and strong search ability. Compared with other swarm intelligence optimization algorithms, SSA takes all possible factors of group behavior into account so that it can quickly converge near the optimal value. Although the improved classical algorithms such as APSO can improve the performance of the algorithm, it needs to be improved in large-scale problems due to the increase of computation and time consumption. In order to overcome the drawback that the SSA is easy to fall into local optimum, Shi et al. [
30] proposed an evaluation model based on an FASSA-BP algorithm and studied the maturity evaluation of intelligent manufacturing capability based on the maturity theory and the firefly disturbance strategy. Li et al. [
31] proposed an SSA-BP model to predict the stress of reinforced concrete beams. The empirical analysis shows that the proposed model is superior to the traditional BP neural network in accuracy and robustness. Aiming at the shortcomings of an unstable model caused by random input weights and thresholds of extreme learning machine (ELM), Liu et al. [
32] proposed the use of a sparrow search algorithm to optimize the combined prediction model of ELM to achieve accurate prediction of wind power. Hu et al. [
33] used tournament algorithm to improve the SSA and proposed an ISSA-LSSVM prediction model for short-term power load forecasting. Through the verification of test functions, it was found that the improved SSA optimization performance is improved and more stable. Some scholars have applied the sparrow search algorithm to model construction and optimization in many fields. Different from other similar studies, this paper applies the emerging intelligent algorithm to the field of substation project cost prediction, carries out a more comprehensive technical index mining of substation engineering, and proposes an intelligent prediction model of substation project cost based on BP neural network optimized by sparrow search, which provides new ideas for substation project cost prediction research. The innovations of this paper are as follows:
- (1)
Starting from the concept of data space, this paper uses the data of the whole life cycle of substation engineering to comprehensively investigate the factors of substation engineering cost and the index library of substation project cost factors based on technical factors is constructed after secondary screening.
- (2)
For the first time, the sparrow search algorithm is used to optimize a BP neural network for the prediction of substation project cost. Based on model input indexes, the SSA is used to optimize the weights and thresholds of a BP neural network, so as to construct an SSA-BP prediction model to predict the substation project cost.
3. Intelligent Prediction Model of Substation Project Cost Based on Sparrow Search Algorithm Optimized BP Neural Network
3.1. BP Neural Network
Back Propagation neural network (BPNN) is a multi-layer feed-forward neural network trained according to the back propagation algorithm, which is a more traditional neural network. The basic idea is the gradient descent methodology, which continuously tweaks the thresholds and weights of the network by back propagation to minimize the error, and it is one of the most popular neural network models.
BPNN has good organization and adaptability. Through sample learning, nonlinear problems can be solved [
37]. The standard BPNN architecture is composed of three layers: the input layer, the hidden layer, and the output layer. The hidden layer can be multi-layer, and the neurons in each layer form a full connection. The basic structure of BP neural network is shown in
Figure 4. In the network structure constructed in this paper, the input layer dimension is consistent with the number of screening factors, and the output layer dimension is the cost of substation project.
The learning process of BPNN consists of two parts: forward propagation of information and backward propagation of errors. Forward propagation of information from the input layer through the hidden layer to the output layer. If the output result of the output layer does not match the desired layer, the process of back propagation of the error is entered. The application of the above process continuously reduces the error until it meets the expectation. Since the three-layer neural network has been able to solve simple nonlinear problems, it is most widely used.
3.2. Basic Principle of Sparrow Search Algorithm
The sparrow search algorithm (SSA) is an optimization algorithm proposed in 2020 based on the feeding behavior and anti-predator action of sparrows [
38]. In the SSA, each sparrow has three possible behaviors:
- (1)
As discoverers, searching for food.
- (2)
As joiners, using the finders to obtain food.
- (3)
As scouts, finding danger to decide whether the group continues to forage.
In the sparrow search algorithm, discoverers with better fitness values prioritize access to food during the search. The discoverer usually has high reserves of energy. As the role of guiding the population to find the foraging direction and region, there will be a wider range of food search than the joiners. In order to get a higher fitness value, some joiners choose to monitor the discoverer at any time during the foraging process, thereby increasing their chances of getting food. The number of scouts generally only accounts for a small part of the whole population. They are responsible for monitoring and early warning. Scouts alert the entire population to anti-predatory behavior when danger is detected during foraging. The basic process of SSA is to: initialize the sparrow population; calculate the individual fitness values and determine the best and worst fitness individuals; update the discoverer, joiner, and scout positions in turn; and update them through continuous iterations until the termination conditions are met.
The discoverer updates the location according to the foraging rules, described as follow Equation (10):
where
indicates the number of current iterations,
denotes the maximum number of iterations, and
represents the location information of the
sparrow in
the dimension.
is a random number,
represents alarm value, and
represent security value.
is a random number obeying normal distribution,
represents a matrix of
, and the factors in the matrix are all 1. When
, this means there are no predators around the food environment and it is within a safe range, the discoverer can search for food. When
, indicating that some sparrows in the population have found predators and issued a warning, all sparrows will fly to a safe place to feed.
The joiner updates its position by following the finder or by competing for food, this is described by the following Equation (11):
where
is the current optimal location that the discoverer occupies, and
is the worst position.
represents a matrix of
, and each factor in the matrix is assigned a random value of 1 or −1. When
,the joiner does not obtain food and needs to fly elsewhere to forage at this point to obtain a higher fitness value.
When scouts perceive the population danger, they will conduct anti-predator behavior and update the corresponding location, as shown in Equation (12):
Among them, is the current globally optimal location. is the control parameter of the step size and obeys a random number with a normal distribution with mean 0 and variance 1, K is a random number in interval [−1, 1], is the current fitness value of the individual sparrow, denotes the present globally optimal fitness value, denotes the present worst fitness value, and is the minimum constant that exists to prevent the denominator from being 0. When , indicating that the sparrow at the margins of the population, vulnerable to predator attacks.
3.3. SSA-BP Prediction Model
The BP neural network has solved many application problems, but with the expansion of application scope, some shortcomings have also been exposed, such as slow convergence of the algorithm and the ease of falling into a local extreme value. The SSA can speed up the convergence speed and improve the solution accuracy. Optimizing BP neural networks using the SSA allows them to be applied to a broader range of applications.
This article uses the sparrow search algorithm to optimize the weights and thresholds of the BP neural network to obtain more accurate results. The specific realization process is shown in the following steps.
- (1)
Data preprocessing. Including dividing the training and test sets and normalizing the data.
- (2)
Determine the BPNN topology. The nodes of input layer and output layer are obtained by size function, and the determination of hidden nodes uses the cycle process, the minimum error in the cycle process corresponds to the optimal hidden layer node.
- (3)
Initialize BPNN weights and thresholds.
- (4)
The SSA is used to seek the optimal value and threshold. It includes calculating population fitness, foraging behavior, and anti-predator behavior.
- (5)
Output BP neural network optimal parameters.
- (6)
Get the optimal parameters of the model for instance prediction.
The specific model framework is shown in
Figure 5: