1. Introduction
With the ongoing advancement of modern industry, an increasing variety of mechanical equipment, industrial production lines, and electrical control systems are being utilized [
1,
2]. However, during long-term operation, these devices and systems inevitably encounter various failures, resulting in significant challenges and losses for both production and maintenance [
3]. Consequently, developing effective methods for early warning and diagnosis of equipment and system faults has emerged as a pressing issue to address [
4].
In this context, adopting machine learning-based fault warning methods has become a vital approach to modern industrial fault prevention [
5]. Fault warning is a management strategy that enables early detection of faults and implementation of appropriate measures to prevent or mitigate their impact on enterprises or consumers. Not only can fault warning enhance operational efficiency and customer satisfaction for businesses, but it also reduces maintenance expenses and minimizes production line downtime caused by failures [
6].
BP neural networks are widely used in the field of fault early warning, primarily due to their strong learning capability, rapid information processing speed, and error adaptivity benefits. However, despite the remarkable results achieved by BP neural networks, they possess a series of non-negligible drawbacks that negatively impact the accuracy and stability of fault early warning systems.
Firstly, BP neural networks are prone to falling into the local minima problem. When the weight update during training becomes stuck at a local optimal solution, the neural network may not be able to find the global optimum solution, ultimately affecting the warning accuracy. Secondly, selecting an appropriate BP neural network configuration is both complex and challenging. Failing to suitably determine the parameters can result in suboptimal network performance. Furthermore, BP neural networks suffer from slow convergence, and the training process often takes a substantial amount of time. This increases the response delay of the early warning system, which may allow a fault to occur before a warning is issued.
The primary objective of this paper is to effectively optimize BP neural networks for developing more efficient fault warning methods. According to the No Free Lunch theorem of optimization problems, current algorithms may not be proficient in addressing this issue. This paper’s key contribution is the introduction of an enhanced metaheuristic algorithm, based on EO, which incorporates a random wandering strategy and the concept of simulated annealing. To the best of our knowledge, based on the existing literature, this metaheuristic algorithm has not been employed to solve any variant of BP neural networks. In this study, we implement it practically. Furthermore, we integrate an adaptive learning strategy into the BP neural network and compare its performance with other cutting-edge fault warning algorithms.
The remainder of this paper is organized as follows:
Section 2 provides a comprehensive literature review of the latest advancements in fault warning algorithms. In
Section 3, we present our proposed hybrid approach in detail, highlighting the integration of an EO with a Backpropagation (BP) neural network.
Section 4 presents a case study involving widely used industrial equipment, showcasing the application of our approach for failure warning. To evaluate the effectiveness of our hybrid method,
Section 5 compares it with other state-of-the-art techniques. Finally,
Section 6 concludes the paper with a discussion of key findings, limitations, and potential directions for future research.
2. Literature Review
At present, the growing interest in early fault detection has led researchers to develop various techniques. Zhao et al. presented a deep learning approach—deep autoencoder network—that examines sensor data for issuing early warnings about component failures in wind turbine systems [
7]. Wang et al. devised an enhanced deep learning, multistage, fusion LSTM model for predicting future reciprocating compressor valve parameters by studying operational data’s spatiotemporal features, thus achieving fault early warning goals [
8]. Gao et al. employed adaptive deep belief networks and charging data analysis to create an early warning method for electric vehicle charging processes, training the network with historical charging data and offering early warnings using real-time and predicted data [
9]. Luo et al. introduced a conditional mutual information technique for selecting valuable variables from multiple options for network training, subsequently developing a BP neural network-based wind turbine gearbox fault diagnosis model utilizing real-time data computations [
10]. Wang et al. established a power distribution transformer pre-warning model that accounts for extreme weather conditions and various nonlinear situations, integrating weather data into BP neural network training [
11]. Chen et al. optimized the BP neural network with genetic algorithms to issue warnings regarding wind turbine pitch system faults, filtering pitch system parameters with strong power correlation based on SCADA system-monitored parameters for network training [
12]. Jiang et al. also created a GA-BP model and investigated a state-based baking machine maintenance method using operational data. They determined weight selection input data through the entropy weighting method, effectively avoiding the influence of subjective factors by selecting reasonable data input samples [
13]. Zhang et al. optimized the BP neural network with an improved grey wolf algorithm, examining an electric vehicle charging safety pre-warning model based on charging statistics, and providing early warnings by comparing post-network fitting data with original data [
14]. Chen et al. proposed a BP neural network optimization technique employing parallel factor decomposition and GA, efficiently extracting intricate information from equipment operation using the parallel factor decomposition method, achieving data mining, and considerably enhancing centrifugal pump fault detection efficiency [
15]. Lin et al. optimized the BP neural network with an improved sparrow search algorithm, applying the model to active phase-change control device fault detection using specific equipment information [
16]. Wu et al. suggested a hybrid method that combined a deep local adaptive network, two-stage qualitative trend analysis, and a five-state Bayesian network for extracting trend states from local moving window data, converting continuous data of abnormal variables into trend state information for fault detection, identification, and diagnosis [
17]. Zhou et al. introduced an entropy-based sparsity technique, utilizing LSTM network and envelope analysis data to predict bearing defects and identify issues in complex hydraulic machinery (such as axial piston pumps) [
18].
After conducting a comprehensive review of the above literature, we have drawn the following conclusions:
(a) In modern engineering fields, fault warning for equipment and systems is a crucial task. Numerous fault warning technologies are continuously emerging. However, compared to other methods, BP neural networks exhibit several advantages, making them the preferred solution for fault warnings.
(b) BP neural networks face some practical shortcomings, such as local minima and difficulty in manual parameter selection. Slow training speed and the propensity to fall into local minima can negatively impact fault warning systems, leading to prolonged diagnosis times, increased operational costs, and diminished warning accuracy. Addressing this issue is vital, as it can enhance the efficiency and precision of fault warnings, reduce operational expenses, and optimize maintenance strategies. By refining training methods, we can achieve faster and more reliable fault detection in practical applications, ultimately ensuring efficient equipment maintenance and promoting the stability of production processes. Optimizing them using metaheuristic algorithms serves as an essential solution.
(c) The No Free Lunch Theorem suggests that algorithms performing well on some problems may perform poorly on others [
19]. Consequently, in accordance with the No Free Lunch Theorem, it is necessary to persistently explore the application of algorithms in new areas to identify the optimal algorithm suitable for specific tasks or scenarios.
EO, proposed in 2020, is a novel optimization algorithm inspired by the physical phenomenon of control volume mass balance. It is characterized by robust optimization capabilities and rapid convergence. Results from various case studies indicate that EO outperforms numerous classical and contemporary algorithms, such as the particle swarm algorithm, grey wolf optimizer, genetic algorithm, gravitational search algorithm, and sparrow search algorithm. According to the NFL theory, we used this excellent algorithm.
With the above background, compared to previous research, this paper offers the following contributions:
(a) We designed an improved equilibrium optimizer (IEO) by incorporating a simulated annealing algorithm into the main loop process of the conventional equilibrium optimizer and augmenting it with an enhanced local search operator that utilizes a random wandering strategy. Experimental analysis effectively demonstrates that our introduced strategy significantly enhances the search capability of the IEO.
(b) Fixed parameters may result in slow convergence and performance degradation for BP neural networks. In this study, we integrated an adaptive update strategy for the parameterization into the BP neural network, effectively improving its performance.
(c) This paper proposes a novel fault warning model and strategy using BP neural networks and IEO, determines its parameters through Taguchi’s experimental method, and validates the effectiveness of the method via real-world analysis. Comparisons with other state-of-the-art methods further showcase the exceptional performance of our proposed method, providing a new approach to fault warning.
In summary, this study presents a significant contribution to the field by effectively improving the EO through the combination of SA and the incorporation of a random wandering strategy. For the first time, we have integrated this enhanced algorithm with a BP neural network, which substantially expands its application domain. Furthermore, our research introduces a parameter calibration analysis and fault warning strategy based on this model. To validate our proposed hybrid method, we collected data from real-world industrial equipment commonly used in practice, conducted a thorough investigation of its failure modes and fault anomalies, and tested our method using this case study. The results underscored the high accuracy of our technique in identifying faults, demonstrating that our research can effectively elevate the level of failure warning and contribute to sustainable industrial development.
3. Proposed Hybrid Method
In this section, we first describe the improved BP neural network (
Section 3.1), followed by a description of IEO (
Section 3.2), and finally the framework of IEO-BP is constructed (
Section 3.3).
3.1. Improved BP Model
The general flow of the BP neural network is as follows [
20]:
Step 1: Initialization: Randomly initialize the connection weights and thresholds.
Step 2: Forward propagation to calculate the output value: the input samples are passed through the input layer to the output layer through the implicit layer, and the output value is calculated, as depicted in Equations (1) and (2).
Input layer to the implied layer:
Implicit layer to output layer:
is the connection weight from the input layer to the hidden layer,
is the connection weight from the hidden layer to the output layer,
is the threshold of the hidden layer,
is the threshold of the output layer,
is the
ith feature of the input sample,
is the output of the neuron in the hidden layer.
Step 3: Root-mean-square error (
RMSE) calculation: The error is calculated using the difference between the output value and the desired output value. It is calculated as in Equation (3).
where
is the output of the output layer neurons,
is the desired output value, and m is the number of output layer neurons.
Step 4: Back propagation to adjust the weights and thresholds: the connection weights and thresholds are adjusted by error back propagation from the output layer back to the input layer, as described in Equations (4) and (5).
where
is the activation function, and
is the derivative of the activation function.
Step 5: The weights and thresholds are updated, as shown in Equations (6)–(9).
where
is the number of neurons in the hidden layer, and
is the learning rate.
Step 6: Repeat 2~4 steps until the error reaches convergence or the training number reaches the limit.
This is followed by our improvement of the traditional BP neural network:
When training a neural network, selecting an appropriate learning rate is crucial to achieving a fast convergence rate and maintaining stability [
21]. Utilizing a fixed learning rate may result in issues such as under-learning or overfitting during the training process. As a solution, we employ an adaptive learning rate (exponential decay) to train the BP neural network. In exponential decay, the learning rate decreases as the number of training rounds or iterations increases. This strategy helps reduce the magnitude of weight updates as the optimal solution is approached, enabling more accurate finetuning. Specifically, we adaptively adjust the learning rate according to Equation (10).
where
I is a constant between 0 and 1, representing the decrease in the learning rate after each decay step, usually set to 0.5;
E denotes the number of current iterations; and
T represents the time interval for decaying the learning rate.
3.2. Enhanced EO Model
EO is a novel intelligent algorithm inspired by the mass balance equation in physics [
20]. The mass balance equation describes the process of mass entry, exit, and generation within a controlled volume and can be expressed as a first-order differential equation, as shown in Equation (11).
where
V is the control volume;
C represents the concentration within the control volume;
Q denotes the volumetric flow rate into or out of the control volume;
Ceq represents the concentration in the control volume at equilibrium; and
G signifies the mass production rate within the control volume.
The rate of mass change of the control volume, , when it is 0 means that the control volume enters a stable equilibrium state.
Let = = , the transformation of Equation (10) gives .
Let
t0 and
C0 be the initial time and concentration values, respectively, and integrate both sides of Equation (11) simultaneously to obtain.
Solve Equation (12) to obtain Equation (13).
where
.
3.2.1. Initialization
Similar to most heuristic algorithms, the initialization process of the balanced optimizer can be expressed as Equation (14).
where
is the initial concentration vector of the
ith individual:
and
are the lower and upper limit vectors of the individual,
randi is a random vector between [0, 1].
3.2.2. Establishing the Equilibrium Pool
The equilibrium state represents the ultimate state that the algorithm converges to. During the optimization process, the equilibrium pool serves as a source of candidate solutions for the entire optimization procedure. In our proposed method, we introduce a bootstrap optimization process. Specifically, the EO selects the top four individuals in terms of fitness from the equilibrium pool and calculates their average, creating a “fifth individual”. Subsequently, one of these five individuals is randomly selected with equal probability to guide the rest of the optimization process, as demonstrated in Equation (15).
where
e.
The probability of each of the five individuals in the equilibrium pool being selected as the solution for the bootstrap optimization process is identical, with all having a 0.2 chance.
The bootstrap optimization process plays a crucial role in enhancing the exploration and exploitation capabilities of the algorithm. Introducing randomness and diversity through the selection of individuals from the equilibrium pool helps to prevent the algorithm from getting stuck in local optima.
3.2.3. Exponential Terms
The exponential term plays a crucial role in the algorithm’s update process and can be represented as Equation (16).
where
is a random vector between [0, 1].
The variable t is defined as a function that diminishes with an increasing number of iterations, as illustrated in Equation (17).
where
E and
Maxit are the current iteration number and the maximum iteration number, respectively;
a2 is a constant, generally taken as 1.
In order to guarantee the algorithm’s convergence while simultaneously enhancing its search and exploitation capabilities, consider:
where
a1 is a constant, generally taken as 2;
sign is a mathematical sign function;
is a random vector between [0, 1].
Bringing Equation (18) into Equation (16), we can obtain Equation (19).
3.2.4. Generation Rate
The algorithm generation rate is characterized as a first-order exponential decay process, illustrated in Equation (20).
In order to achieve a more controllable and systematic search pattern, the algorithm sets
k =
λ and incorporates the previously derived exponential term to describe the generation rate, as represented in Equation (21).
where
where
r1 and
r2 are random numbers within the range [0, 1];
GP is the generation probability, which is typically set to 0.5.
In summary, the final update formula for the balanced optimizer is defined in Equation (24).
where the
V value is generally taken as a constant 1.
In Equation (24), the first term represents the concentration at equilibrium, while the second and third terms characterize changes in concentration. Specifically, the second term enhances the algorithm’s search capability by inducing significant changes in the individual close to the equilibrium state. Meanwhile, the third term improves the utilization capability by refining the obtained solution through minor adjustments in concentration.
3.2.5. Individual Memory Storage
Drawing inspiration from the concept of individual best in particle swarm optimization, the balance optimizer introduces an individual memory storage mechanism [
21]. After
E iterations (where
E = 2), the fitness value achieved by each individual is compared with the fitness value obtained after
E-1 selections. If the fitness value of the individual improves after the
E-th iteration, both the individual’s position and fitness value are updated accordingly. Otherwise, no update occurs, and the individual retains the position and fitness value obtained after the
E-1 selection for the next iteration. This mechanism primarily aims to enhance the algorithm’s utilization capacity.
3.2.6. Enhanced Local Search Strategy Based on SA
In the literature, it is pointed out that EO has advantages, such as rapid convergence, but it also has the disadvantage of easily falling into local optima [
22,
23,
24,
25]. To address this issue, this section combines SA with EO and designs a local search.
SA is an optimization algorithm that seeks the global optimal solution in a complex search space. In SA, the temperature (T1) is a crucial parameter that imitates the distribution of energy states of atoms within solid-state physics at various temperatures. This temperature parameter represents the likelihood of accepting suboptimal solutions in the search space. To explore the solution space more extensively, the algorithm begins with a higher initial temperature. As the number of iterations increases, the temperature is progressively reduced, consequently decreasing the probability of accepting an inferior solution. This procedure is known as “cooling” or “annealing”. By suitably adjusting the cooling rate and cooling function, the solution space can be thoroughly explored, ultimately converging to the global optimal solution. The following are the basic steps of the algorithm:
Step 1: Random initialization: determine an initial solution x, which is usually generated randomly, according to the characteristics of the problem.
Initial temperature setting: initialize the parameter T1 (temperature) to a larger value in order to make the search process easier to jump out of the local minima.
Step 2: Iterative loop: for each temperature T1, a certain number of subit iterations are performed; each iteration starts from the current solution x to explore a new solution .
Step 3: Random perturbation: according to the characteristics of the problem, some random perturbations are made to the existing solution x to obtain a new solution .
Step 4: Evaluate the function: calculate the quality of , i.e., estimate the value of the function to be solved .
Step 5: Decision function: decide whether to accept the new solution according to the metropolis criterion:
If < , then the new solution is accepted.
If
≥
, then accept the new solution with probability
p.
where
is the energy difference,
t is the temperature, and
e is the natural constant.
Step 6: Temperature update: cooling according to certain cooling rules.
where
is the cooling rate, gradually reduce the temperature.
Step 7: Stopping condition: when the stopping condition is reached, i.e., the maximum number of iterations or the minimum temperature.
It should be noted that we set a random wandering strategy in this step to better improve the search performance, which is calculated as shown in Equation (27).
where
is the updated new solution;
and
are two random solutions; ε is the scaling factor,
ε∼U (0, 1), U is uniformly distributed.
3.3. IEO-BP Model
To address the issues of weak self-adaptation and local minima in BP neural networks, we first employ IEO to globally pre-optimize the weights and thresholds of BP neural networks. Next, we assign these optimal weights and thresholds as initial values for BP neural networks and use the optimized parameters for training. This approach leads to the final fault BP neural network structure for early warning. The specific ISEO-BP process includes the following steps:
Step 1: Input neural network parameters, such as the number of hidden layer neurons, activation function, training times, training rate, and target error to be achieved during training.
Step 2: Input IEO algorithm parameters and use the RMSE of neural network prediction as the IEO fitness function. Execute the IEO algorithm process.
Step 3: Train the constructed BP neural network using the weights and thresholds obtained from IEO optimization, resulting in the optimized BP neural network structure.
Step 4: Input test data into the trained BP neural network to obtain output data. Perform data analysis on the output.
IEO-BP encompasses several crucial stages. Firstly, the neural network is set up by randomly allocating weights and biases to its neurons. Following this, input data traverses through the network, resulting in output via a combination of weighted and non-linear activation functions. The generated output is then compared to the actual labels to determine the error.
At this juncture, the IEO component is introduced, creating an initial solution for the optimization process. This solution experiences an iterative search procedure that assesses fitness values and selects novel candidate solutions. The derived solution is subsequently integrated into the BP neural network, serving as weights and thresholds to begin iterations.
The iteration cycle carries on until a pre-specified number of iterations are executed or the error is minimized to a satisfactory level. By employing this method, IEO-BP effectively merges the benefits of both BP neural networks and IEO algorithms, offering a powerful and efficient solution for equipment fault warning detection.
4. Case Study
In this section, we first describe the case use (
Section 4.1 and
Section 4.2), followed by an effective selection of the IEO parameters (
Section 4.3), and finally, the training and testing of the IEO-BP network (
Section 4.4).
4.1. Case Description
The vapor feed pump is a widely used type of pump in various industries, primarily for supplying water to boilers and other equipment. It offers:
High Efficiency: The pump quickly delivers water to the target equipment, ensuring a consistent and stable flow. This significantly improves water usage efficiency.
Reliability: Allows for continuous operation, even under considerable loads and extended periods of use.
Continuity: Equipped with a dual water supply system (electric or steam), the pump can continue functioning if one system encounters issues, thus guaranteeing continuity in the production process.
Due to the widespread application of vapor feed pumps, we have chosen this pump as a test example to evaluate the effectiveness of IEO-BP.
Following our investigation and analysis, we identified five primary fault types for steam feed pumps and their corresponding data anomalies.
Furthermore, we gathered 2800 sets of regular operation data from steam feed pumps and 30 sets of failure data for each fault type. The data collection relied on sensors, and the standard operation data encompassed a variety of pump performance metrics under different operating conditions, including pressure, flow rate, and temperature. We utilized this data to train an IEO-BP network model to recognize normal operating characteristics. The fault data encompass five primary fault types, among others, which are employed to assess the IEO-BP’s ability to detect and provide early warning for these faults. By leveraging this comprehensive data for training and testing purposes, we could effectively evaluate the practicality and efficiency of the IEO-BP approach in real-world applications.
Our data collection methods are as follows:
1. On-site temperature and vibration sensors complete the data acquisition.
2. Collected data are sent to the control system.
3. The control system sends the data to the SIS system via the OPC data interface.
4. Access to the data from the SIS system for analysis and processing.
Regarding the sensors we utilized, temperature sensors typically transform temperature variations into electrical signals by altering thermocouple resistance. This process enables us to acquire temperature data. In this case, we have employed the widely used voltage divider circuit. Once the supply and voltage divider resistor R1 are established, we can determine the relationship between output voltage and temperature. We then choose an appropriate voltage divider resistor and calculate the corresponding voltage divider value for each temperature based on the Resistance-Temperature (R-T) table.
Vibration sensors primarily consist of three types: acceleration sensors, velocity sensors, and displacement sensors. These sensors measure vibration acceleration, vibration velocity, and vibration displacement, respectively.
As for pressure sensors, they convert pressure fluctuations into changes in resistance. However, since directly capturing resistance as a signal is challenging, we need to transform the resistance change into a voltage or current change. This conversion allows the acquisition card to collect data efficiently.
The
Appendix A shows our collection of fault pictures.
4.2. Description of Fault Types and Characteristics
We have gathered and identified five primary types of faults, as depicted in
Table 1. The table presents the fault type in the left column, the anomaly measurement point during the fault in the middle column, and the abnormality type of the fault measurement point in the right column.
Table 2 showcases a few examples of measurement points exhibiting anomalous data.
4.3. IEO-BP Parameter Calibration
Appropriate parameters have a significant impact on algorithms [
26,
27]. In machine learning, many algorithms require certain parameters to be set in order to tune their behavior. These parameters can affect the performance of the model training process and the final performance of the model. If the parameters are not set properly, they may cause the algorithm’s performance to degrade or even fail, resulting in the model not converging correctly. Therefore, proper parameter selection and tuning are required to ensure the best performance of the algorithm.
Before using IEO-BP for the training of the vapor feed pump network, we first adjusted its parameters, setting the number of BP network training to 1000, the training error to 0.02, and the learning rate to 0.001. We provided three reference values for each of the remaining parameters, as shown in
Table 3, based on pre-experiments and literature analysis [
12,
13,
14]. It is important to note that we chose
Softsign,
Tanh, and
ReLU as activation functions. The
Softsign,
Tanh, and
ReLU activation functions perform well in dealing with nonlinear problems. They are widely used activation functions in deep learning and neural networks and help to improve the performance of the models on various tasks. Here is a brief description of these three models:
(1) Softsign:
Softsign functions are simpler and more efficient to compute than other S-shaped curves, such as Sigmoid and Tanh.
The output range (−1, 1) is useful to avoid excessively large or small output values in some scenarios.
(2) Tanh:
The Tanh function is symmetric with respect to the origin of the coordinates compared to the Sigmoid function, so it may have better performance in some applications.
Output range (−1, 1) alleviates the gradient vanishing problem.
(3) ReLU:
The ReLU function has good stability and low computational complexity when training deep neural networks.
ReLU has linear and nonlinear properties that help improve the expressiveness of the model and help it learn more complex mathematical functions.
Mitigates the gradient vanishing problem and helps converge faster.
It should be noted the all the codes were written in MATLAB 2018b software on an operating system using an (InteI) CI(TM) i7-10850H CPU @ 2.70 GHz, 2712 MHz, 6 Core(s) and 12 Logical Processor(s).
Based on the results in
Table 3, conducting the full test would require a significant amount of resources. Therefore, we used the Taguchi test method to form an orthogonal array to conduct a reasonable number of tests. This method is based on the design principle of “orthogonal table”, where multiple variables are combined and arranged so that each variable is tested at different levels. This maximizes the amount of useful data obtained and minimizes possible confounding factors.
We use the relative percentage deviation (
RPD) to measure the performance of IEO-BP for each combination of parameters, which is calculated by Equation (28).
where
is the
RMSE under the current parameter pair,
is the minimum
RMSE among all experimental times.
To effectively evaluate the performance of models with varying parameters, we employ K-fold Cross Validation, a widespread method for assessing machine learning model performance. This technique involves using a subset of the dataset for multiple training and validation iterations, ensuring the stability of evaluation results. The procedure includes the following steps:
Randomly divide the dataset into K disjoint subsets.
For each subset, execute the following steps:
Set the current subset as the validation set and merge the remaining K-1 subsets to form the training set.
Train the model with the training set.
Evaluate the model performance using the validation set and record the evaluation results.
Calculate the average of the K evaluation results to obtain the final performance evaluation metric of the model.
We select K = 5 and use Root-Mean-Square Error (
RMSE) as the evaluation metric. We record the average result after five runs with the current combination of parameters. As per the Taguchi method’s recommendation, we utilize the L27 orthogonal array. The subsequent outcomes of the 27 experiments are presented in the
Table 4.
We then used Equation (28) to calculate the
RPD for each group of experiments; we select the mean
RPD value of each parameter across all experiments to determine its optimal level. After calibration, the final parameter settings are displayed in
Table 5.
4.4. Network Training and Early Warning Testing
After crossover experiments and the Taguchi method to determine the optimum parameter levels, we first performed the network training of the IEO-BP model using the health sample data collected in
Section 3.1. All data were divided into training and test sets according to an 8:2 ratio. The true values of the measured points and their predicted values under normal operation of these five main fault types are shown in
Figure 1. Based on the true and predicted values, we can obtain our fault warning method, which is described in detail below. In addition, the
RMSE convergence results for the training set and the
RMSE convergence results for the test set are depicted in
Figure 2 and
Figure 3, respectively, with the test sets observed every 20 iterations.
According to the results in
Figure 1, our IEO-BP model demonstrates a good prediction effect; in addition, according to the results in
Figure 2 and
Figure 3, the convergence of IEO-BP is also faster and, at the same time, more stable on the test sets, and we can propose a fault warning strategy based on the difference between the predicted and true values. The specific steps for implementing a fault warning strategy are as follows:
Step 1: Set the threshold value: Based on historical data analysis, establish a reasonable threshold value for the difference between predicted and actual observed values. This threshold should account for both normal equipment fluctuations and abnormal fluctuations that occur during faults.
Step 2: Real-time warning: Input real-time equipment data into the trained prediction model to obtain predicted values. Calculate the difference between the predicted and actual observed values. If the difference exceeds the predetermined threshold, issue a fault warning signal.
Step 3: Fault diagnosis and processing: Upon receiving the fault warning signal, conduct further inspection and diagnosis of the equipment. Depending on the diagnostic results, take appropriate measures to prevent or mitigate losses caused by equipment failure.
Step 4: Continuous optimization: Consistently collect equipment operation data and update the prediction model to maintain accuracy. Regularly evaluate the effectiveness of the warning strategy, adjust threshold settings, and make other necessary improvements.
By employing this fault warning strategy based on the difference between predicted and actual values, abnormal equipment conditions can be detected and addressed promptly, thereby enhancing the operational efficiency, safety, and service life of the equipment.
This failure warning strategy can also be applied to other devices. The trained network can output point data, and if the output data deviates from the value set by the decision maker, a fault warning judgment can be triggered.
Subsequently, using the trained network and the proposed fault warning test, we conducted fault warning tests for five fault modes. The final test results are displayed in
Table 6, demonstrating that IEO-BP can effectively achieve the purpose of fault warning. Its histogram is presented in
Figure 4.
It should be noted that the warning strategy we adopt involves issuing a warning immediately when the error between the predicted and true values exceeds the limit set by the decision maker during operation. Considering the randomness, the decision maker can either conduct troubleshooting immediately on this occasion or wait until the next warning.
Based on the above-mentioned experiments, IEO-BP can effectively achieve the purpose of fault warning, with its warning success rate for H1, H5, H3 reaching 93%, and for other problems being higher than 85%, which is within a reliable confidence range. However, we also note that its early warning accuracy for H4 is only 80%; thus, we need to strengthen the training of IEO-BP in this aspect. This issue arises from the settings of fault warning thresholds. Some fault data points may not be sensitive enough, causing deviations that do not reach the threshold. If the threshold is set too low, accuracy would increase, but it might also lead to false alarms. In actual practice, it is necessary to choose the threshold based on specific requirements and conditions.
6. Conclusions and Future Work
Fault warning is a reliable method for promoting the sustainable development of industrial equipment. Among various fault warning techniques, the BP neural network stands out as the most common and efficient approach. However, it has certain shortcomings. To enhance the efficiency of fault warning, this paper introduces a hybrid algorithm called IEO-BP. In IEO, we incorporate an SA-based random perturbation local search operator to effectively boost the exploration ability of the algorithm. For the BP neural network, we add an adaptive learning rate to improve its prediction performance. Subsequently, we combine IEO with the improved BP neural network for fault warning analysis. Experimental results demonstrate that IEO-BP effectively achieves the fault warning objective, displaying notable advantages in comparison with other algorithms. In terms of performance comparison, our method achieved the best values for RMSE and R2, with its solution efficiency in the middle of the range, thus striking a balance between efficiency and quality. In the fault warning test, its effectiveness improved by 11% compared to GA-BP, 8.5% compared to SVM-BP, and 6% compared to AFSA-BP, resulting in an average effectiveness improvement of 8.5%. Additionally, our proposed algorithm enhancement strategy demonstrates its effectiveness by exhibiting a faster convergence speed and higher solution accuracy compared to conventional EO.
Our research not only addresses the limitations of EO, but also expands its application area by integrating it with the improved BP neural network to propose a novel solution for fault warning issues, thereby fostering enhanced industrial development to meet contemporary demands.
Despite the successful investigation of a hybrid approach for fault warning, there remains ample room for future research. Firstly, fuzzy languages can be employed to represent uncertainties in the operation of realistic industrial equipment [
2,
3]. Secondly, IEO can be combined with other metaheuristics [
30,
31]. Lastly, our IEO-BP framework can be applied on various equipment according to practical requirements, or further improved to propose more sophisticated warning strategies; examples include the use of more adaptive learning rate expressions and the use of hybrid metaheuristics, in addition to extensions such as these to more neural network structures [
32,
33,
34,
35,
36].