1. Introduction
Prognostic and System Health Management (PHM) generally provides capabilities such as fault detection, fault prediction, and component life tracking to assess product reliability. PHM technologies include sensing, anomaly detection, diagnosis, prediction and decision support for intelligent machinery maintenance and health operation. Taking advantage of advances in sensor technologies, PHM enables a pro-active fault prevention strategy through continuously monitoring the health of complex systems. A power transformer is a piece of equipment that is of great importance to the electronic system. Thus, its performance can have a great impact on the power grid [
1,
2,
3]. Power transformer aging is an important factor leading to grid failure, which can also cause three main fault types in transformers: electrical, mechanical, and thermal failure. Among them, mechanical failure ranks first [
4,
5]. Therefore, it is critical to improve the accuracy of fault diagnosis of power transformers [
6,
7].
Some traditional methods for fault diagnosis of transformers such as dissolved gas analysis (DGA) [
8,
9,
10], short circuit reactance (SCR) [
11], and frequency response analysis (FRA) [
12] have been widely used in industries. Nevertheless, these methods were limited by the low accuracy of fault diagnosis when the component of the dissolved gas in oil is complicated. High-dimensional fault data on power transformers can lead to the nonlinearity of the whole system, and FRA and SCR in this condition cannot find the real locations of fault, and also cannot provide the information about the types of transformers [
11].
The methods of power transformer faults’ diagnosis [
13] include mainly the International Electrotechnical Commission (IEC) four-ratio and the three-ratio method, characteristic gas method and so on. However, these methods generate large errors in the diagnosis of power transformers. The accuracy will be greatly reduced when the sample data are too small or there are some outliers the samples. Therefore, artificial intelligence technology with excellent performance is desired to be used in transformer fault diagnosis. Intelligent algorithms based on the DGA data are the widely-used methods in transformer fault diagnosis, especially the back-propagation (BP) neural network [
14,
15]. The BP neural network can be utilized to find the connection weights and bias to implement accurate diagnostic methods or models for DGA. The updated parameters of BP neural network follow the rule of gradient descending to avoid mistaking the parameters as the optimal parameters.
Nowadays, many smart optimization algorithms and machine learning algorithms have been applied to different domains such as power transformers since these methods have great fault diagnosis performance. There are plenty of power transformer fault diagnoses and other cutting-edge research. In the fault diagnosis of power transformers, various intelligent and machine learning methods are used to detect the state of transformers.
As for power transformer fault diagnosis, Khmais et al. [
16] developed a fault classification method of power transformer based on support vector machine (SVM) using train data to build a multi-layer SVM classifier. This classifier has superior performance in identifying transformer fault types. Li et al. [
17] presented an intelligent method for power transformer fault diagnosis based on selected gas ratio and SVM. They used a genetic algorithm (GA) to obtain the optimal dissolved gas ratio (ODGR) for DGA ratio selection and support vector machine parameter optimization. Three and four-digit coding with faulty information and fuzzy logic is used to improve the result by Hooshmand et al. [
18]. The method has been applied to the diagnosis of dissolved oil in the transformer. Wang [
19] developed a new transformer fault diagnosis method based on a probabilistic neural network (PNN) and dissolved gas analysis. A hybrid evolutionary algorithm based on particle swarm optimization (PSO) and BP is used to optimize the parameters of PNN. In order to solve the problem of power transformer accidents, Trappey et al. [
20] developed an intelligent engineering asset management system. Data-driven models are used to detect potential faults in transformers. The Principal component analysis (PCA) and BP-Artificial Neural Network (BP-ANN) are used as prediction models to carry out this task. Zheng et al. [
21] proposed a transformer solubility prediction method based on PSO and least squares support vector machine (LS-SVM). The results demonstrated that the method is superior to BPNN, Generalized Regression Neural Network (GRNN), Radial Basis Function Neural Network (RBFNN) and Support Vector Regression (SVR) methods.
With regard to another piece of equipment detected by novel methods, Zhou et al. [
22] presented a method of intelligent fault diagnosis based on ontology and FMECA (Failure Mode, Effects and Critically Analysis) for the fault diagnosis of wind turbines. This method realizes the knowledge sharing between deep knowledge and shallow knowledge, improves the fault diagnosis ability and makes a better decision for the diagnosis system. In order to improve the efficiency and accuracy of transient probability analysis of flexible mechanisms, a dynamic network method (DNNM) based on Improved PSO/Bayesian regularization (BR) is proposed by Song et al. [
23]. The results show that the method improves the computational efficiency and provides a meaningful insight for flexible mechanisms. In order to address the problem of rolling bearing tip under complex working conditions, it is often affected by mechanical and electrical system faults. Therefore, Lu et al. [
24] proposed a deep learning method based on a convolutional neural network (CNN). Evssukoff and Gentil [
25] proposed a recursive neural fuzzy system for fault detection and isolation in nuclear reactors. It generates good performance in detecting and isolating various security related faults.
This paper is extended from the DPDC 2008 conference, entailed, “Cuckoo Search Optimized NN-based Fault Diagnosis Approach for Power Transformer PHM” [
26]. Based on the recommendation from the SPSC 2008 committee, we extensively rewrote the paper by extending the experiments and providing more validation results obtained from real transformer sensor data collected in a smart grid. The main contributions are as follows: (1) to develop machine learning-based models for transformer PHM, we proposed a novel method to enhance the cuckoo search algorithm for optimizing the parameters of multi-layer back-propagation neural network for fault diagnosis of a power transformer. (2) We introduce the important factors such as improvement rate (IR) to update the function of cuckoos (solutions). (3) Given the mutation of the process of finding optimal solution, we consider that the mutation of solution
x, which is controlled by mutation probability
. (4) We evaluated the developed machine learning-based PHM models by using the real operational data collected from power transformers in a smart grid. The results demonstrated the high performance of the PHM models for transformer fault diagnosis.
The paper is organized as follows. After the Introduction section,
Section 2 presents the machine learning-based method, using a Cuckoo search algorithm to optimize the BP neural network for power transformer fault diagnosis.
Section 3 introduces the developed machine learning-based model for power transformer fault diagnosis;
Section 4 presents the experiments and the results;
Section 5 discusses the results and draws the conclusions.
2. Methods
2.1. Modified Cuckoo Search (MCS) Algorithm
Cuckoo Search Algorithm (CS) is a nature-inspired meta heuristic algorithm which imitates parasitic brood behavior of cuckoos [
27]. To simulate the behavior of cuckoo nesting, the CS algorithm sets three rules. The cuckoo produces an egg each time, which represents a solution to the problem, and randomly places the eggs in a nest for hatching. In addition, the number of nests is fixed and set a value
to describe the probability that the nest owner finds the that the egg is a foreign egg. CS is enhanced by the Levy flight so that CS can explore global space and local space of solution and combine them with local search and global search mechanisms that make itself efficient [
28]. In addition, important parameters
and step-size
of CS algorithm in fine-tuning of solution vectors are used to adjust the convergence rate of the algorithm. However, the standard CS algorithm uses a constant value for these parameters by the experience. Unquestionable parameter setting and constant parameters during iterations will decrease the performance of CS algorithm [
29].
Thus, in order to improve the ability and overcome disadvantages, a modified Cuckoo Search Algorithm (MCS) is proposed in [
30], which the main task is to implement the iterative process in which parameters
and
are updated via function in the appropriate range.
In order to use feedback information during evolution, parameters
and
are set as proportional to the improvement rate (
IR). In addition, the
IR can be computed by
where
where
NI is the number of improvement of solutions.
f is the fitness function we set.
NN is the total population size.
The discovery probability
and step size
are dynamically updated as follows:
where
and
are the maximum and minimum values of step size
, respectively.
and
are the maximum and minimum values of discovery probability
, respectively.
m and
n are nonlinear factors for adjusting the speed of change of the control parameters.
There are two different strategies in MCS for exploration and exploitation. The first strategy uses Mantegna’s algorithm [
31] as follows:
where
Here,
L is the characteristic scale of the data set.
in Equation (
5) is the step size.
is the Levy exponent which controls the scale of distribution.
s is the step size that can be computed as follows:
where
,
. In addition,
can be calculated as follows:
The second strategy is to attract the closest individuals of the current solution and conduct a global random walk.
These two strategies are randomly selected by the switching probability
, and the second strategy is described as follows:
where the integers
,
and
represent three mutually different indices randomly selected in the range [1, 2, …,
NN], which are different from the integer
i.
T is scaling factor and
is a random number within the interval [0, 1].
In addition, to strengthen the global search capability of MCS, a mutation strategy is also introduced through mutation probability
. In addition, the mutation is as follows:
where
is a random number in the interval [−1, 1].
i and
j are different integers selected within the range [1, 2, …,
NN].
k is an integer within the range [1, 2, …,
D],
D is the solution space dimension.
In addition, the parameter Pa is to judge the probability of hosts finding exotic birds’ eggs. It can determine whether to generate the next new nest. At this time, the location update equation is given by
where the integers
,
and
denote three different integers.
and
are randomly generated numbers in [0, 1].
The initial location of MCS can be expressed as:
where the
and
are the upper and lower bounds of the search space, respectively.
The pseudo-code of modified CS algorithm is shown as Algorithm 1.
Algorithm 1: Pseudo-code of the modified CS algorithm |
|
2.2. Back-Propagation (BP) Neural Network
Back-propagation (BP) neural network is a multi-layer feed-forward neural network, which belongs to an uncertain nonlinear mathematical model [
32,
33,
34]. The BP network consists of an input layer, hidden layer and output layer. The two processes of forward propagation and back propagation are of great importance to the BP neural network [
35,
36]. The BP network can have better performers in classification and prediction because of the combination of these two processes. In the forward propagation, the data are passed through the input layer and combined with the hidden layer weights and thresholds to calculate layer by layer, and finally reach the output layer to obtain the classification result. In back propagation, when the output in the output layer does not comply with expectations, the error signal will propagate back. It uses an error gradient descent algorithm to reduce the mean square error (MSE) between the network output value and the actual output value, and the network adjusts the weights and thresholds layer by layer from the output layer to all hidden layers. Finally, the corrected result is output to the output layer.
2.3. MCS Optimized BP Neural Network (MCS-BP)
The fault diagnosis of a power transformer based on an MCS optimized BP neural network can be used as a comprehensive diagnosis platform, which combines the data of gas in oil with the detection system, and then obtains good results by supervised learning methods.
As shown in
Figure 2, MCS optimizes the block diagram of BP neural network. The following are the main steps:
Step 1: At first, use an IEC three-ratio method to process the features of DGA data of power transformer.
Step 2: Randomly choose the different types of faults of power transformers into the neural network.
Step 3: Initialize the parameters of the BP neural network.
Step 4: Initialize the modified cuckoo search size
, population size
N, switching probability
, mutation probability
, and value of step size
, maximum value of step size
, minimum value of step size
, maximum value of discovery probability
, minimum value of discovery probability
, nonlinear factor
m and
n, scaling factor F and the fitness function
. The fitness function we used in this paper is the mean square error (MSE), as follows:
where
is the measure value and
is the predicted result.
Step 5: Calculate the fitness value of the initial nest via the fitness function, and then select the current optimal solution in the solution space.
Step 6: Generate a random number
and compare with
. Compare
and
, if
, update nests
via Equation (
5), otherwise by Equation (
9).
Step 7: Generate a random number
and compare with
of MCS. If
, perform the mutation via Equation (
10); otherwise, it is unchanged.
Step 8: Calculate the updated solution’s fitness value and update the discovery probability
and the step size
via Equation (
3) and Equation (
4).
Step 9: Generate a random number
and compare with
. If
, update nests
via Equation (
11), or do not change. Compare the last fitness values with new birds’ nests, keep the optimal bird’s nest as the contemporary best nest
.
Step 10: If it can reach the maximum iteration condition, proceed to the next step, or return to Step 6.
Step 11: Substitute the optimized weights and bias of the BP neural network.
Step 12: Input the test set into the trained BP neural network to get the classification output.
3. MCS-BP for Power Transformer Fault Diagnosis Platform
In this paper, power transformer fault diagnosis is mainly divided into four parts: data collection and preprocessing, segmentation of data set, neural network model-training, and comparison between test set output and train set output, as is shown as
Figure 3.
In
Figure 3, firstly power transformer DGA data will be processed in feature selection via the IEC three-ratio method. This procession can be seen in
Table 1. Then,
of the data can be used in the training model, which has been sorted randomly to ensure the train set and test set containing all types of faults. The other
of the data is utilized to test the optimized model. In this study, we test five types of faults of power transformers, which are the thermal faults
T > 700
C, thermal faults
T < 300
C, high energy discharge, low energy discharge and partial discharge. It can be seen as
Table 2, and each group of data is balanced. There are 109 sets of data.
Through this optimization model, the potential faults of power transformers can be predicted and classified.
5. Conclusions
In this paper, we propose a machine learning-based method, CS optimized BP neural network model for power transformer fault diagnosis. This algorithm can adjust the search step of solution space adaptively to find a better global optimal solution, and the fitness value of each solution is utilized to build the mutation probability to avoid local convergence. In addition, the MCS enhances the exploitation capacity and convergence rate. We conducted the experiments to validate the developed models by using 109 sets of real-world data collected from power transformers. Compared with other algorithms, experimental results show that the MCS method we developed outperformed other methods and can converge to the optimal solution for most test cases.
To validate the machine learning-based models or methods for fault diagnosis, more extensive experiments and more advanced metrics and evaluation tools are in high demand. This will be our future work. We will continue to enhance the performance of the algorithms and models and evaluate the performance of the models under different circumstances of the error rate and operating efficiency, using other evaluation tools and metrics.