1. Introduction
Fault-warning technology based on the online monitoring of vibration signals can find early faults in transformers in a timely and effective way. This is an important function in substations and power inspections, and it is also very important for preventing sudden and large power outages caused by poor operating conditions of transformers, which can even lead to accidents such as explosions and fires [
1,
2,
3,
4,
5].
Fault warning tasks usually consist of two phases: offline modeling and online monitoring. In the offline modeling phase, a transformer operating mechanism model is trained using a section of normal data to determine the boundaries of the normal data; for online monitoring, if the data to be detected exceeds the established boundaries, there is a high probability that a fault has occurred in the transformer in that operating condition [
6]. With the concept of the ubiquitous power IoT, various sensing and monitoring technologies have been rapidly developed, and the transformer vibration signal data have gradually shown big data characteristics such as large volume, high dimensionality, and fast growth, while traditional diagnosis technology has problems of low efficiency and high cost. Therefore, data mining and analysis based on actual operating conditions and the construction of low-cost, generalized transformer condition identification and fault early warning models are of great importance to ensure the stable operation of power systems and a quality power supply. The laboratory-based vibration test platform is a common method used by scholars at home and abroad to study transformer fault early warning [
7]. In [
8], a 110 kV power transformer that was producing GDR II warnings underwent a number of electrical and chemical tests in order to be examined for defects. In [
9], for the modeling and real-time application of fault diagnosis within the transformer, a single-phase, 600 VA, 220/110 V, 50 Hz transformer was employed in the lab. However, in actual engineering, transformer state change is often the result of an accumulation of weak faults; state change is a gradual process, and with many transformer models and complex structures, most studies can only be carried out under limited conditions to simulate the special vibration situations of transformers, so data-driven models trained based on laboratory-established fault samples are not sufficiently adaptable to perform at some sites.
On the other hand, international researchers in the field of power transformer defect warning have focused a lot on data-driven models, such as artificial neural networks, support vector machines, random forests, and principal component analysis, to solve these problems better [
10,
11,
12,
13]. In [
14], a training technique for deriving rules from a functionally approximated ANN utilizing the concentration of dissolved gases in transformer oil as the input is suggested in order to implement fault warning and defect diagnostics in transformers using artificial neural networks. However, the synchronization of the model parameters can be difficult to control and operates slowly for the neural network approach; in contrast, the defect warning strategy based on the SVM algorithm operates quickly and accurately. In [
15], in the SVM-BA optimized SVM model for oil-immersed transformers, the kernel function and penalty margin are integrated with the Gaussian classifier. However, supervised learning algorithms such as these have a strong reliance on the completeness of sample information, and monitoring data in real industrial settings often lacks appropriate data labels. Unsupervised learning and semi-supervised learning can be applied to condition monitoring data where fault data are scarce or lacking in labels [
16]. In [
17], to handle the challenging cases that are largely unclassifiable by Duval’s triangles, a novel DGA diagnostic method based on the K-means method with an enhanced KNN cumulative voting mechanism was created. This method is appropriate for the early warning of transformer defects. In [
18], to analyze and process transformer DGA data, the upgraded FCM algorithm is employed, significantly resolving the issue of classic clustering methods not performing as expected in tasks requiring transformer defect warning and diagnosis.
The number of faulty samples is very scarce, which in turn leads to an uneven distribution of the actual equipment data set. When machine learning methods are applied to such unbalanced samples, the training model is biased towards the majority class and does not perform well for the minority class. Therefore, a good number of samples is a necessary condition for ensuring that the above machine learning algorithms produce models that work well in real life.
Based on the above analysis, the data mining concept is used in this paper to study the vibration characteristic values of transformers under different operating conditions based on the field measurement data of transformer vibration signals and the existing transformer vibration mechanism model of power quality. Firstly, a spectral clustering algorithm is used to cluster the transformer vibration signal data set to achieve the division of the field-measured transformer vibration signal into working conditions. A decision tree model is then constructed to analyze the vibration characteristics of the transformer during harmonic current, light load, heavy load, and three-phase unbalanced current operation. The method establishes a direct link between the transformer’s operating state and each of its amplitude and frequency characteristic quantities. It also gives a way to keep track of the transformer’s state and warn of problems using vibration signals during gradual state changes.
4. Decision Trees
As a top-down supervised learning classification algorithm, the inherent characteristics of the decision tree algorithm make it insensitive to the true or nonlinear characteristics of the data, taking into account the interactions between variables while also providing a clear and intuitive representation of the relationship between logical labels and feature vectors in the form of a tree diagram, enhancing the mapping relationship between vibration features and transformer operating conditions. So, using decision trees to build a classification model for transformer operating conditions can help make the analysis of data after spectral clustering clearer and more accurate [
22].
In this paper, a decision tree algorithm with information entropy and information gain as splitting rules is used [
23]. The type of working condition is used as a category attribute, and other relevant factors, such as the amplitude of vibration harmonics components from 50 Hz to 2400 Hz, are used as non-category attributes to construct a transformer fault warning decision tree. The specific steps are as follows:
Step 1: Determine the sample set . The collected transformer vibration harmonic component amplitudes from 50 Hz to 2400 Hz and the working condition category are composed into a complete sample so that a large amount of actual data can form the sample set.
Step 2: Calculate the sample information expectation. The sample information entropy is calculated using Equation (6).
In Equation (6): is the sample data set; is the total number of transformer operating conditions types; is the amplitude of the harmonic component of the transformer vibration from 50 Hz to 2400 Hz for working condition type ; and is the number of samples with condition type as a proportion of the total number of samples.
Step 3: Calculate the information gain for each non-category attribute. In this sample data set, there are 440 values of the non-categorical attribute
. According to the value of this attribute, the sample set of transformer condition categories can be divided into 440 parts, and the following formula is used to calculate the information gain of each non-categorical attribute.
In Equation (8), where there are values of the eigenvalue : is the branch containing all samples with value on in the sample data set ; and is the weight of the branch node.
Step 4: Select the split property node. For the non-class attributes, the amplitude of each vibrational harmonic component of the transformer from 50 Hz to 2400 Hz is calculated according to the method shown in Step 3, the sample information gain corresponding to each attribute is determined, and the non-class attribute with the greatest gain is selected as the split node.
5. Experimental Analysis
To ensure the homogeneity of the test samples, the spectral amplitudes of the samples after FFT and removal of the working condition labels and merging were normalized using a linear transformation.
where
and
are the maximum and minimum values of the spectral amplitude in the sample, respectively.
The vibration spectrum data set obtained after the normalization transformation was spectrally clustered to obtain four different types of vibration cluster classes. The results are shown in
Table 2.
The experimental results show that spectral clustering divides the transformer vibration signal spectrum data set into four vibration clusters. Based on how the original data set was put together, the four clusters are called light load, heavy load, harmonic current, and three-phase current unbalance.
According to the transformer vibration signal clustering thermogram in
Figure 6, the spectrum at odd frequencies such as 50 Hz, 150 Hz, 250 Hz, and 400 Hz is stronger under the unbalanced three-phase current condition, where the odd frequency component is caused by the zero-sequence current invasion during the asymmetric operation of the system. Under harmonic current conditions, the spectrum is higher at 100 Hz, 300 Hz, 400 Hz, and above 1000 Hz for high-frequency harmonic components, as the core magnetostriction generates 100 Hz, 200 Hz, and other vibrational harmonics, and harmonic currents often lead to superimposed harmonic components on the winding amperage. Under light load and heavy load conditions, the vibration signal is mostly made up of even frequency components between 100 Hz and 500 Hz. However, under heavy load conditions, the combination of winding vibration and core vibration caused by a current in the winding will cause a high frequency vibration component of 1000 Hz or more through nonlinear propagation.
On the basis of the vibration clusters of each transformer operating condition, the vibration spectrum amplitude is converted into discrete variables consisting of “low”, “medium”, and “high” by using the trilateration method. The information gain
purity of the vibration amplitude at 400 Hz is calculated to be the highest among the 48 vibration features, so 400 Hz can be used as the root node of the transformer vibration feature decision tree model. The information gain at 50 Hz and 350 Hz in the second layer is 0.854 and 0.730, respectively, while the information gain at 50 Hz in the third layer is 0.222. The transformer vibration feature decision tree model is shown in
Figure 7.
In the diagram,
represent the amplitude of the vibration spectrum at 400 Hz, 50 Hz, 350 Hz, and 50 Hz, respectively. Among the four transformer operating conditions reflected in
Figure 7—light load, heavy load, harmonic current, and three-phase current unbalance—the 400 Hz vibration signal amplitude in the light load condition is relatively low compared to the other three conditions and is therefore first split out at the root node in the decision tree. When the transformer operating condition satisfies
, the transformer is in the light load operating condition.
The three-phase unbalance is divided into high-spectral components at 50 Hz and 350 Hz in the decision tree. Combining the clustering heat map with the mechanism analysis, the spectral components of odd harmonics such as 50 Hz, 150 Hz, and 250 Hz can be used as the vibration characteristics of the transformer’s three-phase current unbalance, i.e., the transformer is in a three-phase unbalanced operation when the transformer operation state meets at , or at , or at and .
Although there is a certain 50 Hz vibration component in the heavy load condition, the difference between the 50 Hz vibration signal amplitude and that in the three-phase unbalance is large, so the heavy load and three-phase current unbalance conditions can be judged at the 50 Hz leaf node. When the transformer operating state meets at or , the transformer is in the heavy load operating state.
This establishes the state classification of the four operating conditions of the transformer, as well as the safety thresholds for the 50 Hz, 350 Hz, and 400 Hz spectral components for each operating condition. If the range of the relevant state parameters in the decision tree model built from the collected vibration signals or the state identification nodes changes, a fault warning is sent out about the transformer’s state.
The transformer vibration signal in two unknown states is acquired for characterization and combined with the transformer vibration mechanism analysis to diagnose the type of fault. In the fault state, the voltage, sound field, magnetic field, and other relevant parameters are measured. The transformer is then modeled and simulated with finite element simulation software to look at the magnetic field, coil force, and other properties in this state to see if the above conclusion is correct [
24].
Figure 8a,b demonstrate that when the transformer is operated in state 1, the vibration signal in the original light load state has a significantly higher spectral component at frequencies between 100 and 500 Hz, whereas when the transformer is operated in state 2, the vibration signal only has a significantly higher spectral component at frequencies between 100 Hz and 200 Hz. The vibration signals produced by the transformer in these two unknown states result in a decrease in the variability of the vibration characteristics between the various clusters in the vibration data set as compared to the vibration characteristics of the different operating conditions in the transformer’s normal operating state.
Figure 8c shows that the range of
changes to varying degrees in both unknown states of the transformer, especially in the light load condition where the spectral components are beyond the normal
range. Therefore, the light load condition in state 1 cannot be directly classified on the basis of the vibration component characteristics at 400 Hz, but needs to be further determined by the vibration component characteristics at 50 Hz and 100 Hz to distinguish it from the three-phase unbalance and harmonic currents, respectively; the root node of the decision tree constructed in state 2 becomes 150 Hz and the light load condition is directly classified according to the vibration component characteristics at 150 Hz, which is a very different condition from the normal operating condition.
In summary, the two unknown states of the transformer are judged to be some kind of fault in the light load condition, which needs to be dealt with in a fault alarm. The abnormal vibration of the transformer’s internal silicon steel sheet due to loose screws will cause a relative increase in magnetostrictive strain values, and the vibration component in the 100 Hz to 500 Hz frequency range generated by the core may increase accordingly. In the event of a loose winding fault in the transformer winding, the vibration component at 100 Hz will increase relatively due to the dynamic Lorentz force on the winding and will also increase at 200 Hz, 300 Hz, and other frequencies due to the nonlinear mechanics of the pad. Therefore, in combination with the change in vibration characteristics and the analysis of the vibration mechanism, it is assumed that the transformer generates a mechanical fault in both states.
The magnetic flux density and the magnetostriction of the transformer silicon steel sheet are strongly correlated; the higher the flux density, the stronger the magnetostriction. The transformer flux density diagram for both states is shown in
Figure 9a. The maximum value of the main flux density of the transformer in both modes is more than 2.0 T, which is outside of the normal working range of the transformer. The main flux density of this transformer is approximately 1.7 T when operating under normal conditions. The transformer’s stress distribution is depicted in
Figure 9b, and the transformer winding exhibits varying degrees of deformation in both modes. With the study in
Figure 9a,b combined, it is clear that the transformer’s aberrant vibration characteristics in both modes are caused by internal mechanical flaws.
The three fault warning models, SVM, KNN and K-means, were trained with empirical parameters and the test results were compared with the SC−ID3 model and the comparison results are shown in
Figure 10. According to
Figure 10, it can be seen that the SC−ID3 model has the highest recognition accuracy among these four models, and it can also show good recognition accuracy under the two working conditions of light load and heavy load where the parameters are more similar.