In the event of a transformer fault, there is a large amount of H
2, CH
4, C
2H
6, and other gases present in the insulating oil, and the composition of these gases has a strong non-linear relationship with the type of fault [
4]. Therefore, dissolved gas analysis (DGA) techniques are widely used in transformer fault monitoring and diagnosis [
5]. Traditional fault diagnosis methods mainly include the ratio method [
6], key gas method, triangle method [
7], and pentagon method [
8]. Although they are simple and effective, these methods still have many problems, such as inconsistent diagnostic results and low accuracy, which reduce the reliability of fault analysis. In recent years, artificial intelligence techniques based on neural networks [
9], support vector machines [
10,
11] (SVM), extreme learning machines (ELM), etc., combined with DGA analysis have become a research hotspot for experts at home and abroad. Although these methods can improve the accuracy of fault diagnosis to a certain extent, there are still some shortcomings. For example, the SVM-based fault diagnostic model [
12] is constrained by the multiclassification problem constraints of the SVM itself, and therefore, does not function effectively in the presence of complex high-dimensional data. Bazan et al. [
13] proposed a two-stage approach for three-phase induction motor diagnosis based on mutual information measures of the current signals, principal component analysis, and intelligent systems. This offers the possibility of reducing the amount of information required, but with reduced accuracy. Although Yu [
14]’s developed KNN fault diagnosis model increases the work efficiency of the KNN algorithm, it does not address the issues of the poor fault tolerance rate of the KNN to training data and is easily prone to dimension disaster, which results in weak generalization of the model. Accuracy and stability in the medium- and long-term prediction of fault data cannot be guaranteed by the fault diagnosis method based on the hidden Markov model (HMM) presented by Jiang [
15]. Compared with other artificial intelligence methods, artificial neural networks (ANN) can significantly improve the accuracy of fault diagnosis. The connection weights and biases (significant parameters) of the network model are continuously adjusted during the training process to ultimately establish the corresponding mapping relationship between specific fault features and fault types for ANN-based fault diagnosis models for power transformers [
16]. Researchers are integrating neural network-based, deep learning methods with transformer fault diagnosis techniques. An evolving neural network method for power transformer defect diagnostics was put out by Huang et al. [
17]. The neural network automatically modifies the network parameters (connection weights and deviation terms) based on the suggested evolutionary strategy to produce the optimal model. Meng and Dong et al. [
18] proposed a radial basis function neural network (RBFNN) based on a hybrid adaptive training method for the fault diagnosis of power transformers. This method is able to generate RBFNN models based on fuzzy cmeans (FCM) and quantum-inspired particle swarm optimization (QPSO), which allows for automatic configuration of the network structure and the acquisition of model parameters. Compared to conventional neural networks, using these methods, the number of neurons, the center and radius of the hidden layer activation function, and the output connection weights can be automatically calculated. The classification accuracy of RBFNN is significantly improved. This offers the possibility of reducing the amount of information required, but with reduced accuracy. Burriel et al. [
19] proposed an automatic system based on neural networks for generating optimized expert diagnostic systems for fault detection when the machine works under transient conditions. Dai et al. [
20] proposed a deep belief network (DBN)-based transformer fault diagnosis method. By analyzing the relationship between dissolved gas in transformer oil and fault type, the noncoding ratio of gas is determined as the feature parameter of the DBN model. The DBN adopts a multilayer multidimensional mapping method to extract more detailed fault-type differences and proves, through experiments, that this method can effectively improve the accuracy of fault diagnosis. In order to improve the hybrid kernel extreme learning machine (KELM), Huang et al. [
21] proposed a transformer fault diagnosis method based on the gray wolf optimization (GWO) algorithm. The GWO algorithm can be used to optimize the parameters of the hybrid kernel function, and logistic chaos mapping can be used to generate the initial population parameters of the GWO algorithm to prevent the negative effects of convergence that is too fast on the optimization results and effectively improve the classifier performance. Although Huang [
17]’s evolutionary neural network model can automatically update the network parameters, the evolutionary algorithm’s capacity to converge is limited, and it is easy to fall into the local optimum, which reduces the classification model’s accuracy; Meng [
18]’s proposal of quantum-inspired particle swarm optimization (QPSO) can address the issue of PSO’s delayed convergence. However, RBFNN’s complex structure and extensive calculation are disadvantages when the data sample is large; the classification accuracy of the fault diagnosis model based on DBN is very high [
20], but it needs a lot of fault data for network training, and the classification performance is not stable in the case of small amounts of data; the method proposed in the literature [
21] is very effective for KELM optimization, but its efficiency and accuracy need to be improved.
As a non-linear dimensionality reduction algorithm, Isomap is a good solution to non-linear problems. However, the increase in the number of samples greatly increases the computational complexity of the Isomap algorithm. Therefore, Silva and Tenenbaum et al. proposed the landmark equidistant mapping (L-Isomap) algorithm. Compared with the Isomap algorithm, L-Isomap has a faster computational speed and wider application range and can represent the low-dimensional features of high-dimensional data well.
Three shortcomings of transformer fault diagnosis based on BiGRU are summarized: (1) a single fault diagnosis model cannot greatly improve the fault diagnosis performance; (2) the noise of the transformer fault data will reduce the stability of the model; (3) research on optimization algorithms is not targeted and cannot significantly improve optimization performance. Thus, a transformer fault diagnosis method based on the L-Isomap and ISCSO-BiGRU methods is proposed in this paper. It is noteworthy that the innovations and contributions of this paper are mainly divided into the following five improved methods. First, L-Isomap is used to extract the features of DGA data to reduce the influence of noise on the diagnosis results. In addition, SCSO can be improved by the following four methods to obtain the ISCSO. A logistic is proposed to improve the initial diversity of the sand cat population. A strategy with improved water wave dynamics factors, adaptive weights, and golden sine is introduced to improve the SCSO. Then, it is noteworthy that ISCSO can be obtained from the above four improved methods, and the benchmark functions are used to test the optimization performance of ISCSO and the other algorithms. The results show that ISCSO has the best optimization performance. Finally, ISCSO is used to optimize the relevant hyperparameters of BiGRU. The important feature quantities selected by the L-ISOMP algorithm are input to the BiGRU optimized by the ISCSO algorithm for transformer fault identification and compared with the conventional DGA method to verify the enhancement effect of L-ISOMP on model performance. Finally, by comparing the analysis with other transformer fault diagnosis models, it is verified that the model in this paper has a higher accuracy rate.