Power Grid Faults Diagnosis Based on Improved Synchrosqueezing Wavelet Transform and ConvNeXt-v2 Network

Liu, Zhizhong; Zhao, Zhuo; Huang, Guangyu; Wang, Fei; Wang, Peng; Liang, Jiayue

doi:10.3390/electronics14020388

Open AccessArticle

Power Grid Faults Diagnosis Based on Improved Synchrosqueezing Wavelet Transform and ConvNeXt-v2 Network

by

Zhizhong Liu

¹,

Zhuo Zhao

^1,*,

Guangyu Huang

²,

Fei Wang

²,

Peng Wang

² and

Jiayue Liang

¹

College of Computer and Control Engineering, Yantai University, Yantai 264005, China

²

Dongfang Electronics Co., Ltd., Yantai 264000, China

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(2), 388; https://doi.org/10.3390/electronics14020388

Submission received: 26 December 2024 / Revised: 13 January 2025 / Accepted: 16 January 2025 / Published: 20 January 2025

Download

Browse Figures

Versions Notes

Abstract

:

The increasing demand on electrical power consumption all over the world makes the need for stable and reliable electrical power grids is indispensable. Meanwhile, power grid fault diagnosis based on fault recording data is an important technology to ensure the normal operation of the power grid. Despite the fact that dozens of studies have been put forward to detect electrical faults, these studies still suffer from several downsides, such as fuzzy characteristics of complex fault samples with small inter-class differences and large intra-class differences in different topology structures of distribution networks. To tackle the above issues, this work proposes a power grid fault diagnosis method based on an improved Synchrosqueezing Wavelet Transform (SWT) and ConvNeXt-v2 network (named PGFDSC). Firstly, PGFDSC extracts fault features from the fault recording data with an improved SWT method, and outputs the vector signal to enhance the instantaneous frequency. Then, PGFDSC inputs the extracted feature vectors into the improved ConvNeXt-v2 network for power grid faults recognition. The improved ConvNeXt-v2 network is a self-supervised learning model with the advantages of fast speed and high accuracy, which can effectively solve the problem of inaccurate judgment caused by the high dimensionality of data samples. Finally, extensive experiments were conducted and the experimental results show that PGFDSC improves the accuracy of fault diagnosis by two percentage points compared to other baseline models.

Keywords:

power grid faults diagnosis; supervised learning; signal processing; synchronized compressed wavelet transform

1. Introduction

Currently, power grids are considered to be an important component of infrastructure upon which modern society depends. The primary objective of power system operation is to supply uninterrupted power to customers [1]. Due to factors such as the aging of equipment materials, defects in manufacturing, and uncontrollable natural climate conditions, various faults are inevitably encountered in power systems [2]. When a fault occurs in the power grid, the monitoring system collects massive amounts of fault recording data and automatically sends it from local devices to the dispatch center. The power grid fault diagnosis technology must be able to rapidly analyze fault-related data from this extensive dataset, identify fault causes, and assist operators in timely accident analysis and handling, thereby quickly restoring power supply and ensuring the safe and reliable operation of the power grid. Thus, fault diagnosis technology in power systems has become a significant research issue [3].

Recently, extensive research has been conducted on power grid fault diagnosis and proposed effective methods. Existing methods for power grid fault diagnosis are mainly designed based on expert systems [4], Petri nets [5], artificial neural networks [6], and fuzzy set theory [7]. From the perspective of information sources regarding faults, these methods can be divided into two categories: One category of diagnostics employs information from protective devices and circuit breakers, especially the number of switches, to identify grid faults using various intelligent technologies based on Supervisory Control And Data Acquisition (SCADA) systems [8]. Meanwhile, another category applies fault recording data and PMU data to diagnose grid faults through machine learning models and deep learning models [9]. It is noteworthy that diagnostics based on fault recorders are typically more effective due to the richness of the information [10]. Among these methods, those that analyze data generated by fault recorders are particularly valuable for aiding staff in decision-making during fault situations. Traditional learning methods have seen limited application in power grid fault diagnosis due to small sample sizes, suggesting significant potential for further research. While these studies have made progress in classifying specific types of power grid faults, most have focused on a single type of fault. It is crucial to distinguish multiple types of faults through models in the actual operation of power grid equipment.

Currently, power grid fault diagnosis methods based on fault record data can be broadly categorized into two types: traditional machine learning-based methods and deep learning-based methods. Traditional machine learning methods exhibit poor robustness, with a significant drop in recognition rate when parameters change. In contrast, deep learning methods can extract deeper features, allowing for effective judgments even in the presence of missing fault information, especially when optimized information sources are utilized. Traditional models typically involve manually extracting features such as fault waveforms and current signals from images. Among these, methods based on artificial neural networks (ANNs) have shown excellent performance in diagnosing power grid fault recording information. The work presented in [11] proposed a wavelet-based deep neural network approach aimed at intelligent fault detection within microgrids. While this approach demonstrates certain effectiveness in detecting and recognizing faults, it has notable limitations, particularly regarding its adaptability. When implemented in other microgrid systems with varying parameters, the recognition rate may experience significant declines, especially in scenarios characterized by limited or missing data, which ultimately affects diagnostic accuracy. In another study, reference [12] tackled uncertain fault challenges related to protection mechanisms and circuit breaker malfunctions, including refusals and information loss within large power grids, by employing a particle swarm optimization method. Meanwhile, the research highlighted in source [13] concentrated on line loss analysis and utilized clustering techniques in conjunction with neural networks to diagnose abrupt increases in grid fault losses.

With the ongoing advancements in technology, convolutional neural networks (CNNs) have also proved to be effective in diagnosing power grid faults and shown significant improvement in performance. To address the issues of large data volumes and long diagnostic response times in microgrids, the investigation found in reference [14] emphasized solutions to issues regarding substantial data volumes and prolonged diagnostic response times in microgrids through the application of CNNs for fault diagnosis. Additionally, the research discussed in source [15] introduced a method grounded in intelligent optimization, utilizing a CNN-BiLSTM framework to diagnose complex faults in power grids. This approach involved hierarchical modeling and the optimization of information sources, enhancing the model’s fault tolerance and generalizability in cases where fault information might be missing. Furthermore, another study noted in reference [16] employed two-dimensional variational modal decomposition to extract multimodal representations from raw SAR images, capturing both global and detailed information about targets, and achieving fault diagnosis via variational modal decomposition combined with CNNs.

Although existing diagnosis research for power grid faults based on fault recording data has achieved various wonderful results, there are still some shortcomings to be overcome, which are presented as follows:

(1): Existing works generally input signals directly into fault diagnosis models without preprocessing; this always leading to a high amount of irrelevant information in the data and causing feature information to be more ambiguous, thus resulting in poor noise resistance in the trained models.
(2): Existing research typically employs basic CNN models for power grid fault diagnosis, but these models struggle with high-dimensional data classification tasks, and cannot handle multidimensional data effectively, which leads to feature loss and consequently lower diagnosis accuracy.

To tackle the above issues, this work proposes a complex power grid fault diagnosis method based on an improved SWT and ConvNeXt-v2 network (named PGFDSC). PGFDSC first extracts fault features from the fault recording data with an improved SWT method, and outputs the vector signal to enhance the instantaneous frequency. Then, PGFDSC inputs the extracted feature vectors into the improved ConvNeXt-v2 network for power grid faults recognition. In the ConvNeXt-v2 model, the fault information undergoes a layer-by-layer separable convolution operation through the stage convolution modules composed of blocks from the ConvNeXt-v2 model, allowing for the extraction of feature information from the input signals for feature classification. Utilizing frequency domain analysis enables a more comprehensive examination of data features, facilitating the extraction of deep underlying characteristics from the data, thereby enhancing both the accuracy of judgments and the robustness of the model. Furthermore, to enhance the generalization capability of the ConvNeXt-v2 model and reduce overfitting, we improved the loss function of the model, thereby further optimizing the fault diagnosis performance. The main contributions of this work are as follows:

(1): To address the issue of unclear characteristics in fault recording data, we propose an improved SWT method for preprocessing the raw data and extracting frequency domain features, which better resolves the issues related to the high dimensionality of the raw data and the ambiguity of features.
(2): To effectively handle the high-dimensional data, we propose the improved ConvNeXt-v2 for grid fault diagnosis. We introduce residual modules between each stage of the ConvNeXt-v2 model, which is useful for enhancing the model’s ability for feature extraction and enabling it to possess strong capabilities for processing high-dimensional data, thereby improving the accuracy of power grid fault diagnosis.

The remainder of this paper is organized as follows: Section 2 discusses related work in this paper, Section 3 illustrates the PGFDSC method proposed in this paper. Section 4 presents the experiments analysis and performance evaluation. Finally, Section 5 summarizes the research work of this article and looks forward to future research work.

2. Related Work

With the development of artificial intelligence and big data technologies, more and more research has begun to apply AI technologies to power grid fault diagnosis, such as fault diagnosis methods based on machine learning algorithms and deep learning models. Researchers in power grid fault diagnosis have developed various types of fault diagnosis systems, including online monitoring systems and offline analysis systems. These systems help power personnel detect and address grid faults promptly and accurately. However, the diagnostic methods of these systems vary greatly, and their effectiveness differs across different types of power grids. Among them, the fault diagnosis method that combines signal processing with CNN has significant advantages in terms of practicality and adaptability.

2.1. Fault Diagnosis Methods Based on Signal Feature Processing

Currently, deep learning has been widely applied in power systems due to its outstanding feature extraction capabilities. For instance, reference [17] combines the safety correction control problem of the power system with deep reinforcement learning, proposing a two-stage training method based on deep deterministic policy gradient to determine safety correction control strategies, and validates the effectiveness of the method using simulations of provincial power grids in China. Reference [18] integrates the attention mechanism with one-dimensional convolutional neural networks for the assessment of voltage stability and power angle stability, directly analyzing measurement data to save data processing time and further enhance model prediction accuracy.

Reference [19] combines deep learning with kernel ridge regression to achieve robust state estimation for auxiliary forecasting in power systems. Despite the presence of non-Gaussian noise in the measured data and the relatively low accuracy and timeliness of the estimation results, satisfactory outcomes are still obtained. Some scholars have established a transient stability assessment model based on a bidirectional Long Short-Term Memory (LSTM) network, taking into account the temporal characteristics of transient process data from power systems. The analysis results of the assessment model, using visualization methods and network prediction scores, indicate that this network model has strong capability in extracting transient process features, which is beneficial for evaluating transient stability. Reference [20] addresses the issue of poor training performance caused by insufficient target domain samples by combining transfer learning with CNNs. Additionally, the dimensionality reduction of time-series data is performed based on Principal Component Analysis (PCA), which enhances operational speed and enables rapid prediction of fault types, even in small sample scenarios.

2.2. Fault Diagnosis Method Based on Convolutional Neural Networks

To address the issues of speed and accuracy in the fault diagnosis of existing DC microgrids, a fault diagnosis method that integrates CNNs with an attention mechanism and Bidirectional Long Short-Term Memory (BiLSTM) networks is proposed. Specifically, the method first employs a CNN to extract the longitudinal detail features of fault data at a specific moment, compressing the data length to reduce the number of training parameters for subsequent networks, thereby enhancing the speed of fault diagnosis. Furthermore, a cascaded network centered on BiLSTM is constructed to achieve fault diagnosis for the fault data.

This model consists of a longitudinal feature extraction module based on two convolutional layers and a horizontal feature extraction module centered on the BiLSTM network. These two modules are cascaded, enabling thorough extraction of data features from both longitudinal time points and horizontal time series perspectives. By incorporating the attention mechanism, the model is better able to focus on the feature variation patterns of the data surrounding the moments of fault occurrence [21]. Compared to more complex deep learning network structures, this model has a simpler fault diagnosis structure and operates faster, allowing for timely and accurate fault diagnosis.

If random matrix eigenvalues are studied without considering the eigenvectors, there will be issues with insufficient mining of data information and the loss of useful information contained in the eigenvectors. Currently, in the field of power systems, research is ongoing to combine random matrix theory with artificial neural networks or deep learning algorithms in engineering case studies. Wei Wenbing et al. have utilized high-dimensional eigenvalues extracted from random matrices to quickly assess transient voltage characteristics in power systems leveraging the autonomous learning capabilities of neural networks [22]. Cheng Yingying et al. constructed clustering levels based on the statistical characteristics of eigenvalues derived from state monitoring data, using clustering results to assess the operational status of energy meters [23]. Example analyses have verified the effectiveness and timeliness of this method in detecting abnormal states, and it has been found that the clustering algorithm exhibits good robustness and strong anti-interference capabilities.

The above studies have achieved certain outcomes in identifying abnormalities in the power grid. The reasonableness of the input feature set constructed from samples within the dataset is a critical factor affecting the accuracy of neural network abnormal judgment models. Therefore, to quickly and accurately automate the judgment of fault types in the power grid, this paper employs a synchronous compression wavelet transformation data processing method to construct the input feature set from the collected samples. The ConvNeXt-v2 model is then used to perform convolution operations on the data information at the input layer, establishing a mapping relationship to facilitate learning from new input layers, thereby achieving power grid fault diagnosis.

3. Complex Power Grid Fault Diagnosis Method Based on SWT and ConvNeXt-v2

The power grid fault diagnosis method based on SWT and ConvNeXt-v2 mainly consists of data feature extraction and processing, along with a fault judgment model, as illustrated in Figure 1. The process begins with preprocessing the original fault recording data through operations such as cleaning and normalization to remove invalid data from the dataset. Next, the real-time fault waveform data undergoes synchronous compressed wavelet transformation. The processed fault data is then input into the fault discrimination model ConvNeXt-v2 for pre-training and formal training. After completing the above steps, the fault waveform is input into the judgment model for accurate classification, ultimately outputting the fault type to achieve precise identification of the faulty equipment. Compared to traditional convolutional neural networks, the fault diagnosis structure proposed in this paper has fewer parameters and can achieve timely and accurate fault diagnosis. It improved the accuracy of the four-class fault classification task in the power grid by nine percentage points.

3.1. Feature Extraction of Fault Recording Data Based on SWT

When analyzing and processing power grid fault signals in frequency domains using conventional methods, there is often a problem of feature information loss. To achieve an ideal representation of time–frequency information, energy rearrangement is performed based on the original time–frequency spectrum, making the current components more prominent. As shown in Figure 2, the time–frequency analysis method based on synchronous compressed transform enables the compression and rearrangement of time–frequency coefficients, allowing for a high-resolution representation of complex multi-component current signals within the power grid.

Wavelet transform, as an outstanding signal analysis technique, can accurately capture the local detail features of signals, allowing for simultaneous analysis of local characteristics in both time and frequency domains. Unlike linear filters, wavelet transform provides multi-resolution analysis. Particularly, when integrated with the attention mechanism discussed below, it can more intelligently guide the model to focus on the key aspects of the data. This not only leverages the advantages of wavelet transform in multi-scale and multi-resolution feature extraction but also enables a deeper exploration of data characteristics, thereby enhancing the model’s ability to process complex data.

Based on the characteristics of synchronous compressed transform, it can generally be divided into two types: synchronous compressed transform along the frequency direction and synchronous compressed transform along the time direction. To address the issues of the traditional signal processing method EMD (Empirical Mode Decomposition) and its improved techniques, which lack mathematical theoretical support and face problems such as endpoint effects and mode mixing, this article employs the SWT method for processing to achieve feature enhancement.

The SWT algorithm employed in this paper rearranges the time–frequency coefficients of power grid signals through synchronous compression operators, shifting the time–frequency distribution at any point in the time–frequency plane to the centroid of energy, thereby enhancing the energy concentration of instantaneous frequency and effectively addressing the time–frequency ambiguity issues present in traditional time–frequency analysis methods. From a mathematical perspective, the SWT method improves the concentration of time–frequency distribution in the scale domain (frequency domain), thereby reducing the distortion of instantaneous frequency curves. The time–frequency coefficients are rearranged only along the frequency axis, without any rearrangement along the time axis. For convenience, the STFT used in SWT is referred to as the improved Short-Time Fourier Transform, which can be expressed as shown in Equation (1):

V M_{t}^{g} (t, ω) = \int_{- \infty}^{\infty} x (τ) g^{*} (τ - t) e^{j w (τ - t)} d τ = \frac{1}{2 π} \int_{- \infty}^{\infty} X (τ) G^{*} (v - ω) e^{j v t} d v

(1)

where X(ω) is the frequency domain representation of the signal x(t); G(ω) is the frequency domain representation of the window function g(t); and

V M_{t}^{g} (t, ω)

represents the calculation result of the MSTFT (Modified Short-Time Fourier Transform). The SWT is performed by compressing and rearranging the MSTFT coefficients along the frequency axis based on the IFO (Instantaneous Frequency Operator) estimates, making the computation of the IFO a crucial step in SWT. The first-order expression of the IFO,

\hat{ω} (t, ω)

can be written as:

\hat{ω} (t, ω) = \frac{1}{2 π} \partial_{t} a r g V M_{x}^{g} = R (\frac{1}{2 π j} \frac{\partial_{t} V M_{x}^{g}}{V M_{x}^{g}})

(2)

where

\hat{ω} (t, ω)

represents the IFO and

R

denotes the real part of a complex number. The IFO serves as an instantaneous frequency operator.

The inverse transform formula corresponding to the MSTFT is:

x (t) = \frac{1}{2 π g^{*} (0)} \int V M_{x}^{g} (t, ω) d ω

(3)

where

t

represents the final obtained signal, typically indicating a time-varying signal;

g

is the window function (where

g^{*} (0)

is the conjugate value of the filter

g (t)

at

t = 0

).

V M_{x}^{g} (t, ω)

denotes the complex representation of the input signal in the time domain

t

and frequency domain

ω

where

t

represents time.

From Equation (3), it can be observed that the inverse transformation of MSTFT involves integration only along the frequency direction. Thus, if the MSTFT coefficients are rearranged solely along the frequency direction, it will not affect the reconstruction of the signal. The SWT rearrangement formula can be expressed as:

T_{f} (t, w) = \frac{1}{2 π g^{*} (0)} \int V M_{x}^{g} (t, ω) d ω

(4)

where

T_{f} (t, w)

is a bivariate function that represents the response of the fault recording signal to the filter

g

, and

g^{*} (0)

is the conjugate value of the filter at

t = 0

.

Therefore, by performing one more frequency integration on the coefficients that have moved and concentrated around the ridge, the original power grid fault signal can be obtained. The SWT reconstruction formula is expressed as:

x_{r} (t) = \int T_{f} (t, ω) d ω

(5)

where

x_{r} (t)

is the final obtained signal, which is obtained by integrating the function

T_{f} (t, w)

over the frequency

w

.

By combining fault information, the feature information obtained after SWT processing is more manageable for the ConvNeXt-v2 model.

3.2. Fault Diagnosis Model for Complex Power Grids Based on Improved ConvNeXt-V2

To enhance the model’s ability to focus on the feature areas of fault recording images, this paper introduces a Simple, Parameter-Free Attention Module (simAM), which improves the local feature extraction capabilities of the model. Subsequently, a Global Context Multi-scale Feature Fusion (GC-MFF) module is incorporated between two adjacent stage modules to merge shallow and deep multi-scale contextual information, thereby obtaining more representative features to enhance the model’s classification ability across various fault scenarios. Furthermore, to address the significant intra-class variation present in fault images, this study employs a hybrid approach combining the commonly used cross-entropy loss function with center loss, optimizing the feature space by minimizing intra-class distances, promoting intra-class compactness, and ultimately improving classification accuracy. The fault diagnosis framework based on ConvNeXt-V2 is illustrated in Figure 3, with the input data comprising fault feature vectors processed through SWT from the fault matrix extracted from COMTRADE files.

This research adopts a self-supervised learning approach using a Fully Convolutional Masked Autoencoder (FCMAE) framework. The training process begins with a preliminary phase utilizing a substantial volume of unlabeled data, aimed at enabling the model to learn generalizable features and knowledge, thus laying a foundation for subsequent fine-tuning or task-specific training. In this study, a large-scale unlabeled dataset is initially collected, and the processed data is cleaned to ensure quality. Utilizing the unlabeled data, model parameters are updated through a backpropagation algorithm. During this phase, the model learns the statistical properties and general knowledge inherent in the data. After the completion of pre-training, the model can be fine-tuned according to the power grid recording discrimination task, employing a smaller labeled dataset to optimize the model and thereby enhance its performance in power grid fault diagnosis tasks. Notably, the framework employs a fully convolutional structure rather than fully connected layers for mask generation and image reconstruction, reducing the number of parameters and computational burden while preserving spatial information. Additionally, a multi-scale masking strategy is implemented instead of using fixed-size masks, which enhances the model’s capability to perceive features at different scales.

This research proposes a robust framework for power grid fault diagnosis by integrating an attention mechanism, a hybrid loss function, and a self-supervised FCMAE approach. These innovations collectively enhance the model’s ability to extract relevant features from fault data while maintaining computational efficiency and improving classification accuracy. The careful combination of methods aims to tackle the challenges posed by intra-class variations, thus promoting effective fault diagnosis across various scenarios.

3.2.1. Self-Attention Mechanism Improvement

To address the issue of local feature loss caused by the use of large convolution kernels in the depthwise separable convolution operations during feature extraction, and to further enhance the model’s attention to subtle features in power grid fault waveforms while reducing interference from irrelevant information, the fault feature images are input into the trained ConvNeXt-V2 model for fault type recognition.

Once the fault waveform images enter the model, they first pass through the stem layer for convolution operations. The stem layer is an important convolutional layer in convolutional neural networks (CNNs) used to process two-dimensional data and extract features from the input data through convolution. Next, the images progress to the stage module, which is composed of stacked ConvNeXt-V2 Block modules. The structure is illustrated within the blue dashed box in Figure 3 and features an inverted bottleneck structure that is narrow at both ends and wide in the middle. This design uses a 7 × 7 depthwise separable convolution (DSC) to replace standard convolution operations, thereby reducing the number of parameters.

Subsequently, the input enters the SimAM layer. Unlike traditional one-dimensional channel attention or two-dimensional spatial attention, the SimAM focuses on the differences across each feature dimension. It calculates the local self-similarity of the input features

X

to generate attention weights, which are then fused with

X

to produce weighted features

\bar{X}

, enabling the model to pay closer attention to abnormal information in the fault waveforms.

Following the processing through SimAM, the information undergoes down sampling calculations and is subsequently re-input into the stage layer, repeating this process four times. Down sampling can reduce computational complexity: by lowering the resolution of feature maps, the computational burden on subsequent layers is decreased, thereby improving the efficiency of the model. Moreover, down sampling can also extract important features: in image classification tasks, lower resolution helps the model to concentrate on more macroscopic features, avoiding excessive attention to detail noise. This paper utilizes pooling operations to reduce the size of feature maps, scaling down the width and height of the feature maps to half of their original size while preserving the average of the features. The main module architecture is shown in Figure 4, where the convolution layer strides and kernel sizes are indicated by the coefficients s and k, respectively. Black arrows represent the flow of the process, while blue arrows indicate detailed information.

SimAM (Simple Attention Mechanism) is a lightweight attention mechanism designed to enhance the performance of models when processing complex data. As illustrated in Figure 4, it reinforces feature representation in a simple yet effective manner, enabling the model to better focus on important information. The SimAM emphasizes the differences among each feature dimension by calculating the local self-similarity of the input features

X

, generating 3D attention weights. These weights are then fused with

X

to obtain the weighted features

\tilde{X} (E q n (7))

, allowing the model to pay greater attention to information in faulty regions.

The SimAM module uses the minimum energy function to estimate the importance of individual neurons. The calculation formula for the minimum energy function of the

t

-th neuron is shown in Equation (6):

e_{t}^{*} = \frac{4 ({\tilde{σ}}^{2} + β)}{(t - {\tilde{μ}}^{2}) + 2 {\tilde{σ}}^{2} + 2 β}

(6)

where

β

is the regularization term,

t

is the target neuron of the input feature

X

, and

{\tilde{μ}}^{2}

and

{\tilde{σ}}^{2}

are the mean and variance of all neurons except for

t

. This formula indicates that the smaller the neuron energy

e_{t}^{*}

, the greater the difference from surrounding neurons, which enhances its importance for the CV task. Thus, the importance of the neuron can be obtained by 1/

e_{t}^{*}

. Finally, the feature map is optimized based on the importance of each neuron, as shown in Equation (7):

\tilde{X} = S i g m o d (\frac{1}{E}) ⊙ X

(7)

where the output result

\tilde{X}

is the enhanced feature,

X

is the input feature, E groups all

e_{t}^{*}

across channel and spatial dimensions, sigmoid is added to restrict too large value in

E

. It will not influence the relative importance of each neuron because sigmoid is a monofonic function, and

⊙

denotes the Hadamard (element-wise) product.

Layer Normalization (LN) is a regularization technique used in deep learning models, primarily aimed at reducing internal covariate shift. Unlike Batch Normalization, which normalizes across the batch for each feature, Layer Normalization normalizes all neurons for each training sample. It is particularly effective in sequence models, such as RNNs and Transformers. The fundamental idea is to standardize based on the activation values of a particular layer, which helps to accelerate training and improve the stability of the model.

Gaussian Error Linear Unit (GELU) is an activation function that exhibits smooth nonlinear activation characteristics. Compared to ReLU and other activation functions, it demonstrates superior performance in certain tasks. The core idea behind GELU is to weight the input features according to a Gaussian distribution, thereby retaining the outputs of low-value neurons to some extent.

In the context of power grid fault diagnosis, employing a pre-trained learning approach can effectively improve fault recognition accuracy. The performance of the power grid fault diagnosis system is primarily limited by the following key factors: the chosen neural network architecture, the training methods, and the data used for training. Enhancements in each of these aspects can contribute to the overall improvement of the fault discrimination system’s efficiency. The combination of the ConvNeXt-V2 model with the Fully Connected Multi-Head Attention Encoder (FCMAE) has demonstrated superior performance. This model utilizes self-supervised learning techniques by implementing a fully convolutional masked autoencoder framework and a new Global Response Normalization (GRN) layer. These enhancements aim to strengthen inter-channel feature competition and improve the robustness of the model.

Additionally, this paper applies the Global Response Normalization (GRN) technique to the power grid data, in conjunction with the ConvNeXt architecture, to enhance the effectiveness of extracting features from power grid waveform data. This approach effectively addresses the feature collapse issue that may occur when directly training the ConvNeXt-v2 model. The GRN layer can be divided into three steps: global feature aggregation, feature normalization, and feature calibration.

In the global feature aggregation step, we employ the L2 norm to aggregate the feature maps across each channel, resulting in an aggregated vector. In the feature normalization step, we utilize a standard division normalization function to normalize the aggregated vector. In the feature calibration step, we use the normalized vector to calibrate the original feature maps. The computational load of the entire GRN layer is very small, making it easy to integrate into convolutional neural networks, thereby enhancing feature competition and improving model performance. The specific formulas are as follows:

First, we aggregate a spatial feature map

X_{i}

into a vector

g x

using a global function

g

(⋅):

g (X) : = X \in R^{H \times W \times C} \to g x \in R^{C}

(8)

where

H

represents the height of the image,

W

represents the width, and

C

represents the number of channels.

R^{H \times W \times C}

denotes the spatial feature map values, and through this operation, we obtain the vector gx.

Next, we apply the response normalization function

n (\cdot)

to the aggregated value. Specifically, we employ a standard division normalization as follows:

N (∥ X_{i} ∥) ≔ ∥ X_{i} ∥ \in R \to \frac{∥ X_{i} ∥}{\sum_{j = 1,2, 3, \dots, C} ∥ X_{j} ∥} \in R

(9)

where

∥ X_{i} ∥

is the L2 norm of the i-th channel.

Finally, by calculating the normalized and calibrated response of the original input

X_{i}

, we feed the data into a fully connected layer to convert it into a probability distribution across different categories, resulting in a judgment outcome. This process enhances the model’s generalization capability and reduces the issue of feature entanglement.

X_{i} = X_{i} * N ({g (X)}_{i}) \in R^{H \times W}

(10)

3.2.2. Loss Function

In the optimization of the loss function, we adopt the Cross Entropy (CE) loss function, which is commonly used for multi-class classification and effectively guides the model to learn the correct class distributions, as shown in Equation (11):

L_{C E} = \sum_{i}^{m} p (x_{i}) l n q (x_{i})

(11)

where

m

represents the batch size,

i

indicates the

i

-th sample in the batch,

m

denotes the feature map of the sample,

m

is the true distribution probability of the four categories, and

m

is the predicted distribution probability of the four categories.

However, the fault waveform images exhibit characteristics of small inter-class differences and significant intra-class differences. Different categories, such as vascular dilation and erythema, share similar color and texture features, while lesions of the same category can vary in morphology, such as differences in polyp shapes and sizes and varying degrees of lesions. The Cross Entropy loss function has difficulty optimizing the model to an ideal state, therefore we incorporate a Center Loss function to enhance the model’s optimization regarding intra-class distances, thereby improving performance in the WCE image classification task. The calculation of the Center Loss function is represented in Equation (12).

L_{c} = {\frac{1}{2} \sum_{i = 1}^{n} ‖x_{i} - C_{y i}‖}^{2}

(12)

where

n

denotes the batch size,

x_{i}

refers to the feature map of the i-th sample, and

C_{y i}

represents the feature center corresponding to that sample’s class, matching the dimensionality of

x_{i}

features. This formula aims to minimize the squared distance of each sample’s features from the class center, meaning a smaller intra-class distance is preferable.

The primary objective of the Center Loss function is to reduce intra-class variance but does not address predicting the class of images. To tackle the characteristics of fault waveform images, we combine the Cross Entropy loss and the Center Loss function to form the loss function used in this paper, as indicated in Equation (13).

L_{C E C} = L_{C E} + λ L_{C}

(13)

where

L_{C E}

denotes the Cross Entropy loss function,

L_{C}

is the Center Loss function, and α is a weighting coefficient whose range is between 0 and 1. Reference [22] provides an analysis of the changes in category feature distributions under various values on a handwritten digit recognition dataset (10 categories). When the value is set to 0.001, there is a clear boundary between the feature distributions of different categories, effectively alleviating the problems of large intra-class variance and small inter-class variance. As the value increases, it can interfere with model optimization and reduce classification performance. Therefore, we set the value of α to 0.001, allowing the Cross Entropy loss function to dominate.

4. Experiments and Analysis

This paper utilizes MULTISIM 14.0 and employs a Python 3.9 interface to control DIGSILENT for random simulation, generating batches of fault recording waveform data under various types. A distribution network node model is constructed to obtain three-phase current and three-phase voltage data. The simulation yields four types of faults, including phase A, B, and C grounding. The simulated data is then combined with real grid fault recording data.

4.1. Fault Data Processing

The fault recording waveforms from substations are generally classified into two types: one type is the waveform generated by protection and automatic devices, while the other is from dedicated fault recorders. Although there are slight differences in the graphic format of the waveforms produced by different devices, the basic format of all fault recording waveforms is the same. The fault recording waveform from dedicated recorders, as shown in Figure 5, mainly consists of five components: ① textual information; ② scale bar section; ③ channel annotation section; ④ time scale section; and ⑤ waveform section.

4.1.1. Characteristics of Fault Recorded Samples

The variation in the fault location of the same circuit cannot be accurately reflected in the waveform amplitude changes. Therefore, relying solely on the amplitude changes of the fault recording signal cannot accurately determine the fault location. To address this, we utilize time–frequency analysis methods to extract time–frequency features, using both amplitude features and time–frequency features as inputs to explore fault characteristics from different perspectives. Different types of faults correspond to different current characteristics, and by analyzing the amplitude variations of the electrical waveforms, we can effectively distinguish between fault types, as detailed in Table 1 below:

The electrical quantities of some faults are shown in Figure 6. The specific numerical characteristics under different fault conditions are as follows:

(1): Single-phase Ground Fault: The single-phase current increases, while the single-phase voltage decreases; zero-sequence current and zero-sequence voltage appear; the increase of current and decrease of voltage belong to the same group. The phase of the zero-sequence current is in the same direction as the fault phase current, while the zero-sequence voltage is in the opposite direction to the fault phase voltage.
(2): Two-phase Short Circuit Fault: The currents of the two phases increase, while their voltages decrease; zero-sequence current and zero-sequence voltage appear; the increase of current and decrease of voltage belong to two identical groups. The zero-sequence current vector is situated between the currents of the two faulted phases; the inter-phase fault voltage leads the inter-phase current by approximately 80 degrees; the zero-sequence current leads the zero-sequence voltage by about 110 degrees.
(3): Two-phase Ground Short Circuit Fault: The phase currents of the two phases increase, while their voltages decrease; zero-sequence current and zero-sequence voltage appear; the increase of current and decrease of voltage belong to two identical groups. The zero-sequence current vector is situated between the currents of the two faulted phases; the inter-phase fault voltage leads the inter-phase current by approximately 80 degrees; the zero-sequence current leads the zero-sequence voltage by about 110 degrees.
(4): Three-phase Short Circuit Fault: The currents of all three phases increase, while their voltages decrease; there is no zero-sequence current or zero-sequence voltage; the fault phase voltage leads the fault phase current by approximately 80 degrees, and the inter-phase fault voltage leads the inter-phase fault current by around 80 degrees.

4.1.2. Source Data Processing

In this paper, we construct a dataset that contains a mixture of real fault samples and simulated samples. The real samples are obtained from field data collected by Dongfang Electronics, while the simulated samples are obtained through simulations. The signals processed in this paper are extracted from the COMTRADE files of the grid operation. COMTRADE is the IEEE standard format for the exchange of transient data in power systems. This standard defines a format for transient waveforms and fault data collected from power system models, aiming to provide an easy-to-interpret data exchange format. The COMTRADE format file mainly consists of the following four types of files:

(1): Header File (HDR): The header file is an optional ASCII text file created by the original authors of the COMTRADE data, which can contain any information in any order desired by the creator. The format of the header file is ASCII.
(2): Configuration File (CFG): This is an ASCII text file used to accurately describe the format of the data (.DAT) file, thus it must be saved in a specific format. This file explains the information contained in the data (.DAT) file, including sampling rates, number of channels, frequencies, channel information, etc.
(3): Data File (DAT): The data file contains values for all input channels for each sample in the record. The data file includes a sequence number and a time stamp for each sample. In addition to recording the simulated input data, it also records states, which represent inputs for on/off signals.
(4): Information File (INF): This file contains special information that the creator hopes will be useful to the user, in addition to any other information.

4.2. Evaluation Criteria

To evaluate the performance of the fault diagnosis model, this paper adopts precision, recall, and F1 scores as evaluation metrics. Since the fault diagnosis model designed in this paper involves classification tasks, the metrics utilized are the same and will not be elaborated further. We also employ accuracy, commonly used for image classification tasks, as an evaluation metric. For the multi-class classification task of fault waveform images in this paper, macro-averaging is employed to assess the overall performance of the model. We compute the metric values for each category separately and then calculate the arithmetic mean to obtain the average precision, average recall, and average F1 score. Accuracy (Acc) is the proportion of correctly classified samples to the total samples; precision (P), also known as positive predictive value, is the ratio of the number of true positive samples to the total predicted positive samples, calculated as shown in Equation (14); recall (R), also known as sensitivity, is the ratio of true positive samples to total actual positive samples, where a larger value indicates a lower miss rate, with its formula shown in Equation (15); the F1 score takes into account both precision and recall and is the harmonic mean of these two metrics, with a higher value indicating better performance. The formula is shown as Equation (16):

p r e c i s i o n = \frac{T P}{T P + F P}

(14)

r e c a l l = \frac{T P}{T P + F P}

(15)

F_{1} = {(\frac{{p r e c i s i o n}^{- 1} + {r e c a l l}^{- 1}}{2})}^{- 1} = 2 \cdot \frac{p r e c i s i o n * r e c a l l}{p r e c i s i o n + r e c a l l}

(16)

where:

TP—True Positive;

FN—False Negative;

FP—False Positive;

TN—True Negative.

For a binary classification task,

X

and

\bar{X}

represent the two classes in the actual and predicted values, respectively. The relationships among the four metrics are shown in Table 2.

4.3. Comparative Test

This paper conducts experiments using LSTM, CNN, ResNet, Inception-ResNet, and ConvNeXt-v2 models for the classification of various types of faults, including binary classification of simple and complex faults, binary classification of instantaneous vs. permanent faults, classification of two-phase short circuit faults, three-class classification of two-phase ground short circuit faults, and three-class classification of single-phase short circuit faults. For each network, the size of the hidden layers is approximately the same, and the training is carried out over 50 epochs. Evaluation metrics such as accuracy, precision, recall, and F1 score are used for a comprehensive assessment, with the test set evaluation results presented in Table 2.

Among the models, convolutional neural networks (CNNs), represented in this context, exhibit significant advantages in terms of F1 score and training time over the 50 epochs. However, within the convolutional neural network family, both the CNN and ResNet models yield only moderate F1 scores during testing. This is attributed to the limited depth and richness of the features extracted during the training process.

In contrast, the deeper and wider architecture of ConvNeXt-v2 achieves a high level of accuracy. The optimized ConvNeXt-v2 model demonstrates a clear advantage in training time. This improvement is due to the model’s Block layer, which effectively integrates deep features across channels while compressing the feature space, thereby enhancing computational efficiency. As a result, ConvNeXt-v2 can shorten computation time and deliver superior performance under the same number of training epochs and computational scale.

4.3.1. Experimental Parameter Setting and Analysis: Comparative Test

To verify the impact of the number of stages in the ConvNeXt-v2 model and the input channel count on the accuracy of the fault diagnosis model, experiments were conducted while keeping other conditions constant. Using the complexity of faults as a criterion, models with stage counts of 4, 6, 8, and 10, and channel counts of 1, 8, 16, and 32 were tested. The F1 scores from these models were compared, with test set evaluation results illustrated in Figure 7.

From the analysis of the number of stages in the ConvNeXt-v2 model, it can generally be observed that the F1 score increases gradually with the number of stages. Notably, the upward trend is quite pronounced between 4 and 8 stages, while the increase from 8 to 10 stages becomes less significant. This indicates that beyond 8 stages, the impact of adding more layers on the model’s performance begins to diminish.

In terms of channel count analysis, the overall influence of a smaller compression rate on the F1 score is less pronounced. Specifically, when the compression rates are set to 1, 8, and 16, there is no significant downward trend in the F1 scores. However, when the compression rate reaches 32, the F1 score shows a marked decrease. This suggests that as the compression rate increases beyond a certain threshold, it negatively impacts the model’s accuracy, highlighting the importance of maintaining an optimal channel configuration to achieve better performance in fault diagnosis tasks.

4.3.2. Comparative Experiments

This experiment primarily introduces comparative studies involving the improved ConvNeXt-v2 model and other classification models. Based on this approach, binary classification models for fault presence or absence (as shown in Table 3), a four-class classification model for four types of short circuit faults, three-class phase classification models for two-phase short circuit faults, two-phase grounded short circuit faults, and single-phase short circuit faults were constructed (as shown in Table 4). Through simulation experiments, the impact of experimental parameters on the diagnostic models was analyzed. Various evaluation metrics, such as accuracy, precision, recall, and F1-score, were used to conduct a comprehensive analysis of the experimental performance of different networks.

4.4. Results Analysis

The learning rate of the deep residual network was set to 0.0075. To determine the optimal number of epochs for iteration, the loss rate and accuracy for each iteration were recorded during training, and the parameters were optimized using Adam. The fault recognition accuracy are presented in Table 3. As shown in the above table, it can be observed that when the number of epochs exceeds 20, the fault localization and recognition accuracy of the model approach 100%. Additionally, the loss value gradually decreases as the number of epochs increases, confirming the model’s feasibility.

To verify the performance of the ConvNeXt-v2 diagnostic model, comparative tests were conducted using the Deep Residual Network (DRN), DRSN, and Fusion Deep Residual Network (FDRN). A total of 1200 test samples were input into the aforementioned models, and the comparative results are shown in Table 3. It is evident from Table 3 that when the output power is relatively stable, all six models perform well in fault diagnosis.

5. Conclusions

To effectively address the issue of fault diagnosis in complex power grids under varying operating conditions, this paper propose a fault diagnosis model that combines SWT with ConvNeXt-V2. This method combines feature extraction with convolutional neural networks through data processing, thus providing a more effective approach to diagnosing complex power grid faults. A wavelet transform feature extraction method was put forward, which effectively reduces the impact of noise or redundant information in the network. Furthermore, improvements were made to the original convolutional model by incorporating the SinAM module and optimizing the loss function, addressing the decline in the generalization ability of the original model and enhancing its stability. Experimental results indicate that this model demonstrates high accuracy, strong adaptability, and good applicability in fault recognition within distribution networks. This model integrates fault features from different perspectives, achieving advantages in recognizing fault waveform signal amplitude variation features and time–frequency characteristics. The integrated features provide a more comprehensive representation of different fault status information. Compared to Inception-ResNet and ResNet models, the fault accuracy rate improved by an average of 1.45% and 3.45%, and the fault recognition accuracy improved by an average of 3.47% and 5.50%, respectively. Compared to CNN and other traditional convolutional models, the improvement of the attention mechanism not only enhances the model’s capability to extract important features but also enables the entire model to filter out noise and redundant features. Even when the fault waveform signal is subjected to 5% noise interference, the fault localization and recognition accuracy remain above 96.75%, indicating that the model achieves high precision in fault localization. The model exhibits significant improvements in fault recognition accuracy under different types of interference noise. Even in situations where DG power fluctuates, the diagnostic accuracy remains no lower than 97.17%, demonstrating strong adaptability and robustness in fault localization and recognition. The model is capable of effectively adapting to fault record images and can distinguish among four types of faults, thus aiding professionals in fault type diagnosis. However, there are still some shortcomings; relying solely on the samples from this dataset does not adequately address the diversity of faults encountered in actual power grid operations. In the future, we will collect and organize more real-world sample sets to increase sample diversity, thereby enhancing the model’s stability and generalization capability.

Author Contributions

Conceptualization, Z.Z. and Z.L.; methodology, Z.Z. and Z.L software, Z.L. and Z.Z.; validation, Z.Z., Z.L., and G.H.; formal analysis, Z.Z., Z.L., and G.H. investigation, G.H. and F.W.; resources, F.W. and J.L.; data curation, F.W. and Z.L.; writing—original draft preparation, Z.Z., F.W., and Z.L.; writing—review and editing, Z.L., P.W., and Z.Z.; visualization, P.W.; supervision, Z.L. and Z.Z.; project administration, Z.L.; funding acquisition, F.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (Grant nos. 62273290, 61872126).

Data Availability Statement

The data that support the research results can be obtained by contacting the corresponding author.

Conflicts of Interest

Authors Guangyv Huang, Fei Wang and Peng Wang were employed by the company Dongfang Electronics Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Bhattacharya, B.; Sinha, A. Intelligent fault analysis in electrical power grids. In Proceedings of the IEEE International Conference on Tools with Artificial Intelligence (ICTAI), Boston, MA, USA, 6–8 November 2017; pp. 1–6. [Google Scholar] [CrossRef]
Hu, T.; Wang, J.; Pan, Y. Analysis of the Cooperation Mode between Distribution Automation and Relay Protection in Power Grid Fault Handling of Electric Power Systems. China Pulp Pap. Ind. 2023, 56–58. [Google Scholar]
Zhang, Y. Research on Improving the Transfer Ability of Knowledge Model for Power Grid Fault Analysis. Master’s Thesis, North China Electric Power University, Beijing, China, 2023. [Google Scholar]
Wagner, W.P. Trends in expert system development: A longitudinal content analysis of over thirty years of expert system case studies. Expert Syst. Appl. 2017, 76, 85–96. [Google Scholar] [CrossRef]
Zhang, X.; Yue, S.; Zha, X. Method of power grid fault diagnosis using intuitionistic fuzzy Petri nets. IET Gener. Transm. Distrib. 2018, 12, 295–302. [Google Scholar] [CrossRef]
Zhu, Y.; Huo, L.; Lu, J. Bayesian Networks—Based Approach for Power Systems Fault Diagnosis. IEEE Trans. Power Deliv. 2006, 21, 634–639. [Google Scholar]
Xiao, F. A Novel Evidence Theory and Fuzzy Preference Approach—Based Multi—Sensor Data Fusion Technique for Fault Diagnosis. Sensors 2017, 17, 2504. [Google Scholar] [CrossRef] [PubMed]
Wu, J.; Li, Y.; Liu, L.; Luo, Y.; Dai, W. A Method for Judging Power Grid Fault Types Based on SCADA Alarm Data. Microcomput. Appl. 2024, 40, 319–340. [Google Scholar]
Huang, S.; Zhou, Y. Power System Fault Detection in Noisy Environment Based on PMU Measurement Data. Electrotech. Appl. 2023, 8, 38–44. [Google Scholar] [CrossRef]
Li, W.; Fu, X.; Feng, B.; Zhou, Y.; Wei, X. Research on Substation Fault Analysis System Based on Fault Recording Data. Electr. Equip. Econ. 2024, 4, 148–153. [Google Scholar] [CrossRef]
Liu, X.; Wang, D.; Zhang, C.; Jiang, X.; Ning, Y. Research on Power Grid Fault Diagnosis Method Based on Genetic Wavelet Neural Network. J. Petrochem. Univ. 2019, 26, 78–82. [Google Scholar] [CrossRef]
Jiao, D.; Fan, X.; Wang, H. Particle Swarm Optimization Algorithm for Active Distribution Network Fault Reconfiguration. J. Chang. Univ. Technol. 2020, 6, 587–590. [Google Scholar] [CrossRef]
Wu, J.; Gao, P.; Tan, H. Power Grid Fault Diagnosis Based on Improved Differential Evolution Clustering Algorithm and Electrical Quantities. Electr. Eng. 2022, 8, 176–178. [Google Scholar] [CrossRef]
Li, P.; Cheng, B.; Jie, L.; Li, L.; Li, X.; Ding, K. A Power Grid Fault Identification Method Based on Convolutional Neural Network. Electron. Test 2023, 6, 57–61. [Google Scholar] [CrossRef]
Meng, H.; Zhang, J.; Cai, Z.; Li, C. Research on DC Micro-grid Fault Diagnosis Based on CNN-BiLSTM-Attention. Proc. CSEE 2023. Available online: https://link.cnki.net/urlid/11.2107.TM.20231213.1111.002 (accessed on 15 January 2025).
Zhang, Q.; Ma, W.; Li, G.; Ding, J.; Xie, M. Power grid fault diagnosis based on variable-mode decomposition and convolutional neural network. Electr. Power Syst. Res. 2022, 208, 107871. [Google Scholar] [CrossRef]
Peng, Y. Research on Power System Fault Diagnosis and Location Methods Based on Deep Learning. Electr. Eng. 2024, S1, 469–471. [Google Scholar] [CrossRef]
Zheng, Y. Power Grid Fault Diagnosis Based on Multi-Head Self-Attention Mechanism. Master’s Thesis, North China Electric Power University, Beijing, China, 2022. [Google Scholar] [CrossRef]
Wang, L. Research on Power Grid Fault Diagnosis Based on Deep Learning Algorithms. Master’s Thesis, Taiyuan University of Technology, Taiyuan, China, 2023. [Google Scholar] [CrossRef]
Ding, J.; Shao, Q.; Qi, Z.; Xie, M.; Gao, B.; Yu, Y. Power Grid Fault Diagnosis of Convolutional Neural Network Based on Transfer Learning. Sci. Technol. Eng. 2022, 14, 22. [Google Scholar]
Zhang, Z.; Du, M.; Wang, Z.; Zhang, X. Power Grid Fault Diagnosis Method Based on Inception Network. J. Phys. Conf. Ser. 2023, 2527, 012052. [Google Scholar] [CrossRef]
Yao, X.; Xing, L.; Xin, P. Microgrid Fault Diagnosis and Classification Method Based on Wavelet Feature Extraction and Deep Learning. J. Phys. Smart Power 2023, 2021, 49. [Google Scholar]
Gao, Y.W.; Su, X.N.; Zhang, H.; Jiang, S.Y.; Gao, H.J. A Fault Diagnosis Method for Highly Fault-Tolerant Distribution Networks Based on Data Verification and Graph Convolutional Neural Networks. Electr. Eng. New Technol. 2024, 43, 95–104. [Google Scholar] [CrossRef]

Figure 1. Schematic diagram of electrical quantity data transmission in the SWT—ConvNeXt model.

Figure 2. SWT signal decomposition process.

Figure 3. Schematic diagram of the ConvNeXt-V2 structure.

Figure 4. Schematic diagram of feature fusion under SimAM.

Figure 5. Schematic diagram of fault recording diagram.

Figure 6. Some fault characteristics are shown, where (a) represents the voltage signal of the three-phase short circuit fault recording, (b) represents the current signal of the three-phase short circuit fault recording, (c) represents the voltage signal of the two-phase short circuit fault recording, and (d) represents the current signal of the two-phase short circuit fault recording.

Figure 7. Model accuracy comparison with varying compressibility coefficients.

Table 1. Characteristics of different types of faults.

Fault Type	$I_{a}$	$I_{b}$	$I_{c}$	$U_{a}$	$U_{b}$	$U_{c}$
Single—phase ground fault (A—phase)	√
Single—phase ground fault (B—phase)		√
Single—phase ground fault (C—phase)			√
Short—circuit between AB—phases		√	√
Short—circuit between BC—phases			√
Short—circuit between CA—phases	√
Short—circuit between AB—phase and ground	√	√		Decrease	Decrease	Increase
Short—circuit between BC—phases and ground	√	√		Increase	Decrease	Decrease
Short—circuit between CA—phases and ground	√	√		Decrease	Increase	Decrease
Three—phase short—circuit	√	√	√

Table 2. Relationship between TP, FN, FP, and TN.

		Predicted value
		$X$	$\bar{X}$
Actual	$X$	$T P$	$F N$
	$\bar{X}$	$F P$	$T N$

Table 3. Binary classification (fault presence or absence).

Model	Accuracy	Precision	Recall	F1 Score	Training Duration
Model	(%)	(%)	(%)	(%)	(h)
LSTM	92.5	92.5	92.5	92.5	8.64
CNN	94.3	92	92.1	92	5.51
ResNet	96.8	96.8	96.8	96.8	5.23
Inception-ResNet	97.4	97.4	97.4	97.4	6.95
ConvNext (sign)	94.2	93	90.2	93	5.08
ConvNext-v2	99.3	99.3	99.3	99.3	5.12

Table 4. Four-class classification.

Model	Accuracy	Precision	Recall	F1 Score	Training Duration
Model	(%)	(%)	(%)	(%)	(h)
LSTM	82.0	74.6	76.4	75.2	15.17
CNN	90.2	85.9	90.2	87.0	12.56
ResNet	90.8	92.7	91.6	11.49	11.49
Inception-ResNet	97.0	95.4	96.4	95.9	13.6
ConvNext (sign)	92.1	91.5	90.1	90.4	10.5
ConvNext-v2	99.2	99.5	99.7	99.1	11.51

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, Z.; Zhao, Z.; Huang, G.; Wang, F.; Wang, P.; Liang, J. Power Grid Faults Diagnosis Based on Improved Synchrosqueezing Wavelet Transform and ConvNeXt-v2 Network. Electronics 2025, 14, 388. https://doi.org/10.3390/electronics14020388

AMA Style

Liu Z, Zhao Z, Huang G, Wang F, Wang P, Liang J. Power Grid Faults Diagnosis Based on Improved Synchrosqueezing Wavelet Transform and ConvNeXt-v2 Network. Electronics. 2025; 14(2):388. https://doi.org/10.3390/electronics14020388

Chicago/Turabian Style

Liu, Zhizhong, Zhuo Zhao, Guangyu Huang, Fei Wang, Peng Wang, and Jiayue Liang. 2025. "Power Grid Faults Diagnosis Based on Improved Synchrosqueezing Wavelet Transform and ConvNeXt-v2 Network" Electronics 14, no. 2: 388. https://doi.org/10.3390/electronics14020388

APA Style

Liu, Z., Zhao, Z., Huang, G., Wang, F., Wang, P., & Liang, J. (2025). Power Grid Faults Diagnosis Based on Improved Synchrosqueezing Wavelet Transform and ConvNeXt-v2 Network. Electronics, 14(2), 388. https://doi.org/10.3390/electronics14020388

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Power Grid Faults Diagnosis Based on Improved Synchrosqueezing Wavelet Transform and ConvNeXt-v2 Network

Abstract

1. Introduction

2. Related Work

2.1. Fault Diagnosis Methods Based on Signal Feature Processing

2.2. Fault Diagnosis Method Based on Convolutional Neural Networks

3. Complex Power Grid Fault Diagnosis Method Based on SWT and ConvNeXt-v2

3.1. Feature Extraction of Fault Recording Data Based on SWT

3.2. Fault Diagnosis Model for Complex Power Grids Based on Improved ConvNeXt-V2

3.2.1. Self-Attention Mechanism Improvement

3.2.2. Loss Function

4. Experiments and Analysis

4.1. Fault Data Processing

4.1.1. Characteristics of Fault Recorded Samples

4.1.2. Source Data Processing

4.2. Evaluation Criteria

4.3. Comparative Test

4.3.1. Experimental Parameter Setting and Analysis: Comparative Test

4.3.2. Comparative Experiments

4.4. Results Analysis

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI