1. Introduction
The three-phase permanent-magnet synchronous motor (PMSM) drive systems are widely utilized in manufacturing, electric vehicle applications, the metallurgical industry, etc., due to their high reliability and efficiency [
1,
2]. The PMSM faults cause the deteriorating performance of the device and affect the safe operation of equipment. More seriously, faults can even cause significant safety and casualty accidents. Therefore, timely and accurate fault diagnosis of such machinery is attracting the attention of scholars to reduce unexpected downtime, economic losses, and personal injuries [
3]. The fault diagnosis model is established through data feature extraction technology [
4,
5,
6]. Finally, fault diagnosis is completed using machine learning techniques [
7] or deep learning methods.
As known to us all, the fault diagnosis model is established through data feature extraction technology [
8]. Fault signals of many mechanical systems are typically non-stationary time series because the fault is caused by the accumulation of degradation over a long period of time. They are also long-term-dependent in the temporal domain and interrelated in the spatial dimension. In reference [
9], a deep learning-based observer, which combines the CNN and the LSTM, was employed in the fault detection of the nonlinear driving control system. Yixuan Mao et al. [
10] came up with a novel hybrid approach based on an LSTM neural network and a support vector machine (LSTM-SVM), which revealed great performance in mooring failure detection. In reference [
11], a dual deep learning reference classifier, utilizing CNN and LSTM, was specifically designed for the classification of synchronous motor electrical faults. Haitao Zhao et al. [
12] used the LSTM neural network to directly classify raw processed data, which provided good fault diagnosis performance. In reference [
13], an LSTM-regulated deep residual network was proposed for data-driven fault detection, which achieved good results in the accuracy of detection. Hence, the fault diagnosis problem can be converted to time series identification. LSTM received more scholarly attention when it was proposed due to its exemplary recognition and identification capabilities for time series. Reference [
14] introduced a data-driven fault diagnosis method that leveraged long short-term memory (LSTM) networks for detecting multiple open-circuit switch faults in the back-to-back converter of a doubly fed induction generator-based wind turbine system. Ping Zou et al. [
15] used LSTM to adaptively fuse IMF component information and extract features from rotating electrical machines to intelligently classify and recognize bearing status. Admittedly, these existing methods have high accuracy, but it is undeniable that the results depend on the huge amount of balanced data. Simultaneously, only a few of the samples are normal, and most of the samples are faulty due to the high-reliability design of the PMSM drive system. That is, the sample is distributed in a long tail, which is small and unbalanced.
With the rapid development of data-driven artificial intelligence, generative adversarial networks (GANs) were proposed by researchers in 2014 [
16]. GAN has been widely used in various fields because of its powerful ability to learn the original sample distribution and generate similar data distributions [
17]. Hence, many scholars have employed GAN to solve the problem of unbalanced and small samples. Reference [
18] proposed a fault diagnosis method based on LSTM and GAN for wind turbines. GAN uses the generator to solve the problem of insufficient data labels, and the Bayesian optimized LSTM prediction accuracy is better. To meet the large number of requirements for intermittent fault diagnosis and degradation assessment, the LSTM-GAN-based method is presented in reference [
19]. Although these methods have obtained good fault diagnosis results across diverse systems, the authors in this paper have identified suboptimal performance of the LSTM-GAN-based method in fault diagnosis for PMSM. The accuracy is only about 94%. As a result, the main issue of this paper is how to improve the accuracy of fault diagnosis.
As known to us all, the performance of fault diagnosis depends on both feature extraction and pattern recognition [
20,
21,
22]. In the existing references [
23,
24,
25,
26], it can be seen that combining multiple intelligent feature extraction methods from larger feature quantities can better improve the accuracy of the classification model. Consequently, one can see that effective means of feature extraction may contribute to GAN-LSTM-based fault diagnosis. In reference [
27], the autoencoder (AE) is used to perform critical temporal feature extraction and dimension reduction; thus, the fault diagnosis performance of the LSTM-GAN-based method is improved. However, the AE can only approximately copy inputs similar to the training data; that is, it is a compressed representation of the data and easily causes overfitting problems. To solve this problem, DAE [
28] is proposed to enhance the robustness of the trained encoder by adding noise to the input data. On this basis, SDAE [
29] is presented to obtain better data representation with the deep neural network. Recently, DAE and SDAE were employed with other deep learning methods for diagnosis, achieving good applicational progress [
30,
31]. Unfortunately, the fault diagnosis design of the structure that combined GAN-LSTM with SDAE has not been widely applied in practice.
Considering the above shortcomings and advantages, this paper proposes a fault diagnosis method based on SDAE-GAN-LSTM. Among the three neural network frameworks proposed in this paper, the advantage of SDAE in extracting deep features from nonlinear data is fully utilized, while the ability of LSTM to deal with the dependence of time series information is fully utilized for effective classification and recognition. Furthermore, the proposed method designed a new generator to generate higher-quality samples, which can be applied to fault identification. Meanwhile, the training of the generator is optimized by utilizing the ability of SDAE to extract deep fault features and the fault diagnosis error of LSTM to generate high-quality samples. Therefore, the fault diagnosis accuracy of the three-phase PMSM inverter is improved.
The main innovations of this paper are as follows:
We designed a new generator to generate fault features rather than fault data. First, stacked denoising autoencoders are pre-trained to obtain optimal parameters. Then, the optimized parameters are migrated to the generator layer of GAN, where the generated fault features are decoded to obtain fake samples.
We designed a new discriminator. The LSTM fault diagnosis model is added to the real discriminator of the traditional GAN network, which overcomes the defect of the traditional GAN not filtering low-quality samples. The new discriminator needs to identify the authenticity of the sample while considering the results of the fault diagnosis of the generated sample.
Furthermore, the model demonstrates enhanced capabilities in generating samples and conducting fault diagnosis. Improvements have been achieved in the sample generation ability of the generator, the discriminator’s discriminative performance, and the fault diagnosis capability of the LSTM.
The rest of this article is arranged as follows.
Section 2 introduces the basic theory of SADE, GAN, and LSTM;
Section 3 elaborates on the imbalanced sample data fault diagnosis algorithm based on SDAE-GAN-LSTM;
Section 4 illustrates the simulation results;
Section 5 presents the conclusion.
3. Fault Diagnosis Algorithm Based on SDAE-GAN-LSTM
Due to the high reliability and highly stringent design principles of PMSM, the motor drive systems typically operate under normal conditions during their operational phases. It is worth mentioning that the fault signals are characterized by a long-tailed distribution, which brings a challenge to data-driven intelligent fault diagnosis technology due to the imbalanced sample distribution. In this paper, the advantages of three neural network frameworks are combined to address this issue. First, this paper harnesses the deep feature extraction capabilities of SDAE to extract nonlinear features from the data. Second, this experiment utilizes GAN to tackle the problem of imbalanced samples. Finally, LSTM networks are employed to process time series data and perform classification recognition. The basic principles of the specific model and the introduction of the diagnostic process are elucidated in this section.
3.1. Design of Generator
In this paper, a novel generator is designed for the generation of features in imbalanced sample datasets. The generator network is optimized by using real features so that the generator can generate the features of the imbalance samples. Then, the generated features are decoded to generate samples. To initiate this process, a random noise distribution, denoted as
z, is used as the input of the generator to obtain
. Its process can be expressed as Equations (13) and (14):
where
and
are the weight matrix and bias of the generator input layer and hidden layer, respectively. Both
and
are activation functions.
To ensure that the generated sample aligns with the requirements of fault diagnosis, this paper uses SDAE to decode . The SDAE is trained in this study by using the imbalanced sample . The detailed training process is outlined as follows:
- Step 1:
The original signal is processed by adding noise to obtain the input , which makes the decoder network robust.
- Step 2:
According to Equation (
15), the input
is encoded to obtain the first layer feature:
;
where
are the weight matrix and bias of the first
.
- Step 3:
Then
is used as the input of the second
. Repeat Step 2–Step 3 until the
is decoded by the N-th
. According to Equation (
16), the
is decoded to obtain the output
.
Therefore, the process of decoding the
to the
can be represented by Equation (
17) as:
where
are the weight matrix and bias of the decoder layer.
3.2. Design of Discriminator
In this section, the improved discriminator is described. The proposed discriminator adds a fault diagnosis layer, which is comprised of the LSTM network, while the traditional discriminator is retained, unaltered. The specific designs of both discriminators are as follows.
3.2.1. Preserve the Traditional Discriminator, Which Can Discern Authenticity
The purpose of the traditional discriminator is to discern authenticity. The traditional discriminator, in this paper, is composed of three layers of BP neural networks. Since the output layer of the traditional discriminator only has one neuron, it is necessary to set the labels of the real and generated samples. Therefore, the proposed discriminator is a supervisory model. The label used can be represented by Equation (
18).
Then, the parameters of the discriminator network are updated by backpropagation to improve the ability to discern the authenticity of the discriminator network. The cross-entropy loss function is shown in Equation (
19):
where
k is the number of samples, and
and
represent traditional discriminant results, respectively.
3.2.2. Add a Fault Diagnosis Discriminator Based on LSTM
The LSTM is pre-trained with the given real dataset , where represents the fault sample for each fault type with a large sample size, and represents the normal sample without fault with an imbalanced sample size.
According to the basic principle of LSTM in
Section 2, the neural network based on LSTM can be constructed by Equation (
20).
where
,
,
,
stand for the parameters of the forget gate, input gate, input candidate information, and output gate, respectively.
represents loss function, which is the same as Equation (
12). We optimize the Softmax classifier by minimizing the cost function of Equation (
21).
where
t is the number of states.
3.3. Loss Function Description
For traditional GAN, it is only necessary to optimize the generator by minimizing the cross-entropy loss function shown in Equation (
22) and try to make the generated samples consistent with the original sample distribution.
To generate more qualified fault samples for fault diagnosis classification, this paper optimizes the training of the generator based on the fault diagnosis results of LSTM and the reconstruction error between
and
. Therefore, the parameters of the generator are optimized by the new loss function defined by the minimization Equation (
23) to ensure that the generated samples can be consistent with the original sample distribution and meet the requirements of fault diagnosis.
where
k is the number of samples,
J is the number of deep features extracted from SDAE. The SDAE-GAN-LSTM network structure is shown in
Figure 4.
3.4. Diagnosis Process Based on SDAE-GAN-LSTM
The specific iteration steps are as follows:
- Step 1
Pre-train the stacked denoising autoencoder.
Given the original fault sample
as the input, the optimal decoder network can be obtained by following Step 1 to Step 3 in
Section 3.1.
- Step 2:
Expand the fault sample dataset.
According to Equation (
14), the fault features
are obtained, and the fault features are decoded by Equation (
17) to obtain a new fault sample dataset. Details are shown in Equation (
24):
- Step 3:
Train the traditional discriminator.
Taking
as the input of the traditional discriminator, the parameters of the discriminator are optimized by minimizing Equation (
19).
- Step 4:
Pre-train the LSTM-based fault diagnosis discriminator.
Using
as the input, the fault diagnosis discriminator based on LSTM is trained. In addition, the parameters of Equation (
20) are optimized by minimizing Equation (
21) to improve the fault identification capability.
- Step 5:
Optimize the generator.
and
are obtained by optimizing the two discriminators through the generated sample
and original fault sample
. We use Equation (
23) to optimize the parameters of the generator.
After several iterations, repeat Step 1–Step 5. Record the generator and discriminator loss functions for each iteration. This indicates that the training converges when the Nash equilibrium is reached. Meanwhile, the generator can generate fake samples that are consistent with the distribution of real fault samples to achieve the effect of false confusion. At the same time, the generated samples can achieve high accuracy fault identification requirements based on the LSTM fault diagnosis discriminator.
The flow chart of the fault diagnosis algorithm based on SDAE-GAN-LSTM is shown in
Figure 5.
5. Conclusions
An improved fault diagnosis method is proposed, which can diagnose the open-circuit fault and short-circuit fault of the inverter in a three-phase permanent-magnet synchronous motor drive system. The main advantages of this method are as follows. First, the pre-trained SDAE network decodes the fake features, enhancing the model’s robustness, and ensuring that the generated samples meet the requirements of fault identification. Secondly, leveraging the potent sample generation capabilities and data feature extraction ability of the GANs, high-quality samples with similar distributions—satisfying fault discrimination—are generated. Finally, the adaptive learning capabilities of the LSTM network in processing time series data are leveraged to further integrate feature information and predict fault diagnosis outcomes. Through a series of comparative experiments on open-circuit fault diagnosis in the inverter of a three-phase PMSM drive system, the classification accuracy of the proposed method can reach 98.92%, which shows the effectiveness of the proposed method. Furthermore, for inverter short-circuit faults in the three-phase permanent-magnet synchronous motor drive system, the classification and identification accuracy reaches an impressive 99.96%. In contrast, the accuracies of SAE, GAN-SAE, LSTM, GAN-LSTM, and BPNN are 94.83%, 91.18%, 93.25%, 94.68%, and 94.79%, respectively. Consequently, this method holds significant reference value for open-circuit fault and short-circuit fault diagnosis in three-phase permanent-magnet synchronous motor drive systems.
It is important to acknowledge that the above research comes with certain limitations. The fault experiment of the permanent-magnet synchronous motor drive system was completed in a very ideal state in the simulation software while ignoring the influence of external objective factors, such as temperature and load on the motor work. In light of these limitations, future research endeavors will aim to address these gaps. Specifically, in future research, we intend to investigate the operation and failure modes of the permanent-magnet synchronous motor drive system in more complex and realistic environments. This will involve conducting experiments that account for the influence of temperature fluctuations and varying loads on motor performance, providing a more comprehensive understanding of system behavior. Additionally, in terms of algorithmic advancements, we plan to delve deeper into the problem of adaptive optimization in future work. This will involve exploring and developing algorithms that can further enhance the adaptability and performance of our proposed method in diverse operational scenarios. In future research, efforts should be made to continuously improve the experimental procedure to improve the generalization and robustness of the method and ensure its applicability in the actual environment. In conclusion, while this study has produced promising results, this paper acknowledges that further exploration and refinement are needed to make the proposed method more resilient and adaptable in practical applications.