Next Article in Journal
Soft Red Winter Wheat Elite Germplasm Screening and Evaluation for Stripe Rust in the US Southeast Region
Next Article in Special Issue
FRESH: Fusion-Based 3D Apple Recognition via Estimating Stem Direction Heading
Previous Article in Journal
Meeting Market and Societal Ambitions with New Robust Grape Varietals: Sustainability, the Green Deal, and Wineries’ Resilience
Previous Article in Special Issue
Fault Diagnosis of Rolling Bearings in Agricultural Machines Using SVD-EDS-GST and ResViT
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Intelligent Fault Diagnosis of Inter-Turn Short Circuit Faults in PMSMs for Agricultural Machinery Based on Data Fusion and Bayesian Optimization

1
College of Mechanical Electrification Engineering, Tarim University, Alar 843300, China
2
National Engineering Laboratory for Electric Vehicles, Beijing Institute of Technology (BIT), Beijing 100081, China
*
Authors to whom correspondence should be addressed.
Agriculture 2024, 14(12), 2139; https://doi.org/10.3390/agriculture14122139
Submission received: 31 October 2024 / Revised: 23 November 2024 / Accepted: 24 November 2024 / Published: 25 November 2024
(This article belongs to the Special Issue Computational, AI and IT Solutions Helping Agriculture)

Abstract

:
The permanent magnet synchronous motor (PMSM) plays an important role in the power system of agricultural machinery. Inter-turn short circuit (ITSC) faults are among the most common failures in PMSMs, and early diagnosis of these faults is crucial for enhancing the safety and reliability of motor operation. In this article, a multi-source data-fusion algorithm based on convolutional neural networks (CNNs) has been proposed for the early fault diagnosis of ITSCs. The contributions of this paper can be summarized in three main aspects. Firstly, synchronizing data from different signals extracted by different devices presents a significant challenge. To address this, a signal synchronization method based on maximum cross-correlation is proposed to construct a synchronized dataset of current and vibration signals. Secondly, applying a traditional CNN to the data fusion of different signals is challenging. To solve this problem, a multi-stream high-level feature fusion algorithm based on a channel attention mechanism is proposed. Thirdly, to tackle the issue of hyperparameter tuning in deep learning models, a hyperparameter optimization method based on Bayesian optimization is proposed. Experiments are conducted based on the derived early-stage ITSC fault-severity indicator, validating the effectiveness of the proposed fault-diagnosis algorithm.

1. Introduction

As an agricultural powerhouse, China feeds nearly 20% of the world’s population with only 9% of the world’s arable land [1]. Against this backdrop, the level of agricultural mechanization in China has continuously improved alongside the rapid development of the national industrial level [2]. Agricultural mechanization plays a crucial role in promoting agricultural modernization and sustainable development, making intelligent fault-diagnosis research in agricultural mechanization especially important [3]. Agricultural machinery is widely used in various stages of modern agricultural production, including sowing, fertilization, tillage, and harvesting [4]. Permanent magnet synchronous motors (PMSMs) are key power components in agricultural machinery. Due to their advantages of high power density, high efficiency, and excellent control performance, they are widely applied in sowing machines, harvesters, seeders, electric tractors, spraying equipment, and tillage [5]. However, the harsh working environments faced by agricultural machinery—such as extreme temperatures, humidity, corrosion, and dust—along with complex and variable operating conditions—including multimodal vibrations and shocks, fluctuating loads, high loads, frequent starts and stops, and overload operations—pose significant challenges to the durability and safe operation of PMSMs, leading to faults [6]. Common faults in PMSMs include short circuit faults, mechanical faults, and permanent magnet failures. Among these, an ITSC fault is one of the most common short circuit faults in PMSMs [7]. These faults not only pose significant hazards but are also difficult to detect. The occurrence of ITSC faults creates a new closed loop at the short circuit point, generating a large fault current. This not only disrupts the magnetic field distribution in the air gap, increasing motor vibrations but also produces a considerable amount of heat, further threatening the insulation of nearby windings. If not detected and addressed promptly, this can lead to further deterioration of the fault severity, potentially causing the motor and agricultural machinery to lose control, resulting in serious losses [8]. Early diagnosis of ITSC faults not only helps protect equipment and improve production efficiency but also reduces costs and ensures safety, making it highly significant.
Existing fault-diagnosis approaches mainly focus on model-based methods and data-driven approaches [9]. Model-based methods are developed based on the analysis of signal features in different domains, primarily including time domain features, frequency domain features, and time-frequency domain features. These methods demonstrate high accuracy when agricultural machinery operates under stable conditions; however, their effectiveness is often limited in dynamic operating conditions. Data-driven methods are typically used for more complex agricultural machinery fault-diagnosis problems. They rely on machine learning (ML) algorithms to achieve fault recognition and classification. Commonly used algorithms include artificial neural networks (ANN), random forests (RF), extreme learning machines (SVM), and support vector machines (SVM). Nevertheless, these methods share a common drawback: the effectiveness of the algorithms largely depends on the quality of the extracted fault features, which are typically manually extracted. This process requires not only a strong background in expertise but also involves a degree of subjectivity, and it can be time-consuming.
Deep learning models possess the self-learning capability of distributed features, enabling them to automatically abstract and extract the relationships and hierarchical structures among vast amounts of data [10]. In the application of fault diagnosis, deep learning can achieve end-to-end feature extraction and fault classification, effectively overcoming the drawbacks of the aforementioned methods [11]. Xu et al. addressed identifying and diagnosing faults in a tractor’s transmission system [12]. They proposed a fault-diagnosis model that combines transformer networks and time generative adversarial networks (Time GANs), achieving high accuracy in fault diagnosis. Xie et al. applied deep learning to the fault diagnosis of rolling bearings in agricultural machinery [1]. Their method employs energy spectrum and singular value decomposition for noise reduction of the vibration signals and then combines ResNet and Vision Transformer to achieve fault diagnosis of the bearings. Lee et al. proposed a method for diagnosing ITSC faults using current signals and rotational speed signals. This approach employs a recurrent neural network (RNN) for fault-feature extraction and incorporates an attention mechanism to assess the severity of ITSC faults [13]. Zhu et al. proposed an intelligent fault-diagnosis method based on principal component analysis (PCA) and deep belief networks (DBNs), which, according to experimental results, is more effective and easier to implement compared to other approaches [14]. Zhu et al. applied a novel capsule network for bearing fault diagnosis, integrating deep learning and short-time Fourier transform (STFT) to convert one-dimensional signals into time-frequency maps, with validation results demonstrating that this model outperforms traditional methods in terms of generalization ability [15]. Husari et al. proposed a hybrid model architecture for diagnosing early ITSC faults, which uses current as input, employs a CNN for feature extraction, and utilizes long short-term memory (LSTM) networks and gated recurrent units (GRUs) for fault diagnosis and severity identification, achieving an accuracy exceeding 97%, thereby outperforming single deep learning models [16,17].
From the above analysis, it is evident that although deep learning-based methods for ITSC fault diagnosis have shown promising results, such as fault recognition accuracy exceeding 97%, an examination of the confusion matrix on the validation set reveals that the model still faces issues related to false alarms and missed detections [18]. This indicates that there is still significant room for improvement in the model’s feature-extraction capability. In motor fault diagnosis, a single type of signal often fails to capture all the characteristic information about the motor’s operating condition. Sufficiently rich features are a necessary condition for achieving high recognition accuracy in fault diagnosis [19]. Recent studies have shown that diagnostic systems using multi-sensor resources and sensor fusion technologies can provide superior and more robust diagnostic results. Based on this analysis, to further improve the accuracy of ITSC fault diagnosis and reduce the model’s false alarm and missed detection rates, this paper proposes a CNN model based on multi-source data fusion for identifying the severity of ITSC faults. This method employs both current and vibration signals for feature learning to enhance the richness of the fault-feature space and uses the fused features for subsequent fault diagnosis to improve the accuracy of the model’s fault classification. The main contributions of this paper are summarized as follows:
(1)
An indicator suitable for early-stage ITSC fault-severity analysis is derived from the equivalent circuit. This indicator cannot be directly used for fault diagnosis of ITSC, but it can serve as a guide for setting the fault severity during the experimental process.
(2)
A feature-level multi-source data-fusion algorithm based on CNNs is proposed. This algorithm employs Bayesian optimization for model hyperparameter tuning, fusing current and vibration signal features to enhance the richness of the fault-feature space, thereby improving the accuracy of ITSC fault-severity identification.
(3)
A signal synchronization method is proposed to construct a synchronized signal dataset for current and vibration signals. By calculating the maximum cross-correlation of synchronized signals collected by two devices, synchronization of the current and vibration signals acquired by both devices is achieved.
(4)
The effectiveness of the proposed multi-source data-fusion algorithm is validated through experiments. The experimental results indicate that compared to three other methods, the proposed algorithm not only demonstrates higher training efficiency but also superior model performance, highlighting its advantages.
The remainder of this paper is arranged as follows. Section 2 analyzes the mechanism of ITSC faults. Section 3 introduces the details of the proposed algorithm. Section 4 describes the experimental equipment used and the settings required for simulating fault tests. In Section 5, the experiment results are presented to demonstrate the superiority of the proposed algorithm. Finally, Section 6 summarizes this article.

2. ITSC Fault in PMSMs

Estimating ITSC faults is crucial to ensure the safe operation of PMSMs, primarily because it enhances the safety and reliability of the motor, reduces potential safety hazards such as fires, and effectively protects the equipment from damage, thus avoiding costly repairs [20]. Previous research has lacked effective indicators specifically designed for the early diagnosis of ITSC faults. This paper presents an equivalent circuit model for ITSC faults based on the winding coil structure, and on this basis, derives an indicator to guide the setting of early ITSC fault severity in experiments.
The winding of a PMSM is typically composed of multiple coils arranged in series or parallel. This paper primarily studies the winding structure of multiple coils in series. To ensure the uniformity of the air gap magnetic field, reduce harmonic content, and improve the efficiency and performance of the motor, the winding of the PMSM generally adopts a distributed winding structure. The coils are wound into appropriate shapes and installed in two stator slots with a certain spacing between them. When a coil in a certain slot experiences an ITSC fault, the related turns in the corresponding slot are also affected by the ITSC fault, as shown in Figure 1. Figure 1 shows a cross-sectional view illustrating the winding structure of an 8-pole, 36-slot PMSM with an ITSC fault. Each turn of wire within the slot is uniquely marked in the form of Pc-t. For example, A1–4 indicates that the fourth turn of the first coil in the A winding has experienced an ITSC fault. The red section in Figure 1 indicates the location of the ITSC fault, while the corresponding enlarged view shows the position and number of the shorted wire turns within the slot.
The ITSC fault in the winding can lead to changes in the structure of the faulty phase winding in a PMSM. At the shorted point, the faulty phase winding is divided into a faulty part and a healthy part. Parameters related to the faulty phase winding, such as resistance, inductance, and magnetic flux, will also change accordingly. Additionally, the shorted point will create a new closed loop, if the resistance at that point is low enough, the fault current can increase significantly and generate a large amount of heat. If heat dissipation is not timely, it may cause further damage to the adjacent insulation, exacerbating the severity of the fault. Changes in the winding structure and the presence of fault current can lead to an imbalance in the air-gap magnetic field and introduce higher harmonic components, thereby exacerbating the unbalanced magnetic pull between the stator and rotor, which in turn increases motor vibrations [21,22,23]. Additionally, the disruption of the current balance among the windings results in larger torque fluctuations, further intensifying the motor’s vibration [24].
Therefore, ITSC faults not only cause changes in the three-phase current but also exacerbate motor vibrations and alter their vibration characteristics [25]. This means that the fault features are reflected in both the current and vibration signals. The characteristic information contained in different signals varies, and their sensitivity to the motor’s operating conditions is also different [26,27]. To enrich the extracted feature space and improve the accuracy of ITSC fault identification, this study employs both current and vibration signals for feature learning, thereby increasing the diversity of the feature space. The fused features are then used for subsequent fault diagnosis to enhance the accuracy of the model’s fault classification.
Assuming that the first coil of winding A experiences an ITSC fault, the schematic of the equivalent circuit model is shown in Figure 2. From the figure, it can be observed that after the fault occurs, the faulty phase winding is divided into two parts: the yellow part represents the shorted section, while the green part represents the remaining healthy section. The red part indicates the newly formed closed loop at the short circuit point and its fault current. In the fault current loop, the fault phase current ia is divided into two components: one is the current if flowing through the fault resistance Rf, and the other is the current iaif flowing through the shorted winding.
Let Ns be the number of turns that are shorted in the A-phase winding, Nt be the number of turns in each coil, and Nc be the number of coils in each phase winding. The proportion of the shorted turns relative to a single coil and the total number of coils can be expressed as follows:
μ = N s N c N t
where η represents the proportion of the shorted turns in the coil relative to the total number of turns in that coil, and μ represents the proportion of the shorted turns in the faulty phase relative to the total number of turns in that phase winding. Based on the above analysis, the equivalent circuit model is derived as follows:
V a b c n = R a b c f I a b c f + d d t ( L a b c f I a b c f ) + e a b c f
where
V a b c n = [ v a n v b n v c n 0 ] T R a b c f = R a h + R a f R a f R b R c R a f R a f + R f I a b c f = i a i b i c - i f T L a b c f = L a h + L a f + 2 M a h f M a h b + M a f b M a h c + M a f c L a f + M a h f M a h b + M a f b L b b M b c M a f b M a h c + M a f c M b c L c c M a f c L a f + M a h f M a f b M a f c L a f e a b c f = [ e a h + e a f e b e c e a f ] T = d d t ψ f a h + ψ f a f ψ f b ψ f c ψ f a f T
In Equation (2), Raf, Rah, Rf, Rb, and Rc indicate the resistance of the shorted part of the faulty phase winding A, the resistance of the remaining healthy part, and the fault resistance at the shorted point, the resistance of the phase winding B, and the resistance of the phase winding C, respectively. The ia, ib, ic, and if represent the currents flowing through the A, B, and C phase windings, as well as the fault current at the shorted point of the fault resistance, respectively. Lah, Laf, Mahf, Mahb, Mafb, Mahc, and Mafc denote the self-inductance of the healthy portion and the shorted portion of faulted phase winding A, the mutual inductance between these two portions, the mutual inductance between these two portions with phase B, and the mutual inductance between these two portions with phase C, respectively. Ψfah, Ψfaf, Ψfb, and Ψfc represent the permanent magnet flux linkage of the healthy portion of the faulted phase winding A, the permanent magnet flux linkage of the faulted portion, the permanent magnet flux linkage of phase B winding, and the permanent magnet flux linkage of phase C winding, respectively.
The resistance and permanent magnet flux linkage are directly proportional to the number of turns in the winding; therefore, the various parts of the faulted phase winding can be expressed as:
R a f = μ R a R a h = ( 1 μ ) R a ψ a f = μ ψ f a ψ a h = ( 1 μ ) ψ f a
where Ra and Ψf stand for the phase resistance and permanent magnet flux linkage of phase A winding under the condition of a healthy state.
Since the focus of the study is on the early fault diagnosis of ITSCs, only the case of faults occurring within a single coil is considered. To streamline the analysis, it is also assumed that the ITSC fault affects the other phase windings symmetrically. The mutual inductance between coils within the same phase winding is ignored [28]. The relationship between the inductance of different parts of the winding and the degree of the shorted ratio can be expressed as [29]:
L a f = μ 2 L a a L a h = ( 1 μ ) 2 L a a M a f h = μ ( 1 μ ) L a a L a f + 2 M a f h + L a h = L a a M a h b = ( 1 μ ) M a b M a f b = μ M a b M a h c = ( 1 μ ) M a c M a f c = μ M a c
According to Kirchhoff’s current law, the sum of currents flowing into the same node is zero; thus, we can conclude the following:
i a + i b + i c = 0
By substituting Equations (3) and (4) into (2) and combining with Equation (5), the expression for the fault current can be derived as follows:
i f = μ ( v a v n ) + ( μ ( L a f + M a h f ) L a f ) d i f d t μ R a + R f μ 2 R a
From Equation (4), it can be calculated that μ(Laf + Mahf) − Laf = 0. In the early stages of an ITSC fault, the amplitude of vn can be considered negligible compared to that of va, thus vavavn. Let va = Va sin(ωt), Thus, an approximate expression for the fault current amplitude can be derived as follows:
I f μ V a μ R a + R f μ 2 R a
It is widely recognized that the voltage amplitude of the stator winding in a PMSM has a positive correlation with the motor speed [30]. Then, Equation (7) can be represented as follows:
I f K μ ω r μ R a + R f μ 2 R a
where K represents a constant coefficient that can be considered a known quantity, while ωr denotes the mechanical speed of the rotor.
From Equation (8), it can be seen that K can be considered a known quantity, while Ra represents the inherent parameter of the motor, which can also be regarded as known. The remaining parameters will directly affect the fault current if. The parameters μ and Rf are both related to the severity of the ITSC fault, whereas ωr is independent of the fault severity. By removing ωr from the right side of Equation (8), only the parameters related to the severity of the ITSC fault remain, expressed as follows:
I f ω r K μ μ R a + R f μ 2 R a : = F I
The right side of the equation contains only known quantities and parameters related to the severity of the ITSC fault. We define the right side expression as a representation of the fault severity, denoted by the symbol FI. When the motor is in a healthy state, this fault-severity indicator is 0, conversely, when the faulted phase winding of the motor is completely shorted and the fault resistance at the shorted point is 0, this indicator becomes infinite. The left side of the equation represents the ratio of the magnitude of the fault current to the mechanical speed of the rotor. From the derivation process, it can be seen that this indicator is only applicable for analyzing the fault severity in the early stages of ITSC faults, during which the indicator is generally unaffected by speed. It increases as the fault resistance Rf decreases or the shorted ratio μ increases, and vice versa. In practice, it is difficult to directly measure Rf and μ during the motor’s operation. Therefore, this indicator is not suitable for the direct estimation of ITSC fault severity. However, it can be used as a fault-severity indicator in experiments, guiding the setting of fault levels for ITSC faults.

3. Proposed Algorithm

3.1. Fault-Diagnosis Methods Based on Multi-Source Data Fusion

Multi-source data fusion, also known as information fusion, is a technology that enables the automated processing of integrated information. This technology originated in the military domain and has gradually been widely applied in the civilian sector after years of development. Today, it has achieved rapid advancements in various fields, including robot control, autonomous driving, and fault diagnosis [31]. The application of data fusion in fault diagnosis relies on the research object’s ability to collect information from multiple types of sensors. Additionally, to achieve a comprehensive analysis and accurate assessment of the fault status, it is necessary to utilize various signal processing techniques to obtain a rich feature space. By combining different intelligent algorithms, multi-level data fusion can be achieved, leading to the final assessment results.
According to the different levels of data fusion, data-fusion methods can be classified into three categories: data-level fusion, feature-level fusion, and decision-level fusion. When selecting different fusion levels, it is necessary to consider the balance between fusion performance and implementation cost [11].
Data-level fusion, also known as pixel-level fusion, refers to the direct integration of raw data, which maximally preserves the original information and exhibits superior fusion performance. Xia et al. proposed a CNN-based fault-diagnosis method for rotating machinery that combines sensor fusion with spatiotemporal information to achieve automatic feature extraction [32]. Chen et al. proposed a fault-diagnosis method for gearboxes based on DCNN, which integrates the raw data of vertical and horizontal vibration signals to achieve automatic feature extraction [33]. However, data-level fusion models do not possess the ability to correct errors, and their performance is poor when the sensor types are different or when there are significant differences in magnitude.
Feature-level fusion can achieve a more refined integration of information through the dimensionality reduction of the data. Compared to data-level fusion, feature-level fusion offers higher robustness, less information redundancy, greater flexibility, and better real-time performance, allowing for a more comprehensive representation of the data’s characteristics. Azamfar et al. proposed a novel two-dimensional CNN architecture that integrates features extracted from multiple current sensors to monitor gearbox faults under different operating conditions and speeds [34]. Parai et al. proposed a feature-level fusion method for circuit fault diagnosis, which employs wavelet analysis for fault-feature extraction from multiple signals and uses PCA for feature fusion. Ultimately, circuit fault-type diagnosis is achieved based on a support vector machine [9]. In recent years, the combination of feature-level fusion and deep learning models has gradually become a common approach to achieve better diagnostic results.
Decision-level fusion typically occurs after each model or sensor completes its processing independently, followed by a decision merger. This approach has a high degree of information integration and lower computational complexity, making it suitable for various signal sources or data types. Common decision-level fusion methods include the Dempster–Shafer (D-S) theory, decision tree fusion, and weighted voting. However, the complexity of decision-level fusion methods is relatively high, with significant loss of original data and various challenges in choosing fusion strategies, which is why they are less frequently used in conjunction with deep learning models.
Based on the advantages and disadvantages of different levels of multi-source data fusion described earlier, and in conjunction with the content of this study, a multi-source data-fusion algorithm based on a CNN model has been proposed. This algorithm is applied to the identification of the severity of ITSC faults, aiming to enhance feature learning from various types of sensor information. By enriching the feature space and integrating multi-faceted fault characteristics, the algorithm seeks to improve the diagnostic accuracy of ITSC fault identification under various complex operating conditions.

3.2. Convolutional Neural Networks

A CNN is a significant type of deep neural network that can achieve end-to-end fault classification by automatically extracting local features [35]. Compared to traditional artificial neural networks, a CNN has convolutional layers that feature “local connectivity” and “weight sharing”, which substantially decreases the number of parameters in the model and lowers the training difficulty [36]. CNN models typically consist of multiple hidden layers that can automatically extract various features from the input signals. The lower hidden layers focus on learning the basic characteristics of the input signals, while the higher layers abstract and re-extract these basic features to form more complex high-dimensional features, resulting in more accurate classification [37].
The convolutional layer typically needs to work in conjunction with functional layers such as pooling layers, normalization layers, activation layers, and dropout layers to enhance the feature-extraction capability of the convolutional module. A typical structure of a convolutional module is shown in Figure 3a. In this module, the convolutional layer filters out redundant information from the input signals through convolution operations, reinforcing important features related to fault classification. The pooling layer usually follows the convolutional layer to perform feature dimensionality reduction while maintaining the translation invariance of the features. The activation function accelerates the convergence of the model and helps mitigate the vanishing gradient problem to some extent. The dropout layer is primarily used to prevent overfitting, thereby improving the model’s generalization ability.
In this paper, to maintain the high resolution of the extracted fault features, reduce downsampling operations, and enhance the model’s ability to capture multi-level complex information, a dilated CNN is employed. Its structural diagram is illustrated in Figure 3b, and the expression for dilated convolution is shown in Equation (10) [38].
F ( x ) = ( S * d f ) ( x ) = i = 0 k 1 f ( i ) S x d i
where F(x) stands for the dilated convolution operation. The input signal SRn is convolved using the operator *, with x indicating the specific element of the input signal involved in the convolution. The dilation factor is denoted by d, while f: {0, 1,…, k − 1} → R represents the set of weight values applied during the convolution. The parameter k defines the size of the weight matrix, and xd·i indicates the i-th element of the input signal undergoing the convolution operation.

3.3. The Attention Mechanism

Due to its significant impact on deep learning models, attention mechanisms have garnered widespread attention in recent years. The core of this mechanism lies in adjusting weights to guide the model in filtering out redundant information that is irrelevant to the task, thereby focusing attention on the features that are more critical for achieving the task objectives [35]. In CNN models, common attention mechanisms include channel attention, spatial attention, and hybrid attention. This paper employs a channel attention mechanism to adjust the weights of input features extracted from different signals under varying fault levels and operating conditions, reallocating the model’s attention and enhancing the contribution of each channel feature to improve the performance of the ITSC fault-diagnosis model.
The typical channel attention mechanism is known as SENet (Squeeze and Excitation Network), and its core structure mainly includes three steps: squeeze, excitation, and scaling, as illustrated in Figure 4 [39]. During the training process of the fault-diagnosis model, the feature weights extracted by the model are adjusted and redistributed after undergoing the squeeze and excitation operations of SENet. The weights of features that are relevant and sensitive to fault level classification are enhanced, while irrelevant features are suppressed or weakened. The core three steps of SENet can be expressed as follows:
z c = F s q ( u c ) = 1 W × H i = 1 W j = 1 H u c ( i , j ) s = F e x ( z c , W i ) = δ ( W 2 σ ( W 1 z c ) ) x ˜ c = F s c a l e ( u c , s c ) = s c u c
In SENet, the first step involves applying a squeeze operation to the information from each input channel, which sets the stage for adjusting the weights of the different channels. Assuming the input feature map has dimensions W × H × C, where × denotes scalar multiplication, this operation converts the input features into global features of size 1 × 1 × C through global average pooling. In Equation (11), Fsq denotes the squeeze operation, uc represents the entire input feature map, and Zc refers to the global features obtained after the squeeze operation.
The second step is the excitation operation, which aims to capture the relationships between the features input from different channels. The key step of the excitation operation is to input the global features of size 1 × 1 × C obtained from the squeeze operation into a fully connected layer of dimension C ÷ r × C, where r represents the scaling factor, primarily used to reduce the number of channels, which in turn lowers the computational complexity of the model. The output from the previous step is passed through a ReLU activation layer to a second fully connected layer, where the number of feature channels is restored. Subsequently, the output feature information is normalized to the range of (0, 1) through a Sigmoid activation layer, completing the readjustment of the weights for the relevant fault features across different channels. In Equation (11), S represents the adjusted weights of the fault features for each channel, Fex denotes the excitation operation, while W1 and W2 represent the operations of the two fully connected layers. σ and δ correspond to the ReLU activation layer and the Sigmoid activation layer, respectively.
The third step is the scaling operation, which primarily involves multiplying the fault features in each channel by the redistributed weights to recalibrate the fault features and achieve the overall adjustment of the attention mechanism. As shown in Equation (11), x ˜ c denotes the features after channel attention adjustment, and Fscale denotes the scaling operation.

3.4. Bayesian Optimization Algorithm

The CNN-based ITSC fault-diagnosis model features a flexible and variable structure, requiring numerous hyperparameters. Different combinations of hyperparameters can significantly impact the model’s training efficiency and final validation accuracy. To achieve a robust ITSC fault-diagnosis model, it is essential to optimize various hyperparameter combinations to identify the best set. Common optimization algorithms include grid search, random search, and Bayesian optimization. Random search and grid search are methods of random enumeration and exhaustive search, respectively, which can lead to a degree of blindness in the optimization process [40]. This may result in a significant waste of computational resources on unsuitable hyperparameter combinations. Consequently, under limited computational resources, these two methods often struggle to yield satisfactory results without extensive prior experience. In contrast to these two algorithms, Bayesian optimization is a global optimization technique that employs a sequential search process [41]. It effectively utilizes the prior information of known data points to autonomously adjust its optimization strategy. Therefore, this paper adopts Bayesian optimization to optimize the hyperparameters of the ITSC fault-diagnosis model.
Bayesian optimization is a global optimization strategy named after the famous Bayes’ theorem (as shown in Equation (12)) used in its framework. This algorithm constructs a probabilistic model of the objective function to effectively select the most promising evaluation points in each iteration, making it particularly suitable for scenarios where the objective function is expensive or difficult to evaluate.
p ( f   |   D 1 : t ) = p ( D 1 : t   |   f ) p ( f ) p ( D 1 : t )
where f represents the objective function, which is typically difficult to express directly in functional form, in the fault-diagnosis model, it reflects the overall performance of the model. D1:t denotes the sample points of the hyperparameter combinations to be optimized, with the number of sample points being t. The process of hyperparameter tuning for the ITSC fault-diagnosis model based on Bayesian optimization mainly includes the following key steps:
(1)
Initialization
Select several initial values as the starting points for Bayesian optimization based on the given hyperparameter combinations and the value ranges for each hyperparameter.
(2)
Constructing the probabilistic model
Establish a probabilistic model for the objective function using a Gaussian process model based on the evaluated sample points. This model can predict the objective function values and their corresponding uncertainties for untested points.
(3)
Selecting evaluation points
Consider both the predicted values and uncertainties of the model and select the next set of promising evaluation points from the probabilistic model using the expected improvement acquisition function.
(4)
Objective function evaluation
Evaluate the objective function at the newly obtained sample points and compute the function values.
(5)
Updating the probabilistic model
Combine the results from the new evaluation points with previous data to update the probabilistic model, thereby enhancing its accuracy. Repeat (3) to (5) until the specified termination criteria are met. Finally, select the hyperparameter combination corresponding to the model with the best performance as the output result.
The hyperparameter tuning process of the ITSC fault-diagnosis model is illustrated in Figure 5. As shown in the figure, the hyperparameter adjustment for the entire fault-diagnosis model is primarily divided into two parts. One part is the constructing and training process of the fault-diagnosis model, as indicated in the black box. When the model meets the termination criteria for training, the final testing accuracy will be passed to the Bayesian optimization process. The other part involves hyperparameter optimization based on the Bayesian optimization algorithm, as depicted in the green box. In this entire process, Bayesian optimization is responsible for finding the optimal hyperparameter combinations, which mainly includes the initialization and optimization of the hyperparameter combinations. Throughout the process, the model training and hyperparameter optimization cycle back and forth until the termination criteria for optimization are met. Ultimately, the optimal hyperparameter combination and its validation accuracy are selected as the output results.

3.5. Multi-Source Data-Fusion Algorithm Based on Bayesian Optimization and a CNN

The above analysis indicates that the occurrence of ITSC faults not only affects the three-phase currents of the windings but also alters the distribution of the air gap magnetic field. This leads to the generation of unbalanced radial magnetic forces within the motor, which exacerbates the production of vibration signals and alters the vibration characteristics of the motor. Therefore, this paper proposes a fault-diagnosis method for ITSCs based on the fusion of three-phase current signals and vibration signals. This method utilizes both three-phase current signals and vibration signals as sources for fault diagnosis. Using a CNN model, it extracts features from the current and vibration signals separately and then performs a feature-level fusion of the two types of signals. The channel attention mechanism is employed to adjust the weights of the fault features from different signals and channels. Finally, to improve the training efficiency of the model and enhance its performance, Bayesian optimization is used to fine-tune the training hyperparameters of the model.
The flowchart of the entire process is shown in Figure 6, which can be divided into five specific steps.
(1)
Data collection. Experiments are conducted with varying degrees of ITSC faults, collecting three-phase current signals and vibration signals simultaneously under different operating conditions.
(2)
Dataset preparation. The collected current and vibration signals undergo data synchronization, followed by a series of preprocessing steps including filtering, normalization, downsampling, data slicing, and grouping. Ultimately, the processed current and vibration signals are organized into a dataset suitable for a multi-source data-fusion model.
(3)
Model construction and initialization. This study employs a multi-stream high-level feature fusion model based on CNNs. The two types of signals are fed into different CNN branches for feature extraction, and then the extracted features are fused at the high level of the network. Fault-severity recognition is achieved through fully connected layers and classification layers. The model’s training hyperparameters are initialized using Bayesian optimization.
(4)
Model training and optimization. The dataset constructed in Step 2 and the multi-source data-fusion model built in Step 3 are used for training. The performance of the trained model is evaluated using a test set, and the Bayesian optimization algorithm updates the hyperparameter combinations based on the results, repeating the process until the optimization iterations or model performance reaches the termination criteria.
(5)
Output results. When the model reaches the optimization termination condition, the best ITSC model is selected for output, which includes the optimal hyperparameter combinations and the corresponding accuracy results for ITSC fault diagnosis.
Figure 6. Flowchart of ITSC fault-diagnosis algorithm based on multi-source data fusion.
Figure 6. Flowchart of ITSC fault-diagnosis algorithm based on multi-source data fusion.
Agriculture 14 02139 g006

4. Experimental Setup and Data Description

To validate the effectiveness of the ITSC fault-diagnosis algorithm based on the fusion of current and vibration signal fault characteristics, as well as the Bayesian-optimized ITSC fault-diagnosis model, a simulation test for ITSC faults in PMSM was conducted. The experiment was carried out on the test bench for the dual-drive motor, as shown in Figure 7. The main test equipment included the motor under test and its controller, a dynamometer, torque and speed sensors, vibration sensors, current sensors, and a data acquisition system. The motor under test was an 8-pole, 36-slot PMSM with a star-connected winding structure. The specific parameters are listed in Table 1.
The torque sensor used in the experiment was the HCNJ-101, with a measurement accuracy of ±0.1%; the current sensor was the ETA-5301B, with a measurement accuracy of 3% RD; the vibration sensor was the KS78B100 from MMF Germany, with an IEPE interface, a sensitivity of 100 mV/g and an accuracy of 2% RD. To prevent low-frequency interference during data acquisition, the sampling frequency for the current signal was set to 1 MHz. Since the maximum sampling frequency for the vibration signal acquisition using the NI cRIO9068 was also 1 MHz, different devices were used for data collection. The data acquisition device for the current signal was an oscilloscope with a sampling frequency of 1 MHz, while the vibration signal was acquired using the NI9401 card within the NI cRIO9068, at a sampling frequency of 10.24 kHz.
To ensure synchronization between the data acquisition devices, a signal generator was used to create a sweep frequency signal with a period of 8 s, sweeping from 20 Hz to 2 kHz, with a voltage signal amplitude of ±2 V. The waveform of the synchronization signal is shown in Figure 8. During the data acquisition process, both the current and vibration data acquisition devices received the synchronization signal from the signal generator simultaneously. The oscilloscope had a sampling frequency of 1 MHz, while the NI9223 acquisition card used by the NI cRIO9068 had a sampling frequency of 100 kHz.
In conducting simulation tests of ITSC faults in motors, two key factors need to be fully considered: on one hand, the establishment of early ITSC faults with varying degrees of severity; on the other hand, the operational conditions of the motor should be as comprehensive as possible. Based on the previous analysis, the severity of ITSC faults is determined by the number of shorted turns and the fault resistance. To simulate different degrees of ITSC faults, the test motor has been modified accordingly, as shown in Figure 9.
Figure 9a displays the tested faulty motor, with terminals on both sides representing the lead wires of windings with different turn counts, allowing for varying degrees of short circuit simulation through paired connections. Figure 9b shows the fault resistance and its heat dissipation device; the fault resistance can be replaced to simulate different levels of insulation damage. The combination of these two parameters facilitates the simulation of varying degrees of ITSC faults. Figure 9c illustrates the temperature measurement device, which continuously monitors the temperature of the faulty motor and its fault resistance throughout the experiment to prevent damage due to excessive temperature.
The operational conditions for the motor during the fault simulation tests are presented in Table 2. To simulate the operating conditions of agricultural machinery during acceleration, deceleration, and constant speed, a total of 8 constant speed scenarios and 2 acceleration and deceleration scenarios are included. The settings for variable speed conditions are shown in Figure 10.
After completing the signal acquisition, a series of data preprocessing operations were required, including signal synchronization, filtering, normalization, downsampling, slicing, and grouping, to organize the current and vibration signals into a dataset suitable for multi-source data-fusion models.
The oscilloscope can record signal sampling for 10 s at a time, while the cRIO9068 has a longer signal sampling duration. Therefore, during the data acquisition phase, the cRIO9068 must be turned on first for data collection, followed by the activation of the oscilloscope. In the end, the oscilloscope is stopped first, followed by the cessation of recording on the cRIO9068. The data synchronization process between the two signals involves using the time segment occupied by the data recorded by the oscilloscope to slice the corresponding data recorded by the cRIO9068 within that time frame. Throughout this process, the synchronized signals recorded by both devices serve as timestamps to determine the start and end times of the signals. To maximize the length of the synchronized signal, the synchronization signal from the oscilloscope is downsampled to 100 kHz, denoted as y, while the synchronization signal from the cRIO9068 is denoted as x. The expression for the cross-correlation of the two signals is as follows:
R ^ x y ( m ) = n = 0 N m 1 x n + m y n ,       m 0
where, R ^ x y represents the cross-correlation index between the two signals, m denotes the offset of the signal in the NI cRIO9068, with a range of (0, N − 1), and N is the length of signal x. When m takes on different values, a series of cross-correlation indices can be obtained for signal x relative to signal y after shifting m points. The maximum value in this sequence indicates the highest cross-correlation index for the two signals at the corresponding offset. Since both signals are derived from the same synchronous signal generated by the signal generator, if the two signals coincide, the offset for the corresponding cross-correlation index represents the starting point for truncating signal x using signal y, while the endpoint is determined by the number of sampling points contained in signal y.
Figure 11 illustrates the results of signal synchronization achieved through cross-correlation analysis of synchronous signals collected by two devices. To compare the effectiveness of the signal synchronization, the synchronous signals of equal amplitude have been slightly offset in the diagram. Signal x is the downsampled synchronous signal recorded by the cRIO9068, while signal y is the synchronous signal captured by the oscilloscope. Signal x′ is derived from signal y, and signal y′ is consistent with signal y. It can be observed from the figure that the length of signal x is significantly greater than that of signal y, which aligns with the setup during the experiment where the cRIO9068 was initiated earlier and stopped later than the oscilloscope. The starting point of signal x′ is determined by the maximum cross-correlation offset between signals x and y, while the endpoint of signal x is determined by the length of signal y. The locally enlarged area in the figure indicates the positions of the frequency sweep cycle transitions for the four synchronous signals. It is evident that the transition positions for these four synchronous signals are consistent, and the synchronization error is within one sampling period, indicating that the synchronization algorithm used is effective.
From the above analysis, it can be seen that during the signal synchronization process of the two devices, the synchronization signal collected by the oscilloscope has not changed. Therefore, the fundamental purpose of synchronizing the signals from these two devices is to use the synchronized signal as a timestamp to extract the corresponding time range data from the cRIO9068, thereby eliminating data that is out of sync. Due to the different sampling frequencies used by the two devices during signal acquisition, there will still be time synchronization errors when using the synchronized signal as a timestamp for data extraction. A schematic of this error is shown in Figure 12.
In Figure 12, the horizontal axis (X) represents the sampling time, and the vertical axis (Y) represents the signal amplitude. Assuming the point (6.7 × 10−5, 3) in the figure is the calculated starting point of the synchronized signals from the two devices, the synchronization error between the synchronization signal and the vibration signal is t1, while the synchronization error between the synchronization signal and the current signal is t2. It is known that the algorithm ensures the synchronization error between the two signals is within one sampling period, so the upper limit of synchronization error between the current signal and the vibration signal is the sampling period of the vibration signal, which is 9.77 × 10−5 s, significantly less than 0.1 milliseconds.
During data acquisition, each group of data is collected for 10 s, and the time length of each data slice when constructing the dataset is 0.2 s. Therefore, this maximum time error represents a very small proportion of each data slice’s time length, and its impact on the sample set constructed for signal synchronization in this paper can be considered negligible.
After synchronizing the current and vibration signals, the subsequent data preprocessing operations include filtering, normalization, downsampling, slicing, and grouping. The description of the resulting synchronized signal dataset is shown in Table 3. From the table, it can be observed that the dataset consists of synchronized current and vibration signals, containing 17 fault-severity labels. Among these, “HL” represents data collected under normal motor conditions, while “A*R*” indicates data collected under different ITSC fault severity. The severity of faults in the ITSC is determined by varying the combinations of shorted turn ratios and fault resistances. Different shorted turn ratios are created by connecting two specific points from the multiple lead wires, each corresponding to different turns of the first coil in phase A. Fault resistance, on the other hand, can take any value within the range of 0 to 1M ohms when an ITSC fault occurs [15]. If the fault resistance is extremely high, the fault current flowing through it will be minimal, effectively resulting in a healthy state. Conversely, if the fault resistance is too low, the fault current will be large enough to potentially cause irreversible damage to the test rig [7]. Since this paper focuses on diagnosing early-stage faults in the ITSC, fault resistance is classified into three cases based on the above analysis. First, when the fault resistance is significantly greater than the impedance of the shorted wire. Second, when the fault resistance is slightly larger than the impedance of the shorted wire, resulting in a higher fault current and more noticeable impact on the motor. Third, when the fault resistance is close to the impedance of the shorted wire, leading to a large fault current that could potentially damage the experimental setup if the test is prolonged. Based on these considerations and experimental experience, only the ITSC scenario involving a single coil is considered when determining the number of shorted turns. The fault resistances are then set within the range of 5 Ω to 0.1 Ω, ensuring that the fault currents are clearly detectable while preventing any irreversible damage to the experimental setup. The severity of faults is arranged in ascending order based on the calculation from Equation (9).
Under each fault-severity label, both types of signals contain 1200 data samples, with 840 samples designated for training and 360 for testing. Figure 13 displays the waveform of the current signal from a single sample in the constructed dataset, along with the corresponding vibration signal for that period. The horizontal axis represents the number of sampling points, and the vertical axis represents the normalized waveform amplitude. It is evident from the figure that each set of current signals has 3000 sampling points, while the vibration signals consist of 2048 sampling points. The time duration for both sets of signals matches the specified data slicing duration, which is 0.2 s.

5. Discussion

5.1. Results and Comparisonsns

After completing the simulation tests for ITSC faults and constructing the dataset, the proposed multi-source data-fusion fault-diagnosis model is applied to analyze the synchronized current signal dataset and the vibration signal dataset. The hyperparameters that need optimization in the model include the initial learning rate, gradient optimization parameters, L2 regularization coefficient, and the dropout rate in the dropout layer, represented by the symbols Linit, G1, L2R, and P, respectively. During the model training process, the number of Bayesian optimization iterations is set to 60, and the number of model training epochs in each optimization is set to 40. The corresponding parameter search space, data types, and the optimal hyperparameter combinations for the model are shown in Table 4.
The proposed multi-source data-fusion model is a multi-stream high-level feature fusion (MS) model, as illustrated in Figure 14a. The synchronized current and vibration signals are processed through different CNN branches for fault-feature extraction, with feature fusion occurring at a higher level. To compare the impact of different fusion methods on the diagnosis results of ITSC faults, a multi-channel information fusion model was also constructed, with its structure illustrated in Figure 14b.
The MS model is composed of three main components. The first part is the input layer. The second part handles feature extraction, which is implemented through two parallel CNN branches. Each branch consists of several stacked convolutional blocks, with each block containing a dilated convolution layer, a batch normalization layer, a ReLU layer, and a dropout layer. The third part is the classification and output layer, which includes a fully connected layer, a softmax layer, and the output layer with severity labels as outputs. In this architecture, three-phase current segments and vibration segments serve as the input signals. Each current segment has a fixed size of 1 × 3000 × 3, while each vibration segment has a fixed size of 1 × 2048 × 3. The feature extraction part consists of three convolutional layers, with depths of 5, 9, and 6, and widths of 18, 66, and 38, respectively. The kernel size is set to 1 × 3, with a dilated factor (d) of 2. The initial learning rate (Linit) is 2.7586 × 10−4, Momentum (M) is 0.9037, the batch is set to 32, L2 regularization (L2R) is 0.0097, and dropout probability (P) is 2.488 × 10−5.
In this model, the synchronized current and vibration signals are concatenated directly and passed through the same CNN for fault feature extraction, where the network structure parameters of the feature-extraction layer are consistent with those of the individual branches in the MS model. Furthermore, to compare the effects of fused signals versus single signals, the results of the MS model were compared with those from a residual network (Res) model using three-phase current signal, as well as an improved model that integrates a multi-scale network structure with an attention mechanism (MK-SE-Res model). The comparison results of the four models are shown in Figure 15.
Figure 15a presents a comparative trend of validation accuracy for four models as training epochs progress. From the figure, it is evident that the MS model exhibits the fastest ascent in accuracy, ultimately achieving a validation accuracy of 98.99%. The MK-SE-Res model follows closely, with a relatively rapid ascent and a final validation accuracy of 98.25%. In comparison, the Res model outperforms the MC model in its ascent rate, achieving a final validation accuracy of 97.35%, while the MC model reaches a final validation accuracy of 96.32%. Notably, the MC model’s final validation accuracy is 1.03% lower than that of the Res model, which was trained solely on current signal data. In contrast, the MS model improves its final validation accuracy by 0.74% compared to the MK-SE-Res model, which performs best with current signal data. For the MC model, the introduction of multi-source fusion data not only leads to a decrease in final validation accuracy but also slows down the training efficiency throughout the training process, resulting in significant fluctuations in its curve. Conversely, the MS model, with the introduction of multi-source data, not only enhances the final validation accuracy but also accelerates the rate of change in validation accuracy concerning training epochs, leading to a notable improvement in overall stability during training. Figure 15b illustrates the trend of error loss for the four models as training epochs progress. It can be observed that in the early stages of training, both the MC and MK-SE-Res models exhibit significant error loss. As the training epochs increase, the error loss for all four models rapidly decreases, with the MS model achieving the lowest final error loss, demonstrating its superior generalization ability.
To comprehensively analyze the performance of the MS fault-diagnosis model and the MC model on samples with varying fault levels in the validation set, the confusion matrices of these models are plotted and compared with those of the Res model and the MK-SE-Res model. The results are shown in Figure 16, Figure 17, Figure 18 and Figure 19, where Figure 16 represents the confusion matrix for the Res model, Figure 17 for the MK-SE-Res model, Figure 18 for the MC model, and Figure 19 for the MS model. In the confusion matrices, the left side lists the actual fault level labels of the test set data, while the bottom displays the predicted fault level labels according to the respective fault-diagnosis models used. The values along the diagonal of the confusion matrix indicate the number of samples where the actual and predicted labels match, while the other positions represent the number of samples where the predicted labels do not correspond to the actual labels.
The performance of the models across different fault labels is related to two parameters: precision (p) and recall (r). Precision refers to the proportion of correctly predicted labels among the predicted labels in the model’s column vector, as indicated by the bottom row of the confusion matrix; recall, on the other hand, indicates the proportion of accurately predicted samples to the total number of samples in that row. There is an inherent trade-off between precision and recall. To balance these two metrics, the F1 score was introduced as a parameter to evaluate the model’s performance across various classification labels. The F1 score is the harmonic mean of precision and recall, which allows for a comprehensive consideration of the impact of both parameters, expressed as follows:
F 1 = 2 p × r p + r
To further compare the performance of the five models across different fault-severity classification labels, the F1 scores and overall validation accuracy are calculated based on the precision and recall for each fault-severity label in the confusion matrix of the four models on the validation set. The relevant results are shown in Table 5.
By combining the confusion matrix and Table 5, it is clear that the MS model, having the highest overall validation accuracy, has the fewest misclassified samples. Compared to the other three models, the MS model shows significant improvements in both the “false positive rate” and “false negative rate”, with misclassified samples primarily concentrated in the fault-severity labels “HL”, “A4R1”, and “A4R0.5”. In contrast, the MC model has the highest number of misclassified samples among the four models, mainly in the lighter fault-severity labels, and the number of misclassified samples decreases as fault severity increases. As shown in Table 5, the proposed model performs best in F1 scores across 14 out of 17 different fault types.
The four deep learning models mentioned above are all black box models. After training, these models are capable of extracting the necessary features for identifying the degree of ITSC faults from the given dataset. To visually illustrate the features learned by these deep learning models from the ITSC dataset samples, this paper compares the features of the input raw signal dataset with those learned at the final output layer. Since the features at each layer of the models consist of high-dimensional vectors, dimensionality reduction is necessary before visualization. This paper employs the t-SNE (t-distributed stochastic neighbor embedding) algorithm to map high-dimensional features or data samples into two-dimensional space for visualization purposes.
Figure 20 presents a comparison of the two-dimensional feature distributions of the input layer and the output layers of the four models. The feature map uses 17 colors, with each color representing a different fault degree label, and each point corresponding to a data sample. Figure 20a shows the feature distribution of the input layers for each model, revealing a chaotic distribution where sample points of different colors overlap significantly, making it difficult to discern the fault degree based solely on the input data. Figure 20b,e depict the feature distributions of the final output layers for the four ITSC fault-diagnosis models. A comparison indicates that the feature distributions in the output layers of the four models exhibit a marked intra-class clustering characteristic compared to the input layer’s feature distribution, with samples of the same fault degree label being more concentrated. Notably, in the feature distribution of the MS model, there are fewer misclassified sample points, and the boundaries between different fault degree labels are clearer, with greater distances between samples, demonstrating good class separation characteristics and indicating a stronger feature learning ability.
Through the analysis of the results, it is evident that compared to the fault-diagnosis model that uses only current signals, the MS model employing multi-stream high-level feature fusion of current and vibration signals demonstrates superior performance in terms of final validation accuracy, error loss, F1 score across various fault-severity classifications, and the model’s feature learning capability. This indicates that combining current and vibration signals for multi-stream high-level feature fusion can more effectively leverage the unique characteristics and fault-feature information of different signals, facilitating comprehensive collection of the motor’s operating status and thereby enhancing the overall performance of the ITSC fault-diagnosis model.

5.2. Limitation Discussion

From the above analysis, it can be concluded that the algorithm focuses primarily on improving the accuracy of identifying the severity of early ITSC faults under simulated operating conditions that reflect those likely to occur in actual motors. The algorithm assumes that the training data used for the model is sufficiently large and evenly distributed. However, in real-world applications, insufficient or imbalanced data is a common issue, which poses a challenge to the practical deployment of the proposed algorithm. Although the literature suggests that transfer learning has the potential to address this problem, further research is needed to determine how it can be effectively applied in this context. Another practical challenge is that signals may experience significant interference during the data-acquisition process, including electromagnetic interference, inherent vibration noise from the equipment, vibration disturbances, and transmission noise. These factors can limit the performance of the algorithm in real-world applications. Therefore, further research is needed to improve the algorithm’s robustness under these interference conditions.

6. Conclusions

In this article, a novel multi-source data-fusion algorithm was proposed for the ITSC fault diagnosis. The results indicate that, compared to using only current signals, the ITSC fault-diagnosis model that combines current and vibration signals for feature fusion performs better in terms of validation accuracy, error loss, F1 score, and feature learning capability. The following conclusions can be drawn. First, an indicator suitable for early-stage ITSC fault-severity analysis is derived from the equivalent circuit. Second, a feature-level multi-source data-fusion algorithm based on CNN has been introduced. This algorithm utilizes Bayesian optimization for hyperparameter tuning and integrates current and vibration signal features, enhancing the richness of the fault-feature space, and thereby improving the accuracy of ITSC fault-severity identification. To achieve synchronization of multi-source signals during experiments, a signal synchronization method has been proposed to construct a synchronized dataset of current and vibration signals. By calculating the maximum cross-correlation of the synchronized signals collected from two devices, successful synchronization of the current and vibration signals was achieved. Finally, the experimental results indicate that, among the four comparative models, the proposed MS model achieved the highest final validation accuracy, close to 98.99%, with the smallest error loss, below 0.04. In the F1 scores for 17 classification tasks, the MS model outperformed the others in 14 cases, demonstrating the strongest feature learning capability, which validates the effectiveness of the proposed multi-source data-fusion model.
Unfortunately, the proposed algorithm primarily focuses on improving the accuracy of identifying the severity of early ITSC faults under experimental conditions, without addressing issues such as signal interference, and insufficient or imbalanced sample sizes that may arise in real-world applications. To enhance the algorithm’s performance in practical scenarios, the next step will be to investigate methods for overcoming these challenges.

Author Contributions

Conceptualization, M.W., Q.S. and W.L.; methodology, M.W. and W.L.; software, M.W. and W.L.; validation, M.W.; formal analysis, M.W. and W.L.; investigation, M.W., Q.S. and W.L.; resources, Q.S.; data curation, M.W. and W.L.; writing—original draft preparation, M.W.; writing—review and editing, M.W., H.Z., Y.L. and W.L.; visualization, M.W.; supervision, Q.S.; project administration, M.W. and Q.S.; funding acquisition, Q.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The raw data supporting the conclusions of this study are available from the corresponding author upon reasonable request. ([email protected]).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Xie, F.; Wang, Y.; Wang, G.; Sun, E.; Fan, Q.; Song, M. Fault Diagnosis of Rolling Bearings in Agricultural Machines Using SVD-EDS-GST and ResViT. Agriculture 2024, 14, 1286. [Google Scholar] [CrossRef]
  2. Xie, F.; Sun, E.; Wang, L.; Wang, G.; Xiao, Q. Rolling Bearing Fault Diagnosis in Agricultural Machinery Based on Multi-Source Locally Adaptive Graph Convolution. Agriculture 2024, 14, 1333. [Google Scholar] [CrossRef]
  3. Wang, J.; Lu, Z.; Wang, G.; Hussain, G.; Zhao, S.; Zhang, H.; Xiao, M. Research on Fault Diagnosis of HMCVT Shift Hydraulic System Based on Optimized BPNN and CNN. Agriculture 2023, 13, 461. [Google Scholar] [CrossRef]
  4. Lagarde, Q.; Beillard, B.; Marcuzzi, D.; Mazen, S.; Leylavergne, J. Stray Currents in Livestock Farming: Electrical Diagnosis in Farms. Agriculture 2023, 13, 2010. [Google Scholar] [CrossRef]
  5. Wen, C.; Zhang, J.; Zheng, K.; Li, H.; Ling, L.; Meng, Z.; Fu, W.; Yan, B. Accelerated Verification Method for the Reliability of the Motor Drive Mechanism of the Corn Precision Seed-Metering Device. Comput. Electron. Agric. 2023, 212, 108163. [Google Scholar] [CrossRef]
  6. Sewioło, M.; Mystkowski, A. Agriculture Rotary Tedder Fault Diagnosis Based on Evolutionary Convolutional Neural Network with Genetic Algorithm Optimization. In Proceedings of the 2023 27th International Conference on Methods and Models in Automation and Robotics (MMAR), Międzyzdroje, Poland, 22–25 August 2023; IEEE: New York, NY, USA, 2023; pp. 87–92. [Google Scholar]
  7. Zafarani, M.; Bostanci, E.; Qi, Y.; Goktas, T.; Akin, B. Interturn Short-Circuit Faults in Permanent Magnet Synchronous Machines: An Extended Review and Comprehensive Analysis. IEEE J. Emerg. Sel. Top. Power Electron. 2018, 6, 2173–2191. [Google Scholar] [CrossRef]
  8. Zorig, A.; Hedayati Kia, S.; Chouder, A.; Rabhi, A. A Comparative Study for Stator Winding Inter-Turn Short-Circuit Fault Detection Based on Harmonic Analysis of Induction Machine Signatures. Math. Comput. Simul. 2022, 196, 273–288. [Google Scholar] [CrossRef]
  9. Parai, M.; Srimani, S.; Ghosh, K.; Rahaman, H. Multi-Source Data Fusion Technique for Parametric Fault Diagnosis in Analog Circuits. Integration 2022, 84, 92–101. [Google Scholar] [CrossRef]
  10. Singh, A.; Nawayseh, N.; Singh, H.; Dhabi, Y.K.; Samuel, S. Internet of Agriculture: Analyzing and Predicting Tractor Ride Comfort through Supervised Machine Learning. Eng. Appl. Artif. Intell. 2023, 125, 106720. [Google Scholar] [CrossRef]
  11. He, Y.; Tang, H.; Ren, Y.; Kumar, A. A Deep Multi-Signal Fusion Adversarial Model Based Transfer Learning and Residual Network for Axial Piston Pump Fault Diagnosis. Measurement 2022, 192, 110889. [Google Scholar] [CrossRef]
  12. Xu, L.; Zhang, G.; Zhao, S.; Wu, Y.; Xi, Z. Fault Diagnosis of Tractor Transmission System Based on Time GAN and Transformer. IEEE Access 2024, 12, 107153–107169. [Google Scholar] [CrossRef]
  13. Lee, H.; Jeong, H.; Koo, G.; Ban, J.; Kim, S.W. Attention Recurrent Neural Network-Based Severity Estimation Method for Interturn Short-Circuit Fault in Permanent Magnet Synchronous Machines. IEEE Trans. Ind. Electron. 2021, 68, 3445–3453. [Google Scholar] [CrossRef]
  14. Zhu, J.; Hu, T.; Jiang, B.; Yang, X. Intelligent Bearing Fault Diagnosis Using PCA–DBN Framework. Neural Comput. Appl. 2020, 32, 10773–10781. [Google Scholar] [CrossRef]
  15. Zhu, Z.; Peng, G.; Chen, Y.; Gao, H. A Convolutional Neural Network Based on a Capsule Network with Strong Generalization for Bearing Fault Diagnosis. Neurocomputing 2019, 323, 62–75. [Google Scholar] [CrossRef]
  16. Husari, F.; Seshadrinath, J. Stator Turn Fault Diagnosis and Severity Assessment in Converter-Fed Induction Motor Using Flat Diagnosis Structure Based on Deep Learning Approach. IEEE J. Emerg. Sel. Top. Power Electron. 2023, 11, 5649–5657. [Google Scholar] [CrossRef]
  17. Husari, F.; Seshadrinath, J. Early Stator Fault Detection and Condition Identification in Induction Motor Using Novel Deep Network. IEEE Trans. Artif. Intell. 2022, 3, 809–818. [Google Scholar] [CrossRef]
  18. Ortego, P.; Diez-Olivan, A.; Ser, J.D.; Veiga, F.; Penalva, M.; Sierra, B. Evolutionary LSTM-FCN Networks for Pattern Classification in Industrial Processes. Swarm Evol. Comput. 2020, 54, 100650. [Google Scholar] [CrossRef]
  19. Wang, J.; Ma, Y.; Zhang, L.; Gao, R.X.; Wu, D. Deep Learning for Smart Manufacturing: Methods and Applications. J. Manuf. Syst. 2018, 48, 144–156. [Google Scholar] [CrossRef]
  20. Akhmetov, Y.; Nurmanova, V.; Bagheri, M.; Zollanvari, A.; Gharehpetian, G.B. A New Diagnostic Technique for Reliable Decision-Making on Transformer FRA Data in Interturn Short-Circuit Condition. IEEE Trans. Ind. Inf. 2021, 17, 3020–3031. [Google Scholar] [CrossRef]
  21. Yuan, X.-H.; He, Y.-L.; Liu, M.-Y.; Wang, H.; Wan, S.-T.; Vakil, G. Impact of the Field Winding Interturn Short-Circuit Position on Rotor Vibration Properties in Synchronous Generators. Math. Probl. Eng. 2021, 2021, 9236726. [Google Scholar] [CrossRef]
  22. Lanciotti, N.; Ojeda, J.; Gabsi, M.; Boukhobza, T. Detection and Localization of Interturn Short-Circuit Fault by Analysis of Stator Accelerations Spectrum in Five-Phase Flux Switching Machine for HEV Application. In Proceedings of the 2020 Fifteenth International Conference on Ecological Vehicles and Renewable Energies (EVER), Monte-Carlo, Monaco, 10–12 September 2020; IEEE: New York, NY, USA, 2020; pp. 1–10. [Google Scholar]
  23. He, Y.-L.; Wang, T.; Sun, K.; Wang, X.-L.; Peng, B.; Wan, S.-T. Enhanced Characteristic Vibration Signal Detection of Generator Based on Time-Wavelet Energy Spectrum and Multipoint Optimal Minimum Entropy Deconvolution Adjusted Method. Math. Probl. Eng. 2020, 2020, 6916289. [Google Scholar] [CrossRef]
  24. He, Y.-L.; Xu, M.-X.; Zhang, W.; Wang, X.-L.; Lu, P.; Gerada, C.; Gerada, D. Impact of Stator Interturn Short Circuit Position on End Winding Vibration in Synchronous Generators. IEEE Trans. Energy Convers. 2021, 36, 713–724. [Google Scholar] [CrossRef]
  25. He, Y.-L.; Deng, W.-Q.; Peng, B.; Ke, M.-Q.; Tang, G.-J.; Wan, S.-T.; Liu, X.-Y. Stator Vibration Characteristic Identification of Turbogenerator among Single and Composite Faults Composed of Static Air-Gap Eccentricity and Rotor Interturn Short Circuit. Shock. Vib. 2016, 2016, 5971081. [Google Scholar] [CrossRef]
  26. Obeid, N.H.; Boileau, T.; Nahid-Mobarakeh, B. Modeling and Diagnostic of Incipient Interturn Faults for a Three-Phase Permanent Magnet Synchronous Motor. IEEE Trans. Ind. Appl. 2016, 52, 4426–4434. [Google Scholar] [CrossRef]
  27. Seshadrinath, J.; Singh, B.; Panigrahi, B.K. Vibration Analysis Based Interturn Fault Diagnosis in Induction Machines. IEEE Trans. Ind. Inf. 2014, 10, 340–350. [Google Scholar] [CrossRef]
  28. Qi, Y.; Bostanci, E.; Zafarani, M.; Akin, B. Severity Estimation of Interturn Short Circuit Fault for PMSM. IEEE Trans. Ind. Electron. 2019, 66, 7260–7269. [Google Scholar] [CrossRef]
  29. Qi, Y.; Bostanci, E.; Gurusamy, V.; Akin, B. A Comprehensive Analysis of Short-Circuit Current Behavior in PMSM Interturn Short-Circuit Faults. IEEE Trans. Power Electron. 2018, 33, 10784–10793. [Google Scholar] [CrossRef]
  30. Hang, J.; Zhang, J.; Cheng, M.; Huang, J. Online Interturn Fault Diagnosis of Permanent Magnet Synchronous Machine Using Zero-Sequence Components. IEEE Trans. Power Electron. 2015, 30, 6731–6741. [Google Scholar] [CrossRef]
  31. Xie, T.; Huang, X.; Choi, S.-K. Intelligent Mechanical Fault Diagnosis Using Multisensor Fusion and Convolution Neural Network. IEEE Trans. Ind. Inf. 2022, 18, 3213–3223. [Google Scholar] [CrossRef]
  32. Xia, M.; Li, T.; Xu, L.; Liu, L.; de Silva, C.W. Fault Diagnosis for Rotating Machinery Using Multiple Sensors and Convolutional Neural Networks. IEEE/ASME Trans. Mechatron. 2018, 23, 101–110. [Google Scholar] [CrossRef]
  33. Chen, H.; Hu, N.; Cheng, Z.; Zhang, L.; Zhang, Y. A Deep Convolutional Neural Network Based Fusion Method of Two-Direction Vibration Signal Data for Health State Identification of Planetary Gearboxes. Measurement 2019, 146, 268–278. [Google Scholar] [CrossRef]
  34. Azamfar, M.; Jia, X.; Pandhare, V.; Singh, J.; Davari, H.; Lee, J. Detection and Diagnosis of Bottle Capping Failures Based on Motor Current Signature Analysis. Procedia Manuf. 2019, 34, 840–846. [Google Scholar] [CrossRef]
  35. Xiao, M.; Yang, B.; Wang, S.; Zhang, Z.; Tang, X.; Kang, L. A Feature Fusion Enhanced Multiscale CNN with Attention Mechanism for Spot-Welding Surface Appearance Recognition. Comput. Ind. 2022, 135, 103583. [Google Scholar] [CrossRef]
  36. Jin, Y.; Chen, C.; Zhao, S. Multisource Data Fusion Diagnosis Method of Rolling Bearings Based on Improved Multiscale CNN. J. Sens. 2021, 2021, 2251530. [Google Scholar] [CrossRef]
  37. Li, Z.; Wang, Y.; Ma, J. Fault Diagnosis of Motor Bearings Based on a Convolutional Long Short-Term Memory Network of Bayesian Optimization. IEEE Access 2021, 9, 97546–97556. [Google Scholar] [CrossRef]
  38. Bai, S.; Kolter, J.Z.; Koltun, V. An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling. arXiv 2018. [Google Scholar] [CrossRef]
  39. Hu, J.; Shen, L.; Albanie, S.; Sun, G.; Wu, E. Squeeze-and-Excitation Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 42, 2011–2023. [Google Scholar] [CrossRef]
  40. Zhang, Y.; Qu, J.; Fang, X.; Luo, G. Motor Bearing Fault Diagnosis Based on Multi-Feature Fusion and PSO-BP. In Proceedings of the 2021 IEEE 4th Student Conference on Electric Machines and Systems (SCEMS), Huzhou, China, 1–3 December 2021; IEEE: New York, NY, USA, 2021; pp. 1–5. [Google Scholar]
  41. Gelbart, M.A.; Snoek, J.; Adams, R.P. Bayesian Optimization with Unknown Constraints. arXiv 2014. [Google Scholar] [CrossRef]
Figure 1. Schematic diagram of a PMSM with ITSC fault.
Figure 1. Schematic diagram of a PMSM with ITSC fault.
Agriculture 14 02139 g001
Figure 2. Equivalent circuit diagram of the PMSM with an ITSC fault.
Figure 2. Equivalent circuit diagram of the PMSM with an ITSC fault.
Agriculture 14 02139 g002
Figure 3. (a) Schematic diagram of a conventional convolutional module. (b) Schematic diagram of a dilated convolutional module.
Figure 3. (a) Schematic diagram of a conventional convolutional module. (b) Schematic diagram of a dilated convolutional module.
Agriculture 14 02139 g003
Figure 4. Principle of channel attention mechanism.
Figure 4. Principle of channel attention mechanism.
Agriculture 14 02139 g004
Figure 5. Schematic of hyperparameter optimization for ITSC fault-diagnosis model.
Figure 5. Schematic of hyperparameter optimization for ITSC fault-diagnosis model.
Agriculture 14 02139 g005
Figure 7. Fault-diagnosis test bench for PMSM.
Figure 7. Fault-diagnosis test bench for PMSM.
Agriculture 14 02139 g007
Figure 8. Synchronous signal waveform diagram.
Figure 8. Synchronous signal waveform diagram.
Agriculture 14 02139 g008
Figure 9. (a) The tested faulty motor. (b) The fault resistance its heat dissipation device. (c) The temperature measurement device.
Figure 9. (a) The tested faulty motor. (b) The fault resistance its heat dissipation device. (c) The temperature measurement device.
Agriculture 14 02139 g009
Figure 10. Schematic diagram of speed variation.
Figure 10. Schematic diagram of speed variation.
Agriculture 14 02139 g010
Figure 11. Schematic diagram of signal synchronization results.
Figure 11. Schematic diagram of signal synchronization results.
Agriculture 14 02139 g011
Figure 12. Schematic diagram of synchronous signal error.
Figure 12. Schematic diagram of synchronous signal error.
Agriculture 14 02139 g012
Figure 13. Waveform diagram of current signals and vibration signals over the same period in the dataset.
Figure 13. Waveform diagram of current signals and vibration signals over the same period in the dataset.
Agriculture 14 02139 g013
Figure 14. Schematic diagram of the multi-source data-fusion model structure.
Figure 14. Schematic diagram of the multi-source data-fusion model structure.
Agriculture 14 02139 g014
Figure 15. Trends of validation accuracy and error loss for four models as training epochs change.
Figure 15. Trends of validation accuracy and error loss for four models as training epochs change.
Agriculture 14 02139 g015
Figure 16. The confusion matrix of the Res model.
Figure 16. The confusion matrix of the Res model.
Agriculture 14 02139 g016
Figure 17. The confusion matrix of the MK-SE-Res model.
Figure 17. The confusion matrix of the MK-SE-Res model.
Agriculture 14 02139 g017
Figure 18. The confusion matrix of the MC model.
Figure 18. The confusion matrix of the MC model.
Agriculture 14 02139 g018
Figure 19. The confusion matrix of the MS model.
Figure 19. The confusion matrix of the MS model.
Agriculture 14 02139 g019
Figure 20. Comparison chart of features extracted by the four models. (a) Feature map of the validation dataset. (b) Feature map extracted by the Res model. (c) Feature map extracted by the MK-SE-Res model. (d) Feature map extracted by the MC model. (e) Feature map extracted by the MS model.
Figure 20. Comparison chart of features extracted by the four models. (a) Feature map of the validation dataset. (b) Feature map extracted by the Res model. (c) Feature map extracted by the MK-SE-Res model. (d) Feature map extracted by the MC model. (e) Feature map extracted by the MS model.
Agriculture 14 02139 g020
Table 1. Specifications of the PMSM.
Table 1. Specifications of the PMSM.
ParametersValuesParametersValues
Rated Power2.3 kWLine-line resistance1.1 Ω
Rated torque15 NmLine-line inductance4.45 mH
Rated current9.5 ANumber of turns per phase108
Rated speed1500 rpmNumber of coils per phase12
Pole pairs4Voltage constant114 V/1000 r/min
Table 2. Operating conditions of the PMSM to be tested.
Table 2. Operating conditions of the PMSM to be tested.
CaseConstant Dynamic
82
Speed (rpm)1504509001350850~1550~850
Torque (Nm)3.0/7.53.0/7.53.0/7.53.0/7.53.0/7.5
Table 3. Dataset description.
Table 3. Dataset description.
LabelFault SettingCurrentVibrationTotal
Fault
Resistance
(Ω)
Shorted Turn Ratio (%)TrainingTestingTrainingTesting
HLInf08403608403602400
A2R554.68403608403602400
A4R558.38403608403602400
A5R5510.28403608403602400
A6R5513.98403608403602400
A2R114.68403608403602400
A4R118.38403608403602400
A2R0.50.54.68403608403602400
A5R1110.28403608403602400
A6R1113.98403608403602400
A4R0.50.58.38403608403602400
A5R0.50.510.28403608403602400
A6R0.50.513.98403608403602400
A2R0.10.14.68403608403602400
A4R0.10.18.38403608403602400
A5R0.10.110.28403608403602400
A6R0.10.113.98403608403602400
Table 4. Hyperparameters to be optimized.
Table 4. Hyperparameters to be optimized.
HyperparametersSearch IntervalsData TypesTransformMCMS
Linit[1 × 10−5 1]reallog2.7586 × 10−41.0001 × 10−4
G1[0.5 1]reallog0.90370.8948
L2R[1 × 1010 1 × 102]reallog0.00970.0091
P[1 × 10−5 1]reallog2.488 × 10−54.7663 × 10−5
Table 5. Hyperparameters to be optimized.
Table 5. Hyperparameters to be optimized.
LabelRes (%)MK-SE-Res (%)MC (%)MS (%)
Acc97.3598.2596.3298.99
HL86.8191.3881.9692.72
A2R597.9597.6994.4499.59
A4R597.1497.6793.8098.77
A5R598.6199.3198.6399.17
A6R595.4597.2298.4999.17
A2R197.3799.3192.5698.76
A4R197.1998.6194.2898.34
A2R0.598.0698.7595.7198.90
A5R198.4798.7497.8099.86
A6R198.8999.4498.2099.86
A4R0.596.7397.9195.9998.47
A5R0.597.7899.1798.4799.72
A6R0.598.2099.5898.0699.72
A2R0.198.1898.0498.4799.86
A4R0.199.3099.4499.45100
A5R0.198.6298.6299.7299.59
A6R0.199.5899.4499.7299.86
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, M.; Lai, W.; Zhang, H.; Liu, Y.; Song, Q. Intelligent Fault Diagnosis of Inter-Turn Short Circuit Faults in PMSMs for Agricultural Machinery Based on Data Fusion and Bayesian Optimization. Agriculture 2024, 14, 2139. https://doi.org/10.3390/agriculture14122139

AMA Style

Wang M, Lai W, Zhang H, Liu Y, Song Q. Intelligent Fault Diagnosis of Inter-Turn Short Circuit Faults in PMSMs for Agricultural Machinery Based on Data Fusion and Bayesian Optimization. Agriculture. 2024; 14(12):2139. https://doi.org/10.3390/agriculture14122139

Chicago/Turabian Style

Wang, Mingsheng, Wuxuan Lai, Hong Zhang, Yang Liu, and Qiang Song. 2024. "Intelligent Fault Diagnosis of Inter-Turn Short Circuit Faults in PMSMs for Agricultural Machinery Based on Data Fusion and Bayesian Optimization" Agriculture 14, no. 12: 2139. https://doi.org/10.3390/agriculture14122139

APA Style

Wang, M., Lai, W., Zhang, H., Liu, Y., & Song, Q. (2024). Intelligent Fault Diagnosis of Inter-Turn Short Circuit Faults in PMSMs for Agricultural Machinery Based on Data Fusion and Bayesian Optimization. Agriculture, 14(12), 2139. https://doi.org/10.3390/agriculture14122139

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop