Next Article in Journal
Dynamic Characteristics, Analysis, and Measurement of a Large Optical Mirror Processing System
Previous Article in Journal
Robotic Grasping Technology Integrating Large Kernel Convolution and Residual Connections
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Fault Diagnosis for Rolling Bearings Under Complex Working Conditions Based on Domain-Conditioned Adaptation

1
Chongqing Tsingshan Industrial, Chongqing 402760, China
2
College of Naval Architecture and Ocean Engineering, Naval University of Engineering, Wuhan 430033, China
*
Author to whom correspondence should be addressed.
Machines 2024, 12(11), 787; https://doi.org/10.3390/machines12110787
Submission received: 11 October 2024 / Revised: 3 November 2024 / Accepted: 4 November 2024 / Published: 7 November 2024
(This article belongs to the Section Machines Testing and Maintenance)

Abstract

:
To address the issue of low diagnostic accuracy caused by noise interference and varying rotational speeds in rolling bearings, a fault diagnosis method based on domain-conditioned feature correction is proposed for rolling bearings under complex working conditions. The approach first constructs a multi-scale self-calibrating convolutional neural network to aggregate input signals across different scales, adaptively establishing long-range spatial and inter-channel dependencies at each spatial location, thereby enhancing feature modeling under noisy conditions. Subsequently, a domain-conditioned adaptation strategy is introduced to dynamically adjust the activation of self-calibrating convolution channels in response to the differences between source and target domain inputs, generating correction terms for target domain features to facilitate effective domain-specific knowledge extraction. The method then aligns source and target domain features by minimizing inter-domain feature distribution discrepancies, explicitly mitigating the distribution variations induced by changes in working conditions. Finally, within a structural risk minimization framework, model parameters are iteratively optimized to achieve minimal distribution discrepancy, resulting in an optimal coefficient matrix for fault diagnosis. Experimental results using variable working condition datasets demonstrate that the proposed method consistently achieves diagnostic accuracies exceeding 95%, substantiating its feasibility and effectiveness.

1. Introduction

As one of the key supporting components in rotating machinery, rolling bearings are widely used in various types of mechanical equipment [1,2,3]. However, to meet the demands of industrial production, rolling bearings must operate for extended periods under complex and variable working conditions, which inevitably leads to faults. Once a rolling bearing fails, it can potentially cause damage to machinery and even pose a threat to human safety [4]. Therefore, fault diagnosis for rolling bearings is of great significance to ensure the operational reliability of production systems.
With advancements in high-performance computing hardware, deep learning now enables efficient training on large-scale datasets and precise inference for complex models. Leveraging its powerful feature extraction capabilities and proficiency in modeling intricate nonlinear relationships, deep learning has driven a transition in fault diagnosis from traditional handcrafted feature engineering to intelligent diagnostic techniques [5,6]. Compared to traditional methods [7,8,9,10], deep learning can automatically extract complex fault features from massive datasets, significantly improving diagnostic accuracy while reducing reliance on expert knowledge. Yu et al. [11] represented CNN-extracted feature vectors as graph structures and further extracted deep feature representations using GNN; Fu et al. [12] constructed a parallel encoder architecture to extract key information from both time-domain and time-frequency graphs, enhancing diagnostic precision; Song et al. [13] optimized model hyperparameters using a particle swarm optimization algorithm, improving diagnostic performance with limited training samples; He et al. [14] synthesized multi-sensor data at the data layer via pixel matrix fusion and built a multi-scale structure to capture information across different scales; Zheng et al. [15] proposed a two-stage fault diagnosis framework to address the issue of data imbalance; and Wu et al. [16] employed a U-Net for early fault diagnosis for rolling bearings. To ensure that models accurately capture deep feature representations in the data and achieve high diagnostic precision, input data must have low noise levels and similar distribution characteristics. However, in practical engineering applications, rolling bearings operate under complex and variable conditions, resulting in significant differences in the data distribution of gearbox fault samples across different working conditions. Moreover, factors such as vibrations and wear in the operating environment introduce additional noise, exacerbating signal aliasing and blurring, thereby increasing the difficulty and uncertainty of fault diagnosis.
Transfer learning extends models trained on source domain data to target domain data via transfer strategies, demonstrating notable advantages in fault diagnosis tasks under varying working conditions [17,18,19,20]. To address the impact of noise and varying working conditions on the fault diagnosis of rolling bearings, Chen et al. [21] employed convolutional neural networks with two different kernel sizes to automatically extract multi-scale signal features from raw data, demonstrating strong performance under noisy conditions. Ghorvei M. et al. [22] proposed the Deep Subdomain Adaptive Convolutional Neural Network (DSACNN), which exhibited good results in noise resistance and reducing domain distribution discrepancies. Peng et al. [23] combined multi-scale concepts by integrating information from multiple components and time scales of vibration signals, improving the model’s noise resistance and domain adaptability. Su et al. [24] developed a hierarchical convolutional neural network that utilizes multiple output layers to predict the hierarchical structure of bearing faults, showing robust performance in noise-affected and variable-working-condition fault diagnosis tasks. Huang et al. [25] enhanced the representation of multi-scale features through a deep convolutional neural network based on multi-scale features and mutual information, further improving the model’s generalization ability under complex working conditions. Qian et al. [26] reduced the negative impact of noise on the model by reconstructing and learning features from the input signal via a convolutional autoencoder, while introducing a new domain adaptation loss based on Coral loss and domain classification loss, effectively enhancing diagnostic performance under complex working conditions. Yu et al. [27] used wavelet packet transform in the data preprocessing stage to reduce noise interference and utilized MK-MMD to minimize the feature distribution discrepancy between the source and target domains, achieving excellent fault diagnosis performance and noise suppression across different working conditions.
While the aforementioned methods provide valuable insights into rolling bearing fault diagnosis, they face limitations under complex working conditions due to the ambiguity of fault signals. Current transfer learning approaches struggle to explicitly model inter-domain feature discrepancies, often leading models to focus on non-discriminative information, thereby reducing diagnostic accuracy and generalization capability. To address these limitations, this paper proposes a domain-conditioned feature correction method for fault diagnosis for rolling bearings in complex operating environments. By incorporating a domain-conditioned adaptation strategy into a multi-scale self-calibrating convolutional neural network, we develop a noise-resistant and adaptive transfer learning model capable of end-to-end fault diagnosis under challenging conditions. The primary contributions of this study are summarized as follows:
(1)
An end-to-end fault diagnosis method for rolling bearings is proposed, which effectively enhances the model’s adaptability to noise interference and varying working conditions;
(2)
A multi-scale self-calibrating convolutional neural network is constructed, which significantly expands the receptive field at each spatial location and aggregates features across different scales, thereby improving the network’s nonlinear expressive capability;
(3)
A domain-conditioned adaptive strategy is proposed, which activates the convolutional channels of the source and target domains differently. This allows the model to recalibrate features within the convolutional layers for each domain, capturing more domain-specific information.

2. Proposed Method

In this study, we propose a fault diagnosis method for rolling bearings under complex working conditions. The framework of the proposed method is illustrated in Figure 1 and primarily consists of data acquisition, a self-calibrating convolutional neural network, and domain-conditioned adaptation. The self-calibrating convolutional neural network serves as an encoder to extract discriminative feature representations, while the domain-conditioned adaptation aims to reduce the distribution discrepancies in rolling bearing fault datasets under varying working conditions.

2.1. Self-Calibrated Convolutions

Self-calibrated convolution [28] is a variant of the standard convolution model, which adaptively integrates richer contextual information by establishing long-range spatial and channel dependencies around each spatial location, thereby generating more discriminative feature representations. This characteristic is particularly effective for addressing noise interference, as self-calibrated convolution enhances the model’s sensitivity to and discriminative ability for fault features by expanding the receptive field and strengthening internal communication between features. Additionally, the self-calibration mechanism enables the model to adaptively construct long-range spatial and inter-channel correlations during training, further enhancing feature representation. The self-calibrated convolution is illustrated in Figure 2.
For a given input feature vector X, the split operation is first applied to obtain two feature vectors X1, X2, each with the same number of channels. Simultaneously, the set of convolutional kernels is divided into four parts with the same dimensions, denoted as K1, K2, K3, and K4; respectively. For the first feature vector X1, average pooling with a kernel size of r is applied to down-sample it, yielding the feature vector T1.
T 1 = a v g p o o l r ( X 1 )
Next, the feature transformation is performed on T1 using the convolutional kernel set K2, and the feature vector is up-sampled to the size of the input feature vector using bilinear interpolation, resulting in mapping from the small-scale space to the original feature space X 1 . The self-calibration operation can be expressed as follows:
X 1 = U p r ( F 2 ( T 1 ) ) = U p r ( T 1 K 2 )
Y 1 = F 3 ( X 1 ) × σ ( X 1 + X 1 )
Y 1 = F 4 ( Y 1 ) = Y 1 K 4
For the second feature vector X2, a standard convolution operation is directly applied to obtain the feature vector Y2. The output features from the two scale spaces, Y1 and Y2, are then fused to produce the final output feature Y.
Y 2 = F 1 ( X 2 ) = X 2 K 1
Y = c o n c a t ( Y 1 , Y 2 )
In the above equation, Up(·) represents the bilinear interpolation function, * denotes the convolution operation, and σ is the sigmoid activation function. In the self-calibrated convolution operation, each spatial location adaptively considers its surrounding information as an embedding from the latent space, which guides the original scale space and models inter-channel correlations. Additionally, the self-calibration operation only considers relevant information within an appropriate range for each spatial location, thereby avoiding redundant information from unrelated regions.

2.2. Inception Module

Due to the influence of mechanical structure and signal transmission paths, the vibration signals of equipment exhibit diversity, and a single scale is insufficient to fully characterize their complex dynamic response characteristics. Therefore, the proposed method constructs a multi-scale module in the model’s input layer, consisting of self-calibrated convolutions with kernel sizes of 1 × 3, 1 × 5, and 1 × 7, as shown in Figure 3. The outputs from each path are concatenated along the channel dimension, allowing the model to capture information at different scales.

2.3. Domain-Conditioned Adaptation

In unsupervised transfer learning, the source domain is denoted as S = { ( x s i , y s i ) } i = 1 n s , where x s and y s represent the source domain samples and labels, respectively, and n s is the number of source domain samples. The unlabeled target domain is defined as T = { x t j } i = 1 n t , where x t represents the target domain samples and n t is the number of target domain samples. Both the source and target domains share the same fault categories C n , but there exists a feature distribution discrepancy P s ( x ) P t ( x ) . Therefore, the goal of Domain-Conditioned Adaptation is to find domain-invariant features to reduce the inter-domain distribution discrepancy [29]. The Domain-Conditioned Adaptation in the proposed method mainly includes a domain-conditioned channel attention mechanism and domain-conditioned feature correction.

2.3.1. Domain-Conditioned Channel Attention Mechanism

Most current domain adaptation methods assume that the source and target domains can share the same convolutional layer parameters, under the belief that convolutional layers can extract universal low-level features. However, in complex cross-domain tasks, the feature distribution differences between the source and target domains can be significant, making this assumption of fully shared convolutional layer parameters invalid. As a result, the encoder may extract features that are sensitive only to the source domain samples, negatively affecting transfer performance. To address this, this paper proposes a domain-conditioned channel attention mechanism (DCCAM), enabling the encoder to select different channel activations for the source and target domains, thereby capturing domain-specific information for each.
For the given source domain and target domain input features X s = [ X s 1 , , X s C ] and X t = [ X t 1 , , X t C ] , feature transformation is first applied to generate domain-specific feature representations:
F c o n v = X s X ~ s , X t X ~ t
In the equation, X ~ s and X ~ t represent the transformed source domain and target domain feature vectors, respectively, and C denotes the number of channels. To generate global context information for each channel, the model performs global average pooling for each channel. Global average pooling captures the global information of each channel, generating a vector g.
g k = 1 L i = 1 L X k ( i )
where g k is the result of global average pooling for the k-th channel, representing the global feature of each channel, and X k ( i ) is the feature value of the k-th channel at position i.
After generating the global information, the domain-conditioned channel attention mechanism produces the source domain representation f s and the target domain representation f t , through different nonlinear layers. These representations are then passed through a ReLU activation function and another nonlinear layer to compute the channel attention weights v s for the source domain and v t for the target domain, respectively.
v s = σ ( W ( Re LU ( f s ) ) ) , v t = σ ( W ( Re LU ( f t ) ) )
X ˜ s = v s X s , X ˜ t = v t X t
where W is the transformation matrix, σ represents the sigmoid activation function, and denotes channel-wise multiplication, meaning that the features of each channel are multiplied by the corresponding attention weight. Through this process, the network can select different channels for the source and target domains, allowing each domain to extract more relevant domain-specific features from the convolutional layers. This not only improves the adaptability of the features to each domain but also enables the network to more effectively handle the differences between the source and target domains.

2.3.2. Domain-Conditioned Feature Correction

The domain-conditioned feature correction strategy primarily reduces the feature distribution discrepancy between the source and target domains by explicitly inserting a feature correction module after task-specific layers. This allows the model to better align features across different domains while preserving class-discriminative information within each domain. This process involves two main steps: feature correction and inter-domain feature alignment. For the source domain features H l ( x s ) and target domain features H l ( x t ) at the output of the l-th layer of the model, after feature correction, the target domain features are modified as follows:
H ^ l ( x t ) = H l ( x t ) + Δ H l ( x t )
where Δ H l ( x t ) is the output of the feature correction module, which is used to learn and adjust the target domain features, bringing them closer to the source domain. To align the features of the source and target domains, the Maximum Mean Discrepancy (MMD) criterion is employed to calculate the difference between the feature distributions of the source and target domains, and this difference is minimized. The loss function is
L M l = 1 n s i = 1 n s ϕ ( H l ( x s i ) ) 1 n t i = 1 n t ϕ ( H ^ l ( x t j ) ) H k 2
In this context, ϕ ( ) is the mapping function in the Reproducing Kernel Hilbert Space (RKHS), and H k refers to the RKHS with characteristic kernel k. Using the above equation, the feature distributions between the two domains are aligned during training, reducing the impact of changing working conditions. To prevent the feature correction module from excessively modifying the source domain features, the correction applied to the source domain should be minimal, i.e., H l ( x s ) H ^ l ( x s ) . To achieve this, a regularization term is designed to ensure that the corrected source domain features remain similar to the original ones, thus avoiding overfitting. The regularization loss is defined as
L r e g l = k = 1 C n 1 n s k x s i S k ϕ ( H l ( x s i ) ) 1 R x s j R ϕ ( H ^ l ( x s j ) ) H k 2
where R represents a random subset of the source domain, and R is the size of the random subset. The probability of selecting each data point is defined as P / C n , with P as a controlling factor. This regularization term ensures that the source domain features are not excessively adjusted by the correction module, thereby improving the stability of the feature correction module.
The overall optimization objective of the proposed method is to train a model for unsupervised domain adaptation tasks by combining labeled data from the source domain and unlabeled data from the target domain, enabling accurate classification in the target domain. To achieve this, a source classifier is constructed by minimizing the cross-entropy function:
min G   L s = 1 n s i = 1 n s E ( G ( x s i ) , y s i )
In the equation, G ( ) is the predictive model, x s i is the i-th sample from the source domain, y s i is the corresponding label, and E ( , ) is the cross-entropy loss function. The goal of this term is to ensure that the model can perform accurate classification on the source domain. Since the target domain lacks labels, an entropy minimization loss is introduced to enhance the discriminability of the target domain data. The loss function is as follows:
min G   L e = 1 n t j = 1 n t k = 1 C n G ( k ) ( x t j ) log G ( k ) ( x t j )
In summary, the total loss of the proposed method can be expressed as
min G L = L s + α l = 1 L ( L l M + L l r e g ) + β L e
where L s and L e represent the source domain classification loss and target domain entropy loss, respectively, L l M and L l r e g represent the feature alignment loss and regularization loss at the l-th layer, respectively, and α and β are the weighting factors.

3. Experimental Validation and Analysis

3.1. Experimental Setup

The experimental data were sourced from the open-access HUST bearing dataset provided by Huazhong University of Science and Technology [30]. The rolling bearing fault data were collected using the Spectra Quest Mechanical Fault Simulator, as shown in Figure 4. The test rig primarily consists of a frequency converter, motor, shaft, acceleration sensor, and bearing. In this experiment, multiple sets of rolling bearing fault data were collected at a sampling frequency of 25.6 KHz with rotational speeds of 20 Hz (Condition A), 25 Hz (Condition B), and 30 Hz (Condition C), forming the dataset. The operating states of the rolling bearing included the following: normal, moderate inner ring fault, moderate outer ring fault, and moderate compound fault of both inner and outer rings. All faults were pre-set manually using wire-cutting methods. The collected vibration signals were obtained from the Z direction of a triaxial sensor. The rolling bearing model used was ER-16K, with a shaft diameter of 38.52 mm, a ball diameter of 7.94 mm, and nine rolling elements. The sensor model was TREA331, and related information is detailed in Table 1.
The samples were segmented using a sliding window method with an 80% overlap, and the number of sample points was fixed at 2048, resulting in 635 samples per class. The vibration signals from each condition (A, B, C) were randomly divided into training and testing sets in an 8:2 ratio. The training set consists of 508 samples per class, totaling 2540 samples, while the testing set consists of 127 samples per class, totaling 635 samples. The dataset construction is shown in Table 2.

3.2. Parameter Settings

The model optimizer used is Adam, with an initial learning rate of 0.0001. A step learning rate schedule with a step size of 10 is applied, and the batch size is set to 64, with a total of 50 epochs. The parameters for each layer of the encoder are shown in Table 3 The hardware environment for the experiments includes an Intel Core i7-11800H CPU, an NVIDIA GeForce RTX 3060 GPU with 6 GB of memory, 16 GB of RAM, and the PyTorch 1.11 deep learning framework. To validate the effectiveness of the proposed method, comparative experiments were conducted between the proposed method and EMD + SVM, MCNN [23], MMDCNN [25], and DAN [31] on rolling bearing fault datasets with noise under both the same rotational speed and varying rotational speeds.

3.3. Results Analysis

3.3.1. Fault Diagnosis Results and Analysis with Noise at the Same Rotational Speed

Due to the complex and variable operating conditions of rolling bearings, the collected vibration signals are subject to a certain degree of noise interference. Therefore, 3 dB of Gaussian noise simulating environmental noise was added to the experimental data to validate the noise robustness of the proposed method. The experimental results are shown in Table 4 and Figure 5 and Figure 6. Table 4 presents the diagnostic results of different methods, Figure 5 shows the accuracy comparison curves of various methods using the A→A diagnostic task as an example, and Figure 6 displays the feature distribution of different methods for the A→A diagnostic task, with the output features from the fully connected layer visualized using T-SNE. Since EMD + SVM does not contain a fully connected layer, we performed dimensionality reduction and visualization on the signals processed by EMD.
As indicated in Table 4, in diagnostic tasks under identical conditions, EMD + SVM, MCNN, MMDCNN, and DAN all underperform compared to the proposed method. Specifically, EMD + SVM and DAN demonstrate limited noise resistance, with average diagnostic accuracies of only 79.33% and 84.65%, respectively, and exhibit the slowest convergence rates. This is primarily due to SVM’s reliance on identifying an optimal hyperplane for classification; in the presence of noise, the positioning of the hyperplane can be adversely impacted by noisy data points, hindering the model’s ability to accurately distinguish between class features. Additionally, the shallow network structure of the AlexNet encoder utilized in DAN restricts its ability to represent complex fault features. As illustrated in Figure 6d, the classification boundaries for the inner ring faults, outer ring faults, and rolling element faults in DAN are indistinct, making it challenging to extract discriminative features. In contrast, MCNN and MMDCNN improve accuracy by 9.73% and 9.79%, respectively, compared to DAN. Both methods enhance the encoder’s noise resistance by using multi-scale feature extraction. Figure 5 shows that both methods converge faster than DAN, and Figure 6b,c shows that the feature distributions of each class are more clearly separated, although some overlap remains. The proposed method achieves the highest average diagnostic accuracy of 97.43%, with tightly clustered intra-class features and clear inter-class separation. This is due to the proposed method’s ability to, on the one hand, build multi-scale layers in the encoder’s input layer to extract more discriminative feature representations, and on the other hand, establish long-range spatial and channel dependencies through self-calibrating convolutions, adaptively integrating richer contextual information. These results demonstrate the proposed method’s strong diagnostic performance under noise interference.

3.3.2. Fault Diagnosis Results and Analysis with Noise at Different Rotational Speeds

In practical engineering applications, the rotational speed of rolling bearings can vary. Based on this, the experiments in this section were conducted under conditions of varying rotational speeds with 3 dB noise to verify that the proposed method maintains strong generalization performance when the speed changes. The results are shown in Table 5 and Figure 7 and Figure 8. Table 4 presents the diagnostic results of each method for different transfer tasks, while Figure 7 shows the accuracy comparison curves for each method using the B→A diagnostic task as an example. Figure 8 illustrates the feature distributions of each method for the B→A diagnostic task. As shown in Table 5, the proposed method achieves an average diagnostic accuracy of 95.50% across different transfer tasks, maintaining a high level of accuracy. In contrast, the accuracy of the other comparison methods declines to varying degrees compared to the results under constant working conditions. This is because, under different rotational speeds, the feature distribution differences between the source domain and the target domain are too large, affecting the model’s ability to extract and recognize features. Additionally, the introduction of noise further exacerbates this complexity, as noise can obscure or distort key information in the fault signals, reducing the diagnostic accuracy of the model. The performance of EMD + SVM is clearly impacted by complex working conditions, with an average diagnostic accuracy of 71.76%, which is significantly lower than that of the proposed method. This is due to EMD + SVM’s lack of adaptability to feature distribution discrepancies across different domains. In particular, under varying rotational speeds and noise interference, it struggles to effectively align features, resulting in reduced diagnostic accuracy. The accuracy of the DAN method, for example, drops to 79.51%, which is 5.14% lower than its performance under constant working conditions, highlighting the challenge of achieving high diagnostic accuracy in complex conditions. MCNN and MMDCNN, due to their noise resistance, show improved accuracy compared to DAN. Moreover, MMDCNN incorporates domain adaptation techniques, which help reduce the distribution discrepancy between the two domains to some extent, resulting in a 3.03% accuracy improvement compared to MCNN. The proposed method, through Domain-Conditioned Adaptation, activates different convolution channels for the source and target domains, allowing the model to recalibrate the features in the convolutional layers for each domain, thereby enabling rolling bearing fault diagnosis under complex working conditions. This demonstrates the superiority of the proposed method.
As shown in Figure 7 and Figure 8, all the methods exhibit varying degrees of fluctuation in the early stages of training. This is because, at the beginning of training, the models have not yet fully extracted consistent features between the source and target domains. At this stage, the models mainly rely on features learned from the source domain, but there is a significant difference in the data distribution between the source and target domains, making it difficult for the models to effectively predict on the target domain. However, as the number of iterations increases, all the methods gradually converge. The proposed method exhibits the least fluctuation and the most stable convergence, as it directly corrects the target domain features during training. Similarly, by comparing the feature distributions in the fully connected layer in Figure 8, the feature extraction capabilities of different methods under complex working conditions can be visually observed. The EMD + SVM and DAN methods show extensive overlap, making it difficult to effectively distinguish between different class features. MCNN and MMDCNN exhibit clearer classification boundaries compared to DAN, but some overlap still exists. The proposed method, on the other hand, has only a small amount of misclassified features, further proving its effectiveness.
The probability density plot of the proposed method is shown in Figure 9 using the A → B transfer task as an example to demonstrate that the method can reduce the distribution discrepancy between the source and target domains. As shown in Figure 9a, the original data from the source and target domains exhibit significant differences, with completely inconsistent probability density plots, reflecting the distribution discrepancy caused by changes in working conditions. The proposed method models the inter-domain differences through the domain-conditioned adaptation strategy and explicitly adjusts the data distribution of the target domain. As shown in Figure 9b, the probability density curves of the source and target domains show a high degree of overlap, proving the effectiveness of the domain-conditioned adaptation strategy in the proposed method.

3.3.3. Fault Diagnosis Under Different Rotational Speeds and Noise Levels

In practical operations, noise is not limited to a single level. Based on this, the experiments in this section were conducted under varying noise levels and rotational speeds. Taking the C → A transfer task as an example, Gaussian white noise ranging from −3 dB to 3 dB was added to the original vibration signals. Compared to 3 dB, −3 dB introduces more significant noise interference, having a greater impact on the model. The results are shown in Table 6 and Figure 8. As can be seen from Table 6 and Figure 10, as the level of noise interference increases, all the methods show a notable decline in accuracy. EMD + SVM, in particular, shows the greatest sensitivity to noise, with significant accuracy drops across lower dB levels. Its reliance on finding an optimal hyperplane for classification in SVM makes it highly susceptible to noise, as noisy data points interfere with hyperplane positioning, resulting in a lower capacity to distinguish between fault types. The accuracy of EMD + SVM, in fact, remains below 73% even at the highest tested noise level of 3 dB, illustrating its limited robustness. The accuracy of the DAN method, for example, drops by 7.76%. MCNN and MMDCNN demonstrate some robustness, but their diagnostic accuracy falls below 90% under −3 dB noise. The proposed method not only adaptively mitigates noise interference during the training process but also reduces the distribution discrepancy caused by changes in rotational speed, proving its noise robustness.

3.3.4. Ablation Experiments

To validate the effectiveness of the self-calibrating convolution and Domain-Conditioned Adaptation in the proposed method, an ablation study was conducted using the C → A transfer task with a noise level of 3 dB, by removing or adding these two components and observing the impact on model performance. The removal of the self-calibrating convolution refers to replacing it with standard convolution for feature extraction. The results are shown in Table 7, where the symbol ↑ indicates an improvement. With the addition of self-calibrating convolution and domain-conditioned adaptation, accuracy increased by 10.03% and 15.03%, respectively. When both modules were added, the accuracy improved by 21.42%. This improvement was due to the self-calibrating convolution’s ability to adaptively build long-range spatial and inter-channel correlations, enhancing feature representation. Additionally, the Domain-Conditioned Adaptation strategy plays a key role in reducing domain distribution discrepancies, thereby improving the model’s diagnostic performance under different working conditions.

3.3.5. Generalization Experiment

To further validate the generalization capability of the proposed method, we conducted experiments using the open-access rolling bearing fault dataset from Southeast University [32]. This dataset includes healthy bearings, inner ring faults, outer ring faults, rolling element faults, and inner–outer ring composite faults. Vibration data were collected from each fault type under condition D (1200 rpm) and condition E (1800 rpm) with a sampling frequency of 5120 Hz. A sliding window with a 50% overlap was used to segment the samples, with a fixed sample size of 2048 points, yielding 635 samples per class. The dataset was divided into training and testing sets at an 8:2 ratio, resulting in 508 samples per class in the training set and 127 samples per class in the testing set. To simulate real-world noise interference, 3 dB of Gaussian white noise was added to the original signals. The diagnostic results of each method are shown in Table 8 and Figure 11, with Table 8 presenting the diagnostic results under complex working conditions and Figure 11 illustrating the feature distributions of each method’s fully connected layer for the E → D transfer task.
As shown in Table 8 and Figure 11, the transfer learning performance of each method differs across the different tasks (D → E and E → D). The proposed method demonstrates the highest average accuracy across both tasks, reaching 96.36%, significantly higher than the other comparison methods. This indicates that it has stronger adaptability and generalization capabilities in handling feature discrepancies between the source and target domains. In contrast, the average accuracy of EMD + SVM is 75.47%, which is significantly lower than that of the proposed method; the DAN method achieves an average accuracy of 79.70%, indicating inadequate feature alignment when faced with large distribution differences between the source and target domains, resulting in suboptimal transfer performance. Moreover, as shown in Figure 11d, DAN struggles to clearly classify each fault type, leading to a substantial amount of overlap. Although MCNN and MMDCNN show some improvement over DAN, they still fall short of the proposed method. In the fully connected layer feature distribution plot of the proposed method, the classification boundaries of each fault type are clear, with minimal overlap, demonstrating the superior generalization capability of the proposed method.

3.3.6. Computational Cost Analysis

To further validate the feasibility of deploying the proposed method in real-world industrial scenarios, a comparison was conducted between the training and testing times per epoch of the proposed method and those of the benchmark methods. All the experiments were performed on the same hardware platform equipped with an RTX 3060 GPU, the results are shown in Table 9. The experimental results indicated minimal differences in training and testing times across the HUST and SEU datasets for each method, due to the consistent dataset sizes. Since the EMD + SVM method involves both feature extraction and pattern recognition processes, it is challenging to make a fair comparison with the other methods. Therefore, it is excluded from the analysis in the Computational Cost Analysis experiment. On the HUST dataset, the proposed method’s training time per epoch was 0.25 s, slightly higher than MCNN’s 0.21 s and DAN’s 0.12 s, yet shorter than MMDCNN’s 0.30 s. Similarly, on the SEU dataset, the proposed method exhibited a training time of 0.27 s, exceeding MCNN’s 0.22 s and DAN’s 0.16 s. This discrepancy is attributed to the relatively shallow encoder structures of the MCNN and DAN methods, which result in lower diagnostic accuracy. Notably, the testing time for all the methods was 0.01 s, as the testing phase does not involve backpropagation or model parameter updates, thereby consuming less time. In practical industrial applications, model training is typically carried out on cloud servers equipped with high-performance computing resources, to accommodate the computational requirements of complex models and large-scale data processing. In on-site industrial settings, only pre-trained models need to be deployed to enable real-time fault diagnosis and rapid response. Therefore, within the same testing time, the proposed method demonstrates superior diagnostic performance.

4. Conclusions

To address the issue of low fault diagnosis accuracy caused by noise interference and varying rotational speeds in rolling bearings, this study proposes a domain-conditioned adaptation-based fault diagnosis method for rolling bearings under complex working conditions, integrating noise reduction and adaptation within a unified framework. A multi-scale self-calibrating convolutional neural network was constructed to aggregate input signals at different scales, mitigating the negative impact of noise on the model’s ability to extract discriminative features. A domain-conditioned adaptation strategy was introduced to recalibrate the features of the source and target domains and generate correction terms for the target domain features. By aligning the source and target domain features through minimizing inter-domain feature distribution discrepancies, the method explicitly reduces distribution differences caused by changes in working conditions. Comparative and ablation experiments using rolling bearing fault datasets showed that the proposed method achieves an accuracy exceeding 95%, with improvements of 3.25% to 15.99% over other frameworks. A further analysis revealed a 21.42% accuracy improvement over the initial model, demonstrating strong robustness. As working conditions become more complex, the cost of obtaining labeled samples in the source domain increases, making diagnostic performance more susceptible to label scarcity in cross-domain diagnosis. Future work could explore integrating the proposed method with semi-supervised learning to leverage the potential feature information and structural patterns from large amounts of unlabeled data, further enhancing the model’s diagnostic performance and generalization capability.

Author Contributions

Conceptualization, X.Z. and G.G.; methodology, X.Z.; software, G.G.; validation, X.Z. and G.G.; formal analysis, X.Z.; investigation, G.G.; resources, G.G.; data curation, X.Z.; writing—original draft preparation, X.Z.; writing—review and editing, X.Z. and G.G.; visualization, G.G.; supervision, X.Z. and G.G.; project administration, G.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the General Program of the Natural Science Foundation of Hubei Province—Research on Key Technologies for Technical Condition Assessment of Rotating Machinery Based on Multi-Modal Data Fusion (project number: 2022CFB405).

Data Availability Statement

Data will be made available on request.

Conflicts of Interest

Author Xu Zhang was employed by the company Chongqing Tsingshan Industrial. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Chen, X.; Yang, R.; Xue, Y.; Huang, M.; Ferrero, R.; Wang, Z. Deep transfer learning for bearing fault diagnosis: A systematic review since 2016. IEEE Trans. Instrum. Meas. 2023, 72, 1–21. [Google Scholar] [CrossRef]
  2. Ni, Q.; Ji, J.C.; Halkon, B.; Feng, K.; Nandi, A.K. Physics-Informed Residual Network (PIResNet) for rolling element bearing fault diagnostics. Mech. Syst. Signal Process. 2023, 200, 110544. [Google Scholar] [CrossRef]
  3. Tao, H.; Qiu, J.; Chen, Y.; Stojanovic, V.; Cheng, L. Unsupervised cross-domain rolling bearing fault diagnosis based on time-frequency information fusion. J. Frankl. Inst. 2023, 360, 1454–1477. [Google Scholar] [CrossRef]
  4. Hou, Y.; Wang, J.; Chen, Z.; Ma, J.; Li, T. Diagnosisformer: An efficient rolling bearing fault diagnosis method based on improved Transformer. Eng. Appl. Artif. Intell. 2023, 124, 106507. [Google Scholar] [CrossRef]
  5. Lin, J.; Shao, H.; Zhou, X.; Cai, B.; Liu, B. Generalized MAML for few-shot cross-domain fault diagnosis of bearing driven by heterogeneous signals. Expert Syst. Appl. 2023, 230, 120696. [Google Scholar] [CrossRef]
  6. Xiong, J.; Liu, M.; Li, C.; Cen, J.; Zhang, Q.; Liu, Q. A bearing fault diagnosis method based on improved mutual dimensionless and deep learning. IEEE Sens. J. 2023, 23, 18338–18348. [Google Scholar] [CrossRef]
  7. Wang, J.; Du, G.; Zhu, Z.; Shen, C.; He, Q. Fault diagnosis of rotating machines based on the EMD manifold. Mech. Syst. Signal Process. 2020, 135, 106443. [Google Scholar] [CrossRef]
  8. Ye, X.; Hu, Y.; Shen, J.; Chen, C.; Zhai, G. An adaptive optimized TVF-EMD based on a sparsity-impact measure index for bearing incipient fault diagnosis. IEEE Trans. Instrum. Meas. 2020, 70, 1–11. [Google Scholar] [CrossRef]
  9. Li, H.; Liu, T.; Wu, X.; Chen, Q. An optimized VMD method and its applications in bearing fault diagnosis. Measurement 2020, 166, 108185. [Google Scholar] [CrossRef]
  10. Li, H.; Wu, X.; Liu, T.; Li, S.; Zhang, B.; Zhou, G.; Huang, T. Composite fault diagnosis for rolling bearing based on parameter-optimized VMD. Measurement 2022, 201, 111637. [Google Scholar] [CrossRef]
  11. Yu, Z.; Zhang, C.; Deng, C. An improved GNN using dynamic graph embedding mechanism: A novel end-to-end framework for rolling bearing fault diagnosis under variable working conditions. Mech. Syst. Signal Process. 2023, 200, 110534. [Google Scholar] [CrossRef]
  12. Fu, G.; Wei, Q.; Yang, Y.; Li, C. Bearing fault diagnosis based on CNN-BiLSTM and residual module. Meas. Sci. Technol. 2023, 34, 125050. [Google Scholar] [CrossRef]
  13. Song, B.; Liu, Y.; Fang, J.; Liu, W.; Zhong, M.; Liu, X. An optimized CNN-BiLSTM network for bearing fault diagnosis under multiple working conditions with limited training samples. Neurocomputing 2024, 574, 127284. [Google Scholar] [CrossRef]
  14. He, D.; Lao, Z.; Jin, Z.; He, C.; Shan, S.; Miao, J. Train bearing fault diagnosis based on multi-sensor data fusion and dual-scale residual network. Nonlinear Dyn. 2023, 111, 14901–14924. [Google Scholar] [CrossRef]
  15. Zheng, M.; Chang, Q.; Man, J.; Liu, Y.; Shen, Y. Two-stage multi-scale fault diagnosis method for rolling bearings with imbalanced data. Machines 2022, 10, 336. [Google Scholar] [CrossRef]
  16. Wu, D.; Chen, D.; Yu, G. New Health Indicator Construction and Fault Detection Network for Rolling Bearings via Convolutional Auto-Encoder and Contrast Learning. Machines 2024, 12, 362. [Google Scholar] [CrossRef]
  17. Huo, C.; Jiang, Q.; Shen, Y.; Zhu, Q.; Zhang, Q. Enhanced transfer learning method for rolling bearing fault diagnosis based on linear superposition network. Eng. Appl. Artif. Intell. 2023, 121, 105970. [Google Scholar] [CrossRef]
  18. Tang, G.; Yi, C.; Liu, L.; Yang, X.; Xu, D.; Zhou, Q.; Lin, J. A novel transfer learning network with adaptive input length selection and lightweight structure for bearing fault diagnosis. Eng. Appl. Artif. Intell. 2023, 123, 106395. [Google Scholar] [CrossRef]
  19. Ding, Y.; Jia, M.; Zhuang, J.; Cao, Y.; Zhao, X.; Lee, C.G. Deep imbalanced domain adaptation for transfer learning fault diagnosis of bearings under multiple working conditions. Reliab. Eng. Syst. Saf. 2023, 230, 108890. [Google Scholar] [CrossRef]
  20. Zhang, R.; Gu, Y. A transfer learning framework with a one-dimensional deep subdomain adaptation network for bearing fault diagnosis under different working conditions. Sensors 2022, 22, 1624. [Google Scholar] [CrossRef]
  21. Chen, X.; Zhang, B.; Gao, D. Bearing fault diagnosis base on multi-scale CNN and LSTM model. J. Intell. Manuf. 2021, 32, 971–987. [Google Scholar] [CrossRef]
  22. Ghorvei, M.; Kavianpour, M.; Beheshti, M.T.; Ramezani, A. An unsupervised bearing fault diagnosis based on deep subdomain adaptation under noise and variable load condition. Meas. Sci. Technol. 2021, 33, 025901. [Google Scholar] [CrossRef]
  23. Peng, D.; Wang, H.; Liu, Z.; Zhang, W.; Zuo, M.J.; Chen, J. Multibranch and multiscale CNN for fault diagnosis of wheelset bearings under strong noise and variable load condition. IEEE Trans. Ind. Inform. 2020, 16, 4949–4960. [Google Scholar] [CrossRef]
  24. Su, K.; Liu, J.; Xiong, H. Hierarchical diagnosis of bearing faults using branch convolutional neural network considering noise interference and variable working conditions. Knowl.-Based Syst. 2021, 230, 107386. [Google Scholar] [CrossRef]
  25. Huang, K.; Zhu, L.; Ren, Z.; Lin, T.; Zeng, L.; Wan, J.; Zhu, Y. An Improved Fault Diagnosis Method for Rolling Bearings Based on 1D_CNN Considering Noise and Working Condition Interference. Machines 2024, 12, 383. [Google Scholar] [CrossRef]
  26. Qian, Q.; Qin, Y.; Wang, Y.; Liu, F. A new deep transfer learning network based on convolutional auto-encoder for mechanical fault diagnosis. Measurement 2021, 178, 109352. [Google Scholar] [CrossRef]
  27. Yu, X.; Liang, Z.; Wang, Y.; Yin, H.; Liu, X.; Yu, W.; Huang, Y. A wavelet packet transform-based deep feature transfer learning method for bearing fault diagnosis under different working conditions. Measurement 2022, 201, 111597. [Google Scholar] [CrossRef]
  28. Liu, J.J.; Hou, Q.; Cheng, M.M.; Wang, C.; Feng, J. Improving convolutional networks with self-calibrated convolutions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10096–10105. [Google Scholar]
  29. Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11534–11542. [Google Scholar]
  30. Zhao, C.; Zio, E.; Shen, W. Domain generalization for cross-domain fault diagnosis: An application-oriented perspective and a benchmark study. Reliab. Eng. Syst. Saf. 2024, 245, 109964. [Google Scholar] [CrossRef]
  31. Long, M.; Cao, Y.; Wang, J.; Jordan, M. Learning Transferable Features with Deep Adaptation Networks. In Proceedings of the International Conference on Machine Learning, Lille, France, 6–11 July 2015; pp. 97–105. [Google Scholar]
  32. Shao, S.; McAleer, S.; Yan, R.; Baldi, P. Highly accurate machine fault diagnosis using deep transfer learning. IEEE Trans. Ind. Inform. 2018, 15, 2446–2455. [Google Scholar] [CrossRef]
Figure 1. Overall structure.
Figure 1. Overall structure.
Machines 12 00787 g001
Figure 2. Self-calibrated convolutions.
Figure 2. Self-calibrated convolutions.
Machines 12 00787 g002
Figure 3. Inception module.
Figure 3. Inception module.
Machines 12 00787 g003
Figure 4. Test rig.
Figure 4. Test rig.
Machines 12 00787 g004
Figure 5. The accuracy comparison curve of various methods for the A → A diagnosis task.
Figure 5. The accuracy comparison curve of various methods for the A → A diagnosis task.
Machines 12 00787 g005
Figure 6. Feature distribution of different methods at the same rotational speed (a) Feature distribution of EMD + SVM; (b) feature distribution of MCNN; (c) feature distribution of MMDCNN; (d) feature distribution of DAN; (e) feature distribution of proposed method.
Figure 6. Feature distribution of different methods at the same rotational speed (a) Feature distribution of EMD + SVM; (b) feature distribution of MCNN; (c) feature distribution of MMDCNN; (d) feature distribution of DAN; (e) feature distribution of proposed method.
Machines 12 00787 g006
Figure 7. The accuracy comparison curve of various methods for the B → A diagnosis task.
Figure 7. The accuracy comparison curve of various methods for the B → A diagnosis task.
Machines 12 00787 g007
Figure 8. Feature distribution of different methods at the varying rotational speed (a) Feature distribution of EMD + SVM; (b) feature distribution of MCNN; (c) feature distribution of MMDCNN; (d) feature distribution of DAN; (e) feature distribution of proposed method.
Figure 8. Feature distribution of different methods at the varying rotational speed (a) Feature distribution of EMD + SVM; (b) feature distribution of MCNN; (c) feature distribution of MMDCNN; (d) feature distribution of DAN; (e) feature distribution of proposed method.
Machines 12 00787 g008
Figure 9. (a) Probability density plot of the original data; (b) probability density plot of the proposed method.
Figure 9. (a) Probability density plot of the original data; (b) probability density plot of the proposed method.
Machines 12 00787 g009
Figure 10. Fault diagnosis results of different methods.
Figure 10. Fault diagnosis results of different methods.
Machines 12 00787 g010
Figure 11. Feature distribution of different methods on the SEU dataset at varying rotational speeds (a) Feature distribution of EMD + SVM; (b) feature distribution of MCNN; (c) feature distribution of MMDCNN; (d) feature distribution of DAN; (e) feature distribution of proposed method.
Figure 11. Feature distribution of different methods on the SEU dataset at varying rotational speeds (a) Feature distribution of EMD + SVM; (b) feature distribution of MCNN; (c) feature distribution of MMDCNN; (d) feature distribution of DAN; (e) feature distribution of proposed method.
Machines 12 00787 g011
Table 1. Information about TREA331.
Table 1. Information about TREA331.
SpecificationsValue
Sensitivity (±15%)100 Mv/g
Voltage Source18–30 VDC
Constant Current Excitation2–10 mA
Frequency Response (±3 dB)30–600,000 CPM
Table 2. Experimental data set construction.
Table 2. Experimental data set construction.
LabelFault TypeTraining SetTest Set
0Healthy508127
1Inner ring fault508127
2Outer ring fault508127
3Rolling element fault508127
4Combined inner and outer ring fault508127
Total2540635
Table 3. Model parameters settings.
Table 3. Model parameters settings.
LayersNumber of FiltersKernel SizeStride
Inception Module161 × 3/1 × 5/1 × 71
SCConv-BN-ReLU641 × 32
Maxpool641 × 22
DCCAM64--
SCConv-BN-ReLU1281 × 32
Maxpool1281 × 22
SCConv-BN-ReLU2561 × 32
AvgPool256--
Fully-connected layer5--
Table 4. Diagnosis accuracies of different methods (%).
Table 4. Diagnosis accuracies of different methods (%).
Methods/TaskA → AB → BC → CAverage
EMD + SVM75.3283.4579.2279.33
MCNN96.2292.4495.1294.38
MMDCNN94.8093.8694.6594.44
DAN82.8382.3188.8284.65
Proposed method97.4897.1797.6497.43
Table 5. Comparison of experimental results of different methods under noisy conditions at different rotational speeds (%).
Table 5. Comparison of experimental results of different methods under noisy conditions at different rotational speeds (%).
Methods/TaskA → BA → CB → AB → CC → AC → BAverage
EMD + SVM70.4573.4462.2071.8972.0173.2471.76
MCNN88.2590.5591.9780.9092.2891.3489.22
MMDCNN89.7693.5493.2391.5092.8692.6092.25
DAN78.2680.3179.0677.6980.6381.1179.51
Proposed method96.8194.6596.8595.9196.5494.2195.50
Table 6. Fault diagnosis results of different methods under different rotational speeds and noise levels (%).
Table 6. Fault diagnosis results of different methods under different rotational speeds and noise levels (%).
Methods/Noise Levels−3 dB−1 dB0 dB1 dB3 dBAverage
EMD + SVM63.2165.7869.4070.0872.2168.13
MCNN87.6589.4591.2291.6692.2890.45
MMDCNN89.1590.1492.0293.4093.8691.71
DAN72.8775.6377.2879.5480.6377.19
Proposed method92.7493.8195.8496.1296.5494.61
Table 7. The ablation study for task C → A (%).
Table 7. The ablation study for task C → A (%).
SCConvDCANAccuracy/%Improvement/%
××75.12-
×85.15↑ 10.03
×90.15↑ 15.03
96.54↑ 21.42
Table 8. Comparison of experimental results of different methods under noisy conditions at varying rotational speeds on the SEU dataset (%).
Table 8. Comparison of experimental results of different methods under noisy conditions at varying rotational speeds on the SEU dataset (%).
Methods/TaskD → EE → DAverage
EMD + SVM73.1577.8075.47
MCNN84.3288.6886.50
MMDCNN90.1892.8191.49
DAN80.4478.9779.70
Proposed method96.5096.2396.36
Table 9. The training and testing time per epoch for each method on the two datasets.
Table 9. The training and testing time per epoch for each method on the two datasets.
MethodsTraining Time/Testing Time (s)
HUST DatasetSEU Dataset
MCNN0.21/0.010.22/0.01
MMDCNN0.30/0.010.38/0.01
DAN0.12/0.010.16/0.01
Proposed method0.25/0.010.27/0.01
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, X.; Gu, G. Fault Diagnosis for Rolling Bearings Under Complex Working Conditions Based on Domain-Conditioned Adaptation. Machines 2024, 12, 787. https://doi.org/10.3390/machines12110787

AMA Style

Zhang X, Gu G. Fault Diagnosis for Rolling Bearings Under Complex Working Conditions Based on Domain-Conditioned Adaptation. Machines. 2024; 12(11):787. https://doi.org/10.3390/machines12110787

Chicago/Turabian Style

Zhang, Xu, and Gaoquan Gu. 2024. "Fault Diagnosis for Rolling Bearings Under Complex Working Conditions Based on Domain-Conditioned Adaptation" Machines 12, no. 11: 787. https://doi.org/10.3390/machines12110787

APA Style

Zhang, X., & Gu, G. (2024). Fault Diagnosis for Rolling Bearings Under Complex Working Conditions Based on Domain-Conditioned Adaptation. Machines, 12(11), 787. https://doi.org/10.3390/machines12110787

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop