1. Introduction
As one of the most important parts of rotating machinery, rolling bearings play a decisive role in the smooth operation of equipment [
1]. Due to the harsh working environment of most rolling bearings, faults often occur; if they are not detected and troubleshooted in time, not only will there be economic losses, but they may also be life-threatening [
2]. Therefore, the use of accurate fault diagnosis methods to monitor the operational status of bearings is a hot topic in current research [
3].
As technology and computer arithmetic develop, data-driven DL methodologies are becoming favoured by many scholars, and such methods are able to automatically extract fault features without too much human involvement [
4,
5,
6,
7]. Sun et al. proposed a CNN-LSTM based model for bearing fault diagnosis in complex operating environments, and the results show that the model has better load generalisation capability and noise immunity [
8]. Wang et al. proposed the RQA-Bayes-SVM for the healthy diagnosis of bearings; experiments showed that RQA-Bayes-SVM has better performance in fault mode diagnosis and fault degree differentiation [
9]. Zhao et al. proposed the DenseNet-BLSTM for the problem of extracting features effectively using traditional fault diagnosis methods rolling bearings; experiments show that the DenseNet-BLSTM has good fault diagnosis capability [
10].
The data-driven deep learning methods mentioned above are usually constructed with datasets that are set to be class-balanced. However, when dealing with unbalanced datasets, these models usually focus on the majority category and may ignore the minority category samples, resulting in low diagnostic accuracy of the minority fault samples [
11,
12]. Current research efforts to solve the unbalanced fault classification problem have focused on the following two areas:
(1) Data level: resampling techniques are mainly used to convert an unbalanced dataset into balanced dataset to enhance the diagnostic capability of the model. Zhang et al. used a generative adversarial network (GAN) to study the mapping between the noise distribution and the actual sample distribution in order to extend the available dataset, and their results show that augmenting the data with GAN improves the accuracy of the diagnosis [
13]. Luo et al. used conditional generative adversarial networks to enable model training to generate new samples towards the constraints, resulting in higher quality new samples [
14].
(2) Model level: feature enhancement extraction and integrated learning are mainly used. Lu et al. proposed an Improved Active Learning (IAL) diagnostic method for the intelligent labelling of unlabelled samples with a limited number of labelled samples, showing that IAL can significantly improve the classification of unbalanced data [
15]. Qin et al. proposed an IGAN method for bearing fault diagnosis for a small dataset and unbalanced dataset that combines the coordinate attention mechanism to effectively mine information from a limited number of fault samples, thus increasing the diagnosis accuracy [
16]. Wei et al. proposed an improved channel-attention CNN for the fault diagnosis of rolling bearings; this method can be better used for the feature extraction of unbalanced data compared to other shallow models [
17].
Although all the above-mentioned methods are effective in categorising and diagnosing unbalanced data, there are some limitations. (1) At the data level: most methods directly input the original signals into the network for training, which may cause information loss; there are also some 2D transformation methods that result in feature maps that may hinder feature recognition in the network. (2) At the model level: in most network structures, simple overlapping convolutional layers can recognise more features; however, in practice, network performance degradation and feature extraction is worse instead.
In order to address the shortcomings of existing problems, this paper proposes a diagnosis method for rolling bearings based on Markov Transition Field (MTF) and Mixed Attention Residual Network (MARN), there are four contributions:
The acquired different vibration signals are converted into 2D images using the MTF method, which preserves timing information and avoids the problem of losing the original signal information.
Introducing a mixed attention mechanism on the structure of the residual network to increase recognition of feature signals. Avoiding network degradation while enabling the extensive use of channel and spatial information.
Other 2D transformation methods are compared on the Case Western Reserve University (CWRU) fault bearing dataset, confirming the superiority of utilising MTF as the data preprocessing method; while also validating the advantages of the mixed attention mechanism for feature extraction.
Balanced and unbalanced datasets are divided to verify the superiority of the model. In comparison with the current state-of-the-art fault diagnostic models, overall, the MTF-MARN provides the best diagnostic results.
The paper is structured as follows.
Section 2 describes the rationale for the methodology.
Section 3 describes the structure of MARN and the bearing fault diagnosis process.
Section 4 gives comparative experiments of different methods.
Section 5 describes conclusions and prospects.
3. Model Structure and Fault Diagnosis Process
This section details the MARN structure for bearing fault diagnosis and sets up the appropriate diagnostic process.
3.1. MARN Model Structure
The MARN model is shown in
Figure 5, which in general consists of the residual feature extraction layers, the mixed attention feature enhancement layers and the Softmax classification layers.
In order to learn as many features as possible from the input images, the feature extraction layer uses 16 groups of residual structures (each containing two 3 × 3 convolutional layers) for feature extraction; all the features extracted by the feature extraction module are subsequently passed through the feature enhancement layer to augment the representation of the features. In residual networks, a portion of the input feature matrices are passed directly from the bottom layer to the top layer, so further channel compression and refinement are required; and the top layer information obtained through mainline convolution lacks location and detail information, although it contains more global abstract information. Therefore, channel and spatial mixed attention mechanism is introduced to utilise high- and low-level information fusion enhancements to make the information more complete.
After feature enhancement, the global average pooling (GAP) is used to match each feature matrix assigned with different intrinsic meanings to the fault categories, which better integrates global spatial information and reduces the network parameters. Finally, the Softmax classifier is used to classify the faults.
3.2. Fault Diagnosis Process
The fault diagnosis process is shown in
Figure 6, which can be split into the following steps:
Step 1: Data acquisition and preprocessing. The vibration signals collected from the CWRU bearing data were first converted into 2D MTF-encoded images and divided into corresponding datasets (balanced and unbalanced data).
Step 2: Build and train the network. The network is built, specific parameters are set, and the dataset divided in the previous step is input into MARN for training.
Step 3: Use MARN to validate the comparison of other 1D to 2D coding methods under the same data, and verify the performance of the mixed attention mechanism.
Step 4: Compare popular fault diagnosis models on divided balanced and unbalanced datasets to validate the superiority of the proposed method.
5. Conclusions
To solve most of the methods that are difficult to effectively extract fault features and unstable model training under unbalanced data, this paper proposes a new rolling bearings fault diagnosis method based on MTF and MARN, draws the following conclusions:
The vibration signals are converted into two-dimensional images using MTF to avoid the loss of original signal information, while retaining the temporal correlation.
Using the residual structure as the body of the network avoids the degradation problem of the model; meanwhile, the introduction of the mixed attention mechanism can enhance the feature extraction ability.
The superiority of MTF as a data preprocessing method was confirmed on the CWRU bearing dataset, which further confirmed that the mixed attention mechanism can extract more features.
The model performs stably on the unbalanced dataset, obtaining average recognition accuracies of 99.5% and 99.2% on the divided balanced and unbalanced data, respectively.
Although the proposed method has been improved at both the data level and the model level, diagnostic unbalanced data also have excellent fault diagnosis results; the MARN is trained from the beginning and needs a longer training period. In further research, the use of transfer learning methods can be considered to decrease training time. Moreover, the MTF-MARN method should also be used for datasets with complex operating conditions, such as strong noise and variable operating fault datasets.