1. Introduction
Rotating machinery is widely used in large-scale manufacturing systems and important technical equipment, such as ships and oceans, wind turbines, aerospace engines, etc. [
1,
2,
3]. In these systems and equipment, the key components in rotating machinery, such as rolling bearings and gears, often have early defects such as mild wear, spot eclipse, and so forth. Bearings are some of the most important components among them. The operating state of the rotating equipment is heavily dependent on the state of the bearings. If these defects are not diagnosed at the incipient stage, they will continue to deteriorate over time, eventually leading to system failure and causing considerable losses to people’s lives and properties [
4,
5]. Hence, to ensure the normal operation of rotating machinery, it is necessary to carry out fault diagnosis and health assessment of bearings.
The key to achieving fault diagnosis of bearings is to extract useful information related to fault characteristics from the analyzed signals. Vibration analysis-based methods have been studied for decades, and vibration has long been one of several main parameters in the fault diagnosis of rotating machinery [
6]. The vibration signal collected by bearings contains very important dynamic characteristics, so it is very important and effective to analyze the collected vibration signal to identify its operating state [
7,
8,
9]. However, the working environment of the bearings is complex, and its vibration signals are affected by various excitation sources, which often have continuous, nonlinear, and non-stationary characteristics [
10,
11]. This situation brings difficulties to signal analysis, feature extraction, and later work status identification. The fast Fourier transform (FFT) method is only suitable for stationary signal processing, while short-time Fourier transform, wavelet transform (WT), and other methods lack adaptability [
12,
13].
The widely used empirical mode decomposition (EMD) method produces unexplained negative frequency phenomena in the non-stationary signal analysis and mode decomposition. This method also has problems with end effects, envelope fitting, and mode mixing [
14,
15,
16]. Zheng et al. [
17] proposed a new signal decomposition method, namely extreme-point weighted mode decomposition (EWMD), for improving the accuracy of EMD. For non-stationary multi-component signals, a novel approach named RRP-RD has been developed in [
18] for ridge detection based on the gathering of ridge portions in the time-frequency plane. Fourer et al. [
19] proposed the ASTRES Toolbox for mode extraction of non-stationary multi-component signals. This method offers efficient tools for analyzing, synthesizing, and transforming any signal made of physically meaningful components. Variational mode decomposition is an adaptive decomposition algorithm by rigorous mathematical reasoning [
20]. This method can convert the mode decomposition into a variational solution problem and determine the variational solution problem by iteratively searching for the optimal solution of the variational mode. Variational mode decomposition (VMD) has received increasing attention in the diagnosis of rolling element bearings. In the method of VMD, an optimal determination of decomposition parameters is the pivotal point. However, this method may have the problem of choosing the mode number
K. The number of decomposed modes
K needs to be set before the VMD starts processing the signal, and the accurate determination of
K is significant to the modal influence after the signal is decomposed [
21,
22]. However, most
K values are determined by human experience and are not rigorous enough due to a lack of criteria.
Aiming to solve the optimization problem of decomposition parameters in variational mode decomposition, Ni et al. [
23] proposed a fault information-guided VMD (FIVMD) method for extracting the weak bearing repetitive transient. Li et al. [
24] first used the envelope kurtosis maximum as an indicator to optimize and determine the mode number of VMD and introduced a novel method to realize the selection of the optimal IMF(s) of VMD, which contain fault information based on frequency band entropy (FBE). Liang et al. [
25] presented a multi-objective multi-island genetic algorithm (MIGA) to optimize VMD parameters and apply it to feature extraction of a bearing fault. The above methods can well solve the parameter optimization problem of variational mode decomposition and avoid the defects of artificial selection. However, parameter selection is mostly based on the optimization algorithm, and the analysis and processing of the decomposed intrinsic mode functions (IMFs) are insufficient. Furthermore, these methods are not combined with popular machine learning algorithms (e.g., neural networks). Ye et al. [
26] decomposed the original bearing vibration signal into several IMFs by the VMD method and reconstructed the signal by the feature energy ratio (FER). They calculated the multiscale permutation entropy of the reconstructed signal to construct multidimensional feature vectors. The vector is fed into the PSO-SVM classification model for automatic identification of different fault patterns of the rolling bearing.
However, the above methods have certain limitations for fault data in the context of big data [
27]. Thus, deep learning methods have emerged for fault diagnosis of bearings. A one-dimensional convolutional neural network has been established to diagnose bearing faults based on a data-driven diagnostic model. However, this method is mainly aimed at one-dimensional time-domain vibration signals and cannot fully extract the time–frequency-domain features of faults [
28,
29]. Hence, a two-dimensional convolutional neural network was established to avoid the one-dimensional convolutional neural network defects. In this method, the fault signal is spliced and processed and then converted into a two-dimensional image and input to the convolutional neural network for fault diagnosis [
30,
31]. However, this method does not process the time-domain feature of the fault signal, and there will be some redundant components in the original signal, which will affect the diagnosis accuracy.
This paper proposes an intelligent bearing fault diagnosis method based on the improved VMD and CNN. The technique uses the minimum average Pearson coefficient principle to analyze the fault signal in the time domain. The purpose is to determine the optimal mode number K for signal decomposition to extract fault features to maximize the difference between adjacent frequency bands and achieve the best decomposition effect. Then, the correlation analysis is performed on each component of the decomposed signal, and the component with a large correlation coefficient is selected for reconstruction. This method can remove the interference of redundancy in the signal and retain the main features. The CWT is used to transform the time-domain signal into a two-dimensional time–frequency image. After that, the preprocessed two-dimensional time–frequency image is applied to the convolutional neural network. Finally, the convolutional neural network adaptively extracts the time–frequency image features to achieve end-to-end fault diagnosis. The intelligent fault diagnosis model can be applied to the background of big data and has a good application effect.
4. Mechanical Rotation Fault Diagnosis Based on Convolutional Neural Network
At present, bearing fault diagnosis faces several problems, such as the background of big data and the limitations of traditional fault diagnosis. An IVMD method based on the minimum average Pearson coefficient principle is proposed to determine the optimal
K for signal decomposition. The deep learning method is used to establish a convolutional neural network to extract fault features and use this network to identify and classify various fault types. The neural network is employed to perform multi-layer nonlinear learning on the preprocessed vibration signal, which can automatically extract fault features and diagnose the health status of the system. The flow chart of the established intelligent diagnosis model is presented in
Figure 2.
4.1. Convolutional Neural Network
The CNN is a special NN with a feed-forward structure. The CNN has three important characteristics that make its strength in 2-D image analysis, including local receptive fields, weight sharing, and sub-sampling in the spatial domain. A typical 2-D CNN consists of three major layers: convolutional layer, pooling layer, and fully-connected layer. After several alternating convolutional and pooling layers, the fully connected layers are followed to compute the class scores.
The convolutional layer applies a set number of kernels to obtain the feature maps of input images. The convolution layer can be described using the following equation:
where
denotes the output of the
jth neuron in layer
l.
is the convolution operation.
Mj is a selection of input maps.
l is the
lth layer in the network.
and
denote the weights and bias of the convolution kernel
ith convolution kernel in layer
l, respectively.
is a nonlinear activation function. The rectified linear unit (ReLU) is used as the activation function between the convolution layers in this paper. It can be described using the following equation:
where
x represents the output of the convolution layer.
The pooling layer is often connected after the convolutional layer to reduce the dimension of the feature map and decrease the number of parameters in the network. The max-pooling function can be described as the following equation:
where
denotes the output of the
jth feature map in layer
l.
is the max-pooling function in this paper. Each output map is given its own multiplicative bias
and additive bias
b.
After the input image is transmitted alternately through multiple convolution and pooling layers, the extracted features are classified using the full connection layer network. The Softmax function is used as an activation function for the output layer in this paper. The loss function of the CNN is the cross-entropy between the estimated Softmax output probability distribution and the target class probability distribution. The cross-entropy function was used in our paper. Furthermore, the Adam Stochastic optimization algorithm was applied to train the CNN to minimize the loss function.
4.2. Deep Learning-Based Rotating Machinery Fault Diagnosis Model
The bearing fault diagnosis model based on the IVMD and CNN methods is shown in
Figure 3. The method developed in this paper consists of three main parts.
(1) Preprocessing.
Step 1: Select 10 types of fault signals in the dataset.
Step 2: The optimal decomposition number K of these fault vibration signals is determined by the average Pearson principle proposed in this paper.
Step 3: These signals are decomposed by the IVMD method, and the decomposed components are subjected to correlation analysis and reconstruction.
Step 4: Convert the one-dimensional signals into two-dimensional time–frequency images by CWT.
Step 5: Resize these images to 3 × 64 × 64.
(2) Feature extraction.
Step 6: Establish a CNN to train the preprocessed fault image dataset, and the network is used to extract the image features of these fault signals.
(3) Classification.
Step 7: Fully connected layers are used to classify different types of faults. The model is established to complete the task of fault diagnosis.