1. Introduction
Gearbox is a key component of the transmission systems of large instruments, such as helicopters, cars, and fans. These pieces of equipment are constantly subjected to external weather and rain, as well as subjected to high-intensity loads for extended periods of time, resulting in frequent gearbox failures that disrupt normal operation and even cause economic losses and casualties. Being able to detect failures at an early stage can avoid catastrophic consequences. Therefore, intelligent fault diagnosis of gearboxes has significant research value [
1,
2,
3,
4,
5,
6].
Currently, vibration signals are commonly used for fault diagnosis in gearboxes. The actual operating environment of the gearbox is extremely severe, with constantly changing load conditions, resulting in an irregular vibration signal in the gearbox. Interference between the internal components of the gearbox causes the vibration signal to be nonlinear; therefore, the collected vibration signal contains a variety of complex noise components. Consequently, the use of vibration signals for noise reduction processing and fault diagnosis is a hot topic in contemporary research, and fruitful results have been obtained [
7,
8,
9]. After the discovery of the empirical mode decomposition (EMD) method of noise reduction for nonstationary, nonlinear signals, EMD-like methods have been widely applied to signal noise reduction. For example, Abdelkader et al. [
10] used the average energy to optimize the threshold operation for the intrinsic mode function (IMF) component of the EMD to achieve noise reduction, and the experiments verified that this noise reduction method is more effective and sensitive for the detection and diagnosis of rolling bearing faults. Liu et al. [
11] used kurtosis to select the intrinsic mode function (IMF) component of the EMD as the main IMF function, and then filtered the main IMF function with an impact dictionary, which can separate the high-frequency resonant component from the meshing harmonics and partial noise to achieve noise reduction. Gao et al. [
12] used integrated evaluation and wavelet thresholding to select and process the IMF components decomposed by ensemble empirical mode decomposition (EEMD), and used simulation methods to verify the feasibility of the method used to extract valid information from the signal under high noise. Liu et al. [
13] used the complementary ensemble empirical mode decomposition (CEEMD) method for nonstationary and nonlinear vibration signals, and the experiment demonstrated that the CEEMD algorithm has good adaptive capability for unstable signals and can effectively extract fault features. Although the above algorithms achieve good results in noise reduction of vibration signal, they also have the following problems: The difficulty of solving the endpoint effect and mode mixing problems of EMD in decomposing vibration signals and the addition of Gaussian white noise to the EEMD decomposition result in a high computational effort and a tendency to decompose spurious IMF components. Although CEEMD solves the endpoint effect and mode mixing problem, there are differences in the number of IMF components generated during decomposition, leading to errors in ensemble averaging.
However, in 2014, Dragomiretskiy et al. [
14] proposed the variational mode decomposition (VMD) method, which has better processing effect for strong noise and interference signal processing. The accuracy of the VMD of a vibration signal depends on the decomposition parameter K and the penalty factor α. There are different methods for finding the optimal number of layers K and the penalty parameter α of the VMD. For example, Fu et al. [
15] used a central frequency observation method to determine the value of the predefined decomposition level K. Yan et al. [
16] used solving for the spectral centroid of each IMF component to determine the value of K. Zhan et al. [
17] used changes in scattering entropy to determine the optimal K value for the VMD. Zhang et al. [
18] used a genetic algorithm combined with nonlinear programming to solve for the VMD parameters. All of the above methods can solve the problem of VMD parameters well, but they all have one-sidedness. Therefore, a better optimization method is sought for noise reduction of the original signal so that the data used for diagnosis can better characterize the fault. Using the noise-reduced signal as sample data for fault diagnosis can better improve the accuracy of fault diagnosis.
With the in-depth research of deep learning theory, more and more scholars have used the theory of deep learning on fault diagnosis, making deep learning an effective means of fault diagnosis [
19,
20,
21]. The advantage of deep learning methods over traditional machine learning is that they can automatically learn features from raw vibration data, solving the disadvantage of requiring manual extraction of fault features and making deep learning much more accurate for fault diagnosis. Therefore, deep learning fault diagnosis methods based on vibration signals are widely used for all types of mechanical equipment. He et al. [
22] proposed a method that uses a combination of vibration signal analysis and deep learning to form a deep learning structure embedded with a time-synchronous resampling mechanism for solving early bearing fault diagnosis. Xu et al. [
23] studied a hybrid deep learning model that substantially improved the accuracy of bearing fault diagnosis. Bai et al. [
24] used a stacked sparse autoencoder for fault feature dimension reduction and a support vector machine for the diesel engine fault diagnosis method with good results and engineering application value. Li et al. [
25] proposed the use of deep learning and multimodal feature fusion approaches to build models for fault diagnosis. Shen et al. [
26] used a multilabel convolutional neural network deep learning method to learn relevant features in vibration signals for fault diagnosis, with higher diagnostic accuracy than conventional methods.
Although the above methods can solve the gearbox fault diagnosis problem to a certain extent, there are still the following problems:
The gearbox operating environment is harsh, the weak early fault signal is seriously affected by noise, and fault information is disturbed or masked, making it difficult to reveal fault characteristic information.
Deep learning requires a large amount of labelled data to support it, but in practical engineering applications, the fault states exist for a short time, and large amounts of fault data are difficult to obtain in a short time. To obtain a sufficient amount of data, the equipment needs to fail several times and be in a state of failure for a long time. Once enough data have been collected, a deep learning model with robustness still needs to spend more time on training. These make deep learning methods have major limitations in practical engineering applications.
With the advent of transfer learning, it can effectively solve the problem of fault diagnosis in deep learning, which requires huge amounts of labelled data. Using a transfer learning approach, there is no need to retrain the model, and only a small number of labelled samples are needed to fine-tune the model parameters to achieve good diagnostic results. Yu et al. [
27] combined wavelet packet transform and multicore maximum mean square difference to perform deep transfer diagnosis of bearing faults using residual networks (ResNet), which can perform diagnosis and suppress noise effects well. Bai et al. [
28] proposed a fault diagnosis method based on transfer learning with optimized variational modal decomposition and deep residual networks, which is effective for noise reduction and fault diagnosis of diesel engines. Su et al. [
29] extended convolutional deep belief networks to extract the transportable features from the raw vibration data and used dynamic multilayer perceptron for fault classification, which were experimentally shown to have good classification accuracy for bearing variable condition problems. Luo et al. [
30] used the sparse term divergence in the original stacked autoencoder to replace it with a convolutional shortcut to solve the gradient disappearance problem in deep transfer learning and improve feature extraction, which is used for rolling bearing diagnosis with more superior results.
To address these issues, a method combined with the whale optimization algorithm (WOA), VMD, and deep transfer learning is proposed for the fault diagnosis gearboxes. First, the WOA is used to find the optimal decomposition parameters ( and ), and the correlation coefficient is used to determine the IMF components and thus select them for signal reconstruction to achieve noise reduction. In the second step, a continuous wavelet transform (CWT) method is used to convert the reconstructed signal into a two-dimensional time–frequency map, which forms the dataset for fault diagnosis. Finally, the AlexNet network is used as the transfer model; after pretraining and fine-tuning, the AlexNet transfer learning (AlexNet-TL) network model is generated to classify the generated 2D time–frequency maps. Experiments have shown that this method can identify fault types quickly and with a higher accuracy.
The main contributions and innovations of this paper are as follows:
For the problem of difficult processing of nonlinear nonsmooth vibration signals, weak fault signals under complex conditions, and difficult extraction of fault features, this paper proposes a WOA-VMD method of signal noise reduction. In complex environments and in more intrusive conditions, the use of this method allows the effects of noise to be well removed, making the fault signature signal more visible.
This paper uses the method of continuous wavelets to turn a one-dimensional vibration signal into a two-dimensional time–frequency image. The good effect of using deep learning on image feature extraction in two dimensions can avoid the blindness of manual extraction of fault features in traditional machine learning, and can effectively improve fault diagnosis accuracy.
This paper uses the fault diagnosis method of model migration. The large amount of data available on ImageNet can be used to train a stable diagnostic model. With the help of model transfer, the need for labelled samples and the reliance on expert experience can be greatly reduced, making the diagnostic approach more general and generalizable.
The other sections of this paper are detailed as follows:
Section 2 describes the noise reduction method of WOA-VMD;
Section 3 presents the basic theory of CWT-based time–frequency transformed image generation and deep transfer learning;
Section 4 describes in detail the fault diagnosis method steps for WOA-VMD and transfer learning;
Section 5 is devoted to experiments and the comparative validation of the proposed fault diagnosis methods;
Section 6 presents the conclusions.
4. Gearbox Fault Diagnosis Based on WOA-VMD and Deep Transfer Learning
The flow of the gear fault diagnosis method proposed in this paper, represented in
Figure 7, is divided into the following four key processes:
Step 1. Vibration signal noise reduction and reconstruction: Using the experimental platform to collect data for different working conditions, the WOA-VMD method is used to decompose the original signal and solve for the sample entropy corresponding to each IMF component. When the sample entropy is smallest, the corresponding and values are the optimal parameters. The correlation coefficient between the solved IMF components and the original signal is then used to further determine the relationship between the IMF components and the original signal, and the IMF components with high correlation coefficients are then selected and added together for reconstruction to obtain the denoised signal.
Step 2. 2D time–frequency plot conversion and dataset generation: Using the signal reconstructed after denoising in the previous step as the input condition, a CWT method is used to convert the one-dimensional vibration signal into a two-dimensional time–frequency signal dataset.
Step 3. Generation of AlexNet-TL network models: The AlexNet network model was used as the transfer target. In the fine-tuning of the model parameters, the parameters of the first five convolutional layers are kept frozen, and the parameters of the last three fully connected layers are fine-tuned, and the training data from the generated dataset are fed into the pretrained model for training. A new network model, named AlexNet-TL, was generated after fine-tuning the parameters of the AlexNet network.
Step 4. Fault pattern recognition with AlexNet-TL: The test set and validation set data for the four operating conditions are imported into the AlexNet-TL network model to obtain the results of the fault diagnosis.