1. Introduction
China has the longest and most complex high-speed rail system in the world [
1]. The advantages of rail traffic, such as high economy and transport prices, contribute to higher demand and increased traffic intensity, as well as to higher loads and speeds of trains, which ultimately results in the rapid deterioration of this infrastructure [
2,
3]. High-speed trains (HSTs) are often considered one of the best modes of transportation, and China has the longest and most complex high-speed railroad system, with HSTs operating at a maximum speed of up to 380 km/h [
4]. At such high speeds, bearings, as important parts of the transmission system, are subjected to complex alternating loads, which increase the wear rate of the bearings and reduce their service life [
5]. They are an important part of HSTs, and their health situation is greatly related to the operation of the whole train.
The majority of the existing research on bearing fault diagnosis primarily focuses on the vibration signals generated by bearing impacts, which contain information concerning the healthy situation of the bearings. Analysis of the vibration signals and the information concerning the healthy situation of the bearings is crucial for the diagnosis of bearing faults in high-speed trains. With the proposition of machine learning methods, more and more academics are studying the correlated intelligent diagnosis algorithms and have successfully produced a variety of effective fault diagnosis algorithms based on vibration acceleration signals. Conventional intelligent diagnostic approaches have often used a mixture of signal processing measures and classification methods, using signal processing measures that have a well-established theoretical foundation, such as variational modal decomposition (VMD) [
6], empirical modal decomposition (EMD) [
7], and empirical wavelet transform (EWT) [
8], to obtain signal features. These features are then input into classification methods, such as support vector machines (SVMs), extreme learning machines (ELMs) [
9], and back-propagation networks (BPNNs) [
10], to recognize the bearing fault state. Although these methods can classify and identify the bearing fault state, the accuracy of its diagnosis is strongly linked to the fault features extracted by the conventional signal processing methods in the previous period. Feature extraction using traditional signal processing methods mainly relies on personal experience, and it is difficult to sufficiently extract the correlated information from the vibration signal. This results in low-quality feature extraction, thus affecting the accuracy of the final fault identification.
Zhang et al. [
11] combined the spatial discard regularization method with separated convolution to extract features from the original signals, achieving effective differentiation of the original signals of bearings in different states. Liu et al. [
12] used improved one-dimensional and two-dimensional convolutional neural networks to achieve intelligent recognition of the bearing states, demonstrating high recognition accuracy. Che et al. [
13] combined a deep belief network and convolutional neural network to extract relevant information from grayscale images and time series signals and used the fused deep learning model to recognize the fault types. More adaptable intelligent fault diagnosis approaches based on deep learning [
14,
15,
16] can be used to feature vibration data automatically. However, their excellent performance depends on massive amounts of well-labeled training data and requires that the training data and the test data fulfill the condition of being identically distributed. In practical engineering work, owing to the effects of complex working environments, it is difficult to obtain enough data with markers of the same distribution, resulting in the inability to effectively apply all kinds of intelligent fault diagnosis measures based on deep learning.
In an intelligent fault diagnosis approach that is based on migration learning, the data for model training and testing do not need to have the same probability distribution. This can be accomplished by exploiting the source domain with the help of technical methods of migration to accomplish the tasks in the target domain, especially for data samples where there are no or few markers in the target domain [
17]. Recently, on the basis of various deep neural networks, scholars have proposed several migration learning methods for rotating machinery fault diagnosis, and experimental validation has been carried out on various types of datasets collected under different operating conditions [
18,
19]. Domain adaptation is one of the migration learning methods that can be used to automatically align feature distributions in the source and target domains for deep feature extraction by using distribution difference metric functions, such as the maximum mean difference (MMD), multinomial kernel maximum mean difference (MK-MMD), and Wasserstein distance, which in turn realizes the migration from the source domain to the target domain to accomplish the task demands of the target domain. Li et al. [
20] added the multiple maximum mean difference (MK-MMD) to a deep transfer network model to reduce the distributional difference of the data in the two domains and identified the faults of the target domains. Wan et al. [
21] utilized the multiple kernel maximum mean discrepancy to adjust the marginal distribution and the conditional distribution in the multidomain discriminators of the two domains to achieve extraction of the domain-invariable features and complete cross-domain fault identification. Guo et al. [
22] embedded the maximum mean difference into a fully connected layer in a convolutional neural network, thus enabling domain-invariant feature extraction to achieve pattern recognition of the faults. Wen et al. [
23] added an adversarial mechanism in combination with an autoencoder to achieve cross-domain diagnosis from experimental bearings to the actual wheelset bearings used in the test models. Zhang et al. [
24] presented an improved domain-adversarial neural network with improved multi-feature fusion to achieve the classification of bearing health status. Wu et al. [
25] combined the adversarial mechanism and maximum mean discrepancy with a weight-sharing convolutional neural network to achieve fault identification under different operating conditions.
In summary, there have been many researchers who have investigated deep migration learning methods from various perspectives to improve their performance for bearing fault state recognition. However, compared to conventional bearings, the forces on rolling bearings in high-speed trains are more complex and variable. Based on the conditions of variable loads and existing theories, a migration model for fault diagnosis, which incorporates an adversarial mechanism and a channel reconstruction unit (CRU), is established in this study. In the process of establishing the model, we analyze not only the influence of each module in the feature extractor on the performance of feature extraction but also the effect of the structure of the fault recognition module and domain discrimination module on the diagnosis accuracy. Finally, the method was experimentally analyzed and found to have good performance in identifying bearing fault states under different load conditions.
The rest of the paper is organized as follows.
Section 2 provides an introduction to the theory of domain-adversarial neural networks (DANNs),
Section 3 describes the migration learning fault diagnosis framework, and
Section 4 validates the performance of the method through experiments.
Section 5 summarizes the paper.
2. Basic Theory
The characterization of data and their edge distributions are represented by
and
. Thus,
represents a domain. Two domains that are not the same indicate differences in the characterization or edge distributions.
denotes the source domain, while
denotes the target domain. These two domains comprise bearing vibration data under various loads with the same labels but distinct edge distributions,
. The purpose of transfer learning is to aid in accomplishing tasks in the target domain by leveraging messages from the source domain. The domain-adversarial neural network (DANN) [
26] is one of the classical methods of transfer learning.
Domain-adversarial neural networks (DANNs) mainly include a feature extractor , a domain discriminant module and a fault recognition module . The feature extractor primarily performs deep feature extraction on the input data, while the fault identification module is responsible for identifying faults in the target domain test data samples after the feature extraction. The domain discriminant module serves as a binary classifier with an added gradient inversion layer.
represents the input data, and the characterization extractor
learns the features of the data samples by updating the parameter
, which can be denoted as
. The features learned by the extractor
are then inputted into the fault recognition module
for fault recognition, represented as
. The loss function is expressed as Equation (1).
where
denotes the
sample,
denotes the fault category of the
sample, and
and
represent the parameters of the feature extractors
and the fault recognition module
. The domain discrimination module
is utilized to discriminate the faults from that domain, and its loss function is given in Equation (2).
where
represents the parameters of the domain discriminative module
and
denotes the label of the domain. The loss function is shown in Equation (3).
where
and
denote the samples and their corresponding true labels,
denotes the samples in the target domain, and
is the equilibrium parameter. The total quantity of samples is represented by
.
and
represent the quantity of samples in the source and target domains, respectively. The total loss function is shown in Equation (4).