1. Introduction
Rolling bearings are integral components of rotating machinery [
1]. They serve as critical and vulnerable support components in a wide range of applications involving large and complex equipment [
2]. Furthermore, detecting early fault signals in rolling bearings can be challenging due to their low signal-to-noise ratio and subtle fault characteristics. Failure to detect early bearing faults promptly, which allows them to propagate, can result in substantial economic losses and even pose safety risks [
3].
The early diagnosis of bearing faults plays a crucial role in accurately identifying and addressing issues when they are still in their nascent or mild stages [
4]. Timely diagnosis guides maintenance and repair activities, enabling prompt measures to prevent extensive losses. This not only enhances complex equipment’s operational safety, reducing the potential for major accidents, but also yields significant social and economic benefits [
5]. For instance, Cai B et al. [
6] introduced a data-driven early fault diagnosis approach for permanent magnet synchronous motors. Leveraging vibration and acoustic emission data in Bayesian networks, they achieved fault accuracy rates exceeding 90%. X. Kong et al. [
7] proposed an attention cycle autoencoder (AE) hybrid model for classifying early faults and detecting faults’ severity in rotating machinery. This method utilizes the original one-dimensional vibration signal as input, obviating the need for time–frequency conversion while delivering precise fault diagnosis and severity assessment for varying degrees of pitting. Wang X et al. [
8] introduced an intelligent early fault diagnosis framework founded on a hierarchical model and hierarchical entropy (HDE) in conjunction with random forests. Compared to various entropy-based approaches, such as multi-scale sample entropy, multi-scale permutation entropy, multi-scale fuzzy entropy, and multi-scale diversity entropy, HDE demonstrated superior feature extraction capabilities. Chen G et al. [
9] brought forth a spectral autocorrelation smoothing index, reciprocally sensitive to periodic pulses, for detecting resonant frequency bands. In contrast to methods such as Infogram, spectral smoothing index reciprocal, and improved VMD, this proposed approach effectively identifies subtle faults and enables earlier bearing faults detection.
The fractional Fourier transform (FRFT) represents an extension of the conventional Fourier transform [
10]. It not only preserves the characteristics and advantages of the traditional Fourier transform, but also employs a single transformation order ‘p’ to reveal intricate details of the fractional domain. This capability overcomes limitations in handling non-stationary signals, and broadens its applicability in the domain of fault diagnosis. For instance, Wu et al. [
11] applied FRFT to motor fault data, extracting pertinent features from the fractional domain, which were subsequently fed into a support vector machine for precise diagnosis. The method’s feasibility was then corroborated through testing using a dedicated bench. Ma et al. [
12] initiated the process by subjecting bearing vibration signals to FRFT filtering, extracting relevant feature frequencies, and subsequently training the model using sparrow search and deep confidence networks to enable fault diagnosis and classification. Zhang et al. [
13] ingeniously amalgamated the strengths of empirical wavelet transform and FRFT, dynamically constructing wavelet filter banks within the fractional Fourier domain. This facilitated the extraction of fault feature components from vibration signals during rotor startup. Consequently, utilizing fractional Fourier transform methods is highly recommended.
At present, deep learning has achieved remarkable success in the domain of fault diagnosis, mainly owing to its inherent ability to autonomously extract features [
14]. However, contemporary models face challenges related to excessive parameters and training complexities, thereby prompting the widespread adoption of lightweight neural networks [
15]. For instance, Wang et al. [
16] harnessed VMD-Hilbert spectrum techniques in conjunction with ShuffleNet-V2 to realize gearbox fault diagnosis in mining scraper conveyors. Xue et al. [
17] employed MobileNetV2 in combination with fast spectral kurtosis analysis for bearing fault diagnosis. Another noteworthy approach was proposed by Zhu et al. [
18], which introduced an improved MobileNet network method enriched with wavelet energy and global average pooling techniques for diagnosing faults in rotating machinery. Consequently, the adoption of lightweight neural networks is a strategy deserving wider promotion.
The primary objective of this study was to devise a comprehensive method for the early diagnosis of rolling bearing faults that combined frequency domain signal analysis with lightweight neural networks. To capture early fault signals from vibration sensor data, the mean square value served as an indicator for detecting abnormalities. Following this, the time domain signal was transformed into a frequency domain signal through fractional Fourier transform to extract more detailed frequency features. Subsequently, amplitude frequency and phase frequency information were integrated to present multiple frequency domain features intuitively, offering richer data for subsequent pattern recognition. In this research, the lightweight neural network, Xception, was utilized for rolling bearing fault diagnosis. Xception boasts a streamlined network structure that effectively reduces parameter complexity while preserving model performance. By inputting the fused frequency domain signals into the Xception model, automated classification and recognition of various fault types could be achieved.
This study’s contribution is the proposal of a method for early fault diagnosis of rolling bearings that seamlessly integrates frequency domain information and lightweight neural networks. The innovation is embodied in a novel approach that combines information fusion through fractional Fourier transform with the application of Xception, effectively bridging the gap in the field of early fault diagnosis of rolling bearings. By comprehensively harnessing the attributes of frequency domain signals and leveraging lightweight neural networks for automated fault classification, the aim was to attain more precise and efficient fault diagnosis of rolling bearings. Furthermore, this research holds potential implications for the adoption of deep learning technology in engineering applications and offers valuable insights for future investigations in related domains.
This article is structured as follows: the second section introduces basic principles; the third section delves into the methods of frequency domain information fusion and lightweight neural network fault diagnosis of early rolling bearing faults; the fourth section showcases the experimental design and result analysis; and finally, we summarize the research findings and anticipate this study’s application prospects in the field of rolling bearing fault diagnosis.
2. Materials and Methods
2.1. Definition of the Fractional-Order Fourier Transform
The
p-order fractional Fourier transform of
x(
t) under the signal is [
19]
When
, the kernel function is
when
p = 2
n or
p = 2
n + 1,
Kp can be replaced by
W(
u-t) or
W(
u + t). In the formula,
,
p is the order of fractional Fourier transform.
The formula for the fractional Fourier transform is
The variable can take any value. Typically, a counterclockwise rotation angle of is utilized for analysis. At other angles, due to the symmetry and periodicity of the fractional Fourier transform, signal analysis results remain consistent with the rotation angle. The p value generally ranges from 0 to 1. When p = 0, it corresponds to the original signal; when p = 1, it corresponds to the classical Fourier transform. As p changes from 0 to 1, the fractional Fourier transform smoothly transitions from the original signal to the classical Fourier transform. Therefore, the fractional Fourier transform can reveal all characteristics of the signal’s gradual transition from the time domain to the frequency domain.
2.2. Gramian Angular Summation Fields
Time domain analysis, as a commonly used method in signal processing, can reflect the most authentic and comprehensive information of signals. Therefore, time domain analysis has a wide range of applications in fault diagnosis of rotating machinery. However, traditional time domain analysis methods cannot accurately identify the type of bearing fault, and directly converting one-dimensional signals into two-dimensional images cannot preserve the temporal correlation of signals, resulting in the loss of signal information. The use of Gramian Angular Summary Field (GASF) can effectively solve this problem, preserving the temporal information of bearing signals as inputs to the neural network, making the final prediction results more correlated [
20].
GASF is a method that encodes time series signals into two-dimensional matrices in polar coordinates. The process involves representing a time series sequence X, with n points denoted as , in polar coordinates and converting it into images through two main steps.
(1) Begin by normalizing the time series values, scaling the values in sequence X to the interval [0, 1]. The formula for this normalization is as follows:
Among them,
, encode the
value of
as angular cosine and the time as radius. The formula is as follows:
Among them, represents the cosine value of the angle in polar coordinates, signifies the time period, and N serves as a constant factor for adjusting the span of the polar coordinate system. For a time series within the interval [0, 1], applying the inverse trigonometric function during polar coordinate mapping confines its value range to [0, pi]. Moreover, the encoding result of in the time series is unique, ensuring the uniqueness of its inverse mapping as well.
(2) Within the polar coordinate system
, compute the cosine and angle values for each polar coordinate; then, input the encoding result into the matrix. The formula is as follows:
The matrix operation takes the form of an inner product, with the main diagonal of the matrix incorporating the original direction and angle information of the time-domain mapping signal. Subsequently, the GASF matrix undergoes transformation into a two-dimensional GASF image, solidifying its absolute time relationship within polar coordinates.
2.3. Xception Lightweight Neural Network
To address the challenge of many existing model parameters, this study adopted the depth-separable convolutional Xception network as the backbone network and enhanced its optimization. The Xception network represents Google’s refinement of the Inceptionv3 [
21] network. It fundamentally decouples standard convolutions and substitutes them with depth-separable convolutions, leading to an increase in the model’s depth while simultaneously reducing the number of parameters. This architectural modification contributed to enhanced network performance. In contrast to standard convolution, depth-separable convolution completely segregated channel correlation and spatial correlation. It initially accomplished the learning of spatial correlation through depth convolution, and subsequently captured information between channels via point-by-point convolution, thereby facilitating improvements in the model’s channel dimension.
The Xception network structure can be categorized into three segments: entry flow, middle flow, and exit flow, comprising 14 modules. The detailed structure is presented in
Table 1, where it can be seen that the Xception network is composed of independent convolutional network blocks.
4. Experimental Platform and Experimental Analysis
4.1. Experimental Platform
To confirm the efficacy of this study’s proposed method for the early fault diagnosis of rolling bearings with unbalanced datasets, this section discusses the rolling bearing full life cycle dataset collected in the literature [
22] for verification and elucidation. The test platform for accelerated life testing consists of essential components such as an AC motor, motor speed controller, rotating shaft, support bearing, hydraulic loading system, and the test bearing. This setup allows for the accelerated life testing of different types of bearings—rolling or plain—under various working conditions. The platform enables the collection of comprehensive monitoring data throughout the entire life cycle of the test bearings. The composition of the experimental setup is illustrated in
Figure 3. The experiment’s sampling frequency was 25.6 KHZ, with a 1 min sampling interval; each sampling lasted 1.28 s.
The testing platform provides flexibility in modifying crucial working conditions, with a primary emphasis on radial force and speed adjustments. The hydraulic loading system generates the radial force, applying it to the bearing seat of the selected test bearing. Simultaneously, the AC motor’s speed controller enables precise configuration and fine-tuning of the speed settings. In this particular test, the designated test bearing is the LDKUER204 rolling bearing.
Details of data used in this experiment are outlined in
Table 2. The early fault signal was selected from the initial fault signal of the dataset provided in
Table 2. The rolling bearing’s normal operation data were extracted from the Bearing2_1 dataset, encompassing six operating states.
Figure 4 illustrates the test bearings’ diverse health states, providing specific details for each fault.
In specific test scenarios, the Bearing2_1, Bearing2_2, and Bearing2_3 datasets correspond to a testing state with a speed of 2100 (r/min) and a radial force of 11 kN. On the other hand, the Bearing3_3 and Bearing3_5 datasets are associated with a test state featuring a speed of 2100 (r/min) and a radial force of 10 kN.
The experimental platform records vibration signals at a sampling frequency of 25,600 Hz, and each vibration signal for a specific state consists of 819,200 points. To ensure a comprehensive coverage of fault information, the sample length was set to 1024 points, resulting in a total of 800 samples. This configuration guaranteed that the failure frequency was encompassed within each sample. Refer to
Table 3 for the detailed data presentation.
Table 3 reveals the random partitioning of the dataset into training, validation, and test sets using a 5:2:3 ratio. This resulted in 400 sets for training, 160 sets for validation, and 240 sets for testing. Considering each sensor state independently, 3200 training samples, 1280 validation samples, and 1920 test samples were gathered across eight states. Each sample contained 1024 points; the training set had 3200 × 1024 points, the validation set had 1280 × 1024 points, and the test set had 1920 × 1024 points. Before the training process, each sample was transformed into a Gram angle field diagram using the information fusion method. Gram angle field diagrams for cross-domain fusion corresponding to each state are illustrated in
Figure 5.
4.2. Performance Analysis of Information Fusion Fault Diagnosis Algorithm
This study used fractional Fourier transform frequency domain fusion data and Xception deep learning model to achieve early fault diagnosis of rolling bearings. Two experiments were conducted, one regarding the Xception deep learning model, and one regarding information fusion data.
All models were optimized using Adam, utilizing cross-entropy as the loss function, and starting with a 0.0001 learning rate. Training concluded after 10 iterations, and optimal parameters were determined based on the loss of the validation set. Additionally, TensorFlow 2.6.1 served as the deep learning framework for all networks, with Python as the programming language, running on hardware including an i7-11800H CPU (Intel, USA, Santa Clara), RTX 3070 graphics card, and 16 GB RAM (NVIDIA, USA, Santa Clara).
To visually illustrate the model training process, a line chart was employed, as depicted in
Figure 6.
Figure 6 shows that the model’s training accuracy gradually increased from an initial stage value of 0.7688 and approached 1 towards the end of training. This indicates the model’s continuous learning and adaptation to training data, which led to improved accuracy. Simultaneously, the initial verification accuracy was relatively low, at approximately 0.1667, but gradually improved. Despite some initial overfitting, the model’s data validation performance progressed. In later training stages, training accuracy approached 100%, while the validation accuracy stabilized at approximately 0.994. This suggests the model was closely fitting the training data, but its data validation performance remained consistently high.
To better illustrate the model’s recognition performance,
Figure 7 displays diagnostic results using a confusion matrix.
The confusion matrix shows the model’s classification performance in six categories (inner circle, outer ring, cage, inner circle1, outer ring1, and health), with a 99.79% overall recognition rate. Using numbers on the diagonal, we can conclude that the model showed high classification accuracy in most categories, which means that the model was able to accurately classify the sample into its correct category. For example, the model showed high accuracy in three categories (inner circle, outer ring, and cage) with 240, 240, and 240 correctly classified samples, respectively.
However, we observed some misclassifications in the off-diagonal numbers. In particular, there were some misclassifications between label 3 (inner circle1) and label 4 (outer ring1) and between label 3 and label 5 (health). Although such misclassifications were relatively few, they may require further attention. These misclassifications may have originated from similarities between categories or the influence of noisy data.
Comprehensive analysis results show that this model has potential in the early fault diagnosis of rolling bearings, especially for key fault types (such as inner circle, outer ring, and cage), and does not identify faults as healthy states. This means it can provide efficient fault detection and has prediction capabilities in industrial applications.
4.3. Comparative Analysis of Different Data Sets and Methods
This section presents an in-depth model comparison and performance evaluation, which were conducted to emphasize the superiority of information fusion data and the Xception model. Three neural network architectures, namely, CNN, ResNet, and Xception, are discussed, and their performance in early fault diagnosis tasks for rolling bearings are compared under both time domain data and information fusion data.
Figure 8 illustrates different models’ training performance under time domain data.
As depicted in
Figure 8, during the rolling bearing early fault diagnosis task training process, three distinct deep learning models (CNN, ResNet, and Xception) exhibited their respective characteristics. Notably, the Xception model demonstrated significant advantages during early training stages, with notably higher initial accuracy compared to other models. This suggests that the Xception model possessed superior feature capture and learning capabilities in initial phases (potentially attributed to its structural design), including depth-separable convolution, which enabled more effective data comprehension early on. Although the Xception model led in initial accuracy, notably, both the CNN and ResNet models exhibited a steady improvement trend throughout training. Despite the ResNet model’s lower initial accuracy, continuous training led to a gradual improvement, which ultimately approached the Xception model’s performance level. This underscores that, even with a less favorable starting point, the ResNet model could achieve satisfactory results through prolonged training.
To better illustrate the multi-mention method’s advantages,
Table 4 presents each model’s accuracy using different datasets.
Initially, for the time domain dataset, the Xception model achieved a remarkable 96.25% accuracy, showcasing a significant improvement over the CNN and ResNet model’s initial accuracy. This underscores Xception’s prowess in time domain feature extraction and vibration sensor signal classification. This high accuracy can be attributed to Xception’s deep learning architecture, which excels in capturing time domain features, including amplitude and phase information.
For the fused information dataset, the Xception model exhibited outstanding performance, with 99.79% accuracy. This near-perfect classification result highlights Xception’s supremacy in processing fused information data. It also underscores the significant potential of combining information fusion data and the Xception model in early fault diagnosis tasks for rolling bearings. Information fusion data, encompassing amplitude–frequency and phase–frequency information, provided more detailed insights into the spectral characteristics of vibration signals, and Xception adeptly leveraged this information for classification.
Moreover, ResNet demonstrated commendable performance on time domain datasets, achieving 95.21% accuracy, slightly surpassing CNN’s accuracy. This suggests that deep residual learning retains certain advantages when processing original time domain data. However, on the fused information dataset, ResNet attained 99.31% accuracy, indicating its effective utilization of fused information data for classification.
These outcomes underscore Xception’s robust performance across different datasets; it particularly excelled on fused information datasets and achieved nearly perfect classification accuracy. This further accentuates the significance and efficacy of the combination of information fusion data and the Xception model in the early fault diagnosis of rolling bearings.
5. Conclusions
This study comprehensively compared and analyzed three distinct deep learning models—CNN, ResNet, and Xception—addressing challenges in the early fault diagnosis of rolling bearings, such as low accuracy in identifying complex operating conditions, insufficient information in single-domain data, and model complexity. It focused on evaluating these models’ accuracy using both time domain and cross-domain data, aiming to understand their data generalization and practical application performance.
First, the Xception model exhibited outstanding performance in early rolling bearing fault diagnosis tasks. It achieved remarkable accuracy across various datasets, especially on fused information datasets, approaching nearly 100%. This underscores the Xception model’s significant advantages in multi-source information fusion tasks and its efficiency in extracting time domain features from vibration signals.
Second, the application of information fusion data proved to be pivotal in the early fault diagnosis of rolling bearings. The fusion of amplitude frequency and phase frequency information from vibration signals enabled the model to develop a more comprehensive understanding of spectral characteristics, thereby enhancing its classification accuracy. The synergistic effect of information fusion data and the Xception model provided robust support for accurate fault diagnosis.
Finally, deep learning models showcase strong potential in this field. The investigation of different architectures (including CNN, ResNet, and Xception) demonstrated commendable performance using diverse datasets. The Xception model’s lightweight design positioned it as an efficient choice that struck a balance between accuracy and model parameter efficiency.
Looking ahead, the results achieved pave the way for exciting avenues in future research. The significant advantages demonstrated by integrating information fusion data with Xception models underscore their potential for fault detection and prediction in the industrial field. The emphasis on multi-source information fusion opens the door to further exploration of data integration strategies for early fault diagnosis. Furthermore, the lightweight design of the Xception model implies scalability and adaptability for practical applications, encouraging future research on optimizing deep learning models for industrial environments.
In summary, this study furnished compelling evidence in the realm of early rolling bearing fault diagnosis that emphasized the substantial advantages of integrating information fusion data with the Xception model to enhance accuracy. These findings hold significant practical application potential for fault detection and prediction in the industrial sector, offering valuable insights for future deep learning applications.