1. Introduction
At present, the manufacturing industry operates within a constantly changing and developing technology environment, and it is very important to ensure that any manufacturing processes in this environment are efficient and reliable. Rapidly detecting and responding to various failures that may occur in the manufacturing process can directly contribute to improving productivity and reducing costs. Motor-related failures occupy a large proportion of manufacturing processes, and bearing faults [
1] represent one of the most common and fatal problems.
Bearings are mechanical elements that provide accurate and smooth movements, and they play an important role in mechanical equipment with shafts. Because of this importance, many studies have focused on diagnosing bearing failure [
2,
3,
4]. However, due to the imbalance of abnormal data and the complexity of time series data, it remains difficult to diagnose bearing failures in manufacturing. This has led to the use of existing AI models such as the convolutional neural network (CNN) [
5], long short-term memory (LSTM) [
6], support vector machine (SVM) [
7], and extreme gradient boosting (XGBoost) [
8] to limit fault diagnosis.
The methods that are currently used to maintain machines in the manufacturing industry [
9] can be largely divided into post-maintenance, preventive maintenance, and predictive maintenance. Post-maintenance is a method of taking measures after the machine breaks down, and it has enormous costs in terms of productivity and finances. Preventive maintenance is a method of performing maintenance at a set period, although it has the disadvantage of incurring regular maintenance costs. Pre-maintenance predicts and measures machine failure to prevent its occurrence, primarily by monitoring equipment conditions using various sensors such as vibration, temperature, and camera sensors [
10], and it detects abnormal signs in advance, allowing for maintenance or repair. Pre-maintenance can minimize production interruptions or cost increases stemming from unexpected failures.
However, several problems must be solved before prediction and maintenance can be effectively implemented. The first is the imbalance between normal and abnormal data. In actual manufacturing sites, there is generally much less abnormal data than normal data, which makes it difficult to learn the model. The current paper solved this problem by generating various bearing fault data using a directly constructed fault simulator testbed. The second is the accuracy problem that artificial intelligence models struggle with. To diagnose bearing faults with high accuracy, the graph convolutional network (GCN)-based self-attention LSTM autoencoder model was proposed. An experiment using this model on the Case Western Reserve University (CWRU) open dataset [
11] showed 97.3% accuracy. Moreover, the accuracy of the proposed model was 99.9% when tested on the dataset extracted from the directly constructed fault simulator testbed. These results verified the reliability and excellence of the proposed model.
The major contributions of this paper can be summarized as follows:
A fault simulator testbed was constructed to extract and utilize various bearing data.
A bearing fault diagnosis model that can be used in various situations was developed by using the CWRU dataset, which is an open dataset, along with directly extracted data.
The proposed model achieves 97.3% accuracy using the CWRU dataset, which is an open dataset, while it achieves 99.9% accuracy using the data that have been extracted through the fault simulator testbed.
The rest of this paper is organized as follows:
Section 1 contains an introduction to detail the research purpose and research methodology.
Section 2 reviews related work and explains the time series data analysis techniques and various deep learning models used therein.
Section 3 elucidates the bearing failure diagnosis method proposed in this paper.
Section 4 presents the experimental environment and experimental results. Finally,
Section 5 synthesizes the experimental results and discusses future research plans.
3. GCN-Based LSTM Autoencoder with Self-Attention Model
This chapter proposes a GCN-based LSTM autoencoder with self-attention model to bear fault diagnosis using multivariate time series data. The proposed model consists of two steps, as shown in
Figure 4. This process can successfully diagnose the failure of bearing data.
3.1. Step 1
Figure 5 shows step 1, which visually represents the data pre-processing step. In this step, data including various fault states and steady states are loaded, and standardization and normalization are performed as well. Based on various types of bearing equipment, as shown in
Figure 6, the data extracted from the three-channel vibration sensor and the sound sensor having the X, Y, and Z axes are normalized to make it suitable for analysis.
Normalized data are divided into certain sizes using the sliding window technique. The sliding window technique divides time series data into fixed-length windows and uses each window as one sample, while the fixed window size and slide are adjusted to divide the data, thus allowing the model to learn data patterns in various time intervals.
The K-NN algorithm is then used to find the k nearest neighbors for each data point, and a graph is generated based on the similarity between the data points. This process makes it possible to effectively model the spatial and structural relationship of the data. The generated graph is learned through the GCN layer.
GCN is a neural network structure that is optimized for processing graph data, and it extracts new characteristics by considering the characteristics of each node and its neighboring nodes together. This makes it possible to learn complex patterns and structural information of data and to effectively extract important features from the frequency domain of time series data. The main advantage of GCN is that it can be learned while maintaining the relationship between each data point by utilizing the graph structure, which makes a substantial contribution to increasing the accuracy of fault diagnosis. The frequency domain data obtained through this process are then transferred to step 2.
3.2. Step 2
Step 2 is shown in
Figure 7, which represents the model experiment stage. It begins with the process of converting the features extracted through GCN, the last step of step 1, into STFT for analysis in the frequency domain. In this step, the data are analyzed by converting the temporal characteristics of the time series data into a frequency domain. STFT can analyze the frequency change pattern over time by dividing the time series data into short time intervals and then performing the Fourier transform for each section. Using this process, the spatial and structural characteristics obtained through GCN can be expressed in terms of changes in time and frequency. GCNs are used to extract spatial and structural characteristics of time series data. The network learns the features of each node through its relationship with its neighbors, which is advantageous for identifying multidimensional patterns in the data. GCNs are powerful tools that reflect the inertia and interactions of data based on their graph structure, which is especially useful for complex network structures.
The LSTM autoencoder with a self-attention layer simultaneously learns the long-term and short-term dependencies of the time series data. In this model, self-attention plays a role in assigning weights to important parts of the encoder’s output sequence [
44]. The LSTM encoder creates a hidden state of each sequence while processing time series data, and it applies self-attention to assign weights to important parts. The self-attention mechanism calculates the importance of all time steps that are different from itself for each time step in the sequence, and it then weights them based on these calculations to emphasize important information in order of relative importance. These weights are determined through learning, and they are gradually adjusted while the model learns. The self-attention mechanism applies a probability function to estimate a score from the bottleneck vector representation. The estimated score is multiplied by the bottleneck vector representation to obtain a context vector. The context vector is then input to the decoder and reconstructed into the original encoder input data dimension. The computation method uses M. Luong et al.’s global attention concatenation multiplicative technique [
45], which allows the model to emphasize important information and learn important features of time series data more effectively. The self-attention layer is defined in the following way:
In the above equation, represents a vector corresponding to a time step i of an input sequence, while W and b are learnable weights and biases. is the score representing the importance of each time step, and is the weight that is normalized through the softmax function. Finally, is the sum of weighted input vectors, which represents an output vector emphasizing important information. By defining the self-attention layer in this way, an abnormal state may be detected through the process of assigning weights to important portions of the output sequence of the encoder and transferring them to the decoder to reconstruct the original data.
The important features weighted by self-attention are converted into repeat vectors and passed to the decoder. The decoder is responsible for reconstructing the original data based on the features extracted from the encoder. The decoder uses the last state of the encoder as the initial state to generate a sequence, which is used to calculate the reconstruction error and detect anomalies. The main role of the decoder is to restore the original shape of the input data based on the information learned from the encoder.
The output layer of the model is responsible for the classification of failure types. The output layer consists of a fully connected (dense) layer, which uses CrossEntropyLoss as the loss function to train the model. For multi-class classification, the LeakyReLU and Softmax activation functions are used to predict the probability for each type. This allows the model to distinguish between normal and abnormal data and accurately classify different defect types. While many previous studies mainly used the ReLU activation function, the proposed model uses the LeakyReLU activation function to perform multi-class classification. To enable the model to learn complex patterns, we set LeakyReLU in the hidden layer to add nonlinearity. LeakyReLU solves the dead ReLU problem by maintaining a small gradient when the output of a particular node is negative. This makes learning more stable. Softmax is set on the output layer of a multi-class classification to convert the predicted values for each class into a probability distribution by making the output value equal to 1. The Softmax activation function calculates the log-likelihood of each class, converts it to an exponential function, and normalizes the sum to 1 to generate the final probability distribution. This process allows the model to learn the features of the input data and accurately predict the probability for each class. Finally, the CrossEntropy Loss function is used to train the model and optimize the classification performance.
Through the feature learning and self-attention mechanism of GCN, the important parts of the time series data can be highlighted, which greatly improves the accuracy of fault diagnosis. The proposed model learns the patterns of normal and abnormal data and detects anomalies in the test data through reconstruction errors to provide an early diagnosis of bearing failure in multivariate time series data. This model has the advantage of achieving significantly improved fault diagnosis accuracy by highlighting the important parts of time series data through graph-based spatial characteristic learning and attention mechanisms. This helps effectively extract and learn important information, particularly from complex time series data.
The proposed model also evaluates the performance by comparing the reconstruction loss with mean squared error (MSE) and mean absolute error (MAE). MSE is a value that is averaged by squaring the difference between predicted and actual values, and it gives a greater penalty to the large error. MAE is the average of the absolute difference between the predicted and actual values, and it deals with all errors equally. By comparing the two indicators, the reconstruction capability and fault detection performance of the model can be evaluated in a more detailed manner. Through this process, the reliability and superiority of the proposed model are verified.
5. Conclusions
In this paper, a GCN-based LSTM autoencoder with a self-attention model for bearing fault diagnosis was proposed and evaluated using multivariate time series data. The proposed model was found to increase the accuracy of fault diagnosis by combining the GCN layer and the LSTM layer to extract important features from the frequency domain. In the data pre-processing step, data including various fault states and steady states were standardized, while features in the frequency domain were extracted through STFT conversion. Moreover, time series data were divided into fixed-length windows using the sliding window technique, which was used as input for the model.
The GCN-based LSTM autoencoder with self-attention model using multivariate time series data performed fault diagnosis by synthesizing various data obtained from multiple sensors. This model learned the spatial and structural relationship of the data using the GCN layer, and it simultaneously learned both long-term and short-term dependencies through the LSTM layer. The self-attention mechanism was added between the encoder and the decoder to highlight important features and synthesize various features to ultimately classify several types of bearings.
The comparative model performance evaluation indicated that the GCN-based LSTM autoencoder with self-attention model showed superior performance over other models in accuracy and F1 score in both the open CWRU dataset and the directly constructed fault simulator dataset. In particular, it was possible to detect abnormal conditions more accurately through reconstruction errors, which is an important advantage in quickly detecting and responding to various failures that occur in the manufacturing process. As a result of visually checking the classification performance of the model through the confusion matrix, the proposed model accurately classified most fault states and normal states, while there was a small number of incorrectly classified samples.
These findings can make an important contribution to early diagnosis and response to various failures that may occur in manufacturing sites, and they can be expected to substantially contribute to increasing the practicality of the model.
To further enhance the performance of the GCN-based LSTM autoencoder with the self-attention model proposed in this study, future research is planned in the following directions: Firstly, the development of a real-time fault diagnosis system based on the proposed model is planned. This is expected to improve the model’s real-time inference capabilities through real-time data collection, processing, and online learning. The real-time fault diagnosis system can quickly detect and respond to failures occurring in the manufacturing process, which can substantially contribute to improving productivity and reducing costs.
Next, the experiment in the current research was only conducted for specific bearing fault types, but in future research, we will conduct a study to increase the versatility of the model while including various failure types other than bearings. Through this, we will develop a general-purpose fault diagnosis model that can be used in various industrial environments as well as build a fault diagnosis system that can exhibit high performance even in various situations. We will also apply additional data augmentation techniques to increase the diversity of data. The data augmentation technique that is selected for use plays an important role in improving the generalizability of the model and enhancing the fault diagnosis performance in various situations. This will enable the model to learn various data patterns and facilitate more accurate fault diagnosis.
The practical implications of the proposed model include failure prevention and maintenance cost reduction, productivity improvement, reliable data-driven decision-making, industrial applicability, and implementation of smart manufacturing systems. By monitoring machine health in real time and diagnosing faults at an early stage, unexpected machine failures can be prevented, maintenance costs can be reduced, and overall productivity can be improved by minimizing production downtime. Reliable decisions can be made based on the data provided by the fault diagnosis system, which is very helpful for machine maintenance planning, resource allocation, production schedule adjustment, etc. Furthermore, the proposed model is not limited to a specific industry but can be applied in various manufacturing sectors, which can be utilized as a bearing fault diagnosis and preventive maintenance system in various industries such as automotive, aviation, energy, marine, etc. In this way, the proposed model can contribute to optimizing manufacturing processes and achieving innovative productivity by leveraging IoT, big data, and AI, which are key elements of Industry 4.0. These practical implications show that the proposed model can serve as an efficient and reliable fault diagnosis solution in various fields of manufacturing in the future.
Also, we plan to introduce a transfer learning technique to apply the model learned from one dataset to another. Through this, we intend to develop a model that can maintain high performance in various environments and conditions. Using this method will have the following advantages:
Improved generalization: transfer learning allows the model to consistently perform highly on different datasets.
Reducing training time: Applying already trained models to other datasets can reduce training time.
Improved adaptability: allows the model to better adapt to new datasets or domains. This will be good for application to new manufacturing sites.
We plan to review practical applicability by applying the research results to actual industrial sites. Through such research, the performance of the model in the actual environment will be evaluated, and the model can be improved as necessary. This will increase the practicality of the model through verification in the actual industrial environment and contribute to solving problems in the actual field.
Finally, in addition to the GCN-based LSTM autoencoder with the self-attention model, we will conduct a fault diagnosis study using other recent deep learning models. We plan to compare the performance of various models and derive the optimal model. We expect that comparative studies with various deep learning models will make it possible to comprehensively evaluate the performance of the model and develop an optimal fault diagnosis model.
With these future research directions, we plan to further improve the performance of the models proposed in this study and eventually develop a practical fault diagnosis system that can be applied in various industrial environments. We believe that this will ultimately help increase the efficiency and reliability of the manufacturing industry, thereby contributing to productivity improvement and cost reductions.