1. Introduction
As the measuring element of the aero-engine control system, the function of the sensor is fundamental and essential for the system. However, with the increasing control demand and the increasing complexity of the control system, new requirements are put forward for the number and reliability of sensors, and in the poor working environment of aero-engine sensors, faults are difficult to avoid [
1,
2]. According to statistics, sensor faults cover more than 80% of faults in the aero-engine control system [
3], so the fault detection and isolation (FDI) of aero-engine sensors are vital to improving the reliability of aero-engine control systems.
At present, the FDI methods of aero-engine sensors can be divided into two types: model-based and signal-based. The main idea of the model-based method is to establish the mapping relationship between the actual input and output of the controlled object by analyzing the physical characteristics of the controlled object and then analyzing the residuals between the output of the actual system and the output of the model for diagnosis and isolation [
4]. The prominent representatives are the Kalman filter [
5,
6,
7,
8,
9,
10], particle filter [
11], etc. Based on the model method, sensor fault detection becomes very simple under the premise of accurate model mapping. However, the complexity of actual controlled objects is the difficulty of its practical application.
Signal-based methods directly analyze signals through reliability analysis or machine learning, among which machine learning has made significant progress in the field of sensor FDI. Regarding fault identification based on one-dimensional sensor signals, literature [
12] expands the one-dimensional signals of a single sensor into two-dimensional data. It inputs them into a CNN for fault detection through time series misalignment. However, this approach has poor randomness and reliability for aero-engines with complex control systems and diverse working conditions. Therefore, the most common solution is to use the analytically redundant space information between multiple sensors to detect faults of single or multiple sensors. In literature [
13,
14,
15], the support vector machine (SVM) algorithm is used to build a nonlinear mapping relationship between faulty sensors and healthy sensors. Finally, an appropriate threshold is selected for the predicted value of the algorithm to determine whether the sensor is faulty. The literature [
16,
17,
18] used the Levenberg–Marquardt algorithm, Artificial Bee Colony algorithm, and adaptive method to optimize a BP neural network to construct the mapping relationship between sensors for banning SVM. All of the above methods have high accuracy for single-sensor fault diagnosis. However, for multi-sensor diagnosis, it is necessary to build multiple algorithm units and match them with a complex logical judgment method or fusion judgment algorithm to make its accuracy usable. As mentioned in reference [
19], assuming that no more than three sensors fail simultaneously among the nine sensors, 84 multi-input multi-output generalized regression neural networks (MIMOGRNNs) should be constructed for fusion diagnosis. In literature [
20,
21], the structure of the self-associative neural network and the extreme learning machine algorithm are optimized and changed by the genetic algorithm, respectively, to achieve multi-sensor detection, but its detection accuracy is still far from the 95% specified in the literature [
22].
In recent years, CNN, as an essential part of deep learning algorithms, has had many applications in different engineering fields. Especially in data-based prediction, CNN shows good performance due to its robust feature extraction ability. Literature [
23] proposed a prediction model based on CNN and lion algorithm fusion; the model was improved by niche immunity and finally used in the short-term load prediction of electric vehicles; the optimized prediction model can allow the existence of deformed data and has good prediction performance. Literature [
24] proposed an innovative hyperspectral remote sensing image classification method based on CNN. The parameter update of the model adopts an extreme learning machine; finally, the model’s accuracy has been verified on the remote sensing image Jiuzhaigou. Literature [
25] proposed a prediction model for detecting power theft by combining the bidirectional gated recurrent unit and CNN. Compared with the multilayer perceptron, the long short-term memory network (LSTM), and the single gated recurrent unit, the model performs better in this application scenario. Moreover, in the field of power protection, the author proposes a deep hybrid neural network model that combines CNN and particle swarm optimization (PSO) in literature [
26]. The model uses CNN as the feature extractor of the PSO algorithm, which aims to condense useful feature information in the original time series. Literature [
27] proposed a one-dimensional CNN prediction model for the real-time detection of motor faults. The model has a simple structure, low hardware requirements, and a fast detection speed. Literature [
28] uses CNN to predict the sound source of plate-like structures. Compared with stacked autoencoders, CNN can accept more information input and has a flexible information input method. In the literature [
29], the author converts the vibration sensor signal of the motor into a three-dimensional feature matrix as the input of the CNN, and different sensors occupy different channel dimensions in the feature matrix. This paper provides ideas for the situation in which there are different kinds of feature data as CNN input. In the literature [
30], the author used the output of the virtual sensor of the aero-engine physical model and the Kalman filter as the input of a one-dimensional CNN and realized the life prediction of the aero-engine. Compared with feed-forward neural networks (FNN) and LSTM, this method has smaller variance and higher accuracy in multiple predictions. In literature [
31], the object detection deep learning algorithm YOLOv3 based on CNN is used. The prediction model built by this algorithm realizes the detection of the number of people in an air-conditioned room and reduces the energy consumption of the air conditioning. Literature [
32] built a predictive model for power transformer and cyberattack fault diagnosis.
This paper proposes a convolutional neural network based on the inception block; the Inception-CNN uses the convolution layer and pooling layer to extract the analytical redundancy information between each sensor. Next, this information is used to adjust network parameters and predict the probability of individual sensor failures.
The innovations of this paper are as follows.
- (1)
The construction of the forecasting model
Compared to the various signal-based methods mentioned above, Inception-CNN can take advantage of the characteristics of the inception block to incorporate more interphase information of sensors of different scales into the fault detection model, which is undoubtedly beneficial to the signal-based method. In addition, the method is based on the self-iterative characteristics of neural networks; thus, it does not require a tedious parameter optimization process, and can focus more on the construction of the network model itself.
- (2)
Generation of fault datasets
Actual aero-engine sensor failure data are very sparse, which contradicts the large amount of training data required for deep learning algorithms. This paper generates sufficient data for training by the Monte Carlo simulation method.
- (3)
The optimization of the forecasting model input
In this paper, the one-dimensional data of the sensor are converted into a two-dimensional feature matrix by the pan and regroup method. This method makes the extraction of sensor phase information more sufficient, and the prediction model’s accuracy is improved.
- (4)
The optimization of the forecasting model details
The activation function and output mode of the predictive model are optimized so that the model can perform fault detection on all target sensors simultaneously, and the accuracy of the model is improved.
The rest of the paper is organized as follows:
Section 2 introduces the research object of the prediction model in this paper and the construction of training data and validation data.
Section 3 shows the various parts of the CNN and how they work and then shows the specific architecture of the Inception-CNN prediction model.
Section 4 verifies the feasibility and effectiveness of Inception-CNN in terms of fault detection effect and FDI process, respectively.
Section 5 concludes this paper.
2. Data Preparation
This paper takes a Geared Turbofan Engine (GTF) as the research object, which can be expressed as Equation (
1):
Among them, w is the operating condition, including height (alt), flight speed (Mach), power lever angle (PLA), etc. is a health parameter. is a measurable physical property, including the values of each sensor, and is an unobservable physical quantity.
Due to the limitation of calculation conditions, this paper only studies the ground operation of an aero-engine, namely height = 0 and speed = 0. Ten sensors are selected as the research object of this paper, as shown in Equation (
2), and the cross-section position of the aero-engine is shown in
Figure 1.
In this paper, the method is tested and verified by the GTF model of NASA/T-MATS master [
33] in the simulation environment of Matlab/Simulink2021a.
By referring to the verification method of the civil turbofan engine fault diagnosis system proposed by Donald et al. [
34,
35], the Monte Carlo simulation method is used to generate the training and verification data set of the neural network. Firstly, the power lever angle (PLA) of the healthy running engine model is given, as shown in
Figure 2, and sensor data
are derived. Then, ten sensor data
are randomly disturbed by the normal distribution, as shown in Equation (
3).
The coefficient is determined by the average of the input sensor value.
In the training data set and validation data set, the data point that exceeds 5% of the original value of the sensor is set as the fault point. For example, if
, sensor
i is considered a failure at
j data points, whereas sensor
i is considered to have no failure. Since class imbalance can adversely affect network training, this paper adjusts
to reduce the gap between faulty and healthy data points and loops to generate data multiple times. Since the probability of failure of more than three sensors at the same time is extremely small, and the redundant information between the sensors decreases with the increase in the number of simultaneously failed sensors, the data points where more than three sensors fail at the same time
are deleted. The scatter plot of a small part of the value of NL, NH, Pt25, and Tt25 sensors after adding disturbance is shown in
Figure 3, where the abscissa represents the label of the data points.
In the final method verification link, we adjust the coefficient
to ensure that the sensor data
are equivalent to the actual engine data collected and filtered, add the typical faults shown in
Table 1, and change the PLA to verify the neural network’s generalization performance.
3. Introduction to Inception-CNN
3.1. Introduction to CNN
As a transformed form of multilayer perceptron, the convolutional neural network (CNN) was developed based on studies on the visual cortex of cats [
36]. It was initially applied in the field of image recognition. Now, CNN has become a hot spot in many research fields. A typical CNN consists of convolutional layers, pooling layers, and fully connected layers, as shown in
Figure 4.
Convolutional layers extract local features of the input data by a perceptual structure and reduce the number of parameters of the CNN by sharing weights. The pooling layer merges adjacent data into a single datum, reduces the dimensionality of the data, speeds up the calculation, and prevents the overfitting of the parameters. The fully connected layer is the basic unit of the BP neural network, which generates output based on the feature data extracted by the previous layer.
3.2. The Basic Module of CNN
The input of each node of the convolutional layer is only the local features of the previous layer, and the convolution kernel is used to convert the subnode matrix of the current layer into a unit node matrix with unlimited channel dimensions in the next layer. Under normal circumstances, the convolution layer generally only converts the channel dimension of the input data and adopts the method of edge zero-padding to ensure that the length and width of the input data remain unchanged. For example, the convolution kernel transforms the matrix of
into
as Equation (
4).
Among them,
represents the value of the
ith identity matrix of this node, and
represents a certain length and width of the convolution kernel sampling in the input matrix of this node. The interception matrix with the channel dimension
.
represents the convolution kernel weight value matrix corresponding to the
ith unit matrix,
represents the bias value corresponding to the
ith unit matrix, and
f represents the activation function. In this paper, Scaled Exponential Linear Units (SELU) is selected as the activation function, which normalizes the data distribution and ensures that the gradient will not explode or disappear during the training process. The SELU activation function can be expressed as Equation (
5):
where
z represents the output value of the convolution operation, and
and
are constants,
,
.
Pooling layers sample the data by sliding the pooling kernel. Unlike convolutional layers, pooling layers generally only change the length and width of the input matrix. The most common types of pooling layers are max pooling and average pooling. The process of converting the
matrix to
by max layer pooling is as Equation (
6):
Among them, represents the value of the ith unit matrix of this node, and represents the interception matrix of a length and width sampled by the pooling kernel when the input matrix of this node corresponds to the ith unit matrix. The pooling layer reduces the data size, speeds up computation, and prevents parameter overfitting without losing data features.
Finally, the network expands the feature matrix extracted by the convolutional and pooling layers into a one-dimensional vector, which is input to the fully connected layer and produces an output based on these features.
3.3. The Calculation Process of CNN
Minimizing the loss function
as much as possible is the goal of neural network training, where
W and
b represent the weights and bias parameters in the neural network, respectively. The loss function consists of two parts: the first part is the residual between the output value and the expected value, and the second part is the regularization loss caused by overfitting, which is regulated by the parameter
. The loss function can be expressed as Equation (
7):
This paper uses the Stochastic Gradient Descent plus Momentum (SGDm) as the solver to update the CNN parameters. Through the back-propagation of the loss function, the trainable parameters of each layer in the CNN can be updated layer by layer. The mathematical representations are as Equations (
8) and (
9):
where
is the learning rate, and
is the momentum coefficient.
3.4. Inception Block
Although a deep neural network can be abstractly considered the superposition of several neural networks, different network architectures and choices of hyperparameters will make a huge difference in the performance of deep neural networks. Furthermore, with the deepening of the network, the interpretability of the deep neural network becomes worse. A deep neural network with good performance often needs to explore the design intuition formed in this field for many years, so it is vital to master and use the previous exploration results.
The inception block is the basic module of the GoogLeNet network architecture proposed in 2014 [
37], and GoogLeNet prevailed in the ImageNet competition that year.
Earlier convolutional neural networks are often series architectures. The size of the convolution kernel is also very different, such as from to , etc. However, the inception block parallels convolutional layers with convolution kernels of different sizes in order to have the ability to extract correlated features at different scales. It has been shown that using convolution kernels of different sizes is beneficial for feature extraction.
In addition, deeper neural networks are often helpful to improve prediction accuracy. However, deep networks also bring the risk of gradient disappearance, which will cause the model to fail to converge eventually. Furthermore, due to the exponential increase in computing parameters in deeper networks, the demands on computing equipment are further increased. The inception block can reduce the network computing parameters effectively.
As shown in
Figure 5, the inception block consists of four parallel paths. The first three paths use convolutional layers with kernel sizes
,
, and
to extract information from different spatial sizes. The two paths in the middle first perform
-convolution on the input to reduce the number of channels and reduce the complexity of the model. The fourth path uses a pooling layer with a pooling kernel size
and then uses a
convolution kernel to vary the number of channels. All four paths use appropriate padding to match the height and width of the input and output and, finally, join the output of each line in the channel dimension and form the output of the inception block. In the inception block, the adjusted hyperparameters are mainly the number of output channels per layer.
The specific representation of the first inception block used in this paper is shown in
Figure 6. The input is an
matrix with a channel number of 48, which passes through four paths, respectively. After the different convolution pooling operations, which are shown in
Figure 5, the number of channels is changed and merged into the output matrix in the channel dimension, as the following input to one layer of the neural network.
3.5. Architecture of Inception-CNN
Considering that the convolution kernel has the characteristics of the local receptive field, for one-dimensional data, it is difficult for the convolution kernel to extract the interphase features of the data with a large separation distance. In order to fully extract the relevant features between sensor signals, this paper performs two-dimensional expansion and recombination of the original data before the data are input into the neural network. That is, the size of the data is expanded to .
As shown in
Figure 7, first, through the method described in
Section 1, a one-dimensional sensor data set is obtained by collecting data from the engine model. Different colored and numbered squares represent data from different sensors. Then, the one-dimensional data group is expanded into a two-dimensional data group while rearranging the data through the data translation operation. For example, the second row of the two-dimensional data group in the figure is obtained by shifting the original one-dimensional data group by two units.
Suppose that a convolution kernel of size three is used to extract the interphase features of the sensor data. In this case, it is impossible to extract the interphase features of the sensor data with an interval greater than 3 in the original one-dimensional data set. However, it can be done in the two-dimensional data set after translation expansion.
Since the neural network model requires a large amount of data for parameter updating, this paper adopts the operating mode of offline training and online detection. The Inception-CNN architecture is shown in
Figure 8. A large amount of perturbed and labeled sensor data is used as the network input during offline training, and the network parameters are updated. During online diagnosis, the data format is kept unchanged, the sensor data that match the actual situation are input to the network, and the prediction is made based on the updated parameters during offline training.
As shown in
Figure 8, the two-dimensional sensor data set expanded by translation is input into the network and passes through one convolution layer, four inception blocks, two max pooling layers, and one average pooling layer. The first convolutional layer has 48 convolution kernels of size
, which converts the original data of size
into a feature response matrix of size
. The pooling kernel size of all pooling layers in the network is
. After the last average pooling, the three-dimensional matrix is converted to a one-dimensional vector with no information loss. Finally, the extracted feature response vector is input to the fully connected layer and produces an output with 20 nodes.
The parameters of the four inception block convolution kernels and pooling kernels are shown in
Figure 5. The changed hyperparameters are only the number of channels output by each layer. The channel parameters are shown in
Table 2.
3.6. Loss Function
In the traditional multiclassification problem using neural networks, if the classification category is
n, it is necessary to construct an
n-dimensional one-hot encoding vector to represent the category, and network output nodes are required. For example, in the three-classification problem, the three one-hot vectors
represent three different categories. It can be seen that one element in the vector is 1, and the others are 0.
Equation (
10) is the softmax function expression, where
a is the output value of the last fully connected layer. By calculating the natural logarithm, the softmax function converts the network’s output into the probability of each category on the one-hot encoding, i.e.,
. Among them,
. The softmax function is widely used in multi-classification problems due to its fit with one-hot encoding and its ability to widen the difference between categories in most cases.
However, in the sensor fault detection problem in this paper, due to the assumption that multiple sensors fail simultaneously, the detection vector output by the network will present multiple 1s at the same time. For example, means that the NL sensor and the Pt25 sensor fail simultaneously, so the softmax function is no longer applicable in the network, and the sensor failure probability is only calculated during model validation.
For multi-classification problems, a combination of the softmax function and cross-entropy loss function is often used. The expression of the cross-entropy loss function is as Equation (
11).
Among them, M is the number of categories, y is the symbolic function, i is the batch data size, and p is the predicted probability of the category, which is the output of softmax. Since softmax is no longer applicable, the output of the fully connected layer a is replaced with p.
There are two types of individual sensors, faulty and healthy, and one-hot encoding is constructed for each sensor separately, so the number of final output nodes of Inception-CNN is , a total of 20 output nodes. Every two nodes constitute a prediction unit of the sensor. The output value of the cross-entropy loss function of each prediction unit is accumulated as a total loss value for the back-propagation of the network.
In summary, the loss function expression used in this paper is as Equation (
12):
where
j is the number of the prediction unit, and
a is the output of the last fully connected layer of the network.
5. Conclusions
In this paper, a convolutional neural network based on an inception block is proposed and utilized for aero-engine sensor fault detection. The traditional detection unit composed of multiple fusion algorithms is simplified into one detection algorithm, which mainly solves the problems of traditional sensor FDI methods with many parameters and complex systems.
The effectiveness and feasibility of the method are verified by the detection effect and the FDI process. On the data set of this paper, the detection accuracy of Inception-CNN is 95.41%, which improves the prediction accuracy by 17.27% and 12.69% compared with the best-performing non-neural network algorithm and simple BP neural networks tested in the paper, respectively.
In addition, this paper constructs the training data set and validation data set of the signal-based sensor FDI method through the Monte Carlo simulation method, which solves the problem wherein the experiment cannot be carried out due to insufficient fault data.
The sensor fault detection method based on Inception-CNN proposed in this paper is a data-driven algorithm. Its accuracy and applicability are positively related to the quality of the data. Thus, in the future, this method can be combined with the mechanism study of the engine to improve the algorithm’s performance through higher-quality data. In addition, the research content of this paper can be combined with the research on the safety control strategy of aero-engines based on sensor value judgment. Taking the research [
38] of Cao et al. as an example, if FDI is used as the front module of the safety protection control module in the control loop, the possibility of control strategy failure due to sensor failure can be reduced, and the robustness of the system can be improved.