1. Introduction
Large-amplitude pressure oscillations are generally undesirable in a variety of industrial settings, such as furnaces [
1], gas turbines [
2] and rocket engines [
3], with the exception of thermoacoustic prime movers [
4] or cooling systems [
5]. Heat-to-sound conversion [
6] is increasingly attracting many researchers and engineers’ attention due to its potential application in the energy industries, for example, as a prime mover, in refrigerators [
7] and in mixture separation. Based on the energy conversion mechanism from heat to sound, many heat sources can be utilized or recycled, such as solar energy and industrial waste heat [
8]. In both of the involved devices, the onset of thermoacoustic instability is of great significance. The consequences of unstable oscillations include low efficiency in industrial furnaces, high pollutant emissions in gas turbines and catastrophic failure in rocket engines [
9]. Thermoacoustic instability (TAI) is characterized by the fact that the main frequencies of pressure oscillations are linked to the acoustic mode of the combustion chamber [
10]. At this point, limit cycles are formed by periodic pressure oscillations with the main frequencies. To mitigate thermoacoustic oscillations, the prediction of TAI is crucial to further the active control of combustion. This study investigates the use of deep leaning tools and chaotic analysis in TAI prediction. This work is motivated by two aspects. On the one hand, chaotic methods are effective in analyzing the nonlinear dynamic structure of combustion systems. On the other hand, deep learning techniques can provide a great advantage in analyzing experimental datasets. A prediction framework, focused on the MIRADAS combustor at IMFT [
11], is thus proposed for the prognosis of instability. Based on recorded datasets, it predicts the proximity of the TAI when stable operation conditions become other conditions. It can not only provide real-time warning signals, but can also clarify the TAI boundary of the combustion system.
Chaotic analysis is an effective approach to investigate combustion noise and combustion instability. TAI occurs due to the coupling between the unsteady heat release and acoustic waves [
12], known as Rayleigh’s criterion [
13]. When unsteady heat release and pressure oscillations are occurring, energy is transferred to acoustic field, leading to large-amplitude periodic pressure oscillations that can damage the combustor structure. At this point, the large-amplitude periodic pressure oscillations indicate that the combustion system has attained a limit cycle. Reference [
14] shows that the combustion noise stage is deterministic–chaotic, and the process of operating conditions approaching combustion instability must be accompanied by a gradual loss of chaos. According to chaotic dynamics theory, the transition from non-periodic combustion noise to periodic thermoacoustic oscillations, e.g., the limit cycle, is referred to as Hopf bifurcation [
15]. If the initial perturbation is within the range of the limit cycle in phase space, the system will always evolve to the limit cycle stage through Hopf bifurcation, which is called triggering in the thermoacoustic community. The chaotic characteristics of the combustion system indicate that the very small perturbations at the initial stage will eventually lead to disparate behavior, making it challenging to deterministically predict combustion instability.
The high-complexity nonlinear and high-dimensional chaotic characteristics of combustion systems have inspired a lot of research on flame dynamics [
16,
17,
18,
19], combustion chaos [
20,
21,
22,
23] and intermittencies [
24,
25,
26,
27,
28,
29,
30]. The flames’ dynamics exhibit bifurcation behaviors [
18] or multi-bifurcation behaviors [
19] depending on the geometric parameters and combustion systems. However, eliminating flame bifurcation in comprehensive combustors is still challenging due to the high-complexity nonlinearity. Thermoacoustic instability is considered high-dimensional chaos [
20] and multifractality has been observed in combustion noise [
21]. Additionally, the occurrence of TAI can be divided into three stages: the stable combustion noise stage, the intermittency stage, composed of large-amplitude fluctuations and small non-periodic oscillations [
25], and the unstable periodic oscillation stage (limit cycle) [
28]. Chaotic analysis has been used to examine pressure oscillations in terms of dominant frequency, attractors in phase space and recurrence plots, revealing their differences at different stages [
29]. These studies emphasize the potential of chaotic methods to investigate combustion systems. Chaotic methods, such as phase space reconstruction, recurrence analysis and nonlinear time series analysis, are extensively used to investigate precursors of combustion instability. For instance, Nair developed a reduced-order model to capture the onset of TAI [
31]. Sujith conducted a comprehensive survey on the application of dynamic theories in TAI and blowout, including phase space reconstruction, recurrence plots, complex networks and fractals [
32]. The complex network method was employed to detect the onset of TAI and flame blowout by Murugesan [
33] and Gotoda [
34], respectively. These studies highlight the benefits of chaotic analysis methods in the precursor detection of combustion instability. In addition, wavelet and recurrence analyses have also been shown to capture the onset of combustion instability [
35]. However, as Ref. [
32] pointed out, the nonlinear methods used in these works are scattered, lacking the potential for a unified framework. The warning signals of combustion instability provided by most of these methods lack a unified threshold and depend on the operating conditions, propellant combinations and combustion chamber geometries. For on-line combustion prediction, a more accurate prediction model and a unified framework are required.
Recently, machine learning methods have been employed to predict the combustion instability, including Support Vector Machine [
36,
37] and Hidden Markov Model [
38,
39]. However, the prediction performances are limited by the shortcomings of conventional machine learning methods, where the manually designed features are subjective. The major concern is that these manual features may not be able to detect the precursors of TAI. More advanced feature-generating techniques are included in deep learning methods, providing great advantages in training a prediction model effectively. The development of deep learning models contributes to their application in the combustion instability community, including in convolutional recurrent neural networks [
40], convolutional neural networks [
41], autoencoders [
42,
43] and long short-term memory networks [
44]. Cellier employed a convolutional recurrent neural network to predict TAI in a swirled combustor based on three channels of recorded signals [
40]. The prediction system achieved 87.7% accuracy on the test data. A convolutional neural network and an autoencoder are introduced to the monitor combustion state [
41,
42] but are unable to predict the impending TAI. Zengyi Lyu used a stacked long short-term memory network to predict acoustic pressure in an annular combustor [
44].
To generate and utilize reasonable features for TAI prediction, we proposed a novel pre-treatment strategy that is able to associate chaotic dynamical analysis with deep learning methods to investigate combustion instability, which proved to be a useful method in an experimental-scale combustor [
45]. Based on phase space reconstruction [
46] and Taken’s delay embedding theorem [
47], acoustic pressure signals, for example, are transformed into recurrence matrices, which will be further trained in deep learning models, i.e., ResNet networks [
48]. Further, the Convolutional Block Attention Module (CBAM) attention mechanism is incorporated into ResNet [
49]. It is often combined with a backbone network to improve the performance. This unified framework is abbreviated as RRC (Recurrence-ResNet-CBAM) and is verified through a dataset obtained from the MIRADAS combustor. Consistent with Ref. [
40], the objective of this study is to predict the proximity of combustion instability when operation conditions are changed based on time series datasets of acoustic pressure, velocity and heat release rate signals recorded in MIRADAS. There are two main challenges in the prediction task: one is detecting instability precursors when the system is close to a unstable region; the other one is ensuring that the RRC framework is able to work in the multi-fuel/multi-injection combustor. Combinations of different fuels and injection configurations greatly enhance the diversity of the combustor datasets. Therefore, the prediction model requires an excellent generalization capability to robustly predict TAI.
The remainder of the study is organized as follows. First, the experimental setup and datasets are presented in
Section 2.1. A novel pre-treatment technique is proposed in
Section 2.2, which integrates chaotic analysis and deep learning. Based on the pre-treatment, a unified prediction framework is proposed in
Section 2.3, which is able to work with different combustion systems. In terms of deep learning models, the learning strategy and parameters are given in
Section 2.4. To evaluate the prediction performance, some common evaluation tools are introduced in
Section 2.5. The prediction results and performance evaluations are presented in
Section 3. Based on the optimal trained framework, on-line prediction simulations are implemented in
Section 4, with potential for application in the active control of combustion instability. Finally, we summarize this work in the last section and present some future research directions to improve the current work.
2. Materials and Methods
2.1. Data Acquisition and Labeling
The datasets were recorded in a lean premixed combustor at the IMFT, referred to as MIRADAS [
11,
40], which is controlled by two operational conditions: bulk velocity
Usw and equivalence ratio
ϕ. The total mass flow is controlled by the bulk velocity. The datasets and labeling strategy are consistent with Ref. [
40], so the prediction performances are compared fairly. The experimental setup is presented in
Figure 1 and more detailed information of the geometry is provided in Ref. [
11].
The main mixture of fuel and air is injected at the bottom plenum of the combustor, and then passes through a convergent nozzle and a swirler oriented 15° off the swirler axis. The cross-section of the bottom plenum has a diameter of 65 mm. The honeycomb is located between bottom plenum and the convergent nozzle is the honeycomb, which is fitted to break turbulent structures resulting from the injection process. The piloting fuel passes through the injection tube in the center of the combustor. The flame is confined within a cylindrical quartz combustion chamber with a length of 297 mm and a diameter of 46 mm. In MIRADAS, methane (CH4) and hydrogen (H2) are involved in different configurations. A total of 366 experiments were performed with seven different injection configurations, providing sufficient data to train the deep learning model. Additionally, combinations of different propellants and injection strategies make it possible to survey the performance of prediction model comprehensively, referred to as the generalization capability. In each injection configuration, experiments were recorded by changing the operational conditions step by step.
Three types of combustion diagnostics were employed in the experiments, as shown in
Figure 2. A microphone (Brüel and Kjær-TYPE 4954, Brüel & Kjær Sound & Vibration Measurement, Egham, Surrey, United Kingdom) was used to measure the acoustic pressure
p, coupled with a constant-temperature hot-wire (Dantec 55P16 miniature hot wire probe and a Dantec 54T42 MiniCTA) to measure the acoustic velocity
v. The heat release rate
h is recorded by a photo multiplier (THORLABS FB430-10).
A total of 366 experiments were conducted on the MIRADAS combustor. They were classified into two groups based on the Root Mean Square (RMS) of the pressure measurements. More specifically, 179 experiments were categorized as instability samples (beyond the empirical limit of 250 Pa), while the rest of the 187 groups were stable. Instability maps (
Usw,
ϕ) can depict the relationship between combustion instability and operation conditions, as shown in
Figure 2a. Bulk velocity
Usw was regulated between 14 m/s and 40 m/s with a step of 2 m/s, while the equivalence ratio was regulated between 0.70 and 0.85 with a step of 0.05. In the instability maps, instability experiments are labeled as solid circles and the diameters of the circles are proportional to the RMS values. The rest regions indicate stable runs. As presented in
Figure 2a, it is difficult to monotonically distinguish stable and instable runs.
As mentioned before, the objective is to predict the proximity of instability when the combustion system is still stable. Thus, only the 187 stable experiments were used in our datasets. When the system is running with the corresponding 187 stable operation conditions, we need to predict the danger for each direction where the operation conditions are changed. We introduced instability vectors
T as data labeling, as shown in
Figure 2b. This is formed by four digits related to the direction of the operation condition changes (1 denotes occurring TAI). The directions of Up and Down denote an increase and decrease in the equivalence ratio, respectively. The directions of Left and Right denote a decrease and increase in the bulk velocity, respectively. For example, the instability vector of [0 1 0 0] means that the TAI will occur when the equivalence ratio is decreased.
During each experiment, signals were acquired at a sampling frequency of 10 kHz for a duration of 10.3 s. To enable the practical on-line prediction of combustion instability, the rolling-window approach is preferred. Three-hundred time windows of 0.3 s were randomly cropped from the complete 10.3 s signal series and the length of each time window was 3000. This process generated a total of 56,100 experimental datasets (300 windows times 187 stable experiments), which were subsequently divided into three datasets for the deep learning model, i.e., a training dataset, validation dataset and testing dataset. For data labeling, the instability vectors of these datasets are given in
Table 1. In total, 15 types of instability vectors were recorded with the absence of [0 1 1 1].
Usually, the training and validation datasets are used to train and monitor the deep learning models, with the aim of achieving high accuracy by optimizing the training loss. The testing datasets were used to evaluate the generalization capability and to compare the performance of different models. Each experiment contained three channels of signals: p, v and h. Each channel’s data were normalized by their maximum value to improve the robustness of the model. Therefore, 56,100 three-channel datasets were used as inputs and the outputs were the corresponding instability vectors.
2.2. Pre-Treatment (Generation of Recurrence Matrices)
To associate chaotic dynamical analysis with deep learning methods to investigate combustion instability, a novel pre-treatment approach is proposed in this study. Based on phase space recognition, the recorded signals were reconstructed into a high-dimensional space. For each signal channel, recurrence analysis in a high-dimensional space provides a quantitative way to describe the reconstructed dynamical system in the form of a recurrence matrix. We note that the recurrence matrix presents a possible form to be used in deep learning models. Making use of the strong feature extraction and learning ability of deep learning models, a recurrence matrix reflecting the nonlinear dynamical features plays a very important role in the pre-treatment process.
To some extent, this pre-treatment can be considered an improved version of the method of Recurrence Quantification Analysis (RQA), which was adopted to investigate TAI as well [
29,
36]. The major concern, as mentioned before, is that the number of manual indices is too small to characterize the high-dimensional combustion system. This section briefly describes the generation of the recurrence matrix.
Suppose
denote the three channels of measured data, which are acoustic pressure, velocity and heat release rate, respectively. Then, the time-delay vector can be reconstructed through time delay
τ and embedding dimension
m according to Taken’s delay embedding theorem [
47], as follows:
where
,
,
N is the length of the input data,
is the measured time series signal and
is the
ith reconstructed vector.
The most import process is to determine time delay
τ and embedding dimension
m. Two types of numerical methods are able to determine the optimum time delay
τ, i.e., the autocorrelation method and mutual information method [
27]. The former can only extract a linear correlation from the time series data. Thus, the mutual information method is used in this study and the mutual information index
I(
X,
Y) is given as follows:
where
X =
xi and
Y =
xi+τ indicate the original time series and original time series with a delay
τ, respectively;
NX and
NY are the number of bins of the
X and
Y histograms, respectively;
PX and
PY are the marginal probabilities;
PXY is the joint probability.
The mutual information index
I(
X,
Y) is actually the function of time delay
τ, indicating the correlation between the two systems distinguished by the time delay. The relatively small mutual information index means that the two systems are irrelevant and the optimum time delay is achieved. The time meshes method is a practical method to calculate the marginal probabilities and joint probability, as indicated in Ref. [
29], based on which mutual information regarding the time delay is obtained. By gradually increasing the time delay, Equation (2) can be employed to calculate
I(
X,
Y) until the point of first minimum. At this point, the corresponding time delay is defined to have the optimum value
τ*.
To investigate the high-dimensional chaos in a combustion system, it is crucial to determine the embedding dimension
m of the nonlinear system so that the attractor orbits will be unfolded exactly and the chaotic characteristics will be exposed. With regard to the embedding dimension
m, two main methods, the so-called false nearest neighbor [
50] (FNN) and Cao’s method [
51], are usually employed [
29,
36]. The latter is developed to improve the robustness of FNN. As noted by Cao, for realistic time series, FNN will probably lead to different optimal embedding dimensions. Hence, Cao’s method was adopted in the present study.
There are adjacent points in low-dimensional phase space, which are actually projections of nonadjacent points in high-dimensional space, i.e., false neighbors. With the increase in embedding dimension, chaotic orbits will unfold gradually and false neighbors will be excluded. In a phase space with
m dimension, the
ith vector possesses nearest neighbor vector
and the Euclidean distance between them is defined by the following:
An increase in embedding dimension leads to a change in the nearest distance, i.e.,
The difference between
Rm+1 and
Rm is associated with an increase in
m, with which the following index is proposed in FNN:
while Cao proposed the following index, with a higher noise resistance performance [
46]:
Further, the average performance of all vectors in phase space is:
Finally, E(m + 1)/E(m) is used to depict the trend of E(m).
The best embedding dimension m* will be determined when E(m + 1)/E(m) stops changing. Once τ* and m* are determined, the phase space will be reconstructed by Equation (1).
After the reconstruction of phase space, the recurrence matrix is determined by the following:
where
ε is the constant threshold,
H is the Heaviside function and
is the Euclidean distance between vectors.
Recurrence matrix Rij describes the relationship between every pair of reconstructed vectors from a mathematical perspective. When a vector recurs within the constant threshold, Rij is endowed with 1; otherwise, Rij is endowed with 0. The recurrence matrix is obtained after the calculation of all vectors. Instead of using the statistic indices derived through the RQA method, this study treats the recurrence matrix as a whole that can be further analyzed by deep learning models. This ensures that we will not miss the TAI-relevant features.
2.3. Overview of the Framework of RRC
The results of pre-treatments include three channels of recurrence matrices that can be naturally fed into deep learning modules.
Figure 3 illustrates the overall strategy of the RRC framework. The inputs of original signals consist of dynamic pressure
p, acoustic velocity
v and heat release rate
h. The output of RRC is the instability vector
T, which determines the probability of TAI occurring in different directions of operation variables. Each operation variable has two directions, i.e., increase or decrease. The dimension of the instability vector includes an increase in equivalence ratio, increase in bulk velocity, decrease in equivalence ratio and decrease in bulk velocity. For example, an instability vector where all digits are zero indicates that the combustion system is stable in any direction of operation conditions. The TAI prediction task is essentially a binary-class, multi-label classification task, where all samples are divided into instability or stability in each direction and one sample can be tagged with four labels.
The RRC model mainly consists of two stages: Stage 1 involves obtaining the recurrence matrix from the original data through phase space reconstruction and a recurrence analysis, i.e., pre-treatment. Stage 2 involves training the deep learning models and outputting the instability vector.
Step (a): The recorded signals are obtained using a microphone (Brüel and Kjær-TYPE 4954, Brüel & Kjær Sound & Vibration Measurement, Egham, Surrey, United Kingdom) to measure the pressure signal, a constant-temperature hot-wire (Dantec 55P16 miniature hot-wire probe and a Dantec 54T42 MiniCTA, Constant Temperature Anemometer (CTA) technology, DK-2740 Skovlunde, Denmark) to measure the acoustic velocity, and a photo multiplier (THORLABS FB430-10, Thorlabs, Newton, NJ, USA) to measure the heat release rate.
Step (b): Takens’ delay embedding theory [
47] is used to reconstruct three channels of normalized signals in phase space based on an appropriate time delay and embedding dimension. This helps to expose the attractor of the combustion system. In this way, the original dynamical structure is revealed.
Step (c): After the phase space reconstruction, the recurrence matrix is calculated to describe the characteristics of the chaotic system.
The three channels of recurrence matrices in terms of pressure, velocity and heat release rate are fed into RRC as the training set. RRC then generates abundant features by performing convolution operations in different layers and channels. Finally, all the feature maps will converge into fully connected layers with a suitable activation function, through which the framework outputs the probability ranging from 0 to 1.
The backbone network is standard ResNet-18 [
48] and the CBAM attention mechanism is combined with this to improve performance [
49]. This mechanism consists of a channel attention module and a spatial attention module. The channel attention module focuses on discovering important channels, while the spatial attention module is focused on the important features in each channel. Different features are obtained by adjusting millions of parameters in RRC, and specific loss functions are used to assess the difference between the true labels and the outputs of RRC. Then, optimization methods are used to continuously adjust these parameters until the loss curve converges to a low level. The optimal model is then selected and frozen to test the generalization capability on a testing dataset. Finally, the RRC model is employed to predict TAI using a complete piece of 10.3 s data to test its on-line performance.
2.4. Deep Learning Strategy
Basic deep learning models and corresponding parameters are introduced in this section. Deep learning methods are very successful in many prediction tasks. However, adding too many layers to the model can lead to a degradation in its problem and an increase in training errors. The ResNet neural network is proposed to address this problem, which uses a shortcut structure to eliminate degradation [
48]. The ResNet-18 model is presented in detail in
Figure 4, consisting of eight residual blocks with 16 convolution layers. The convolution layers are used to extract features from recurrence matrices or tensors from prior layers, with various sizes of convolution kernels producing different channels within the layers.
Activation function and classification function: Consistent with most convolutional neural networks, the rectified linear unit (ReLU) is used as the activation function in layers, i.e., . The classification functions of the last layer are four sigmoid functions outputting probabilities between 0 and 1. The Sigmoid function is defined as . The outputs of the probabilities are not identical to the real probabilities, which is of the RRC’s decisions. A higher fraction (beyond 0.5) means that that the combustion system is prone to TAI in this direction.
Batch normalization: A well-known problem that can hinder the convergence of neural networks is vanishing/exploding gradients. This is mainly because the distributions of parameters in the different layers are mutually influenced, and the increase in network layers aggravates this problem. Ioffe referred to this issue as an internal covariate shift and proposed that the batch normalization layer could address this by normalizing the layer inputs [
52]. Thus, batch normalization can accelerate the convergence of neural networks.
Loss function: Loss function is used to describe the deviation of the networks’ output from the true labels. Different from general tasks in the image classification field or objection recognition field, where multi-class tasks mostly have one label, TAI prediction is a binary-class, multi-label task. Softmax function and cross-entropy (loss function) are often employed in the multi-class task, while sigmoid function and the binary cross-entropy function used in binary classifications are confronted with the severely imbalanced label problem. The imbalanced label issue in predicting combustion instability tasks is rather sensitive. On the one hand, instability samples and stability samples are imbalanced; on the other hand, instability samples may differ a lot in the four directions. Therefore, new criteria focused on multi-label tasks should be employed.
For general binary classification tasks, binary cross-entropy is commonly used as the loss function [
40]:
However, binary cross-entropy struggles to address the imbalanced label issue and struggles with multi-label tasks. In this regard, circle loss
LCL is introduced as another loss function and is adopted in this study. This was first proposed by Yifan Sun [
53], as follows:
where
L and
M are the numbers of negative and positive classes, respectively,
γ is a scale factor and
b is a margin factor.
Optimization method: The selection of the optimization method influences the convergence performance of the neural network. Among the various methods used to optimize training loss, Adam, which is based on adaptive estimates of low-order movements, is considered one of the most stable and computationally efficient [
54].
CBAM: The attention mechanism of CBAM [
49] allows the ResNet model to focus on more important features, i.e., more instability-relevant features. As in other self-attention mechanisms, the CBAM module does not change the structure of the original neural network and hence can easily be added to ResNet, without modifying the other parts. Ref. [
55] demonstrated that CBAM improved the performance of ResNet backbone network more than other self-attention mechanisms. An empirical comparison with other self-attention mechanisms, including SE [
56] and GC [
57], will also be presented in this study.
CBAM consists of channel and spatial dimensions, obtained through decomposing the 3D attention tensor into 1D cannel attention and 2D spatial attention [
55]. The channel attention module is a 1D tensor, while the spatial attention module is 2D tensor. The structures of the channel attention module and spatial attention module are illustrated in
Figure 1. Assume the input feature map is
F,
σ denotes the sigmoid function,
fch–avg is the global average operation tensor and
fch-max is the global maximum operation tensor.
MLP is the two-layer multi-layered perception, while
SACH and
SASP denote the channel attention and spatial attention, respectively; therefore, CBAM can be described by Equation (11). The process of adding CBAM to ResNet is presented in
Figure 3 and more detailed specifications are referred to in Ref. [
49].
2.5. Post-Mortem Performance Evaluation Tools
The proposed TAI prediction model, RRC, is essentially a binary classification model in the mathematics field. To verify the prediction performance, common tools, including accuracy, the F1-score, ROC curves and the AUC-ROC index, are usually used to evaluate the prediction performance [
36,
40].
The most common criterion is accuracy, denoted by
acc. The accuracy index, i.e., the ratio of correct outputs, can be used to discriminate the integral deviation between RRC outputs (instability vectors
T) and true labels (experimental results). The RRC outputs range from 0 to 1 after the activation of sigmoid functions. Output values below 0.5 will be classified into stable ones and vice versa. Then, the accuracy criterion is given by the following:
where
NS is the total number of samples and
denotes the indicator function.
For example, if the output instability vector is T = [0.1, 0.2, 0.6, 0.8] and the true label is [0 0 1 0], then the accuracy is 75%. Three prediction directions are classified correctly and only the direction of “Down” is incorrectly predicted as unstable.
Note that the accuracy criterion is unable to distinguish specific incorrect predictions. To further measure the specific performance of TAI prediction, other indices are introduced. Precision is defined by the ratio of correctly predicted instable samples to the total number of the outputs that are predicted as instable. A higher precision value indicates that when the prediction framework makes a positive decision of combustion instability, combustion instability is more likely to be present. This is very meaningful when applying prediction tools in an active control system of combustion instability. Additionally, recall value is defined by the ratio of instable samples that are correctly predicted as positive. A higher recall value indicates that fewer combustion instability samples are missed. To examine both precision and recall, their harmonic mean, F-1 score, is introduced as follows:
where
TP/
TN/
FP/
FN represents True Positive/True Negative/False Positive/False Negative, respectively (True/False corresponds to correct/incorrect classification and Positive/Negative corresponds to the output of RRC).
The Receiver Operating Characteristic curve is a robust measurement of the binary classification model [
58]. It measures how many instable samples are classified correctly at the cost of the incorrect classification of stability samples when varying the threshold. For instance, if all samples are classified as unstable through RRC (threshold close to 0), the recall index is 1, meaning that all instability samples are correctly classified, and the percentage of wrongly classified stable samples will also be 1. In other words, stable samples are correctly recalled at the great cost of wrongly classifying all stable samples. Thus, the core idea of ROC curves is that a good binary classification model has a higher rate of correctly classified instability samples and a smaller rate of wrongly classified stability samples. The rate of wrongly classified stable samples is defined as the false positive rate
FPR, while the rate of correctly classified instable samples is referred to as the true positive rate
TPR. By adjusting the threshold from 0 to 1, the false positive rate is plotted against the true positive rate, and this curve is the ROC curve.
Four ROC curves will be obtained in each direction. All curves are bounded by a square of side 1, linking the point {0, 0} to the point {1, 1}. Point {1, 1} represents the extreme situation referenced in the last paragraph. The ROC curves of a good classification model will approach the upper left part as much as possible, and the corresponding area under the curve (AUC) will approach 1 as well. Generally, a good model is characterized by an AUC between 0.8 and 1. It is then possible to compare the two binary classifiers by observing their respective AUC.
3. Results and Evaluations
In the pre-treatment process, the results of phase space reconstruction can be displayed by the so-called attractors, which can also provide some qualitative insights into the combustion instability. As shown in
Figure 5, attractors are characterized by their different dynamical structure. To further quantitatively explore nonlinear features related to TAI, recurrence matrices and recurrence plots were obtained. The recurrence plots in
Figure 6 illustrate the recurrence behavior. After the pre-treatment stage, the recurrence matrices of the datasets are fed into the RRC to extract precursors for TAI prediction.
The training process is performed on an Nvidia GeForce RTX 3080 GPU using Pytorch 1.80. One of the most critical hyper-parameters used to train the neural network is the learning rate, which controls the time step size to update the weights by optimizing the loss function. In other words, the learning rate is associated with convergence performance, as indicated by:
where
θ1 denotes the weights that need to be optimized,
α is the learning rate and
L is the proposed loss function.
To determine the optimal learning rate, a technique called Cyclical Learning Rates for the training process, proposed by Leslie Smith, is adopted in this section [
59]. After 30 epochs, RRC achieves the rather high training and validation accuracy of 99.80% and 90.21%, respectively. The training loss and validation loss are represented in
Figure 7. The convergence speeds of both training loss and validation loss highlight the benefit of suitable training parameters. The risk of overfitting could be present beyond 23 epochs due to the fact that a further decrease in training loss does not improve the validation performance. At this point, the optimal model is frozen for further testing.
The testing dataset is used to validate the generalization capability of the prediction model. The outputs of the trained RRC are compared with the true labels to calculate whether deep learning is feasible. Although the accuracy index, as a powerful tool, can measure the integral performance directly, it is occasionally insufficient for validation of the classification model. It is possible that a model with high accuracy could perfectly predict the great majority of stability samples in imbalanced datasets while poorly predicting the rest of the instability samples. Obviously, the minority of samples showing combustion instability are expected to be predicted correctly. The precision, recall and F1-score indices are thus introduced to avoid the aforementioned extreme situations.
The ROC curve in each direction is presented in
Figure 8. The diagonal line represents the performance of a random classifier. The ROC curves of good classifiers are close to the upper left part and their AUC values often range from 0.85 to 1. The AUC values of the “Up”, “Right” and “Down” directions indicate that these directions are predicted perfectly, outperforming the state-of-the-art method. The ROC curves of Ref. [
45] are provided in the appendices for comparison.
Comparisons between different models are important when constructing on-line TAI prediction systems. The compared models are divided into two types. The first type adopts the pre-treatment strategy and the second type is based on the original time series data. A comparison of the two types also allows us to examine the benefits of the pre-treatment. Several models of each type are presented in
Table 2 to show the effective mechanisms.
In RRC, CBAM is introduced to enhance the performance of ResNet-18 networks. In addition, other attention mechanisms are used. Squeeze-and-Excitation (SE) is adopted with the ResNet-18 backbone, which is focused on channel attention [
56]. This model is referred to as RRS in this section and the only difference between this model and RRC is the attention mechanism. Another backbone network, the Alex network, is introduced to substitute the ResNet-18 and the attention mechanism is still CBAM; we refer to this framework as RAC to examine the effect of the networks [
60]. In the third model, referred to as RVC, the ResNet-18 model is substituted by the VGG network [
61]. Note that Alex and VGG, with different frameworks and parameters, are consistent with ResNet-18 in terms of inputs and outputs. In the second type, Multi-Layer Perceptron (MLP) and the long shot-term memory (LSTM) network are the models usually used as a baseline, which utilize the original time series data as inputs.
Table 2 presents a comprehensive comparison of the experimental results, including general accuracy, precision, recall, F1-score and AUC score, for different models.
Appendix A contains the ROC curves with the AUC scores of the involved models. Overall, the RRC model outperforms the CNN-LSTM and other models in most indices. The RRC model achieves the highest accuracy score among all the models. Moreover, the accuracy scores of the other three recurrence matrix-based models are close to the CNN-LSTM model, with slight fluctuations, indicating that these models are comparable to the state-of-the-art model, CNN-LSTM. Additionally, the ROC-AUC values of the models are also presented in
Table 2. The RRC model outperforms the CNN-LSTM model in most directions. Though the AUC scores of the CNN-LSTM model are very good, the RRC model achieves a better performance in most directions. However, the AUC value of RRC in the Left direction is 5% smaller than CNN-LSTM. In the other three directions, the AUC values of the CNN-LSTM model are close to the RRS, RAV and RVC models. Thus, the recurrence matrix-based methods can achieve results as good as the state-of-the-art method, and the RRC framework is the best. The performance of the other two primary original data-based methods, namely MLP and LSTM, fluctuated in terms of precision and recall indices. Overall, RRC outperforms the CNN-LSTM model in most directions, as shown in
Table 2. The considerable advantages of the RRC model over the CNN-LSTM model indicate that the RRC model has high accuracy and robustness. Two main conclusions can be derived: (1) the pre-treatment strategy proposed in this study is effective, as when all models are used in combination with pre-treatment, the results are at least comparable with the existing optimal method; (2) deep learning methods provide a great advantage compared to baseline models.
For feature visualization, we borrowed some ideas regarding feature extraction from the computer vision field since recurrence matrices can be considered images. In this way, the process from pre-treatment to RRC output can be visualized. In the initial stage of the deep learning framework, a few feature maps are generated by sliding multiple convolutional filters over the input recurrence matrix. More abundant feature maps are generated by further convolutional operations and, thus, the neural network is characterized by multiple layers. Shallow layers characterize the basic features of the input matrices, including the outline, texture, edges and corners, which are relevant to the local information of the recurrence matrix. The local information is referred to as the small receptive field in the computer vision field. Conversely, deep layers relevant to regions with a large receptive field can focus on more abstract features, which are hard to understand in the physical world.
Figure 9 illustrates the process from basic to abstract features using the first and second samples in the validation datasets as examples. The corresponding instability vectors are [0 1 1 0] and [1 1 0 0], respectively, which are relevant to the impending instability in some directions. Thus, some regular structures can be seen in
Figure 9a,f. These features in recurrence plots are typical chaotic dynamical characteristics.
Figure 9 describes the effects of different layers on the generated feature maps. The four main layers comprise the main structure of the ResNet-18 network and each layer consists of four convolution layers. With the increase in the depth of the network, the number of channels, i.e., feature maps, will increase as well. Sixteen channels of the batch normalization layers in each main layer are presented in
Figure 9 for brevity. Basic and abstract features are both important for the training process. The significance of these features increases with the increase in the depth of the network and, thus, can play more of a role in feature extraction than the indices in the RQA method might suggest. Feature maps in shallow layers are fine-grained, while those in deep layers are coarse-grained.
Figure 9b,c are relevant to some fine-grained features, including texture, edges and corners, while
Figure 9d,e consist of abstract features in a coarse-grained fashion, along with
Figure 9i,j.
To further visualize how the networks extract features, the feature maps in main layer 1 of the second sample are compared with the features generated by endowing weights of the neural network with random values, as shown in
Figure 10. Altogether, 64 feature maps marked with different colors in
Figure 10b are generated in the first main layer of ResNet-18. It is easy to classify these features due to their clear focus. Specifically, these features are characterized by their overall structures or local recurrence behaviors. This is achieved based on the well-trained convolution filters or other parameters in ResNet-18. When the convolution filters are assigned random values ranging from −1 to 1, the resulting feature maps become very average and, thus, cannot distinguish various recurrence characteristics.
For better visualization, four types of feature maps in
Figure 10b are marked with colored boxes. Red boxes are relevant to the overall texture in recurrence plots and blue boxes denote those with local recurrence behavior regions. Green and purple boxes represent the transition state from blue to red boxes. Green boxes are more relevant to local recurrence regions, whereas purple boxes are more focused on overall texture. The feature channels increase with the increased depth of the network. Thus, more profound features can be captured.
4. On-Line Prediction System with Rolling Windows
The optimal network trained in the previous section is now frozen to predict combustion instability in a segment of recorded signals. Prediction is the first step in active combustion control. To achieve this purpose, this section presents on-line simulations for the application of active control in a combustion system.
In general, an on-line prediction system is obtained by continuously outputting instability vectors based on the RRC framework in the form of rolling windows, as shown in
Figure 11. The most important parameters in this system are the time delay and length of the rolling time windows. The delay is associated with the inference time spent calculating the instability vectors. The shorter the time delay, the more RRC predictions can be completed. If the process of outputting results is extremely rapid, the delay of rolling windows will be as short as the sampling frequency of the sensors, i.e., 0.1 ms for 10 kHz. However, experiments on NVidia 3080 GPU show that the time needed to output a single instability vector is about 50 ms, which is longer than Ref. [
46] (19 ms) as the ResNet-18 neural networks have more parameters. The time delay scale of rolling windows is thus determined to output predictions with intervals of 500 sampling points, which is essentially the rolling gap between adjacent windows. Note that the inputs are composed of 0.3 s time series samples. Therefore, the length of a rolling window is 3000.
After determining the delay and length of rolling windows, the optimal RRC model is implemented on the complete 10.3 s time series recorded from the operation point referred to as PC1 in Ref. [
40], with 30 m/s of bulk velocity and a 0.70 equivalence ratio, where the instability vector is
T = [1 1 0 0]. It should be noted that continuous time series data from PC1 fed into RRC have never been used to train the neural network. If the RRC framework is accurate and robust, it can predict instability vectors correctly. With regard to the four directions, the continuous results of instability vectors are represented in
Figure 12. The threshold is still regulated as 0.5 and the overall results are consistent with the true instability vector. In the Up direction, all predictions are correct, with the outputs showing gentle fluctuations of around 1.0. Similar to the Up direction, predictions in the Down and Left directions are all correct as well. However, there are short periods in the Right directions with incorrect predictions. This is caused by the intermittent nature of the recorded signals when the system approaches instability with a higher bulk velocity. We simply calculate the accuracy index to quantitatively verify the results:
acc = 99.54%. The very high on-line accuracy shows that the RRC framework is robust and precise. The convincing accuracy results demonstrate that the prediction framework is able to capture dynamical features in an on-line combustion control system.
5. Conclusions
In summary, this paper proposes a new framework integrating chaotic methods and deep learning models to predict combustion instability. To the authors’ knowledge, this is the first attempt in which recurrence matrices have been used in deep learning models to predict combustion instability. Specifically, acoustic pressure, acoustic flux and heat release rate fluctuations are addressed in the recurrence matrices in the pre-treatment stage, which is the input of the RRC framework. The output of the model indicates the proximity of unstable operation points in terms of bulk velocity and equivalence ratio.
The promising prediction results achieved by the proposed RRC demonstrate the ability of deep learning to capture relevant features of impending instability. It can robustly predict unstable operation points without entering unstable regions, contributing to monitoring and early warning combustion instability in a combustion control system. Furthermore, this study introduces feature extraction and visualization techniques to reveal how deep learning models extract features and analyze the mechanism quantitatively as an initial step. The comparison results of various models suggest that recurrence matrix-based models are comparable to the state-of-the-art CNN-LSTM model and the proposed RRC model outperforms other models, achieving the best results.
There are some ways to improve this study. First, it is hard for data-driven methods to investigate physical mechanisms. Then, the feasibility of applying this method to other combustion combustors must be determined to enhance the robustness of the proposed framework. Validation of the framework on other experiments is thus also necessary. Furthermore, the calculation complexity of different models should be compared to provide computation information for combustion control systems in the future. Finally, a better comprehension of how network models predict the proximity of instability and some more physical mechanisms is desired to contribute to the theoretical understanding of avoiding combustion instability.