1. Introduction
A lean premixed swirl flame is the main combustion form in a low-emission gas-turbine engine or aero-engine, which generates recirculation zones to stabilize the flame structure [
1,
2]. In the context of reducing NOx emission, the swirl flame easily becomes extinct due to very low equivalence ratio levels. Meanwhile, the extinction events and vortex breakdown become more likely to occur once the lean limit is reached, exacerbating the complexity of the interaction between flow and combustion [
3,
4,
5]. The investigation of the flame structure feature characterizing the flame evolution process is of great significance for understanding dynamic behaviors and recognizing the flame state. Lean blowout (LBO) is a common hot issue in swirl-stabilized combustors. When flame approaches LBO, it becomes unstable; thus, mastering LBO flame behavior, such as flame structure evolution, is extremely important for the efficient, reliable, and safe operation of a gas-turbine engine or aero-engine. Recently, the investigation of LBO characteristics has received widespread attention, especially regarding flame feature investigation at near-LBO condition. However, it is very challenging to identify the LBO limit due to strongly coupled and unsteady processes [
6,
7]. Reliable flame features are always essential for the early identification of LBO and understanding of the instability mechanism, which are significant in the design of reliable and safe gas turbine combustors. Vortex fragmentation can lead to flame surface rupture and local flameout, generating abundant local structural features, such as the flame hole. Hole features can reflect the details of the interaction between turbulence and flames to some extent. Many findings in the literature have found a connection between local structure features and flame characteristics [
8,
9], but further research is needed to determine whether it can be used to analyze combustion states. The generation and evolution of the local structure feature are crucial in the investigation of flame stability and LBO prediction methods, which may serve as a precursor feature for LBO state recognition [
10].
High-speed planar laser-induced fluorescence and chemiluminescence imaging are most frequently used to obtain the spectral information regarding intermediate combustion components (such as CH and OH groups). In addition, the multi-dimensional flame dynamic information of the combustion field can be obtained to conduct an in-depth analysis of combustion characteristics [
11,
12]. For example, high-speed chemiluminescence imaging has been developed and applied to analyze the unstable process of swirling flame extinction. However, the combustion information obtained using the chemiluminescence imaging method has the path integration characteristics, which makes it difficult to obtain high-resolution flame fine structure information. Furthermore, planar laser-induced fluorescence of OH radical (OH-PLIF) was frequently used to obtain the flame location and local flame structure with high spatial resolution. Taamallah et al. [
13] investigated the premixed swirl fame macrostructures in different stabilization modes, and found the presence of a vortex structure along the inner shear layer zone by means of the OH-PLIF technique. However, the evolution process could not be carried out due to the lower acquisition speed of 10 Hz. Zhang et al. [
14] used simultaneous 10 kHz PLIF and stereoscopic particle image velocimetry (S-PIV) to study local feature dynamic behavior, including the processing vortex core, and the growth or decay was quantified as an instability feature of the flow. Furthermore, the local structure features and effects on flame dynamic and heat-release fluctuation were also investigated by Wang et al. [
15]. Skiba et al. [
16] investigated the effects of large eddies on turbulent premixed flame structures using high-speed multi-species PLIF and PIV, and identified two common flow-flame events in the PLIF–PIV movies, expressing the interactions between turbulent structures (eddies) and premixed flame fronts. These results can confirm the roles of local flame structures such as holes. Therefore, the local flame structure detail was closely related to the degree of flow–flame interaction, which can be used as an indicator to study the combustion condition. In addition, there are a lack of efficient local feature extraction methods, resulting in very limited features that can be used to analyze the lean burn extinction process of swirling flames.
Relevant research aimed at the cross-disciplinary area of energy and artificial intelligence has obtained many meaningful results, for example, reconstructing PLIF images using chemiluminescence images [
17], and improving the resolution ratio of spatiotemporal evolution based on the frame interpolation method and an accurate prediction of the combustion state [
18]. Aiming to investigate local flame structure features, aside from using traditional methods such as flame geometric and intensity features, researchers have conducted extensive work on the intelligent processing of flame images and pattern recognition of combustion field images, especially the combination of big data, machine learning, and artificial intelligence [
19]. To improve the interpretability of the model and reduce its complexity, the quantified flame structure features should be extracted and investigated instead of raw images. In order to obtain the image features most directly related to the combustion state and establish a mapping relationship between the image features and combustion state, many structure feature extraction and analysis methods have been developed. In our recent works, flame area and moment features have been used to investigate the near-LBO flame dynamics. The heat-release frequencies and dominant oscillation modes were obtained to demonstrate the oscillation characteristics of the near-LBO flame [
20]. In the above-mentioned study, a flame hole structure from a turbulent flame front was found, but whether it can be used as a novel flame feature requires subsequent research. Meanwhile, basic image features such as flame area, flame circumference, and vertical range, among others, were also used to establish the scramjet/ramjet model classifier by means of K-nearest neighbor (KNN) [
21]. Other potential features need to be explored further. Hasti et al. [
22] proposed a data-driven method based on support vector machine (SVM) to identify the critical flame location of the LBO flame. The temperature and OH mass fraction from the large eddy simulation were used as training data to establish the machine learning model. The flame root region was found to have significant advantages in characterizing the critical flame location. Roncancio et al. [
23] combined OH-PLIF images and a convolutional neural network (CNN) model to classify the burned and unburned turbulent media. The local flame structure features, including the pockets or islands, were extracted and reduced the computational time significantly. Additionally, seeking a quantifiable new feature is highly beneficial for flame condition recognition.
In this paper, the properties of flame images are further explored. In order to accurately extract the flame structure feature from PLIF images, it is necessary to use image segmentation algorithms. Although some simple computer vision algorithms like improved threshold methods can be effective, these algorithms need to carefully adjust each parameter of the algorithm according to each combustion scene, and they are not robust to scene changes. In order to achieve more accurate segmentation, image matting has been applied as a representative of fine subject segmentation technology. For example, Xu et al. [
24] first proposed a two-stage neural network architecture for a tri-map based on image matting and then released the Composition-1K dataset. MatteFormer [
25] utilizes the most advanced transformer architecture and enhances the global information of the network through prior tokens. Although these methods have made great progress, they are more commonly used in the foreground segmentation of natural images or portraits. At present, there is no work in the image matting area that can be directly used in the analysis of flame images. The tri-map-based segmentation method is suitable for the analysis of the scene in this paper, but it cannot be directly applied to flame images. The current work on deep learning-based computer vision and a flame combustion image mainly analyzes the flame properties in the natural image, and it lacks an understanding of the internal mechanism and interaction of the flame. Therefore, it is reasonable to design a framework to introduce tri-map-based flame segmentation into the analysis of flame images.
Most prior LBO studies have observed that significant changes in flame structure features exist within the LBO flame. However, to our knowledge, the limited LBO features are extracted and analyzed due to the lack of experimental data and an efficient flame feature extraction method. The few LBO features from available experimental data were reported, especially the local flame structure features for the investigation of LBO recognition and combustion characteristics. In this paper, we developed a quantified flame structure feature extraction neural network-based method to establish the correlation with the LBO state. In response to the demands of acquisition and analysis for high-resolution local flame structure information in LBO flames, this work explores a flame structure feature for LBO recognition based on high-speed OH-PLIF images. By recognizing the gaps in current methodologies, particularly in combining the deep neural networks and available flame images, this study meticulously analyzes engine flame images to extract the whirls of flames by harnessing the power of the self-attention-based transformer architecture. Furthermore, a novel spatiotemporal matching analysis framework is introduced to analyze the extracted results. To discern and document the statistical patterns inherent in the flame combustion process, this study provides a more in-depth understanding of flame combustion dynamics, offering invaluable insights and guidance in the study of combustion patterns. Finally, we will delve deep into the intricacies of flame image properties and the application of advanced image segmentation techniques, culminating in a comprehensive framework that combines the strength of computer vision, deep learning, and intricate flame combustion mechanisms. Based on the above issues, this work aims to develop two high-speed PLIF diagnostic techniques, including the burst mode (10 kHz repetition rate) and continuous mode (1 kHz repetition rate) to obtain the experimental data of LBO flame evolution. To understand the lean blowout process and further analyze the combustion instability, an efficient feature extraction and analysis method were presented, and some significant LBO evolution parameters were obtained, including the relationship between the life cycle of hole structures (from generation to disappearance) and their area, perimeter, and total number to derive the evolution time. It is important to note that this study aims to extract the hole features for LBO recognition instead of predicting them. However, this data analysis method and flame features can provide new insights for flame blowout prediction in the future.
2. Methodology
2.1. Experiment Details
An aero-engine model combustor with an optical window was used for the investigation of the LBO flame in this paper. The swirl-stabilized burner structure was similar to the dual-swirl burner model of the German Aerospace Center (DLR), which can be a typical research object of swirl flow structure and dynamics. The schematic of the dual-swirl burner setup has been reported in our recent work [
20]. The angle of the primary swirler is 45° with a swirl number of 0.11. The angle of the secondary swirler is 60° with a swirl number of 1.79. Methane fuel is transported in three streams to the combustion chamber, and the inlet is connected to the fuel nozzle through six inclined holes, with an inner diameter of 15.6 mm and an outer diameter of 16.4 mm. By changing the equivalent ratio of fuel, two typical flame conditions are obtained, including stabilization (φ = 0.4) and near-LBO (φ = 0.1). As for the optical diagnostic techniques, the high-speed OH-PLIF technique with 10 kHz and 1 kHz repetition rates was used to obtain a large amount of flame data representing the LBO flame evolution. The acquisition speed for the PLIF system refers to the frame frequency of images. The laser pulse width is 10 ns, which is sufficient to freeze the reaction flow field with high temporal resolution. In the 1 kHz PLIF mode, the CMOS array was 1856 × 970 pixels, corresponding to a 70 mm × 50 mm imaging field of view, giving a spatial resolution of about 67 μm per pixel. In the 10 kHz PLIF mode, the CMOS array was 1000 × 1000 pixels, and the spatial resolution was about 75 μm per pixel.
For the extraction of structure features, the extraction and analysis model of hole structure was trained and validated based on 10 kHz and 1 kHz PLIF image data, respectively. Two OH-PLIF technique working modes were used, including the burst mode of short duration and continuous working mode in the present study. OH fluorescence was facilitated by exciting the Q1(8) transitions in the (0,0) band of the A2Σ+-X2Π system at 283.553 nm via a frequency-doubled dye laser (Sirah Credo, with Rhodamine 6G) pumped by a self-researched high-speed Nd: YAG laser. The ultraviolet laser pulse was used to excite OH radicals, corresponding to an energy of 1.5 mJ at 1 kHz, and 1.8 mJ at 10 kHz. The imaging field of view was 70 mm × 50 mm, giving a spatial resolution of about 67 μm per pixel. The laser sheet thickness was nearly 200 μm to allow high-spatial resolution measurement of the flame structure. Fluorescence was obtained using a high-speed intensified CMOS camera coupled with the combination of Semrock 315 nm/15 nm and Schott UG11. The CMOS array was 1856 × 970 pixels with an operation rate at 1 kHz and 1000 × 1000 pixels at 10 kHz. On the one hand, we used a 10 kHz burst-mode PLIF technique to analyze the characteristics of local hole development in small time scales (~millisecond). In this mode, the single burst train contains 30 pulses with a minimum pulse interval of 100 μs. The fine-hole structure evolution can be obtained to train the feature data extraction model. On the other hand, the continuous 1 kHz OH-PLIF mode was used to obtain a large number of images under a continuous flame evolution process, for example, 3000 pairs of images to validate the deep neural network segmentation method. Furthermore, a statistical analysis of structure features was adopted to assess the relationship between combustion conditions, such as stabilization or near-LBO condition. In this paper, a data model for the extraction and analysis of swirl structure features is described in detail. This article, as a continuation of our research work on LBO flame characteristics, aims to elucidate a deeper understanding of the near-LBO condition and provide an effective structure feature parameter and model for LBO recognition and online prediction.
2.2. Neural Network for Feature Extraction
The framework proposed in this paper is shown in
Figure 1. For the input flame image, this paper first generates its tri-map. Then, the flame image and its tri-map are sent to Flame-MatteFormer to segment the main outline of the flame. After that, this paper designs a series of post-processing processes, including main contour correction, edge compensation, global instance extraction, and instance feature analysis. The purpose of these steps is to reduce the noise of the main contour, extract the hole instances of a single image, and analyze its features. After that, this paper designs a spatio-temporal matching and analysis algorithm to apply a global spatio-temporal analysis of the hole extraction results of all frames. This proposal introduces a neural network tri-map-based segmentation method. The tri-maps are fed to the neural network together with the original flame images to obtain the main contour in the flame image. Through post-processing, the flame contour is further refined, all closed contours are extracted, and noise is filtered. After feature extraction, the temporal characteristics of the hole are analyzed to obtain the specific number of the hole feature, as well as the total number and the life cycle of each hole in the frame. Finally, over time, the statistical analysis results are obtained.
For the extraction of the main contour in the swirling flame, a neural network tri-map-based segmentation method is introduced. The tri-map is a concept originally introduced in the field of image matting [
26,
27,
28], where the goal is to better segment the foreground from various backgrounds with priors. The tri-map is obtained by setting the threshold of the background and flame to be relatively extreme values, while the regions of other values are classified into the uncertain region. Using this definition, the network’s learning objective is to generate the correct foreground segmentation map given an input image
and a tri-map
. In this paper, we introduce the concept of the tri-map and modify the definition to better fit the characteristics of the fire flame image. The tri-map
for the
i-th flame image can be obtained by
where
and
are threshold values for the determined background and the flame. We can set the threshold to be strict so that most areas are classified as transition areas. Then, the neural network learns to combine density and spatial information to automatically determine the location of large transition areas. Compared to the threshold methods that eliminate spatial information through statistics, the tri-map retains spatial relationships, allowing the network to determine the flame or hole locations through brightness and the context around it. Meanwhile, the internal brightness of the hole in the flame image is sometimes bright and dark. A fixed threshold will lead to confusion as the brighter holes are easily divided into the foreground, and vice versa. In this paper, we modified the deep image matting method MatteFormer [
25] to satisfy the requirements of segmenting flame images, and we called the modified network Flame-MatteFormer.
MatteFormer adapts the popular transformer architecture [
29] to achieve the task of foreground segmentation by enhancing contextual modeling through self-attention, which outperforms traditional methods based on convolutional neural networks in terms of global perception. Meanwhile, its window-based processing preserves enough local information. The MatteFormer with self-attention is well suited for the joint processing of global and local information, which is crucial for segmenting flame contours.
In addition, the self-attention can be obtained as follows:
where
are obtained by transforming
with learnable parameters. The key to the attention mechanism is to generate the weights based on different inputs and then weigh the input values. The self-attention mechanism is different in that the weights and values are generated by the same input.
The MatteFormer first defines the prior-token, which represents the global context feature of the tri-map. In Flame-MatteFormer, this represents the global information such as the area of the flame, background, or transition regions. The first prior token of the network corresponds to the specific number of pixels in the flame, background, or transition regions. These prior-tokens are used as global priors and participate in the attention mechanism of each basic module. The method for generating the prior-tokens is as follows:
where
is one of the foreground (flame), background, and transition regions,
is the area of that region,
is the total space,
indicates whether the region is part of
, and
is the density of the region.
Secondly, the MatteFormer consists of Prior-Attention Swin Transformer (PAST) blocks, which are based on Swin Transformer blocks [
30] but introduce the Prior-Attention-Weighted Self-Attention (PA-WSA) layer. In this layer, self-attention uses not only spatial information but also prior-tokens to calculate attention. In this way, each basic module considers not only the local spatial information but also the global statistical information of the flame. The expression of PA-WSA is as follows:
where
is the proportion factor, which is the same as the
in the original attention equation. The modification in this paper is the addition of the positional offset
, which is used to adjust the self-attention further. In addition, Matteformer also introduces prior-memory, which stores the prior-tokens generated by each block. This allows the block to reuse the tokens generated from the previous block, which strengthens the global information by reminding the network of known regions’ statistics. The working mechanism of this part is further illustrated in
Figure 2.
Finally, the extracted main contours are further refined through four different post-processing steps. The first step is the main contour correction. This step utilizes a 5 × 5 morphology kernel to perform close and open operations on the extracted contour to eliminate small holes and connections. The second step is the edge compensation. This step fills in gaps in the flames on both sides by setting a fixed proportion of the brightness ratio to detect potential holes and seal the outlines for better analysis. The third step is the global instance extraction. The extraction algorithm used is the Suzuki algorithm [
31]. As shown in
Figure 1, we extract all significant contours of the flame. Finally, the instance feature analysis step can be conducted. This step aims to differentiate noise, flames, and holes. The method first separates noise by identifying the size of the area. Then, the flames and holes are separated by calculating the average brightness within the area. If the average brightness is small, it is considered a hole feature; otherwise, it is considered a flame.
Overall, the proposed Flame-MatteFormer first divides the input flame image into 16 patches, where each patch is processed and projected separately, and the projected feature expansion is modeled as a sequence feature to be provided for subsequent processing. At each stage, the network structure consists of patch embedding or patch merging and several PAST blocks. The function of patch embedding is to apply feature projection to the sequence features, while the function of patch merging is to rearrange the sequence features to increase the feature dimension and reduce the size of the feature maps. Following this, the sequence features are fed into the PAST block for processing, and the details of this section can be found in the methodology introduction section above. At the end of each stage, the sequence features can recover the spatial representation, which is similar to the feature map; thus, the feature map obtained via transformer-based methods can be used for fine-grained spatial tasks. Finally, the output feature maps of different stages are collected and sent to the decoder, which is a convolution neural network used to recover the foreground segmentation results from input feature maps. The concrete structure is the combination of a convolution block with residual connection and nearest neighbor upsampling, and the feature is gradually restored to the size of the original image.
In the process of network training, the network initializes with the MatteFormer trained on the Composition-1K dataset. Following this, the network is fine-tuned on the collected LBO images. A total of 2000 LBO images are collected, including 1000 in a stable state and 1000 in an unstable state, of which 700 are used for training and 300 for testing. During the fine tuning process, the batch size is set to 20. The learning rate is initialized to 0.0002. The Adam optimizer is used, and the training time is 20 k iterations. Finally, the best metric on the test set is MSE = 0.021.