1. Introduction
Pipelines are one of the most efficient and economical methods of transporting gases and liquids nowadays. While alternatives such as road tankers, rail cars, LNG ships, and compressed gas cylinders exist, they are often constrained by limited capacity, higher operational costs, and increased safety risks. Pipelines, in contrast, offer unmatched efficiency, cost-effectiveness, and safety. They are capable of handling large-scale, continuous transportation over long distances with minimal environmental impact, making them the preferred choice in modern infrastructure for oil and gas transportation [
1]. Despite their advantages, pipelines are subjected to leaks caused by corrosion, earthquakes, mechanical cracks, environmental factors and material defects [
2]. These leaks can have severe consequences, such as pollution, and public safety hazards, including economic losses and resource wastage [
3,
4]. It is alarming to note that 46% of pipeline leak incidents globally result in casualties. For instance, a diesel pipeline leak in Guizhou, China, in 2020 resulted in a 1.5 million RMB loss and widespread environmental pollution [
5]. Similarly, a petroleum pipeline explosion in Hidalgo, Mexico, caused by a leak, led to over 120 deaths and numerous injuries [
6]. These examples underscore the urgent need for effective leak-detection systems [
7,
8].
Leak detection strategies have evolved over the decades. Traditional methods such as visual inspection, pressure monitoring, and acoustic-based methods were initially employed. However, these techniques were limited by delayed detection times, operator dependency, and the inability to pinpoint leaks in real time. These challenges led to the adoption of advanced approaches utilizing AE signals and machine learning, which now focus on real-time leak detection and precise localization. Real-time leak detection systems continuously monitor pipelines for signals such as pressure fluctuations, acoustic anomalies, or vibration changes [
9]. These systems enable immediate detection and corrective actions, significantly reducing the environmental, economic, and safety impacts of leaks. By eliminating delays and ensuring continuous surveillance, real-time systems enhance pipeline reliability and operational efficiency, making them indispensable in modern infrastructure. The pipeline industry now focuses on cost-effective methods for repairing minor leaks, replacing encapsulating collars and clamps rather than replacing entire sections [
10,
11]. Thus, intelligent leak detection methods are necessary for minimizing maintenance costs [
12,
13]. Recent advancements in machine learning and artificial intelligence have greatly improved the efficiency and accuracy of pipeline leak detection [
14,
15]. Techniques such as vibration-based methods, pressure wave techniques, time-domain reflection methods, and AE technology have been developed to ensure pipeline leak detection reliability [
16,
17]. AE technology is particularly notable for its high sensitivity, easy installation, and real-time leak-detection capabilities [
18]. Various studies have demonstrated the potential of AE technology in this domain, and developed the use of AE technology for detecting the beginning of cracks in pipelines [
19,
20].
Related Research Work
A pipeline leak produces elastic energy, leading to AE events. These occurrences are recorded as AE hits (AEHs) by AE sensors on the pipeline’s surface. The variations in the AE signal caused by these AEHs are essential for detecting leaks [
21]. Researchers have concentrated on pattern recognition and feature extraction techniques to make effective use of these signal variations [
22]. Techniques for detecting pipeline leaks using intelligent pattern recognition can be classified into three main types: frequency domain (FD), temporal domain (TD), and time-frequency domain (TFD). Wang et al. [
23] used principal component analysis and pre-processed AE signals to obtain low-dimensional discriminant features before using TD statistical features for leak identification. An artificial neural network (ANN) was applied in conjunction with the amplitude of the TD AE signal to detect leaks [
24]. AE signals are complicated and non-stationary; stationary signals are better for FD analysis. Thus, to extract useful features from non-stationary AE data, TFD techniques like wavelet transform and empirical mode decomposition are used [
25]. With the use of pattern recognition methods like SVDD, ANN, and fuzzy-SVDD, these features assist in the identification of pipeline health issues. However, TFD preprocessing involves significant computational costs, and choosing the appropriate base wavelet for wavelet transforms requires experimental validation [
26].
AEH characteristics have demonstrated good results compared to traditional extracted characteristics from AE signals in FD, TD, and TFD for various approaches [
27]. However, AEH features have limitations due to multiple AEH sources, such as fluid pressure, background noise, increased vibrations, and leaks, reducing their sensitivity to leak identification [
28]. Extracting leak-related characteristics from the AE signal is the first step in the pipeline leak diagnosis system. DL techniques have an advantage over typical machine learning techniques in their ability to analyze complex data [
29]. For pattern recognition tasks, DL techniques independently extract discriminative, meaningful information from complex data. CNN, neural auto-encoders, recurrent neural networks, and deep belief networks are the most popular DL techniques for defect identification. CNNs use local representative fields, share weights within the network, and use special domain subsampling to minimize the danger of overfitting and enable low computing complexity. Moreover, CNNs have shown promising pattern recognition in pipelines, centrifugal pumps, and bearing failure diagnosis [
30,
31].
Various studies have demonstrated the potential of AE technology in leak detection and have developed methods for detecting cracks and other anomalies in pipelines. Techniques such as principal component analysis, neural networks, and wavelet transforms have been explored for feature extraction and pattern recognition. Deep learning has gained significant attention for its ability to solve complex problems in various domains. For example, CNN-based transfer learning models have been successfully applied to classify microseismic event waveforms [
32,
33]. This study highlights the effectiveness of transfer learning in handling data scarcity, which can be adapted to pipeline leak detection scenarios with limited leak data by utilizing pre-trained models. A hybrid deep learning and transfer learning approach for aerosol retrieval [
34] showcased the potential of integrating domain-specific preprocessing with deep learning frameworks. Similarly, our work incorporates domain-specific preprocessing CWT scalograms and Gaussian filtering to enhance signal clarity and feature extraction. Attention U-Net architectures for self-potential inversion tasks [
35] demonstrated how attention mechanisms can enhance model focus on critical data regions. This inspires potential future extensions of our pipeline monitoring framework to include attention layers for focusing on key time-frequency regions of AE scalograms.
CNNs excel at extracting spatial features from data, but they do not capture temporal dependencies effectively. This limitation affects the ability to fully recognize patterns in uninterrupted data, such as AE signals in pipeline diagnostics. To solve this issue, the CNN network is incorporated with LSTM. LSTMs are specifically designed to capture long-range dependencies and temporal sequences in data, making them optimal for analyzing time-series information. By combining CNN with LSTM, the feature extraction process is enhanced, using CNN for spatial pattern recognition and LSTM for temporal pattern recognition. This hybrid approach enhances the precision and efficiency of leak detection in pipelines, providing a more comprehensive analysis compared to using CNN alone. The latent spaces derived from the combined CNN and LSTM models are further optimized using a GA to accurately assess the health state of the pipeline.
The key contributions of this study are outlined below.
A Gaussian filter is used to enhance variations in color intensity due to energy change across different scales and frequencies in AE scalograms.
A deep learning framework is introduced, combining enhanced AE scalograms with CNN-LSTM models and a genetic algorithm (GA) for feature optimization, with the goal of improving the identification of pipeline operating conditions.
The proposed approach is validated using real pipeline data, showcasing its effectiveness across varying leak scenarios, fluid types, and pressure conditions.
The composition of this work is organized as follows:
Section 2 outlines the proposed model and the sequential DL models for leak identification. The experimental setup is discussed in
Section 3.
Section 4 comprises the results, while
Section 5 consists of the discussion. Finally, the conclusion and future directions of this study are described in
Section 6.
2. Proposed Methodology
The proposed approach begins with the collection of AE signals from the pipeline, followed by a series of preprocessing, feature extraction, and classification steps, ultimately leading to an accurate assessment of the pipeline’s health condition. This comprehensive workflow ensures that critical signal characteristics are preserved and analyzed effectively.
Figure 1 visually represents the step-by-step flow of the proposed methodology, providing a clear overview of the processes involved. The detailed steps are outlined below:
Step I: From the pipeline, the AE signals are collected during both leak and non-leak conditions.
Step II: TD AE signals are converted into images utilizing the CWT. These images use various colors to represent different energy intensities, illustrating how energy levels vary across different time and frequency ranges.
Step III: CWT images are pre-processed using a Gaussian filter to smooth and reduce noise in the scalogram images. This filtering enhances the clarity of the key features, making it easier to identify potential leaks.
Step IV: To extract detailed spatial and temporal features from the enhanced scalograms, a hybrid model combining CNN and LSTM was used. The CNN component effectively captures spatial features, such as changes in energy at specific levels, including variations in AE amplitude. Meanwhile, the LSTM component is adept at capturing temporal dependencies, providing information about the sequence of events in the AE signal, such as AE intensity over time and frequency distribution. These extracted characteristics are then used for further analysis to determine the distinct characteristics of pipeline leaks.
Step V: A feature vector is generated by integrating spatial and temporal features extracted from enhanced scalogram images using a hybrid CNN-LSTM model. These extracted features are collected by a genetic algorithm, which then selects the most important ones. Based on these refined features, pipeline leaks are detected, and the health state of the pipeline is determined using a fully connected layer.
2.1. Continuous Wavelet Transform
CWT is a mathematical tool utilized to examine non-stationary signals by decomposing them into time-frequency components. It provides a representation of the signal at multiple scales and resolutions, making it particularly useful for detecting transient features. A source wavelet function is used by the CWT to transform a TD signal into the time-frequency domain. This source, often known as the “mother” wavelet, is typically a brief time-based signal that vibrates in cycles. The decomposition process starts with the mother wavelet and divides the complex signal into coefficients that are scaled and translated according to certain parameters [
36]. The CWT is collected by summing scaled and shifted copies of the wavelet function across the entire time domain. The CWT of a X (t) signal is defined mathematically as follows:
where the wavelet’s center frequency and window length are determined by s, and its position in the time domain is indicated by τ. Greater detail is revealed by smaller scales, which correspond to higher frequencies, while larger scales provide general information about the signal by matching to lower frequencies.
The Morlet wavelet as “amor” was selected as the basis function for CWT due to its ability to balance time and frequency resolution effectively [
37]. Its Gaussian envelope minimizes spectral leakage, which is essential for analyzing transient AE signals. This choice ensures precise localization of leak-related features in the time-frequency domain. The frequency range of interest was set to [1 Hz, 500 kHz] to enclose all relevant spectral components of AE signals. By defining this range, the CWT algorithm dynamically determines the scales, ignoring the need to explicitly set decomposition levels and ensuring optimal representation of the AE signal features.
CWT generates a 2D transformation matrix when it is applied to real AE data from a pipeline. The scale of the AE signal is represented by each row in this matrix, while the translation or size of the pipeline AE signal is represented by each column. These two-dimensional transformation matrices can be seen as what are known as AE scalograms, which are pictures where the color intensities represent the different wavelet energy levels across time and frequency, representing variations in the pipeline’s operating conditions.
Different energy regions that correlate to variations in pipeline conditions are clearly seen in the AE images.
Figure 2a,b show scalogram images of pipelines working under normal and defective conditions, respectively.
2.2. Gaussian Filter
A Gaussian filter uses a Gaussian function, which is defined by its mean and standard deviation and has a bell-like shape. The amount of smoothing applied to the CWT images can be modified by varying the standard deviation [
32]. In this study, a Gaussian filter was used to increase the quality of CWT images by minimizing noise and smoothing the images.
The 2D digital Gaussian filter can be mathematically represented as:
where σ represents the variance of the Gaussian filter. The filter kernel size, typically ranging from −1 to 1 for both x and y, is chosen by excluding values less than five percent of the kernel’s maximum value [
38].
This filter works by blending the CWT image with a Gaussian function, giving more weight to nearby pixel values and less to those farther away. This process effectively minimizes minor, random variations in pixel energy while maintaining the essential features and overall shape of the image. By fine-tuning the standard deviation, the optimal amount of smoothing can be applied, making the images clearer and easier to analyze.
Figure 3 shows the CWT scalograms of normal and leak images after applying a Gaussian filter.
This method utilized the Gaussian filter’s capability to maintain the key features of CWT images while reducing noise, thereby improving the visual quality and reliability of the images for subsequent processing and analysis. This approach is known for its effectiveness in noise reduction and image enhancement, making it a valuable tool in the initial stages of CWT image analysis.
2.3. CNN and LSTM Hybrid Structure
In this study, a hybrid model that integrates CNN and LSTM networks to extract features from pipeline data was used. This section outlines the individual components of CNN and LSTM and then describes their combined structure, specifically designed for the feature extraction task.
2.3.1. Convolutional Neural Network
CNN is a specialized deep learning model designed primarily for image processing tasks, as illustrated in
Figure 4. It efficiently captures spatial hierarchies of features by applying a series of convolutional layers, which use learnable filters to detect local patterns such as edges, textures, and shapes within the input data. This automated feature extraction process allows CNNs to identify intricate relationships in image data, making them highly effective for tasks involving visual pattern recognition and classification. CNN is primarily composed of convolutional, pooling, and activation layers. The central component is the convolutional layer, which performs the convolution operation essentially as an inner product between sections of the input data and a filter matrix. This process is important for extracting feature information from the input data. By utilizing different convolutional kernels, a variety of features can be extracted, with the size of these kernels playing a significant role in determining the features that are captured. The mathematical representation of the convolution operation is as follows:
where γ denotes the current layer,
is the
eigenmatrix of the current layer, f (∙) is the activation function,
is the data element of the
layer,
is the number of kernels,
is the weight matrix of the corresponding convolution kernel, and
is the bias matrix.
The CNN architecture for the proposed model is described below in
Table 1. The pooling layer performs downsampling by adjusting the filter on the input data and selecting the maximum or average value within the sliding window. The activation function used in this model is rectified linear unit (ReLU), which is defined as:
The ReLU (rectified linear unit) activation function is applied to the output of each convolutional layer in a CNN. It is a non-linear function that allows all positive inputs to pass unchanged while mapping negative inputs to zero. This activation introduces essential non-linearity into the model, enabling it to learn complex patterns and relationships in the data. Additionally, ReLU helps to mitigate the vanishing gradient problem, a common issue with other activation functions like Sigmoid or Tanh, thereby improving the training efficiency and convergence speed of deep neural networks.
The CNN architecture was designed with a focus on computational efficiency. Convolutional layers with small kernel sizes (3 × 3) were used to effectively extract spatial features while minimizing computational costs. Max-pooling layers (2 × 2) were employed to reduce feature map dimensions progressively, significantly lowering computational overhead. The number of filters (32, 64, and 128) were chosen based on empirical experiments to balance feature extraction capability and computational complexity. Activation functions, ReLU for CNN, were selected for their effectiveness in capturing non-linear relationships and temporal dependencies, respectively. Training was conducted with mini-batch processing, utilizing GPU acceleration and early stopping to optimize training speed.
2.3.2. Long Short-Term Memory
LSTM is a type of RNN, specifically designed to handle and process long-time sequence information. The LSTM architecture for the proposed model is described below in
Table 1. Unlike traditional RNNs, LSTM can handle long-term dependencies more effectively by incorporating a gate mechanism and cell state [
34]. The network consists of three key gates: the forget gate, input gate, and output gate, which are responsible for controlling the flow of information through the network. These gates work together to manage memory retention, input updates, and output generation, enabling LSTMs to effectively capture long-term dependencies in sequential data.
Figure 5 provides a visual representation of this architecture, illustrating how these gates interact to regulate the information flow across different time steps.
- 1.
The forget gate controls which information from the previous cell state should be discarded or retained. It evaluates the importance of past information using a sigmoid activation function and is mathematically expressed as:
where
is the weight matrix,
is the bias vector,
is the output of the previous unit, and
is the current input.
- 2.
The input gate controls which new information is added to the cell state, computed as:
Here, and are weight matrices, and are bias vectors, and represents the candidate cell state.
- 3.
The cell state combines the previous cell state information with the new one. The cell state is updated using:
where
is the previous cell state and
denotes element-wise multiplication.
- 4.
Output gates determine the next hidden state, and from the cell state, which information should be output, where the output gate is given by:
where
and
are the weight matrix and bias vector. The hidden state
is the output of the LSTM unit.
The CNN-LSTM model processes AE-CWT images through a hybrid architecture that combines spatial feature extraction and temporal pattern recognition. The workflow begins with the CNN component, where the input images are passed through three convolutional layers to extract spatial features. The first convolutional layer applies 32 filters of size 3 × 3, followed by a 2 × 2 max-pooling layer, reducing the feature map size from 654 × 873 to 327 × 436. In the second convolutional layer, 64 filters of size 3 × 3 are applied, and another 2 × 2 max-pooling layer further reduces the feature map to 162 × 217. The third convolutional layer employs 128 filters of size 3 × 3, followed by a final 2 × 2 max-pooling layer, which compresses the output to 80 × 107. The resulting 3D feature map is then flattened into a 1D vector of size 1,095,680 and passed into an LSTM layer with 64 units. The LSTM layer captures temporal dependencies within the sequential data using a tanh activation function, enabling the model to recognize time-dependent patterns essential for detecting pipeline faults effectively.
This hybrid CNN-LSTM architecture combines the spatial feature extraction capabilities of the CNN with the temporal sequence modeling strengths of the LSTM. This synergy allows the model to capture intricate spatial patterns while maintaining an understanding of time-dependent relationships, making it highly effective for pipeline fault detection and diagnosis.
2.4. Genetic Algorithm
GA is an optimization and search technique inspired by the principles of natural selection and evolution. It mimics biological processes such as reproduction, mutation, crossover, and selection to efficiently explore and optimize complex solution spaces. By iteratively evolving a population of potential solutions, GA identifies optimal or near-optimal outcomes for challenging problems. This evolutionary approach enables GA to handle non-linear, multi-dimensional, and highly constrained optimization tasks, making it a powerful tool for solving complex engineering and computational problems.
The idea of “survival of the fittest”, which comes from Charles Darwin’s theory of evolution, is the foundation of the GA function. This strategy makes use of the essential processes of crossover, mutation, and selection, all of which are useful for building robustness and accomplishing global optimization [
35].
The first step in the GA process is to initialize a population of candidate solutions, with each candidate representing a unique subset of features. The GA assesses each candidate’s fitness over several iterative generations by measuring its accuracy against a fully linked model that was trained with those characteristics. In this study, the initial population was set to 20, with generations set to 10. The accuracy of the classification observed on the test dataset acts as the benchmark fitness for evaluating performance throughout the optimization process. The algorithm uses the following steps to evolve the population and improve the solutions continually:
Selection: The GA selects the best-performing candidates from the current population based on their fitness scores. These candidates are more likely to pass on their features to the next generation.
Crossover: Pairs of selected candidates are combined to produce offspring. This crossover operation involves mixing the feature subsets of two parents to create new feature subsets, promoting diversity in the population.
Mutation: To maintain genetic diversity and prevent premature convergence, the GA introduces random changes to some feature subsets. This mutation operation ensures that the search space is thoroughly explored and helps in discovering potentially better feature combinations.
The pipeline health monitoring process is improved by the GA’s selection of the most important features, which guarantees that the model is efficient in spotting leaks in the pipeline. The fully connected layers of the model are subsequently trained using the top-performing feature subset from the last generation, improving the precision and dependability of the leak detection procedure.
2.5. Fully Connected Layer
In the fully connected layer, each neuron is directly connected to every neuron in both the preceding and succeeding layers. This layer performs a weighted sum of the input data, followed by the application of an activation function to introduce non-linearity. This non-linear transformation allows the network to learn intricate relationships in the data, making fully connected layers essential for the final classification step.
In the proposed model, the genetic algorithm (GA) selects the most relevant features from the combined output of the CNN-LSTM network, ensuring that only the most informative features are passed into the fully connected layers.
As described in
Table 2, the features first pass through a dense layer with ReLU activation. This activation function enables the layer to learn complex and non-linear feature representations, enhancing the model’s ability to capture subtle patterns in the data. The output from this layer is then fed into the final fully connected layer, which employs a sigmoid activation function. The sigmoid activation converts the output into probability scores corresponding to the target classes, enabling the model to make a final classification decision with high accuracy and confidence.
This architecture ensures an efficient flow of information, combining optimized feature selection from GA with the powerful representation capabilities of dense layers, resulting in robust and reliable fault classification performance.
4. Results
The arrangement of training and validation data plays a crucial role in evaluating the effectiveness of the proposed method. During the training phase, data corresponding to a leak size of 1 mm under fluid pressures of 13 and 18 bars were utilized, while the evaluation phase employed data from varying pressure levels and leak sizes to ensure a comprehensive assessment. The dataset comprised 1080 samples, evenly divided between non-leak samples and leak samples. To construct and validate the model, 80% of the dataset was randomly allocated for training, while the remaining 20% was reserved for validation. Class imbalance was addressed by adjusting the loss function to assign higher importance to the minority class and using stratified k-fold cross-validation to ensure balanced splits during training and testing phases.
To maintain consistency and reliability, the experiments were repeated 10 times. The model’s convergence was examined by varying the number of training epochs (50, 100, and 150 epochs), and it was observed that optimal accuracy was achieved between 70 and 100 epochs. Early stopping was applied to prevent overfitting by halting training if the validation loss did not improve for 10 consecutive epochs. Additionally, a dropout rate of 0.5 was introduced in the fully connected layers to prevent over-reliance on specific features, and training was stopped once validation loss plateaued. Furthermore, a 5-fold cross-validation strategy was implemented to validate the model’s robustness across different data subsets. These combined strategies ensured a balanced, efficient, and reliable training process, resulting in a robust and generalizable fault diagnosis model suitable for real-world applications.
Metrics like precision, accuracy, recall, and F1 score were employed to evaluate the efficacy of the suggested method. These metrics give an accurate measure of the classification algorithm’s efficiency and data classification accuracy. Equations (13)–(16) are the specific formulas that were used to calculate these measurements.
where ‘
’, ‘
’, and ‘
’ represent false positive, false negative, and true positive outcomes for class A, respectively. A false positive (
) occurs when a sample is incorrectly classified as belonging to class A when it does not. A true positive (
) indicates the correct identification of samples that genuinely belong to class A. On the other hand, a false negative
) happens when samples that actually belong to class A are misclassified as belonging to another class.
The total number of samples in class A is represented by the sum of and denoted as . The total number of samples misclassified as belonging to class A is the sum of and the difference between the total number of data samples (N) and . Here, N signifies the total number of data samples in the testing dataset.
To evaluate the effectiveness of the proposed approach, a comparison was conducted with three other models designed for similar tasks. The first model (CWT-CNN), developed by Li et al. [
39], employs a CNN trained on CWT images extracted from AE signals under a similar experimental setup. The second model (FFT-CNN), proposed by Masoumeh Rahimi et al. [
40], utilizes FFT images as input to a CNN for feature extraction and classification. The third model (STFT-CNN) integrates STFT images with a CNN, leveraging time-frequency representations for fault detection. These models represent diverse TFD approaches combined with CNN architectures, providing a benchmark to assess the performance and robustness of the proposed method across multiple preprocessing techniques and feature extraction strategies
5. Discussion
The proposed CNN-LSTM hybrid model, applied to AE data from industrial fluid pipelines, demonstrated outstanding performance, achieving precision, accuracy, F1 score, and recall values of 99.71%, 99.69%, 99.82%, and 99.75%, respectively, as presented in
Table 5. These results highlight the model’s superiority over reference models, including CWT-CNN, FFT-CNN, and STFT-CNN, in terms of classification accuracy. The model’s enhanced performance stems from its ability to integrate spatial and temporal features effectively. The CNN component extracts spatial features from enhanced CWT scalograms, capturing intricate energy variations in the AE signals. Meanwhile, the LSTM component excels in modeling temporal dependencies, identifying meaningful sequential patterns within the data. Additionally, the inclusion of a genetic algorithm (GA) refines the extracted features, ensuring that only the most relevant and discriminative features are selected for classification. This hybrid architecture enables a seamless combination of spatial and temporal feature extraction, significantly enhancing the model’s capacity to differentiate fault conditions accurately. The integration of these advanced techniques contributes to the model’s robust performance across various experimental conditions and performance metrics, establishing it as a highly reliable solution for fault detection and classification in industrial pipeline systems.
Li et al. [
39] collected acoustic signals from a gas pipeline system with small, synthetic leaks to apply deep learning techniques for leak detection. Their approach involved introducing controlled artificial leaks, which exposed the system to diverse acoustic signatures under predefined conditions. This deliberately created input was important for training the model to recognize patterns associated with small leaks, simulating real-world conditions effectively. The system utilized these inputs to iteratively optimize its performance, refining its robustness and adaptability by addressing edge cases through expert feedback. By transforming the acoustic signals into the frequency domain and applying a 1D CNN model, the methodology was able to develop discriminative features for small-leak detection. However, noise in the acoustic signals introduced challenges that impacted performance, resulting in an accuracy of 85.18%. Despite these limitations, the approach demonstrated its effectiveness and was selected for comparison due to its compatibility with our experimental setup. The performance metrics in
Table 5 reflect the application of this reference technique to our dataset.
Masoumeh Rahimi et al. [
40] employed a DL approach for leak detection by collecting data using a hydrophone from a leaking plastic tank. Their study systematically compared multiple signal preprocessing techniques across the frequency domain, time domain, and time-frequency domain, with each preprocessed signal subsequently analyzed using a convolutional neural network (CNN) for feature extraction and classification. The study revealed that the FFT-CNN approach outperformed other preprocessing methods in effectively detecting leaks. To ensure a fair and consistent comparison, the same methodology was applied to our pipeline dataset, where signals underwent similar preprocessing across the time, frequency, and time-frequency domains before being processed by a CNN model. The results obtained from our dataset were carefully recorded and analysed, allowing for a direct comparison with ABC’s findings to evaluate the relative performance of each method in the context of pipeline fault detection.
The proposed model was compared with a TFD method, specifically the STFT-CNN. In this approach, AE signals from the TD are transformed into the TFD using the STFT. The resulting representations are fed into a CNN, which is trained to extract features indicating leaks. This method utilizes the CNN’s pattern recognition capabilities combined with STFT for feature extraction. For a fair comparison, the same dataset was used, and the STFT-CNN achieved an accuracy of 93.05%. The lower accuracy is primarily due to information loss caused by the windowing effect in STFT, which reduces its ability to capture transient signal variations accurately. This limitation affects the TFD resolution, resulting in decreased fault classification performance compared to the proposed model.
In comparison to the methods previously stated, the results obtained using the proposed method show a higher classification accuracy.
Figure 10 and
Figure 11 show the confusion matrices and t-SNE visualizations to demonstrate this superiority, which is mainly attributable to the method’s improved consistency and precision in identifying leak statuses as well as pipeline normal operating conditions.
6. Conclusions
This study introduced an innovative approach for pipeline leak detection using advanced deep learning (DL) techniques. AE signals were collected from a pipeline system and transformed into CWT images to capture essential time-frequency features. A hybrid DL framework, integrating CNN and LSTM models, was developed to extract both spatial and temporal features effectively from these images. To further enhance feature relevance and classification accuracy, a Genetic Algorithm (GA) was employed for feature selection, ensuring that only the most discriminative features were retained. These optimized features were then fed into a fully connected layer for pipeline health classification. The proposed method demonstrated outstanding performance, achieving an impressive accuracy of 99.69% in leak detection. This highlights its robustness, reliability, and superiority over traditional approaches. The scientific significance of this research lies in its seamless integration of AE signal processing, time-frequency analysis, and DL techniques, resulting in a highly accurate and scalable solution for pipeline leak detection. Furthermore, the approach holds significant practical value, particularly for industries relying on pipelines as critical infrastructure, offering a reliable and efficient tool for real-time monitoring and maintenance.
Future work will address precise leak localization by developing methods such as accurately extracting leak-related AE events and implementing time difference of arrival techniques. Building on this, the integration of hydraulic behavior analysis will be explored, incorporating physical models like Bernoulli’s principle and pressure-loss equations. These advancements will enable a comprehensive framework that not only detects leaks with high accuracy but also localizes them precisely while considering the hydraulic dynamics of pipeline systems. Together, these developments aim to enhance operational safety, reduce environmental and economic impacts, and contribute to a broader understanding of pipeline health monitoring.