3.1. Symptoms of Leak Presence
There exists AE activity in pipelines, even when they are operating in a healthy state. This activity could be due to the mechanical sources, such as pumps or particles in the flow, hitting a pipeline wall, or hydraulic sources caused by pressure pulses at vortexes in the fluid inside the pipeline [
12]. Let assume that
ni (
i = 1, 2) are AE signals acquired by sensors 1, 2 in the normal state, and their mean and variance are 0 and
N1 ≈
N2 ≈
N > 0, respectively. When a small leak occurs in the testbed pipeline, it makes a small disturbance in the flow around the leak, which introduces a new AE source into the system. The previous source
ni can be deemed as background noise; suppose that this noise is uncorrelated with the source of the signal obtained from the leak. Therefore, the model for AE measurements from the sensors for this scenario can be given as:
zi =
si +
ni (
i = 1, 2), where
si is the leak AE signal received by sensor
i. If the variance of
zi and
si is
Zi and
Si, respectively, then the uncorrelation between
ni and
si,
Zi =
Si +
N can be explored by setting
g =
Z2/
Z1 and it is transformed as follows:
If the background noise ni (i = 1, 2) is bandlimited white noise, then its variance is always N over its entire frequency range. Next, consider the measurement model when the measurement consists of the background noise ni and the leak signal si (i = 1, 2) at a frequency ω. In fact, the spectrum of the leak noise is a broadband range of frequencies; however, all the components demonstrate an identical behavior toward the leak phenomenon.
The AE attenuation characteristic from a power law [
17],
S1 and
S2 are related by:
In (2), is the attenuation coefficient in wave propagation which is dependent on frequency, is the distance between sensors 1 and 2, and Si (i = 1, 2) is the leak signal with frequency ω.
By substituting (2) into (1), and abbreviating
, which is the signal to noise ratio measured by sensor 2 at frequency ω, and symbolizing
, (1) transforms as follow:
Next, take the partial derivative of
g according to
r:
Since
for
, then
. As a result,
g(
r) is a monotonically increasing function according to
r. If
r1 ≠
r2, then
g(
r1) ≠
g(
r2). Naturally, a normal state has
r = 0 at every frequency ω, and an abnormal state always has
r ≠ 0; thus, the function
g(
r) is applicable for leak detection.
Figure 5 shows the dependence of
g(
r) on
r with different
β values at a particular frequency. It can be easily observed that all the curves
g(
r) increase from the normal state when
r increases.
3.2. Robustness of g(r) in Leak Manifestation
This section investigates the reliability of leak manifestation using the function g(r) when the level of background noise increases. Gaussian noise is added to the signals to emulate the presence of noise. Suppose that an amount of noise Δn with the mean 0 and variance ΔN is added to the noise background while the leak signal remains the same. At this moment, the background noise n is replaced by n′ = n + Δn; its variance is N′ = N + ΔN, and r is replaced by r′ = S2/N′.
Setting
, produces
, which is a function of two variables (
r,
γ). Consider the partial derivative of
r’ with respect to
γ:
Now, the function
g is replaced by:
and its partial derivative according to
γ is given by:
If and , then g’ decreases when increases. In other words, the discrimination quality of g’ becomes poorer as the intensity of the background noise is higher. This characteristic is similar to r′; however, g′ is more reliable than r′ because the decline in g′ is smaller than the decline in r′.
Shortening
,
, and taking the proportion
produces:
Obviously, if the parameter β in (10) is selected suitably, then . As a result, g(r’) varies more slowly than r′ if is increasing. It turns out that if the noise background increases to some extent, the variable r′ exceeds the limitation of leak discrimination, whereas the function g(r′) still provides enough differentiation.
In (8), if β approaches 0, the ratio Δ2/Δ1 converges 1 and the variation of the function g is similar to r if the background noise changes, the characteristic of g is no longer robust. In contrast, (3) reveals that if β approaches 1, g converges 1 for any r. This means that the function g does not manifest any abnormal state of the system. Hence, the parameter β should be chosen optimally to trade off between the two above cases so that both high sensitivity and reliability can be achieved.
3.3. Detection Procedures
This paper proposes an algorithm to detect a small leak in a pipeline based on
g(
r) function because it can indicate the presence of leak as described in the previous section.
Figure 6 shows a generic framework for leak detection using two approaches: direct AE-based and
g(
r)-based.
The AE signals from sensors 1, 2, and 3 are the inputs to the leak detection framework. In the direct AE-based method, there is no g(r)-construction block. After dividing the AE signals into frames, they are provided to the feature extraction block. The feature extraction process is carried out after the completion of g(r)-construction process.
3.3.1. Frame Division
The recorded AE signals are segmented into a series of frames by the frame division block. The AE waves propagate through different distances from the leak to the sensors. Thus, their arrivals are lagged. It indicates that the frame indexes associated with different channels are not exactly correlation. Hence, at the detection stage, the position of the leak is obscure, and the time of arrival of the signals at the sensors is unknown. Thus, one way to deal with this problem is to select a reasonable frame size.
Figure 7 uses an example of two signals
s1 and
s2 to explain the method. In this figure,
and
are the lag time and frame size (in time) of the two signals, respectively. Due to the existence of the lag time
, a lag part of signal
s1 cannot be correlated with any part of signal
s2 in the same frame index
i because it has already propagated in frame (
i − 1) of
s2. Thus, a formula for the frame size can be defined as:
where
is an amount of time to extend the frame size. Obviously, the bigger
is, the smaller the lag compared with the remains, which reduces the effect of the lag on the correlation.
Next, the parameter
is calculated. The location of the leak in the pipeline is unknown; however, the leak lies somewhere within the tested pipeline. According to this condition, the following equations can calculate the maximum lag time and this result is used to calculate the reasonable frame size.
In (10), C is the wave speed. According to [
19], AE signals can be propagated in fluid in the frequencies range of 20 kHz to 80 kHz besides propagating through the pipe wall in high frequencies. Furthermore, AE signals of leaks are from the flow turbulence and interaction of particles at the leak point. Thus, AE signals might contain both kinds of propagation. In other words, the wave speed in water is smaller than in solid materials [
20]. Hence, the value of C should be calculated with the propagation in water. The wave speed can be calculated as follows [
21,
22]:
where
and
are the volumetric compressibility modulus and the liquid density of the medium inside the pipeline,
and
are the thickness and inner diameter of the pipe,
is a factor related to the pipe supporting condition, and
is Poisson’s ratio.
Thus, the frame size is counted as:
In (14), ζ must keep the lagged part of the signals, which is not very large as compared to the rest of the signal.
3.3.2. g(r)-Construction
The quantity
g(
r) in
Section 3.1 is formulated by dividing the variance of one frequency for the leak signal. In this section,
g(
r) vector is constituted from the signal of sensors 1 and 2. The proportions of the amplitudes over all the frequencies are considered in the
g(
r) vector (
Figure 8).
In
Figure 8, time domain signals are converted to the frequency domain by fast Fourier transform (FFT), taking only their amplitudes as components of the divider for every frequency. After the transformation, we have a new signal in the form of
containing information about leakage symptoms.
3.3.3. Feature Extraction
In this study, the three features given in
Table 3 are used to compare the performance of the direct AE-based method with that of the
g(
r)-based one. These features are selected because of their effectiveness in both time and frequency domains.
3.3.4. Classification
KNN is a popular kernel function used to identify instances belonging to different classes during the diagnosis. The theory of KNN is presented clearly in [
18,
23]. This paper uses the KNN-based classifier to solve a binary classification problem, i.e., whether a sample belongs to the normal or abnormal conditions of the pipeline.