1. Introduction
As ocean exploration deepens, acoustic localization is used in an increasing number of domains underwater [
1,
2,
3]. Typically, underwater acoustic localization systems comprise a platform and several sensors. Sensors are organized into a sensor network based on a certain geometric configuration, and communicate with the platform through acoustic signals. Precise determination of the platform location can be achieved by measuring the parameters of the direct sound signals.
However, underwater acoustic fields are complicated and multi-path effects are severe [
4,
5,
6], which leads to the fact that the received signal is the superposition of a series of pulses. The issue of effectively distinguishing the direct acoustic signal from complicated overlaid pulses is crucial for underwater acoustic localization, as well as for back-end data processing. Many related studies have been conducted to address the above issue with identification and may be broadly categorized as follows, based on the various approaches and techniques employed.
In the first category, the primary purpose is to use the correlation between received pulses to realize the identification of a direct pulse. Literature [
7,
8,
9] constructed the orthogonal correlation function between the received signals and the source signals, and identified the correlation peaks to accurately estimate the direct pulse and parameters. Xia et al. [
10] constructed a spatial spectral function using the MUSIC algorithm by exploiting the orthogonality of the noise and signal subspaces to select direct signals. Nevertheless, this approach typically necessitates the centralized processing of unprocessed data via underwater acoustic communications, which is not conducive to instantaneous data processing.
In the second category, the primary technique is to separate the direct signal inside the overlaid multipath signals. Literature [
11,
12,
13,
14,
15] used different methods to suppress the effect of the multipath and obtain the direct signal, such as multi-dimensional matching filters, the orthogonal correlation, and the energy peak searching and so on. Additionally, methods such as channel equalization, time-reversal mirrors, and homomorphic filtering can be employed to effectively suppress multipath signals within the overlapped pulses, hence facilitating the identification of the direct pulse.
The third type of method depends on the parametric characteristics. The primary method utilized in engineering applications is the artificial expert system, which establishes predetermined criteria based on various project requirements and conditions, as introduced in literature [
16,
17]. Nevertheless, the weighting values of manually assigned criteria lack precision, and are unsuitable for the complicated and dynamic underwater environment.
Recently, ML integrated with underwater acoustics has gained significant attention, which is a type of optimization method from the statistical perspective [
18,
19,
20,
21]. The decision tree (DT) technique is commonly used and efficient for classification in ML, depending on feature characteristics for tree construction [
22,
23,
24]. Literature [
25] investigated the identification method using the DT technique, depending on the independent judgment of the direct signal, abbreviated to S-DT. However, in S-DT methods, independent judgments restrict the utilization of effective information and lead to poor adaptability to complex underwater environments, especially when sharp interference or false alarm pulses exist. Hence, using features from multi-sensors may be a vital aid to enhancing the performance. Nevertheless, the process of signal detection and the estimation of various sensors are independent of each other, and the parameters cannot be directly employed for constructing decision trees.
To address the issue above, this work explores a precise identification method using DT, by integrating the fusion information from multi-sensors. Firstly, the characteristics of pulse parameters in ideal underwater acoustic multipath channels are investigated and their models are established. Secondly, the construction of fusion features is studied. The pulse parameters of different sensors cannot be directly used due to the independent detection. Hence, by exploring the correlation between the direct signals and positioning performance, a cost function is constructed using the geometric intersection principle. The fusion features of different pulses can be calculated through positioning performance feedback. Subsequently, the pulse parameter features are preprocessed to mitigate the impact of varying magnitudes and magnitude units on data processing. Then, the decision tree is constructed using the features to obtain the outputs and achieve the precise and efficient identification of direct signals. Ultimately, the efficacy and robustness of the suggested approach are confirmed by simulation and actual experimental data processing. We abbreviate the proposed method, which depends on multiple feature fusion based on localization feedback, as MFF-LF.
The structure of the paper is as follows:
Section 2 examines the pulse characteristics in the underwater environment;
Section 3 introduces the DT algorithm;
Section 4 investigates the MFF-LF method; the simulation analysis is presented in
Section 5;
Section 6 presents the field test data processing;
Section 7 discusses the limitation of MFF-LF;
Section 8 provides the conclusion.
2. Pulse Characteristics in Underwater Environment
The underwater acoustic channel can be conceptualized as a slow time-varying coherent multi-path channel. In an ideal channel, the sound field has an equal velocity distribution, and all reflections are approximately mirror reflections. Aside from the direct signal, there are also surface reflections, seabed reflections, and secondary reflections between the surface and seafloor in the received signals. Based on the ray acoustics [
26], all reflections are assumed to be specular reflections, disregarding the impact of sound line bending. Then, the ideal multipath channel model is depicted in
Figure 1.
The source signal
is sent from the transmitter and travels via several paths to reach the receiver. The signal received can be mathematically represented as:
In Equation (1),
represents the ambient noise and
denotes the channel impulse response function. Under ideal channel conditions,
can be expressed as follows [
25]:
In Equation (2),
represents the total number of transmission paths.
and
denote the amplitude and delay, respectively.
denotes the Doppler coefficient of the signal. Let the sound speed underwater be denoted by
and let
denote the radial velocity between the transmitter and receiver. Then, we have:
By substituting Equation (2) and Equation (3) into Equation (1) and subsequently arranging the terms, we may derive the formula for the received signal.
Under a multipath channel, the signal received within the signal cycle is a superposition of direct and reflected signals, as shown in Equation (4). The parameters vary, including amplitude, delay, Doppler shift, pulse width, and so on.
Subsequently, a study will be undertaken on the distinct parameter features of ranging CW pulses, building upon the background of underwater acoustic positioning, which will serve as a basis for the later development of MFF-LF algorithms.
The delay is calculated by the ratio of signal propagation range to the sound speed. Taking one-time reflection as an example, the sound propagation paths are shown in
Figure 2. In the figure, D and
refer to the horizontal and the linear distance between the transmitter and receiver, respectively. H represents the depth of the water.
and
, respectively, represent the vertical depth from the water surface to the transmitter and receiver.
and
denote the vertical depth of the transmitter and receiver to the bottom of the water, separately.
The calculation of the time delay is given by Equation (5) to Equation (7). For direct signals, there is:
The phenomenon of reflection is examined in two scenarios. If the number of reflection times is odd, the delay will be computed by the following equation:
where
and
denote the time delay for surface reflection and seabed reflection, separately.
denotes the number of reflection times.
If reflection times are even, then
Using the ray acoustics for the analysis, the pulse amplitude can be obtained by the following equation for the direct signal:
Similar to the process of analyzing the time delay, the pulse amplitude of the reflected sound is discussed in different cases. For an odd number of reflection times,
where
and
denote the reflection coefficients and
denotes the number of reflection times. If reflection times are even, then
If the relative motion exists between the transmitter and receiver, the received signal will exhibit a linear compression (or stretching) on the time scale.
Figure 3 shows a schematic diagram of a signal received at the front and back edges in the presence of relative motion. Here,
indicates the arrival time of the pulse front and
indicates the arrival time of the pulse back edge.
indicates the velocity of the platform (receiver). The width of the emitted pulse is
. The width of the received pulse is
.
Due to the fact that
, the width
can be calculated as
For the reflected, taking the surface reflection as an example,
Since
, then the width of the reflected pulse can be written as
The Doppler shift occurs when there is a relative motion between the transmitter and the receiver, which causes changes in the received signals on both the time scale and in the waveform, specifically resulting in a shift in the signal’s frequency.
According to the analysis of the pulse width above, the Doppler shift of direct signals and the one-time reflections can be obtained as Equations (16) and (17).
where
denotes the frequency of the transmitted CW signal.
Overall, by studying the pulse characteristics in ideal acoustic channels, different parameter models have been constructed, which is of great significance for subsequent research on the accurate identification of ranging direct signals. This research will examine the identification approach based on DT, utilizing the pulse characteristics investigated above.
3. The DT Algorithm
This section focuses on the fundamentals of the DT algorithm. The CART decision tree [
27,
28] is used in this research and the Gini Index is employed to determine the classification attributes.
Let the sample set be defined as
.
denotes the number of sample classifications (
).
represents the proportion of the
k-th class of samples to the sample set
. The purity of the sample set
can be measured by the value
, which is expressed as follows:
Equation (18) denotes the probability that two samples randomly selected from the sample do not share the same category. The purity of increases as decreases.
Define the attribute set
.
denotes the number of characteristic attributes. Then, the Gini coefficient of attribute
can be expressed as Equation (19).
In Equation (19),
denotes the set of the samples characterized by the attribute
. The optimal segmentation attributes
are determined as follows:
Branches are generated for each value of , and are then recursively evaluated until the last node of the decision tree is formed to finish the creation.
4. The MFF-LF Method
Learning from the previous section, the selection of feature attributes greatly impacts the performance of DT. This paper will then investigate the MFF-LF method in the following two respects. One is the method for the construction of fusion features based on localization feedback by using multi-sensor information. Another is the method for preprocessing feature parameters. The underwater localization scene is shown in the following figure.
As shown in
Figure 4, within each signal transmission period, the platform emits CW pulses, and the sensors receive the signal and then complete the measurement of the pulse parameters. Let
denote the position of the sensors,
denote the target position, and
the sound speed. Then, according to the geometric intersection localization principle, there is the following relationship:
In Equation (21),
denotes the Euclidean paradigm and the formula is in matrix form, then it is turned into another form using the coordinates, as follows:
4.1. Construction of Fusion Feature
To solve Equation (22), the optimization algorithm, like the Differential Evolution method, the Gauss–Newton method, and so on, can be used. No matter what kind of techniques are chosen to optimize the problem, the core issue is to establish the cost function, as shown below.
Equation (23) shows that, when the function
takes the minimum value, the target position
takes the optimum value. Theoretically, if there are no measurement errors, then
The specific formula expression in
varies depending on the selected criteria. If the minimum mean square error criterion is selected, then
can be expressed as follows:
In the issue of identifying direct signals, if all the correct direct pulses are successfully picked out the to participate in the subsequent solution, then the minimum value of can be achieved. Otherwise, wrong identification will increase the value of the cost function due to the geometrically infeasible intersection. Thus, in a given signal cycle, the likelihood of identifying the pulse that minimizes the cost function as the direct pulse is greater than that of other pulses.
Applying the above thought, we next investigate how to utilize the multi-sensor information to create fusion features based on location feedback, which is beneficial for improving the performance of DT.
Let the pulses of each sensor be sorted in the sequential order, separately, and the number of pulses received within the signal period by various sensors be marked as
. Then, choosing one pulse from each sensor to create a group set
and taking four sensors
as an example, if the serial numbers are marked as
, then
is as follows:
where
denotes the delay of the
-th pulse detected by sensor 1#. Let
denote the sensor number. Then, set
has a total of
combinations, which form the set
, as seen in
Figure 5.
The cost function
is built according to the minimum mean square error criterion.
According to Equation (28), if goes to the minimum value, the correct pulse set will be found. Among the set, means the correct direct signal of the sensor i #. Further processing must be performed before we obtain , due to the interaction between the multi-sensors’ fusion.
For each combination in
, Equation (28) will give the calculation of the function values. The number of values is
. Different combinations correlate to varying values of
. It is necessary to determine the role of a certain pulse, for example
. According to the combinations in
Figure 5,
appears in more than one set. All combinations containing
should be extracted from
, and then form a new collection
, seen as
Figure 6.
Combined with
Figure 6, the formula for summarizing the aforementioned categorization procedure is as shown in Equation (29).
The set of pulses is complete and the elements inside are mutually exclusive to each other. denotes the complete set of combinations containing the p-th pulse detected by sensor i# within the signal period.
Then, the sum values of the cost function corresponding to the elements in the set
are given by Equation (30), where
is seen as the fusion feature of the pulse
.
From Equation (30), a new feature for the CW pulse is built based on the multi-sensor fusion information and the performance feedback of underwater positioning. For the pulse , its fusion feature contains the outcome of the collaborative influence of the parameter’s information from the other sensors. From the perspective of decision tree construction, the available pulse feature information dimension has increased by one dimension, from the dimension to the dimension. Overall, effectively links the pulse of independent sensors with each other, hence enhancing the effective information for direct pulse identification.
4.2. Preprocessing Method of Features
For the issue of identifying the direct signals, the decision tree produces a classification result, indicating whether a pulse is classified as a direct signal or not. The input of the tree is the pulses’ attributes information, with parameters that vary in amplitude and units. Data preprocessing is essential for eliminating the effect of varying magnitudes among distinct characteristics and enhancing the performance of the decision tree.
First, the processing of a continuous value is required. Since the feature attributes, such as delay, amplitude, width, and Doppler shift are continuous and non-finitely available, they cannot be directly used for node partitioning of the decision tree. Thus, this work utilizes the bisection approach to handle the continuous values of non-discrete characteristics. The processing procedure is as follows:
The values of continuous attribute
in the sample set
are listed in decreasing order and are stored as
.Then, the partition nodes are obtained by using the bisection technique for discrete processing.
Using Equation (21), the Gini coefficient is calculated for each element in the set and the element with the smallest Gini coefficient is used as the optimal discrete division point for the attribute .
Subsequently, data preprocessing of pulse parameters is needed to make the decision tree more robust and better performing. According to the pulse characteristic investigated in
Section 2, the preprocessing procedure is as follows:
In the process constructing the decision tree, the relative size connection between various pulses’ delays has greater significance than their absolute values. Thus, for each sensor
, the delay information is acquired by the subsequent process of differentiation:
where
denotes the transmit period of the signal, and
is the minimum value of the delay within the transmit period.
Since the acoustic range of direct signals varies, the energy loss is different. There is a difference in the magnitude of the pulse amplitude. Therefore, for each sensor, a normalization is conducted:
where
indicates the maximum amplitude within the transmit period.
The Doppler shift is normalized as follows:
In the equation, denotes the frequency of the transmitted signal and denotes the frequency of the received pulse signal within the transmit period.
The relative size of fusion features counts. Therefore, the following difference processing is taken for the fusion features
:
where
denotes the minimum value of the pulse from the same sensor within the transmit period.
As with the time delay, the relative size connection between various pulses’ width has greater significance than their absolute values. Thus, for each sensor
, the pulse width information is acquired by the subsequent process of differentiation:
where
is the minimum value of the delay within the transmit period.
Finally, it is also necessary to determine the label of the decision tree used for indicating the direct signals. In the MFF-LF method, the label of direct sound is recorded as 1, while the non-direct sound is recorded as 0.
In summary, the schematic diagram of the MFF-LF method proposed in this paper is summarized as
Figure 7.
5. Simulation Analysis
In this section, computer simulation is used to simulate the received signals based on the attributes’ characteristics and the acoustic propagation theory studied in
Section 2 to verify the performance of the MFF-LF method under different conditions.
First, the sound field hypothesis is as follows: assuming that the acoustic field conforms to the ray acoustic theory, with only specular reflections from the water surface and the bottom taken into account. There is no absorption attenuation in the channel, only spherical wave extension attenuation. The sound velocity is uniformly distributed throughout the channel, and the background noise is Gaussian white noise.
Next, the simulation procedure for the pulse parameters is given as follows: the width of the CW signals is 10 ms; the frequency
is taken as 10 kHz; the sound speed
is 1500 m/s; the depth of water, the transmitter, and the receiver are 200 m, 10 m, and 120 m, respectively; the reflection coefficient of the water surface is taken as −1 and that of the water bottom is 0.2; and the maximum number of reflections is set as 2. Coordinates of the sensors are set as shown in
Table 1.
Assuming that the platform moves along the direction of a heading angle of 30° from the initial position (250, 250) m with a constant speed of 2 m/s, the platform’s maneuvering situation is shown in
Figure 8.
Under the conditions of the above simulation parameter settings, the values of delay, amplitude, width, and Doppler shift of the 500-frame received pulse are shown in
Figure 9.
Let
represent the total number of received signals.
and
denote the number of correct identifications for direct signals and reflect signals, separately. Then, the performance evaluation metric accuracy rate is as follows:
Then, the performance of the MFF-LF method will be investigated, comparing with the S-DT method. In the simulation, the delay , amplitude , width , Doppler shift , and the fusion feature of the pulses are selected as the parameter set of the decision tree.
According to Equation (28), the construction of the fusion feature of the MFF-LF method is based on localization performance feedback, and is therefore affected by measurement errors of delay, sound velocity
, and sensor position errors
. The error parameters are set as shown in
Table 2. The other parameters are set as above.
In the total of 500 frames of data, the first 200 frames will be used as training data for the DT method and the last 300 frames are used as test data. Then, the results of different sensors are given as shown in
Table 3.
Upon examining
Table 3, it can be seen that, overall, the MFF-LF algorithm, as suggested in this research, successfully achieves accurate identification of direct signals, and its performance is improved compared to S-DT. Given the specified simulation settings above, for the MFF-LF method, S2 and S4 exhibit perfect accuracy in distinguishing between direct and reflected sounds, but S1 and S3 have four and two incorrect pulse data judgments, respectively. In contrast, there are 14, 22, 7, and 13 incorrect frames, respectively, which the direct signal could not correctly pick for the S-DT method. Upon reviewing the results, both algorithms experience the issue of incorrectly distinguishing the reflected sound nearest to the direct sound pulse in specific frames, which will result in an increase of errors in the subsequent data processing.
Next, we will further analyze the effect of measurement error variation on the performance of the MFF-LF method. It is assumed that
,
, and
vary from 0 to 2 ms, 0 to 5 m/s, and 0 to 4 m, respectively.
Figure 10 presents the trend of the variation.
Analyzing
Figure 10, it can be seen that the MFF-LF algorithm exhibits the highest level of sensitivity to variations in
compared to other measurement errors, as indicated by the three graphs. With the gradual increase of
, the accuracy rate of various sensors decreases to varying extents. Taking S1 as an example, if there is no
, the algorithm can correctly distinguish between direct sound and reflected sound with 100% precision. However, when
is 2 ms, the accuracy decreases to 95.27%. Then, upon studying
Figure 10b,c, it can be seen that the MFF-LF algorithm is less affected by
and
. The algorithm can maintain an accuracy of 98.46% even when they reach 5 m/s and 4 m, respectively.
To summarize, the simulation in this section applies the simulation data that adhere to the theoretical law to validate the feasibility of the proposed MFF-LF algorithm. Furthermore, it demonstrates that the identification accuracy of MFF-LF is enhanced in comparison to the S-DT method.
6. Field Test Data Processing
To further verify the performance of the MFF-LF algorithm, the measured data from the field test are used to verify the robustness compared with the S-DT method.
The data in this section are derived from the cooperative high-frame-rate positioning system for underwater targets. The test was conducted on 24 October 2015 at Songhua Lake, Jilin Province. The water depth was 60 m and the average sound velocity was 1460 m/s according to the hydrological conditions measured on that day. The system consists of four buoys and cooperative targets, as is shown in
Figure 11. GPS antennas are mounted on the top of buoys and targets, which can be used to determine their positions in real time. Clock synchronization is maintained between the target and each buoy. The signal transmission period was 0.1 s. In each period, the target transmitted a CW pulse with a width of 10 ms. Buoys detected the signal and measured the corresponding parameters.
Four buoys were placed in a rectangular distribution to form a measurement area with coordinates 1#
m, 2#
m, 3#
m, and 4#
m. The depth of the transmitter on the target was 3 m. The trajectory of target is shown in
Figure 12.
In the total 400 frames of data, the first 150 frames will be used as training data for the DT method and the last 250 frames are used as test data. The measured delays of various buoys are shown in
Figure 13.
Analyzing
Figure 13, it can be seen that, due to the effect of the multipath channel, each buoy will receive several pulses within the period, including direct sound and reflected sound. Then, we will analyze the performance of the MFF-LF method, and compare it with the S-DT method.
Table 4 shows the accuracy rate of the two methods and the root mean square error (RMSE) of localization.
Figure 14 shows the calculation results of the target trajectory.
Analyzing
Table 4, it is evident that the MFF-LF algorithm has demonstrated an enhanced performance compared to the S-DT algorithm in terms of identification accuracy rate. For each buoy, the accuracy rate of the MFF-LF algorithm has increased to a certain degree. By comparing the process time of the two methods, the duration of the MFF-LF method is longer than that of the S-DT method due to the fact that it takes time to handle the fusion feature construction. The time sacrifice of the MFF-LF method can be tolerated as it brings an improvement in the performance of the direct sound recognition. Upon further examination of
Figure 14, it is shown that the trajectories solved by the MFF-LF algorithm closely align with the theoretical values, with a localization RMSE of 2.7009 m, whereas the S-DT approach is 4.8734 m. Hence, the MFF-LF algorithm has performance advantages in terms of localization accuracy.
The field test further verifies the feasibility of the MFF-LF algorithm and its robustness in the outfield environment.
7. Discussions of the Limitation of MFF-LF
This section addresses the main limitation of the proposed method when applied to the issue of identifying the direct signals.
The key idea of the MFF-LF method is to establish the fusion feature based on the feedback of the localization performance. It is evident that the performance of MFF-LF will be affected by localization. We will analyze the limitations in terms of the following aspects.
Based on the investigation, measurement errors are one of the main factors affecting positioning accuracy, which further influence the calculation of fusion features. According to the results of the sensitivity analysis in the simulation, as measurement errors increase, the identification accuracy rate decreases to a certain extent. Hence, this algorithm has requirements for the measurement accuracy of parameters like time delay, sound speed, and sensor positions.
In the issue of underwater acoustic localization, the localization errors vary in terms of the spatial distribution. The localization errors vary in terms of distinct spatial positions. Therefore, the performance feedback is also affected by the spatial position of the platform (receiver).
According to the process time of the two methods shown in
Table 4 in the field test data processing, extra time is needed to handle the fusion feature construction. The time sacrifice of the MFF-LF method can be tolerated as it brings an improvement in the performance of the direct sound recognition.
8. Conclusions
Influenced by the underwater acoustic multipath channel, the signal will be distorted during transmission. The superposition of direct signals and the reflected signals results in the intricate blending of received signals, hence complicating the identification of direct signals for acoustic ranging. In this paper, the above problems are studied, and the identification method is proposed. Then, the following conclusions are obtained:
Through analyzing the pulse parameters under the influence of multipath channels, this paper explores the correlation between parameter characteristics and positioning performance. Multi-sensor fusion features are constructed based on the principle of geometric intersection. This feature enhances the available dimensions of parameters and increases the effective information of direct sound pulses, which is beneficial for improving the performance of the decision tree.
The computer simulation calculates the features of the received pulse parameters based on the models studied in this paper. The simulation is used to verify the viability of the MFF-LF method in solving the issue of the identification of direct CW signals. The processing of field test data provides additional evidence of the feasibility of the MFF-LF method in handling complex real-world environments.
Through a comparison with the S-DT method, the proposed method can significantly boost the accuracy of direct signal identification and therefore improve the accuracy of the subsequent localization solution.