1. Introduction
The wide field electromagnetic method (WFEM) is an important geophysical method, which is a controlled source frequency domain electromagnetic method with completely independent intellectual property rights in China [
1]. With a complete theoretical system and mature instruments, WFEM overcomes the shortcoming of weak and random signals of the natural source electromagnetic method and improves the signal-to-noise ratio and resolution when the field sources are periodic signals and pseudo-random signals [
2]. WFEM improves the work efficiency and anti-interference ability in the field, eliminates the weak signal caused by observing only in the “far region”, organically integrates the “transition region” and the “far region”, and significantly increases the observation scope and detection depth. Geoelectric information of multiple frequencies can be sent and received at one time. WFEM defines the wide field apparent resistivity for the whole region by retaining the high-order term in the calculation formula and by observing one component that can obtain the electric field curve and the apparent resistivity curve [
3,
4].
Electromagnetic noise is more intense in a modern city and working area. To a certain extent, noise limits the further development of the electromagnetic method. Therefore, most electromagnetic method researchers are concerned with the denoising technique. However, WFEM data are also disturbed by noise, and the denoising technique is still the first prerequisite for the collected data. In addition, the high-quality of WFEM data are the foundation of inversion calculation and geological interpretation. To improve the longitudinal resolution and exploration effect of WFEM detection technology, it is necessary to strengthen the research on the WFEM data denoising method. The traditional WFEM data processing method uses frequency-domain processing, which can improve data quality with less interference. When most frequency points are distorted by persistent strong noise, these frequency-domain denoising methods rely on the selection of power spectrum, resulting in an unreliable data processing effect [
5,
6,
7]. However, the signal processing methods in the time domain are all integrated processing of electromagnetic data collected [
8,
9,
10], which improves the data quality to a certain extent but lacks the identification link of signal and noise. Therefore, how to effectively eliminate WFEM noise by using the new method is one of the key technical problems to be solved urgently. The comprehensive evaluation method is the smoothness of the curve and the frequency points without abnormal change. The WFEM 7 frequency wave data include 7-0/7-1/7-2/7-3/7-4/7-5 frequency groups, respectively. In this paper, we focus on the data analysis of 7-2/7-3 frequency groups, which are 64, 32, 16, 8, 4, 2, 1 Hz corresponding to 7-2 frequency group data and 48, 24, 12, 6, 3, 1.5, 0.75 Hz corresponding to 7-3 frequency group data [
11].
Here, the time domain feature information is the time variable used to describe the signal waveform. Frequency domain feature analysis can observe signal features by frequency spectrum. Time-frequency domain features can represent multiple statistical values of the sample data to be tested. The multi-domain features’ fusion can accurately characterize the details of WFEM data and quantitatively describe the difference between signal and noise. Support vector machine (SVM) is a generalized linear classifier, which classifies binary data through supervised learning [
12]. Its decision boundary is the maximum margin hyperplane to be solved by learning samples. SVM usually solves the two-class problems by establishing a hyperplane and distinguishing positive and negative examples as much as possible [
13]. When the WFEM data only need to be divided into signal and noise, the SVM is a suitable classification algorithm. However, the penalty factor
and kernel parameter
are the main factors that affect the SVM classification results [
14,
15]. Grey wolf optimizer (GWO) is a new type of swarm intelligence optimization algorithm [
16]. GWO optimizes searches by simulating the social hierarchy relationship and hunting behavior of grey wolves in nature. The algorithm divides a population into four social levels, and the individuals in the population represent the solution of the optimization problem [
17]. However, the convergence speed of the GWO algorithm is slow, and it is easy to fall into the local optimal solution but difficult to obtain the global optimal solution. In this paper, we propose a new method to improve the GWO algorithm, which uses neighborhood search and location sharing to enhance the balance between local sand global searches, maintain diversity and improve convergence speed.
In this paper, we propose a novel WFEM signal-noise identification method, which is based on multi-domain features and an improved grey wolf optimizer support vector machine (IGWO-SVM). We constructed a WFEM sample library, extracted the peak-to-peak value and pulse factor in time domain features, the mean frequency in frequency domain features, the wavelet singular entropy in time-frequency domain features, and analyzed the signal and noise feature of WFEM data. An IGWO algorithm was used to search the best parameters of the SVM, which learned the sample library’s feature and trained data model. The results were compared with those of K-means clustering, Fuzzy C mean (FCM) clustering and the K nearest neighbor (KNN) algorithm. Then, the IGWO-SVM data model was used to directly remove the identified WFEM noise and retain the WFEM signal for data reconstruction. Finally, the digital coherence technique was used to extract the reconstructed data spectrum amplitude of the effective frequency points. We compared the convergence of multiple intelligent optimization algorithms and optimization SVM models. The proposed method confirms that the fusion of multi-domain features and the IGWO-SVM can accurately and quickly recognize WFEM signals. We applied the proposed method to the simulation experiment and measured WFEM data for validation. The electric field curves were more stable, and data quality was improved. The satisfactory performance in the application and discussion verifies the effectiveness of the design and optimization method.
Note that the aim of this paper is to achieve the high precision WFEM signal-noise identification. The contributions of this paper are summarized as follows:
- (1)
The principle of multi-domain features and improved grey wolf optimizer are introduced. And the convergence of various optimization methods is illustrated.
- (2)
Four optimized SVM algorithm are quantified to demonstrate the advantages of the proposed method. Meanwhile, the K-means clustering, FCM clustering, KNN classification method and PSO-SVM method are compared.
- (3)
The validity of the proposed method is verified in many experiments and measured WFEM data.
The remainder of this paper is arranged as follows:
Section 2 introduces the multi-domain features, and the principle and convergence of the grey wolf optimizer and the improved grey wolf optimizer.
Section 3 presents the experiments and results that illustrate the effectiveness of the proposed method.
Section 4 and
Section 5 show the applications and discussions in the measured WFEM data, respectively.
Section 6 summarizes and highlights the major contributions of this paper.
2. Methodology
Based on a pseudo-random signal as the transmitting source, WFEM data will inevitably be affected by electromagnetic noise, resulting in abnormal waveform of the signal, and changing the electric field value. Considering that the normal WFEM signal should be a pseudo-random signal waveform, the signal is simple, regular and easy to identify. The feature extraction and intelligent identification are beneficial to WFEM signals and noise processing. Therefore, the fusion of multi-domain features and the IGWO-SVM were applied to WFEM signal identification. The proposed method was processed for the time-series waveform. Firstly, we extracted peak-to-peak values and the pulse factor feature in the time domain, the mean frequency feature in the frequency domain and the wavelet singular entropy in the time-frequency domain and introduced them for analyzing the WFEM signal and noise feature. The comparison of the convergence of several intelligent optimization algorithms and SVM parameters optimization methods followed. Finally, the multi-domain features and the IGWO-SVM was used in WFEM signal-noise identification. Next, multi-domain features and the IGWO algorithm were mainly introduced.
2.1. Multi-Domain Features
Feature extraction is a method and process of extracting object feature information by computer. It is mainly used for images, signal processing and machine learning to describe information. Multi-domain features are extracted from the time domain, the frequency domain and the time-frequency domain. In this paper, we focus on the peak-to-peak values and pulse factor features in the time domain, the mean frequency feature in the frequency domain, and the wavelet singular entropy in the time-frequency domain, respectively.
The peak-to-peak value is the difference between the maximum and minimum value of the signal, which is expressed as follows:
The pulse factor is the peak signal divided by the mean absolute value and can also indicate whether the signal contains instantaneous spike, which is calculated as follows:
where
is time domain signal and
is the length of signal. According to the difference of WFEM signal and noise, the dimension feature value changes correspondingly, and the dimensionless index can show the noisy state of electromagnetic data more directly. Therefore, the dimension and dimensionless feature are used together.
Frequency domain analysis performs the Fourier transform through the time domain signal. The signal component and the time domain signal are interrelated and complement each other. Thus, the frequency domain is more concise. Frequency domain feature is extracted from the signal frequency spectrum feature by FFT. Among them, the mean frequency is expressed as follows:
Wavelet singular entropy is the most typical feature in the time-frequency domain [
18]. Based on the theory of singular value decomposition, the wavelet singular entropy of the signal by wavelet transform method is decomposed into a series of singular values, which can reflect the basic feature of the original coefficient matrix. The uncertainty of the singular value set is analyzed by the statistical feature of information entropy, and a definite measure of the complexity of the original signal is given.
The singular value decomposition (SVD) of any
order matrix
can be expressed as follows:
where
and
are orthogonal matrices of
order and
order, respectively.
is the diagonal matrix, among them,
, its non-negative diagonal elements are arranged in descending order and are the singular eigenvalues of matrix
. SVD can represent the
order matrix
of rank
as the sum of
order submatrices of rank 1. At this moment, the wavelet transform coefficient matrix of the signal can reflect the time-frequency distribution feature of the signal by SVD.
To quantitatively describe the frequency components and distribution feature of the signal, the wavelet singular entropy is defined as follows:
where
is the incremental wavelet singular entropy of the
ith nonzero singular value
. The simpler the signal being analyzed, the more concentrated the energy is in a few modes, and the smaller the wavelet singular entropy. Conversely, the more complex the signal, the more dispersed the energy, and the larger the wavelet singular entropy.
2.2. Grey Wolf Optimizer
Inspired by the predation behavior of grey wolves, Mirjalili et al. proposed the grey wolf optimizer (GWO) algorithm [
16]. By simulating the predation behavior of grey wolves, the GWO was optimized based on the mechanism of pack cooperation [
19]. The GWO algorithm is characterized by its simple structure in which few parameters need to be adjusted, its ease of implementation, its adaptive convergence factors and its information feedback mechanism. It can achieve the balance between local optimization and global search, so it has good performance in precision and convergence speed to solve the problem.
Grey wolves encircle prey during a hunt, and the encircling behavior can be modeled as follows:
Equation (6) is the distance between an individual and the prey, and Equation (7) is the location update of the grey wolf, where
is the current iteration,
and
denote coefficient vectors,
is the position vector of the prey, and
indicates the position vector of the grey wolf. The vectors
and
are calculated as follows:
where the components of
are linearly decreased from 2 to 0 over the course of iterations and
and
are random vectors in [0,1]. The hunt is usually guided by
wolves, that is leaders, followed by
and
wolves, which can also occasionally participate in hunting. However, in the search space, we have no idea about the location of the optimum solution.
Thus, the hunt, in which the hunters are moving toward the prey or solution over the provided search space, is the main approach of the GWO algorithm. To simulate the hunting behavior of grey wolves, we assume that
(best candidate solution),
and
have better knowledge about the potential location of prey. Thus, we save the three best solutions obtained so far and oblige the other search agents to update their position by the position of the best search agent. The mathematical representation of such hunts is as follows:
where
,
and
represent the distance between the current candidate grey wolf and the
,
and
wolves, respectively.
,
and
are the position of
,
and
, respectively.
,
and
are random vectors, and
is the position of the current grey wolves.
,
and
are random vectors. The
wolves, considered to be the remaining possible solutions in the pack, follow other solutions and update themselves with the other three best solutions expressed with Equation (11). The
is the final position of the
wolves.
2.3. Improved Grey Wolf Optimizer
In the GWO, the search process is guided by three best wolves in each iteration, which shows a strong convergence toward these wolves [
20]. In contrast, it suffers from a lack of the population diversity, an imbalance between the exploitation and exploration, and premature convergence [
21]. Neighborhood search and location sharing are used to improve grey wolf optimization, namely IGWO, and enhance the ability of global optimization to avoid premature convergence [
22].
The IGWO algorithm benefits from a new movement strategy, namely a dimensional-learning based hunting (DLH) search strategy, which is inherited from the individual hunting behavior of wolves in nature. DLH uses different methods to construct a neighborhood for each wolf, and neighboring information can be shared among wolves. Dimension learning used for the DLH search strategy enhances the balance between local and global searches and maintains diversity.
In the DLH search strategy, each dimension of the new position of each wolf is calculated. This individual wolf is learned by its different neighbors, and a wolf from the top 3 wolves (
,
and
) is randomly selected. First, a radius
is calculated using Euclidean distance between the current position of
and the candidate position
as follows:
The neighbors of
defined by
is constructed as follows:
where
respected to radius
,
is Euclidean distance between
and
.
The neighborhood of
is constructed, multi-neighbors learning is performed as follows:
where
is the
dth dimension of a random neighbor
selected from
, and a random wolf
from
,
and
wolf.
Selecting and updating the new position of
as follows:
To verify the optimization performance of the IGWO algorithm. We give four benchmark functions to compare their convergence accuracy. Among them, the comparison of methods such as the GWO, the particle swarm optimization (PSO) [
23], the multi-verse optimizer (MVO) [
24], the moth-flame optimization (MFO) [
25], the artificial bee colony (ABC) algorithm [
26], the sine cosine algorithm (SCA) [
27] and the imperialist competitive algorithm (ICA) [
28]. Note that the population size is 10, and the maximum number of iterations is 100.
Figure 1 shows convergence comparison of the four benchmark functions.
From
Figure 1, we can see that the solution accuracy and convergence speed are better than other intelligent optimization algorithms at the same population and the iteration number. Through the convergence of the IGWO algorithm, we can see that this algorithm has obvious advantages in the optimization ability and stability of benchmark function and can better jump out of local optimization and obtain higher global optimization ability.
5. Discussions
The electromagnetic method is an important geophysical exploration method, which mainly includes the natural source electromagnetic method and the artificial source electromagnetic method. Compared to the natural source electromagnetic method, the artificial source electromagnetic method overcomes the weak and random signal in the natural field source and further improves the signal-to-noise ratio and resolution of signal. Incidentally, the field source is mainly composed of periodic square waves and pseudo-random signals. With the development of modern industry and technology, electromagnetic interference has become stronger and stronger, and noise suppression has always been a key problem for many electromagnetic workers, restricting the development of technical methods to a certain extent. The WFEM uses an artificial field source with a very powerful signal transmitter. In actual experiments, the observed signals were inevitably affected by various types of strong interference. To improve the longitudinal resolution and exploration effect of the WFEM detection technology, it is necessary to strengthen the research on the denoising method of the WFEM data. Therefore, how to effectively eliminate the noisy data of WFEM by using the new method is one of the key technical problems that urgently needs to be solved.
In recent years, time domain and frequency domain processing methods have been proposed for WFEM data processing, but the WFEM signal and noise recognition technology of time domain waveform is rarely proposed. Therefore, a WFEM signal-noise identification method based on multi-domain features and the IGWO-SVM has been proposed in this paper. We first introduced the characteristic parameters to describe the WFEM signal-noise and the improved intelligent optimization algorithm and compared the convergence of multiple intelligent algorithms (
Figure 1) to provide effective parameters for optimizing SVM classification. In the experiment, we introduced the signal and noise types and their frequency spectrums in the WFEM sample library (
Figure 2). At the same time, we conducted parameter selection and performance comparison of the four optimized SVM algorithms to compare the sample library signals, highlighting the advantages of the IGWO-SVM (
Table 1 and
Table 2). We further used clustering and classification algorithms to divide the sample library (
Figure 3). To verify the identification effect, we conducted a comparison and quantitative analysis of the simulated synthetic data (
Figure 4 and
Table 3). In the applications, a noiseless measured site was selected for artificial noise-added processing, and the electric field curve compared results are shown (
Figure 5). When the measured data were affected by noise, the proposed method identified the noise with high accuracy and reconstructed a high-quality effective signal (
Figure 6). The effectiveness of the proposed method was further verified by comparing the electric field curves before and after data processing (
Figure 7).
In a word, the proposed method improves the process of insufficient signal-noise identification in existing methods, reduces the excessive denoising processing of valid signals, and improves the data quality. However, applying the results of this paper to the inversion and interpretation of geophysical data will be the focus of further research.
6. Conclusions
A novel WFEM signal identification method, which uses multi-domain feature parameters to analyze the WFEM signal noise feature and applies the IGWO-SVM to identify signal and noise, while reconstructing the high-quality of WFEM data, has been developed.
The proposed method has been proven in the feature extraction of a sample library signal, IGWO convergence performance, optimal parameter of IGWO-SVM search ability and optimal classification effect, as well as analysis of the simulated and measured WFEM data. The results show that the WFEM signal-noise can be accurately identified. The reconstructed signal and its spectral information completely conform to the essential feature of WFEM pseudo-random data, and electric field curve is also more stable. The proposed method lays the foundation for feature extraction, improved intelligent optimization and WFEM signal processing. However, when the distinction between signal and noise is gradually fuzzy or complex or is subject to persistent strong interference, how to identify and denoise with high precision will be the focus of the future research.