1. Introduction
Seismic surveys are designed to keep a consistent grid of sources and receivers. In typical surveys, regularly sampled seismic surveys are uncommon or rare because of logistic obstacles or economic constraints. These limitations lead to large shot and receiver sampling intervals, which produce poorly, irregularly sampled seismic data along spatial coordinates with gaps without recorded traces. Seismic reconstruction methods can be divided into four main classes: signal processing-based methods, wave equation-based methods, machine learning-based methods, and rank reduction-based methods.
Most of the methods in the signal processing-based category are multidimensional and use prediction filters [
1,
2,
3] and transform domains through means such as the Fourier transform [
4,
5,
6], Radon transform [
7,
8], or Curvelet transform [
9]. There are some hybrid techniques using combinations of the Fourier transform with prediction error filters [
10] as a way to improve interpolation beyond aliasing. Vassallo et al. [
11] proposed multichannel interpolation by matching pursuit (MIMAP), which interpolates high-order aliased data without making any prior information about the linearity of seismic events or the seismic wavefield model. Ghaderpour [
12] proposed a multichannel anti-leakage least-squares spectral (MALLSSA) method that includes the spatial gradients of seismic data into the anti-leakage least-squares spectral (ALLSSA) to regularize seismic data beyond aliasing.
Wave-equation-based algorithms execute an implicit migration de-migration pair. Stolt [
13] introduced a reconstruction algorithm that regularized a dataset by mapping it to physical space using migration–demigration. A finite-difference offset continuation filter was proposed by Fomel [
14].
In recent years, machine learning has attracted much attention in geophysical studies. Deep learning (DL) is having a great influence on signal and image processing. However, although very powerful, machine learning for geophysical problems is not mature yet, and applications often ignore the complications and subtleties of real data processing. Wang et al. [
15] proposed a ResNet block design to interpolate anti-aliased seismic data. Siahkoohi et al. [
16] proposed generative adversarial networks (GANs) to reconstruct heavily sub-sampled seismic data by giving up linearity and using an adaptive non-linear model. Convolutional neural networks (CNNs) are a novel strategy for the reconstruction of missing traces in pr-stack seismic images [
17]. A new framework for training artificial neural networks (ANNs) was presented by Mikhailiuk and Faul [
18] to restore corrupted multidimensional seismic signals. Wang et al. [
19] utilized an eight-layer residual learning network (ResNet) with better back-propagation for interpolation.
The key aspect of rank reduction-based methods is that linear events in a clean seismic dataset are low-rank in the time and frequency domains. On the other hand, noise and missing traces increase the rank of data [
20]. In the Fourier domain, rank reduction algorithms reduce the rank of Hankel/Toeplitz matrices generated from frequency slices. The singular spectrum analysis (SSA) method proposed by Sacchi [
21], Tricket and Burroughs [
22], Oropeza and Sacchi [
23], and Bayati and Siahkoohi [
24] works by rank-reduction of the Hankel matrix with an iterative algorithm in the frequency domain. Gao et al. [
25] extended the SSA method to higher seismic data dimensions and called it a multichannel singular spectrum (MSSA). Interpolating regularly missing traces with a de-aliased MSSA method was proposed by Naghizadeh and Sacchi [
26]. Kreimer et al. [
27] developed the algorithm for pre-stack data via low-rank tensor completion. Kreimer and Sacchi [
28] used the higher-order SVD for rank-reduction of the pre-stack seismic data tensor. Ely et al. [
29] utilized a statistical test to control the complexity of regularization by tuning a single regularization parameter. Kumar et al. [
30] developed rank-reduction techniques using matrix completion (MC) for seismic data reconstruction. Rekapali et al. [
31] studied the application of multi-channel time slice singular spectrum analysis (MTSSSA) for 3D seismic data de-noising in the time domain and discussed the selection of the input processing window length. A damped OptShrink was studied by Siahshahr et al. [
32], which selects the rank of the block of the Hankel matrix without knowing the final rank. Carozzi and Sacchi [
33] proposed a method called (I-MSSA) that interpolates seismic data in its exact data coordinates that overcomes the problem of vertical errors in rough binning.
One of the necessary assumptions of rank-reduction-based techniques is that the Hankel matrix generated from clean and complete seismic data containing plane waves is low-rank, and its rank is equal to the number of plane waves present in the data. However, gaps and noise increase the rank, breaking this assumption. When dealing with data from complex geologies, rank selection becomes difficult and inconvenient. One approach to deal with this problem is to apply the algorithm to local windows. However, in most cases, it is not easy to find the proper window size because it is hard to decide if the structure in the local window is linear or not. Moreover, it is difficult to approximate the rank of each window. Choosing the wrong rank will lead to failure because, if overestimated, there will be a significant amount of residual noise remaining, and if underestimated, it will cause random noise and signal distortion.
In the presence of noise when the signal-to-noise ratio is low, data reconstructed by TSVD tend to contain a significant amount of residual noise. This residual noise is because part of the noise has a projection on the signal component projection. Nadakuditi [
34] introduced an algorithm that provides a weighted approximation, where the weights reduce and threshold the singular values. Their method mitigates the effect of rank overestimation. If the rank is properly estimated, the algorithm will better estimate weak components of the signal subspace. Chen et al. [
35] introduced a damping operator that shrinks the singular values containing significant particles of residual noise. Their method reached better results than the Cadzow rank-reduction method [
22,
23] in the reconstruction of highly noisy incomplete 5-D data. In this paper, we propose a method that automatically selects a rank for each local window. The method was proposed by Wu and Bai [
36] for 2-D data. We improved it for 3-D data and propose an adaptive weighting rank reduction method (AWRR) that selects the rank of the block of Hankel matrices automatically to reconstruct and denoise 3-D seismic data simultaneously. The strategy is to choose the second cutoff in the singular-value spectrum of the block Hankel matrix. We tested the efficiency of the proposed method by applying it to both synthetic and real seismic data.
2. Materials and Methods
Consider
a block of 3-D seismic data in the
domain on
by
by
samples.
,
,
. In the frequency domain, the data are represented as
and
. Each frequency slice of the data at a given frequency
can be represented by the following matrix:
To avoid notational confusion, let us ignore the argument
. Then, construct a Hankel matrix from each inline of
; for the inline
ith, the Hankel matrix in the
mth frequency slice will be:
At this step, we have
Hankel matrices of each inline. Then, the multichannel singular spectrum analysis (MSSA) constructs a block Hankel matrix
from Hankel matrices
:
The size of is , where and . The integers m and n are chosen to make the Hankel matrices of and the block Hankel matrix of square matrices, or close to square. As we see, including the two spatial dimensions of the 3-D cube for each frequency can make the block Hankel matrix very large.
In seismic data, the observed data can be indicated as
, where
represents observed seismic data,
indicates the full noiseless data,
represents the random noise and the residuals, and
indicates the sampling matrix formed of zeros and ones. Using the MSSA algorithm, we can write the Hankel matrix of observed data as:
where
represents low-rank Hankel matrix of the desired signal and
denotes the noise component which includes the gaps as well. In MSSA, the decomposition of the Hankel matrix in the rank-reduction step using TSVD will be as follows:
If one knows the desired rank of the Hankel matrix, the estimated signal desired signal is recovered by:
where
is the estimated signal, and
k is the predefined rank of the Hankel matrix equal to the number of the linear events in each local window.
Let us investigate the effect of linear events on the singular values in each frequency.
Figure 1a shows the cube of 3-D data.
Figure 1b indicates an inline of the data and
Figure 1c represents a slice of data in a cross-line direction.
We investigate the singular value distribution of the data by converting it to the
domain. Next comes generating the block Hankel matrix in each frequency slice.
Figure 2a shows the block Hankel matrix for the frequency slice of 20 Hz.
Figure 2b displays the block Hankel matrix generated in the frequency slice of 50 Hz. According to
Figure 2a,b, it is clear that the block Hankel matrix for the frequency of 20 Hz is smoother than the one for the frequency of 50 Hz. It signifies that the block Hankel matrices for the higher frequencies require a higher rank than those in the lower frequencies to recover.
The singular-value spectrum of the data’s block Hankel matrices for its frequency range is presented in
Figure 3.
Figure 3 represents the first 15 singular values’ spectrum. This figure illustrates how the number of nonzero singular values of the block Hankel matrix at low frequencies is equal to the number of linear events but increases with frequency.
Similarly,
Figure 4a represents the bar plot of the normalized singular values for the frequency slice of 20 Hz.
Figure 4b shows the zoomed image for the first 20 singular values.
Figure 4c shows the bar plot of the singular values for the frequency slice of 50 Hz.
Figure 4d is the zoomed image for the first 20 singular values. From the figures, one can see that for the frequency of 20 Hz, the maximum difference in the energy between two adjacent singular values occurs in the third singular value, and it is the same as the number of linear events. However, for the frequency slice of 50 Hz, there is a second group of nonzero singular values. It means that we need to keep more singular values to recover all the frequencies completely.
To analyze the effect of each singular value on the energy of each frequency for the data presented in
Figure 1, data were decomposed into their singular matrices and recovered with each singular value from 1 to 15 separately. Then, the energy per frequency or power spectrum of the recovered data with each singular value is calculated. In
Figure 5, the red line represents the graph of frequency energy per frequency, and the blue line shows the estimate of the mean normalized frequency of the power spectrum.
Figure 5a is the result of recovering data with the first singular value.
Figure 5b shows the energy per frequency for the recovered data with the second singular value.
Figure 5c is the energy per frequency for the recovered data, including just the third singular value. The estimated mean frequency for these first three singular values is 77 Hz.
Figure 5d–f represent the energy per frequency for the recovered data, including just the 4th, 5th, and 6th singular values, respectively. The estimated mean frequency for these singular values is 90 Hz.
Figure 5g–i indicates the energy per frequency for the recovered data, including just the 7th, 8th and 9th singular values, respectively. The estimated mean frequency for these singular values is 88 Hz.
Figure 5j–l show the energy per frequency for the recovered data, including just the 10th, 11th, and 12th singular values, respectively. It is clear that most energy of the useful signal is recovered with the first three singular values. Nevertheless, the shift in the mean frequency from 77 to 90 Hz states that there is leakage for the higher frequencies in the data. We conclude from this test that despite the remaining residual errors in the presence of the additive noise, we need to choose the rank of the block Hankel matrix to be greater than the number of linear events in each processing window of frequency and space.
To find the best rank of data that minimizes residual errors, we tested different kinds of data containing various numbers of events with different slopes and amplitudes. The best result is when we choose the rank from the second cutoff of singular values instead of the first cutoff.
In all tested data, it is possible to see the first cutoff with a distinct change in the energy of singular values. However, sometimes there is no abrupt change in the amplitude of the singular values for the second cutoff, especially in the presence of a high level of random noise. Finding the second cutoff could be a challenge when the quality of the signal is poor. With a careful look at where the first and the second cutoff in clean and complete data happen, we conclude that there is a linear relationship between the first and second cutoff. The best estimate for the second cutoff can be estimated as follows:
where
is the
ith singular value of the Hankel matrix in each frequency.
indicates the operator that finds the rank at the point where the two following singular values become more scattered.
indicates the second cutoff of the singular-value spectrum of the block of a Hankel matrix in each frequency slice.
can be introduced as the optimal rank of the block Hankel matrix that minimizes the Frobenius-norm difference between the approximated and the exact signal components.
Substituting Equation (
8) to the rank-reduction step will give us:
By applying Equation (
9) on the rank-reduction step of MSSA, the rank-reduced block Hankel matrix is reconstructed. The next step is averaging anti-diagonals of the recovered block Hankel matrix. It recovers the signal in the Fourier domain for each frequency slice. The rank-reduction step leads to gradually reconstructing the missing traces and can be used together with an iterative algorithm to replace the missing traces with the reconstructed traces. An iterative algorithm is a practical approach for random noise attenuation and seismic data amplitude reconstruction. The algorithm is written as follows:
where
is the temporal frequency,
is a matrix of ones,
is a weight factor, which decreases linearly with iterations,
indicates the Fourier transform, and
shows the inverse Fourier transform.
points to the Hankelization operator to generate the block Hankel matrix,
reveals the rank reduction operator, and
gives the averaging anti-diagonal operator.
Selecting the rank using Equation (
9) recovers all the signal components. One of the advantages of this rank-reduction method is that it is adaptive and data-driven, and there is no need to set any parameter for the rank-reduction step in each processing window. Moreover, choosing the second cutoff instead of the first one leads to a better reconstruction of the high frequencies. However, selecting large ranks leads to more residual errors in the recovered data. That is why we need to apply a weighting operator to reduce the effect of noise on the projected signal components to recover higher frequencies.
We are looking for a weighting operator
to adjust the singular values of
to calculate the best estimation of the desired signal. Nadakuditi [
34] proposed an algorithm for low-rank matrix denoising that can be summarized as follows
is the signal plus noise Hankel matrix.
k is the best effective rank that can represent the signal.
End for loop,
Compute results as
In this algorithm,
is computed as:
where
represents
-transform and
denotes the trace operator of the input.
The
represents the derivative of
with respect to
:
The
D-transform describes how the distribution of the singular values of the sum of the independent matrices is related to the distribution of the singular values of the individual matrices [
37]. Benaych and Nadakuditi [
38] indicate that the principal singular values and vectors of a large matrix can be set apart as the singular values of the signal matrix and the
-transform of the limiting noise-only singular value distribution. From Equation (
11), the weighting operator
can be written as:
We can substitute the weighting algorithm obtained from Equation (
14) into Equation (
9) to enhance the results of the rank-reduction step as:
where
indicates the reduced rank block Hankel matrix. Missing traces can be interpolated completely by applying the iterative algorithm. This adaptive-weighting rank-reduction (AWRR) method leads the way that sorts out the rank of the block Hankel matrix automatically while reducing the effect of the residual errors.
The weighting operator satisfies the equation below [
34]:
To understand the effect of the weighting operator on the singular values of the block Hankel matrices, the predefined rank-reduction method (TRR) and the weighting rank-reduction method (WRR) with predefined rank were applied to a block Hankel matrix.
Figure 6a shows clean and complete 3-D synthetic data having four linear events.
Figure 6b indicates the same 3-D data with
and
missing traces.
Figure 6c,d show slices of data in inline directions for the cubes of
Figure 6a and
Figure 6b, respectively.
Figure 6e,f are the slices of the cubes of
Figure 6a and
Figure 6b in the crossline direction, respectively.
Figure 7 shows the Hankel matrices of data in
Figure 6 for the frequency slice of 20 Hz.
Figure 7a is the block Hankel matrix of clean and complete data.
Figure 7b corresponds to the block Hankel matrix of data with
and
missing traces.
Figure 7c shows the block Hankel matrix after applying the predefined rank = 12 for ten iterations.
Figure 7d indicates the block Hankel matrix after applying the weighting operator to adjust the singular values of the data using TSVD with predefined rank = 12 after ten iterations. We can see from
Figure 7 that the block Hankel matrix of the WRR method is smoother than the result of applying a predefined rank = 12 after ten iterations.
Figure 8 corresponds to the spectrum of the first 25 singular values of the block Hankel matrices of
Figure 6.
Figure 8a represents the singular-value spectrum of the block Hankel matrix of clean and complete data for the frequency of 20 Hz, the first three singular values indicate the useful signal that relates to the coherent events in the data.
Figure 8b relates to the singular-value spectrum of the noisy and incomplete data,
Figure 8c shows the singular-value spectrum of the noisy and incomplete data after applying TSVD with rank = 12.
Figure 8d indicates the singular-value spectrum of the noisy and incomplete data after applying TSVD with rank = 12 and the weighting operator. We can see that the singular spectrum of the data after applying the weighting operator is much closer to the spectrum of the desired signal in
Figure 8a.
Figure 9 corresponds to the block Hankel matrices of the same data in the constant frequency slice of 50 Hz.
Figure 9a shows the block Hankel matrix of the clean and complete data, and
Figure 9b is the block Hankel matrix of data with
where
of traces are removed.
Figure 9c displays the recovered block Hankel matrix after applying TRR and rank = 4 for ten iterations.
Figure 9d indicates the block Hankel matrix after applying the WRR with predefined rank = 4.
Figure 9e represents the recovered block Hankel matrix of the ARR method, and
Figure 9f is the block Hankel matrix recovered by the weighting ARR (AWRR) method. Generally, the results of AWRR and WRR are smoother than the output of TRR and ARR, and the results of ARR and AWRR include more details than those of TRR and WRR.
Figure 10 corresponds to the spectrum of the first 25 singular values of the block Hankel matrices of
Figure 9.
Figure 10a represents the singular-value spectrum of the block Hankel matrix of clean and complete data for the frequency of 50 Hz. The abrupt drop in energy of the third and the fourth singular values relates to the useful signal. However, there are still nonzero singular values that are related to the useful signal.
Figure 10b is the singular-value spectrum of the block Hankel matrix of noisy and incomplete data.
Figure 10c shows the singular-value spectrum of the noisy and incomplete data after TRR with rank = 4.
Figure 10d displays the singular-value spectrum of the noisy and incomplete data after applying WRR with rank = 4.
Figure 10e depicts the singular-value spectrum after applying ARR.
Figure 10d indicates the singular-value spectrum of the noisy and incomplete data after the implementation of the AWRR method. From the figures, one can see that the result of AWRR is more comparable to the corresponding result for the desired signal in
Figure 10a.