The Application of the Supervised Descent Method in the Inversion of 3D Direct Current Resistivity Data

Wan, Tingli; Li, Tonglin; Kang, Xinze; Zhang, Rongzhe

doi:10.3390/min14111095

Open AccessArticle

The Application of the Supervised Descent Method in the Inversion of 3D Direct Current Resistivity Data

College of Geo-Exploration Sciences and Technology, Jilin University, Changchun 130012, China

^*

Author to whom correspondence should be addressed.

Minerals 2024, 14(11), 1095; https://doi.org/10.3390/min14111095

Submission received: 6 October 2024 / Revised: 22 October 2024 / Accepted: 27 October 2024 / Published: 29 October 2024

(This article belongs to the Special Issue New Methods and Technologies for Mineral Geological and Geophysical Exploration in China, 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

:

Three-dimensional direct current resistivity inversion is vital for mineral exploration, offering detailed electrical property distribution data of subsurface resources. To overcome the traditional algorithm’s reliance on initial models, its tendency to become stuck in local optima, and its low inversion resolution, this paper introduces an SDM-based 3D direct current resistivity inversion method. The Supervised Descent Method (SDM) is primarily used to solve optimization problems in nonlinear least squares, effectively capturing subsurface structural details and identifying electrical anomalies through the integration of machine learning and gradient descent techniques, thereby precisely revealing the complex electrical characteristics underground. Moreover, this study incorporates smooth regularization to enhance the stability and reliability of the inversion results. The paper demonstrates the feasibility and generalization of the SDM in 3D resistivity inversion through two sets of model calculations. A comparative analysis with traditional methods further proves the advantages of the SDM algorithm in improving inversion resolution and efficiency. Finally, applying the SDM algorithm to the Xiagalaoyi River mining area in Heilongjiang Province fully proves its optimized data processing capabilities and sensitivity to complex geological structures, providing a more precise and rapid technical approach for mineral resource exploration and assessment.

Keywords:

machine learning; supervised descent method (SDM); DC resistivity; mineral exploration

1. Introduction

Direct current resistivity inversion technology plays a significant role in the field of mineral resource exploration. It infers the geological structure and distribution of rocks and minerals underground by measuring the resistivity distribution on the surface or underground, providing key information for detecting potential mineral resources [1,2]. By leveraging the differences in resistivity of various geological bodies, direct current resistivity inversion can identify geological anomalies associated with mineral resources, which is of great significance for guiding drilling and assessing mineral resources. This technology, with its penetrating power into the deep crust, effectively addresses the challenges of deep ore deposit detection that traditional exploration methods struggle to reach, opening up new horizons for discovering hidden large ore deposits. As exploration demands become increasingly complex and the requirements for precision increase, two-dimensional inversion technology has certain limitations [3]. Therefore, three-dimensional inversion has become key to improving exploration efficiency. Compared to two-dimensional technology, three-dimensional inversion can construct more detailed three-dimensional geological models, allowing exploration personnel to examine underground targets in a comprehensive and multi-angle manner. This leads to a more accurate definition of ore body boundaries, the assessment of mineral reserves, and even revealing the extension direction of mineralization belts.

Traditional direct current resistivity inversion typically employs deterministic inversion algorithms, which generally obtain resistivity models by minimizing the residuals between the observed data and the forward modeling responses. Regularization techniques are often introduced during inversion to suppress the ill-posedness of the inverse problem, stabilizing the optimization process by imposing constraints on the model and data [4,5]. Inversion algorithms are commonly divided into linear and nonlinear categories. Currently, linear inversion algorithms hold a dominant position in practice. Petlon successfully applied ridge regression in 1978 to transform nonlinear problems into linear ones, providing a powerful tool for subsequent researchers [6]. Following this, Petrick and others used the alpha method to achieve three-dimensional resistivity inversion, significantly improving inversion efficiency [7]. Park [8] and Sasaki [9] developed three-dimensional direct current resistivity inversion based on the finite element method. Wu Xiaoping and others conducted forward and inverse studies on three-dimensional direct current resistivity inversion using the conjugate gradient method [10]. Liu Yang conducted research on 3D direct current resistivity forward and inverse modeling using unstructured grids with topography [11]. Liu Bin and others proposed an inequality-based least-squares method, enhancing the accuracy of three-dimensional direct current resistivity inversion [12]. Yang proposed a 3D resistivity inversion method based on the combination of deformed hexahedral meshes and unstructured tetrahedral meshes [13]. However, due to the inherently nonlinear nature of direct current resistivity inversion, linear algorithms have issues such as high dependence on initial values, susceptibility to local optima, and difficulty in flexibly incorporating prior knowledge. In contrast, nonlinear algorithms can alleviate these issues with their strong global search capabilities and have achieved good results in solving direct current resistivity inverse problems, such as genetic algorithms [14], simulated annealing algorithms [15], and ant colony algorithms [16]. However, these algorithms have relatively complex computational processes, requiring a significant amount of time and computational resources, leading to lower computational efficiency.

In recent years, machine learning techniques, represented by deep learning, have been widely used to solve various geophysical inversion problems [17,18]. These techniques involve constructing a training set offline to extract feature information and establish a mapping relationship between data and models, which is then used for online prediction [19]. This algorithm has opened up new avenues for introducing prior information to enhance the accuracy and convergence speed of inversion [20]. In the field of electrical methods, Liu designed a grid for surface resistivity inversion, ERSInvNe, based on U-Net, which combines deep weighting functions and smoothness constraints, enabling the network to quickly obtain effective inversion results [21]. Vu implemented three-dimensional surface resistivity inversion based on deep learning [22].

However, despite the certain successes of machine learning techniques in direct current resistivity inversion, these methods still have certain limitations. Firstly, in most inversion scenarios, there is insufficient prior information to construct a complete training set, and it is difficult to ensure consistency between the model spaces for training and prediction. This makes inversion methods that overly rely on training data face issues with insufficient generalization capabilities when dealing with the complex diversity of geological structures. Second, data-driven algorithms that do not take into account the physical distribution patterns often result in models that lack physical significance during prediction [23,24]. Therefore, there is still a need for a solution that combines data information with physical modeling principles to enhance the accuracy and efficiency of 3D direct current resistivity inversion.

The Supervised Descent Method (SDM) was initially proposed by Xiong for solving optimization problems in computer vision [25]. Subsequently, Guo applied it to one-dimensional Transient Electromagnetic (TEM) and two-dimensional Magnetotelluric (MT) inversion problems in the field of geophysics, demonstrating the feasibility of this method in this field [26,27]. Zhang further validated the effectiveness of SDM in handling three-dimensional gravity inversion issues [28]. Inspired by these developments, we have applied the SDM method to three-dimensional direct current resistivity inversion. During the offline training phase, a set of coefficient matrices is calculated using linear regression methods, which represent the descent directions in the optimization process. In the online prediction phase, these descent directions, along with the residuals of the current data, are used for iterative inversion to achieve online reconstruction of the model. Compared to previous inversion algorithms, the advantage of the Supervised Descent Method (SDM) is that it can flexibly integrate prior information through the construction of a training set during the offline phase, thereby enhancing the resolution of the inversion. In the online phase, this method uses pre-calculated gradient matrices for inversion, avoiding the direct solution of the inverse of the Jacobian and Hessian matrices, which improves the computational efficiency of the inversion. Moreover, the descent direction matrix obtained from the training set includes more global information during prediction, which helps to avoid the problem of falling into local minima. Compared with purely data-driven inversion methods, the model update amount in the prediction phase of SDM is obtained by the data residual weighted-descent direction, and it also incorporates the laws of physical modeling, ensuring the physical significance of the reconstructed model and enhancing the generalization capability and robustness of the algorithm, reducing the dependence of inversion results on the true model. To address the ill-posedness of the resistivity inversion problem, we have introduced smoothing regularization techniques in the SDM prediction phase to enhance the stability of the algorithm and prevent the occurrence of singular matrices.

The innovation of this paper lies in the fact that, although the SDM algorithm has been widely applied in one-dimensional and two-dimensional inversions in the field of geophysics, there is a relative lack of exploration in three-dimensional direct current electrical method inversion. In view of the increasing demand for three-dimensional inversion technology in the current exploration field, this paper innovatively extends the application of the SDM algorithm to three-dimensional direct current electrical method inversion, aiming to improve the accuracy and efficiency of the inversion. To this end, we designed three sets of model experiments for verification and applied the algorithm to the actual case of the Xiagala Ao Yi River mining area in Heilongjiang Province. The results confirmed the accuracy and effectiveness of the SDM algorithm in three-dimensional inversion.

2. Theoretical Foundation

2.1. Traditional Deterministic Direct Current Resistivity Inversion

In traditional deterministic direct current resistivity inversion methods, it is common to transform nonlinear problems into linear ones and then solve them using iterative methods.

This paper takes the Gauss–Newton method as an example.

Based on the Tikhonov regularization scheme, the objective function for three-dimensional direct current resistivity inversion can be represented in the following form:

Φ (m) = \frac{1}{2} {‖W_{d} (f (m) - d_{o b s})‖}^{2} + \frac{β}{2} {‖W_{m} (m - m_{r e f})‖}^{2}

(1)

where

f (m)

is the forward response of the model

m

,

W_{d}

is the observation data weight matrix,

d_{o b s}

represents the observed data,

β

is the regularization factor,

W_{m}

is the model regularization weighting matrix, and

m_{r e f}

is the reference model.

By linearizing Equation (1), we obtain the following:

Φ (m) = \frac{1}{2} {‖W_{d} ((f (m) + \frac{\partial d}{\partial m} \cdot Δ m) - d_{o b s})‖}^{2} + \frac{β}{2} {‖W_{m} (m - m_{r e f})‖}^{2}

(2)

To minimize Equation (2), we will take the derivative of the equation with respect to

m

and set it to zero. This yields the following updated formula:

(J^{T} {W_{d}}^{T} W_{d} J + β W_{m}^{T} W_{m}) \cdot Δ m = - (J^{T} {W_{d}}^{T} W_{d} (f (m) - d_{o b s}) + β W_{m}^{T} W_{m} (m - m_{r e f}))

(3)

where

H = J^{T} {W_{d}}^{T} W_{d} J + β {W_{m}}^{T} W_{m} g = J^{T} {W_{d}}^{T} W_{d} (f (m) - d_{o b s}) + β W_{m}^{T} W_{m} (m - m_{r e f})

The following can be obtained:

H \cdot Δ m = - g

(4)

where

J = \frac{\partial f}{\partial m}

is the first-order derivative of

f (m)

, known as the Jacobian matrix; H is an approximation of the Hessian matrix; and g is the gradient of the objective function. Given that

J

is a large, dense matrix, direct computation can be very time-consuming and require a significant amount of storage. To mitigate this, the conjugate gradient method can be employed to transform the computation into calculating the product of a vector with

J

and another vector with

J

.

By solving for the model correction term in Equation (4) and then using “line search” to update the model, the following model expression for the

k + 1

th iteration is obtained:

m_{k + 1} = m_{k} + α Δ m

(5)

where

α

is the linear search parameter, which can be used to control the magnitude of the update step, and it is typically taken as 1.

When using this method for inversion, the inversion results are greatly influenced by the initial model. If the initial model is close to the desired solution, only a few iterations are needed. A significant difference may lead to multiple iterations or non-convergence. Therefore, this method still has its limitations.

2.2. Supervised Descent Method for Direct Current Resistivity Inversion

Different from traditional deterministic inversion algorithms, the inversion based on the Supervised Descent Method is a nonlinear optimization approach that integrates machine learning with gradient descent. This algorithm consists of two phases: offline training and online prediction.

By extremizing the data-fitting term

{‖W_{d} (f (m) - d)‖}_{2}^{2}

in the objective function, the model update can be obtained, with the expression being as follows:

Δ m = (J^{T} J)^{- 1} J^{T} Δ d = K Δ d

(6)

where

Δ m = m - m_{0}

and

Δ d = d - f (m_{0})

. Based on the model update from Equation (6), the problem of minimizing the data-fitting term can be transformed into solving a least-squares problem.

Φ_{d} (m) = {‖Δ m - K Δ d‖}_{2}^{2}

(7)

From Equation (6), it can be seen that there is a mapping relationship

K = (J^{T} J)^{- 1} J^{T}

between the data residual

Δ d

and the model update quantity

Δ m

, which is the descent direction matrix and is related to the Jacobian matrix and the Hessian matrix. As can be seen from the formula, in traditional resistivity inversion,

K

often needs to be calculated through the Jacobian matrix at point A. In addition,

K

contains only local feature information of the objective function and relies on a suitable initial model to obtain the optimal inversion solution.

This paper employs an inversion algorithm based on SDM to construct N training models

\{m_{T}^{i}\} (i = 1, 2, 3 \dots N)

with prior information, where the superscript

i

indicates the number of the training model. From Equation (6), the following can be obtained:

Δ m_{T}^{i} = \tilde{K} Δ d_{T}^{i}

(8)

where

Δ m_{T}^{i} = m_{T}^{i} - m_{0}

and

Δ d_{T}^{i} = f (m_{T}^{i}) - f (m_{0})

. Using the method of Lagrange multipliers, the objective function can be written as

Φ_{d} (Δ m, \tilde{K}) = {‖Δ m - \tilde{K} Δ d‖}_{2}^{2} + λ \sum_{i = 1}^{N} {‖Δ m_{T}^{i} - \tilde{K} Δ d_{T}^{i}‖}_{2}^{2}

(9)

According to the above equation, it is known that

\tilde{K}

is obtained from a series of training models through learning. Compared with traditional algorithms, the descent direction matrix obtained in this way contains more global information, avoiding the problem of easily falling into local minima.

To improve computational efficiency and save memory consumption, Equation (9) can be divided into two stages, offline training and online prediction, that is,

{Φ_{d}}^{T} (\tilde{K}) = \sum_{i = 1}^{N} {‖Δ m_{T}^{i} - \tilde{K} Δ d_{T}^{i}‖}_{2}^{2}

(10)

In the offline training phase, by minimizing Equation (10), the descent gradient matrix

\tilde{K}

is obtained.

{Φ_{d}}^{P} (Δ m) = {‖Δ m - \tilde{K} Δ d‖}_{2}^{2}

(11)

In the prediction phase, the model is updated using the

\tilde{K}

obtained from the offline phase.

Offline training

Given a set of training models

M_{T}

and an initial model

M_{0}

, as follows:

M_{T} = [\begin{matrix} {m_{T}}^{1 T} \\ {m_{T}}^{2 T} \\ {m_{T}}^{3 T} \\ \cdot \\ \cdot \\ \cdot \\ {m_{T}}^{N T} \end{matrix}], M_{0} = [\begin{matrix} {m_{0}}^{1 T} \\ {m_{0}}^{2 T} \\ {m_{0}}^{3 T} \\ \cdot \\ \cdot \\ \cdot \\ {m_{0}}^{N T} \end{matrix}]

calculate their forward responses

D_{T}

and

D_{0}

to construct the training set, which includes the following:

D_{T} = [\begin{matrix} f (m_{T}^{1})^{T} \\ f (m_{T}^{2})^{T} \\ f (m_{T}^{3})^{T} \\ \cdot \\ \cdot \\ \cdot \\ f (m_{T}^{N})^{T} \end{matrix}], D_{0} = [\begin{matrix} f (m_{0}^{1})^{T} \\ f (m_{0}^{2})^{T} \\ f (m_{0}^{3})^{T} \\ \cdot \\ \cdot \\ \cdot \\ f (m_{0}^{N})^{T} \end{matrix}]

By minimizing Equation (10), a descent direction

\tilde{K}

can be obtained. For simplicity, the solution is directly integrated into the following formula:

\underset{\tilde{K}}{\arg \min} (\sum_{i = 1}^{N} {‖Δ M - \tilde{K} Δ D‖}_{2}^{2})

(12)

where

Δ M = M_{T} - M_{0}

and

Δ D = D_{T} - D_{0}

. However, due to the nonlinearity of the problem, it is not possible to establish a mapping relationship between model residuals and data residuals with a single

\tilde{K}

. We need multiple iterations to obtain a series of descent direction matrices

{\tilde{K}}_{k}

. Therefore, the above formula can be rewritten as

\underset{{\tilde{K}}_{k}}{\arg \min} (\sum_{i = 1}^{N} {‖Δ M_{k} - {\tilde{K}}_{k} Δ D_{k}‖}_{2}^{2})

(13)

where

Δ M_{k} = [\begin{matrix} Δ {m_{k}}^{1 T} \\ Δ {m_{k}}^{2 T} \\ Δ {m_{k}}^{3 T} \\ \cdot \\ \cdot \\ \cdot \\ Δ {m_{k}}^{N T} \end{matrix}], Δ D_{k} = [\begin{matrix} Δ {d_{k}}^{1 T} \\ Δ {d_{k}}^{2 T} \\ Δ {d_{k}}^{3 T} \\ \cdot \\ \cdot \\ \cdot \\ Δ {d_{k}}^{N T} \end{matrix}]

where

Δ m_{k}^{i} = m_{T}^{i} - m_{k}^{i}

and

Δ d_{k}^{i} = f (m_{T}^{i}) - f (m_{k}^{i})

, the superscript

i

denotes the index number of the training model, and the subscript

k

denotes the

k

-th iteration.

In the

k

-th iteration, the least-squares method is used to solve Equation (13) to obtain the average descent direction matrix:

{\tilde{K}}_{k} = (Δ {D_{k}}^{T} Δ D_{k})^{- 1} Δ D_{k} Δ M_{k}

(14)

In actual computation, the data residuals

Δ D_{k}

decrease rapidly with the convergence of iterations, and

Δ D_{k}

can easily become a singular matrix after a few iterations. Therefore, to enhance the stability of the solution process, based on the Levenberg–Marquardt algorithm, Equation (14) can be written in the following form:

{\tilde{K}}_{k} = (Δ {D_{k}}^{T} Δ D_{k} + μ_{k} I)^{- 1} Δ D_{k} Δ M_{k}

(15)

where

μ_{k}

is the regularization factor, primarily used to stabilize the solution, and its value is generally very small.

After calculating the

{\tilde{K}}_{k}

of the

k

-th iteration, the model update for the

k + 1

-th iteration can be obtained as follows:

M_{k + 1} = M_{k} + {\tilde{K}}_{k} Δ D_{k}

(16)

During the offline training process, we also need to record the model fit discrepancy and the data fit discrepancy for each iteration.

R M S m = \frac{1}{N} \sum_{i = 1}^{N} \frac{{‖Δ m_{k}^{i}‖}_{2}}{{‖m_{T}^{i}‖}_{2}}, R M S d = \frac{1}{N} \sum_{i = 1}^{N} \frac{{‖Δ d_{k}^{i}‖}_{2}}{{‖f (m_{T}^{i})‖}_{2}}

(17)

Repeat the aforementioned iterative process, and when

R M S m

and

R M S d

are sufficiently small or no longer decrease, or when the set maximum number of iterations is reached, the training concludes. The inversion process flowchart for the offline phase is shown in Figure 1.

The recorded descent directions

{\tilde{K}}_{1}, {\tilde{K}}_{2}, {\tilde{K}}_{3} \dots {\tilde{K}}_{k}

are stored for subsequent online prediction calculations. From the formula, we can see that during the iteration from the initial model to the training model,

Δ m

and

Δ d

with different fitting degrees will train different

\tilde{K}

. Therefore, we save the

\tilde{K}

from each iteration during the offline training phase to ensure that the objective function can descend in a direction that better fits the current fitting degree during the online prediction phase. Moreover, the independence of

{\tilde{K}}_{k}

from the model m parameters also means that this method does not require a high degree of similarity between the training model and the true model, that is, it has strong generalization capabilities.

B.: Online prediction

In this stage, we use the data residuals

Δ d

and

{\tilde{K}}_{k}

obtained from the training phase to update the predictive model. When predicting, we choose the same initial model as in the training phase, and the model update expression for the

k

-th iterative prediction is as follows:

m_{k} = m_{k - 1} + {\tilde{K}}_{k} (d - f (m_{k - 1}))

(18)

Due to the ill-conditioning and non-uniqueness of the 3D DC resistivity inversion problem, directly updating the model with the previous formula would result in false anomalies and an unreasonable fit discrepancy. Therefore, by reconstructing the online prediction objective function and incorporating a model-smoothing regularization term, the objective function for the

k

-th iteration can be expressed as follows:

Φ (m_{k}) = {‖m_{k} - m_{k - 1} - {\tilde{K}}_{k} (d - f (m_{k - 1}))‖}_{2}^{2} + λ \cdot {‖W_{m} (m_{k} - m_{r e f})‖}_{2}^{2}

(19)

By extremizing Equation (19), the model expression for the

k

-th iteration can be obtained as follows:

m_{k} = {[Ι + λ \cdot {W_{m}}^{T} W_{m}]}^{- 1} [{\tilde{K}}_{k} (d - f (m_{k - 1})) + m_{k - 1} + λ \cdot {W_{m}}^{T} W_{m} (m_{r e f})]

(20)

In this paper, the regularization factor

λ

is taken to be 0.1.

The introduction of the regularization term endows the predictive model in Equation (20) with certain physical significance, accurately describing the connection between the data and the model, and also gives the algorithm a higher generalization capability. Compared to the large system of linear equations in the Gaussian–Newton inversion method, the matrix

[Ι + λ \cdot {W_{m}}^{T} W_{m}]

in Equation (20) is a sparse matrix that can be quickly inverted, which improves the precision and efficiency of the inversion.

Furthermore, the SDM inversion does not rely on depth-weighting matrices or sensitivity-weighting matrices, reducing the bias caused by the uncertainty of the weighting coefficients. The flowchart for the prediction phase is shown in Figure 2.

3. Results of Algorithm Validation

3.1. Synthetic Data Example

In this section, three sets of experiments were designed to verify the effectiveness of the SDM in three-dimensional direct current resistivity inversion. This paper employs the RESINVM3D program [29], which is based on the finite volume method, to perform forward calculations on the model and generate synthetic data for constructing the training set. All experiments ensure that the theoretical model does not overlap with the training model.

3.1.1. Feasibility Analysis

To verify the feasibility of the SDM algorithm in direct current resistivity inversion and to discuss the impact of regularization constraints on inversion results, we designed two experiments in the prediction phase, one with smooth regularization and one without. In the experiments, we performed inversion using a model with measurements of 17 m × 17.5 m × 13.4 m and an inversion area divided into 34 × 35 × 16 cells. The transmitting electrodes and receiving electrodes of this model are arranged on the surface. A total of 96 electrodes are deployed, among which 38 electrodes are used only as receiving electrodes, and the remaining 58 electrodes are used as both transmitting electrodes and receiving electrodes. The model involves 41 current pairs, and each current pair records 93 independent data points. A total of 3813 data points are generated during the entire measurement process [29]. In Figure 3, we show the arrangement of the electrodes when the first current pair is supplying power.

The theoretical model for the experiment is shown in Figure 4, where we have buried two rectangular anomalies in the inversion area with resistivities of 20

Ω \cdot m

and 1000

Ω \cdot m

, respectively, and the background resistivity of the model is 200

Ω \cdot m

. During the training phase, we designed the following training models: One to three resistive blocks were buried in a uniform stratum, with the size of the resistive bodies being random but all rectangular, and their positions did not overlap. The resistivity ranges were 50–100

Ω \cdot m

and 500–1000

Ω \cdot m

, with a background resistivity of 200

Ω \cdot m

. The resistive bodies were randomly distributed underground and did not touch the model boundaries. A total of 3500 training models were generated for the experiment, and corresponding training sets were constructed.

The experiment used a homogeneous semi-space with a background value of 200

Ω \cdot m

as the initial training model. A total of five iterations were performed during the training phase to obtain a set of descent direction matrices

{\tilde{K}}_{k}

. In the prediction phase, this matrix was used for real-time online prediction, and the resulting slice images are shown in Figure 5. The model residuals of the inversion results are recorded in Table 1.

Upon examining the slice images and comparing the three cross-sectional views in Figure 2, it can be observed that, while both sets of result cross-sections are capable of identifying anomalies, Figure 5b, with the introduction of smooth regularization technology, not only reduces local false anomalies but also makes the boundaries of the anomalies clearer and the property values more accurate compared to Figure 5c. Furthermore, the comparison of model residuals in Table 1 further supports the viewpoint that incorporating regularization during the prediction phase can significantly enhance the resolution of the results. This improvement can balance the contradiction between data fitting and model smoothing, effectively reducing the generation of local false anomalies. This not only enhances the stability of the algorithm but also strengthens the reliability of the solution.

3.1.2. Generalization Capability Analysis

To analyze the inversion capability of the SDM algorithm for new samples, we designed this set of experiments to verify the algorithm’s dependency on prior information and test its generalization ability. The model size, grid division, and transmission–reception devices in this experiment are consistent with the previous one. In this group of experiments, we designed two theoretical models as shown in Figure 6. Model A involves burying a low-resistivity rectangular anomaly with a resistivity of 20

Ω \cdot m

in the inversion area, while Model B involves burying a high-resistivity diagonal-step anomaly with a resistivity of 1000

Ω \cdot m

in the inversion area; both have a background resistivity of 200

Ω \cdot m

.

During the offline training phase, we constructed a set of training models: 1–4 rectangular anomalies were buried in a uniform stratum, with their positions and sizes being random but not touching or overlapping with the model boundaries. The resistivity values of the anomalies ranged from 30 to 80

Ω \cdot m

, with a background resistivity of 200

Ω \cdot m

, and a total of 3500 training models were generated. Figure 7 shows some training samples. The anomalies in the training models are similar to those in Model A and are all low resistivity, thus indicating that this training set contains stronger prior information for Model A.

Then, in the prediction phase, the trained

{\tilde{K}}_{k}

matrices were used to perform inversions on the two theoretical models. The experiment used a homogeneous semi-space with a background value of 200

Ω \cdot m

as the initial model, and the results are shown in Figure 8.

The inversion results are shown in Figure 8, and it can be observed that using the same training set to obtain

{\tilde{K}}_{k}

for prediction, whether for Model A with strong prior information or Model B with weak prior information, highly similar inversion results to the true model can be achieved. Despite the absence of high-resistivity anomaly information in the training set, the inversion can still accurately recover the underground high-resistivity anomaly structure, with both shape and burial location essentially matching the true model. This indicates that the strength of prior information in the training set does not significantly affect the accuracy of the inversion results. From the above experiments, it can be concluded that the three-dimensional direct current resistivity inversion algorithm based on the SDM method is not highly dependent on prior information in the training set, which gives the algorithm good generalization capabilities.

When discussing the generalization of SDM, we should pay attention to one issue: the size of the descent direction matrix

{\tilde{K}}_{k}

is related to the grid partitioning and acquisition device of the trained model. This means that only when there are changes in the grid partitioning or device array of the new model do we need to adjust the training set and retrain

{\tilde{K}}_{k}

. However, if these conditions remain unchanged, the existing

{\tilde{K}}_{k}

can be directly used to predict new models with weak prior information and obtain effective inversion results. On this basis, it is meaningful to discuss the generalization of SDM.

3.1.3. Comparison Analysis of Gauss–Newton Method and SDM

In this set of experiments, we conducted a comparative analysis of the traditional Gauss–Newton method and the SDM method for three-dimensional direct current resistivity inversion in terms of resolution and efficiency. We designed a theoretical model of a diagonal-step anomaly as shown in Figure 9, with the model size being 300 m × 300 m × 112 m, dividing the inversion area into 22 × 22 × 16 rectangular prism elements, with the height of the elements increasing gradually with depth. The model’s transmitting and receiving electrodes are always located on the ground, with the center of the inversion area as the midpoint, and there are seven sets of positive and negative electrodes with a spacing of 40–160 m along the X and Y axes, all of which are located at grid nodes. In the theoretical model, the background resistivity is 200

Ω \cdot m

, and the resistivity of the anomaly is 1000

Ω \cdot m

(in this set of experiments, we take the logarithm base e of the resistivity values). In this model, the transmitting and receiving electrodes are still located on the ground surface. The experiment first performs a forward calculation on the model and then uses both algorithms to perform inversions on the model. During the training phase of the SDM, the following training models were generated according to the experimental requirements: 1–3 resistivity anomalies were buried in the inversion area, with each anomaly’s position and size being random but not touching or overlapping with the model boundaries. The resistivity values were randomly selected from the range of 20 to 1000

Ω \cdot m

. A total of 3500 training models were generated for the experiment, forming the training set.

Both sets of algorithms use a homogeneous semi-space with a background value of 200

Ω \cdot m

as the initial model, and both iterate five times during the inversion phase. The inversion model residuals and times are recorded in Table 2, respectively. The inversion results are shown in Figure 10.

First, we compare the inversion resolution of the two methods. Under the same conditions, whether in terms of structural morphology or anomaly amplitude, the slice in Figure 10b is more accurate than the slice in Figure 10c, better reflecting the contrast and outline of the true model. Furthermore, comparing the model residuals in Table 2 between the two sets of experiments, the RMSm value of the SDM inversion results is smaller, further proving that the SDM algorithm has higher resolution when applied to three-dimensional direct current resistivity inversion compared to the deterministic Gauss–Newton method.

Secondly, we analyze the computational efficiency of the two algorithms. From the time recorded in Table 2, we can see that without considering the time spent in the training phase, the SDM algorithm takes less time for a single inversion, achieving nearly a five-fold speedup. This is because the Gauss–Newton method requires solving a large system of linear equations (Equation (3)) during inversion, which is less computationally efficient and takes longer. In contrast, although the SDM inversion also requires solving a system of equations (Equation (20)), the solution is much simpler, greatly improving the inversion efficiency. Moreover, the SDM algorithm has good generalization capabilities during inversion. When predicting different structural models with the same grid division, we can use the same set of descent directions

{\tilde{K}}_{k}

without the need for repeated training, saving a significant amount of time. Therefore, the SDM algorithm shows higher computational efficiency compared to the Gauss–Newton method.

Thus, in this set of tests, we can conclude that in terms of both inversion accuracy and computational efficiency, the SDM inversion algorithm has more advantages than the traditional Gauss–Newton inversion.

3.2. Data Processing Results and Analysis of Actual Measurements

To verify the practicality of the SDM direct current resistivity inversion algorithm, this paper applies the algorithm to the data processing and interpretation of a lead–zinc mine area in the upper reaches of the Xiagalaiaoyi River, Hu Zhong District, Greater Khingan Range, northwestern Heilongjiang Province, as shown in Figure 11. According to the geological map, the study area is rich in metallic minerals, and the surrounding rocks are mainly composed of marble, diorite, hornfels, skarn, and other lithologies. Parts A and B of the map are the locations of two known skarn-type ore deposits [30].

The experimental data collection was carried out using a pseudo-tripolar sounding device. In the mining area, a total of 32 survey lines and 31 power electrodes were laid out. This paper mainly selected 12 horizontal and vertically parallel survey lines, along with 13 power electrodes. The survey lines were spaced 40 m apart, with 12 measurement points on each line, and the measurement points were also spaced 40 m apart. The layout of the power electrodes and receiving electrodes is shown in Figure 12, where the yellow points represent the locations of the receiving electrodes and the blue points represent the locations of the power electrodes. Power electrode A is laid out at an “infinite distance” perpendicular to the main survey profile of the study area, and electrode B is perpendicular and parallel to the survey lines. The inversion area is sized 920 m × 1000 m × 500 m, dividing the underground space of the inversion target area into 23 × 25 × 12 grid cells, with each cell measuring 40 × 40 × 40 m. The initial model for inversion is a homogeneous semi-space with a background resistivity of 1500

Ω \cdot m

.

In the SDM inversion, we designed a set of training models based on the size of the inversion area: 1–4 anomalies were randomly buried in the inversion area, with variable sizes and random positions. The resistivity values were randomly selected from the ranges of 10–200 and 6000 to 10,000

Ω \cdot m

(referencing the electrical properties of rocks in the mining area), generating a total of 3000 training models to construct the corresponding training set. Five iterations were performed in the offline phase to train and obtain a set of

{\tilde{K}}_{k}

matrices for prediction.

The model fitting error obtained from the inversion is 0.0935. Figure 13 is the profile of the resistivity inversion results along the X direction, and Figure 14 is the resistivity slice diagram of the inversion results at different depths along the z direction. From the inversion result profile, it can be observed that there is a notably high-resistivity area to the north of the study area, which is analyzed to be marble and schist. Similarly, a distinct high-resistivity area is also present to the south, which is inferred to be the country rock of granodiorite. In the slice diagram, A and B are the locations of the ore deposits identified on the geological map. Upon examining the slice map, a distinct high-resistivity area is observed to the south and southeast of the study area, which is analyzed to be the wall rock of granodiorite; combined with the geological map, it can be seen that there are marble and schist wall rocks in the central and northeast parts of the study area, all of which exhibit high-resistivity characteristics during the rock sampling phase, and this is reflected in the inversion slice map; a significant low-resistivity area is observed in the central part, near the contact zone between the marble and the rock body. Based on the formation mechanism of skarn-type polymetallic ores, it is inferred that there is a lead–zinc ore deposit here, and its location closely matches the known ore deposit A; a clear low-resistivity area appears to the north of the study area, which is inferred to be a skarn-type ore body formed by the contact between marble and granodiorite, and the location of this ore body is consistent with the known ore deposit B.

By comparing the inversion results with the geological information of the area, we found that the algorithm proposed in this paper produces inversion results that are quite close to the actual geological conditions. It can effectively delineate the regional location of known ore bodies and accurately define the boundaries between ore bodies and surrounding rocks.

4. Discussion

This paper addresses the issues present in traditional inversion algorithms by proposing a three-dimensional direct current resistivity inversion algorithm based on the Supervised Descent Method (SDM). By integrating machine learning with conventional gradient descent, the approach not only ameliorates the traditional algorithms’ difficulty in integrating prior information and their tendency to fall into local minima but also prevents the limitations of pure data-driven algorithms, which suffer from poor applicability and insufficient generalization capabilities. Through three sets of model calculations, this paper analyzes the performance and advantages of the SDM algorithm. The experimental results indicate that introducing smooth regularization technology during the SDM prediction phase helps to reduce local false anomalies and achieve more accurate inversion results. Moreover, the SDM algorithm is less dependent on prior information in the training data, demonstrating good generalization ability. A comparative analysis with the traditional Gauss–Newton method further proves that the SDM algorithm has better computational efficiency and inversion resolution when applied to three-dimensional direct current resistivity inversion. The application results in actual geological data show that the algorithm of this paper can effectively delineate the location of ore deposits and distinguish the boundaries between ore bodies and surrounding rocks, further illustrating the effectiveness and practicality of the SDM algorithm.

Author Contributions

R.Z. and T.W. designed the concept and algorithm. T.W. completed the implementation of the algorithm and the writing of the paper and supported the writing review and editing. T.L. and X.K. provided theoretical guidance and suggestions for the revision of the paper. T.L. provided the funding support and assistance necessary for the writing of the paper. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Open Fund from SinoProbe Laboratory (grant no.: Sinoprobe Lab 202403).

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Legault, J.M.; Carriere, D.; Petrie, L. Synthetic model testing and distributed acquisition DC resistivity results over an unconformity uranium target from the Athabasca Basin, northern Saskatchewan. Geophysics 2008, 27, 46–51. [Google Scholar] [CrossRef]
Lu, J. 3D Electrical Resistivity Inversion and Imaging Technology for Coal Mine Water-Containing/Water-Conductive Structures. J. China Coal Soc. 2016, 41, 1–10. [Google Scholar] [CrossRef]
Wiwattanachang, N.; Giao, P.H. Monitoring crack development in fiber concrete beam by using electrical resistivity imaging. J. Appl. Geophys. 2011, 75, 294–304. [Google Scholar] [CrossRef]
Tarantola, A.; Valette, B. Generalized nonlinear inverse problems solved using the least squares criterion. Rev. Geophys. 1982, 20, 219–232. [Google Scholar] [CrossRef]
Tikhonov, A.N.; Arsenin, V.Y. Solutions of Ill-Posed Problems; VH Winston and Sons: Washington, DC, USA, 1977. [Google Scholar]
Pelton, W.H.; Rijo, L.; Swift, C.M. Inversion of Two-Dimensional Resistivity and Induced-Polarization Data. Geophysics 1978, 43, 788–803. [Google Scholar] [CrossRef]
Petrick, W.R., Jr.; Sill, W.R.; Ward, S.H. Three-Dimensional Resistivity Inversion Using Alpha Centers. Geophysics 1981, 46, 1148–1162. [Google Scholar] [CrossRef]
Park, S.K.; Van, G.P. Inversion of Pole-Pole Data for 3-D Resistivity Structure Beneath Arrays of Electrodes. Geophysics 1991, 56, 951–960. [Google Scholar] [CrossRef]
Sasaki, Y.; Saito, H. 3D Resistivity Inversion Using the Finite-Element Method. Geophysics 1994, 59, 131–140. [Google Scholar] [CrossRef]
Wu, X.-P.; Xu, G.-M. Study on 3-D Resistivity Inversion Using Conjugate Gradient Method. Chin. J. Geophys. 2000, 43, 420–427. [Google Scholar]
Liu, Y. Study of 3D Resistivity Modeling and Inversion Using Unstructured Grids and Their Applications. Ph.D. Thesis, University of Science and Technology of China, Hefei, China, 2016. [Google Scholar]
Liu, B.; Li, S.C.; Li, S.; Wang, J.; Zhang, Q.S.; Wang, J.; Li, X.; Zhang, Q.S. 3D resistivity inversion with inequality constraints using the least squares method and its algorithm optimization. Chin. J. Geophys. 2012, 55, 260–268. [Google Scholar]
Yang, H.; Li, T.; Zhang, R.; Dong, X.; Zhuang, Y. 3-D Joint Inversion of DC Resistivity and Time-Domain Induced Polarization With Structural Constraints in Undulating Topography. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5920412. [Google Scholar] [CrossRef]
Liu, B.; Li, S.C.; Nie, L.C.; Wang, J.; Zhang, Q.S. 3D Resistivity Inversion Using an Improved Genetic Algorithm Based on Control Method of Mutation Direction. J. Appl. Geophys. 2012, 87, 1–8. [Google Scholar] [CrossRef]
Santos, F.A.M.; Sultan, S.A.; Represas, P.; Sorady, A.L.E. Joint Inversion of Gravity and Geoelectrical Data for Groundwater and Structural Investigation: Application to the Northwestern Part of Sinai, Egypt. Geophys. J. Int. 2006, 165, 705–718. [Google Scholar] [CrossRef]
Zhang, L.-Y.; Liu, H.-F. The Application of ABP Method in High-Density Resistivity Method Inversion. Chin. J. Geophys. 2011, 54, 64–71. [Google Scholar]
Li, S.; Liu, B.; Ren, Y.; Chen, Y.; Yang, S.; Wang, Y.; Jiang, P. Deep-Learning Inversion of Seismic Data. IEEE Trans. Geosci. Remote Sens. 2020, 58, 2135–2149. [Google Scholar] [CrossRef]
Puzyrev, V. Deep Learning Electromagnetic Inversion with Convolutional Neural Networks. Geophys. J. Int. 2019, 218, 817–832. [Google Scholar] [CrossRef]
Moghadas, D.; Behroozmand, A.A.; Christiansen, A.V. Soil Electrical Conductivity Imaging Using a Neural Network-Based Forward Solver; Applied to Large-Scale Bayesian Electromagnetic Inversion. J. Appl. Geophys. 2020, 176, 104012. [Google Scholar] [CrossRef]
Massa, A.; Oliveri, G.; Salucci, M.; Anselmi, N.; Rocca, P. Learning-by-Examples Techniques as Applied to Electromagnetics. J. Electromagn. Waves Appl. 2018, 32, 516–541. [Google Scholar] [CrossRef]
Liu, B.; Jiang, P.; Guo, Q.; Wang, C. Deep Learning Inversion of Electrical Resistivity Data. IEEE Trans. Geosci. Remote Sens. 2020, 58, 5715–5728. [Google Scholar] [CrossRef]
Vu, M.T.; Jardani, A. Convolutional Neural Networks with SegNet Architecture Applied to Three-Dimensional Tomography of Subsurface Electrical Resistivity: CNN-3D-ERT. Geophys. J. Int. 2020, 225, 1319–1331. [Google Scholar] [CrossRef]
Colombo, D.; Turkoglu, E.; Li, W.; Sandoval-Curiel, E.; Rovetta, D. Physics-Driven Deep-Learning Inversion with Application to Transient Electromagnetics. Geophysics 2021, 86, E209–E224. [Google Scholar] [CrossRef]
Wang, W.; Yang, F.; Ma, J. Velocity Model Building with a Modified Fully Convolutional Network. In SEG Technical Program Expanded Abstracts; SEG Library: Houston, TX, USA, 2018; pp. 2086–2090. [Google Scholar]
Xiong, X.; De La Torre, F. Supervised Descent Method and Its Applications to Face Alignment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, 23–28 June 2013; pp. 532–539. [Google Scholar]
Guo, R.; Li, M.; Fang, G.; Yang, F.; Xu, S.; Abubakar, A. Application of Supervised Descent Method to Transient Electromagnetic Data Inversion. Geophysics 2019, 84, E225–E237. [Google Scholar] [CrossRef]
Guo, R.; Li, M.; Yang, F.; Xu, S. Application of Supervised Descent Method for 2D Magnetotelluric Data Inversion. Geophysics 2020, 85, WA53–WA65. [Google Scholar] [CrossRef]
Zhang, R.; He, H.; Dong, X.; Li, T.; Liu, C.; Kang, X. Application of Supervised Descent Method for 3-D Gravity Data Focusing Inversion. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–10. [Google Scholar] [CrossRef]
Pidlisecky, A.; Haber, E.; Knight, R. RESINVM3D: A 3D Resistivity Inversion Package. Geophysics 2007, 72, H1–H10. [Google Scholar] [CrossRef]
Li, S.-P.; Li, T.-L.; Zheng, J.; Chen, H.-B.; Zhang, Z.-Y. Application of 3D inversion of gravity magnetism electric method in upper mining area of Xiagalai Aoyi River. Glob. Geol. 2020, 39, 437–443. [Google Scholar]

Figure 1. Offline training flowchart of the SDM DC inversion.

Figure 2. Online prediction flowchart of the SDM DC inversion.

Figure 3. Schematic diagram of the measuring point arrangement when the first current pair of the model is powered. AB are current electrodes, and MN are potential electrodes.

Figure 4. True resistivity model.

Figure 5. Resistivity inversion result slice images: (a) true model; (b) inversion results with regularization; (c) inversion results without regularization.

Figure 6. True resistivity model. (a) Model A (strong prior information); (b) Model B (weak prior information).

Figure 7. Partial training model samples.

Figure 8. Resistivity inversion result slice images: (a) true Model A; (b) inversion result of Model A; (c) true Model B; (d) inversion result of Model B.

Figure 9. True resistivity model.

Figure 10. Resistivity inversion result slice images: (a) true model; (b) SDM inversion result slice; (c) Gauss–Newton method inversion result slice.

Figure 11. Geological map of the study area. (A and B are the locations of two known skarn-type ore deposits) 0. Unconformity boundaries; 1. Tuff, welded tuff, brecciated tuff; 2. Angstroms, rhyolitic angstroms; 3. Rhyolite, rhyolitic welded tuff, andesitic rhyolite; 4. Quartzite, sandstone; 5. Schistose acid sandstone, schistoserhyolite; 6. Schistose neutral volcanic rock, andesite; 7. Slate, phyllite, chlorite phyllite, quartzite; 8. Thick-layered dolomite, locally interbedded with thin-layered dolomite; 9. Hornfels; 10. Silicified limestone; 11. Gabbroic vein; 12. Gabbro; 13. Gabbro vein; 14. Hornblende anorthosite; 15. Granite porphyry; 16. Plagiogranite; 17. Medium- to medium-fine-grained granodiorite; 18. Diorite porphyry; 19. Iron deposit; 20. Andesite; 21. Lead–zinc deposit; 22. Granite.

Figure 12. Schematic layout of measurement points in the study area.

Figure 13. Profile of resistivity inversion results along the X direction.

Figure 14. Resistivity cross-sections at different depths (A and B are the locations of two known skarn-type ore deposits). (a) −250 m; (b) −300 m; (c) −350 m; (d) −400 m.

Table 1. RMSm comparison of inversion results.

	Incorporating Smooth Regularization	Without Regularization
RSMm	0.0441	0.0998

Table 2. Comparison of inversion results and time consumption between the two methods.

	SDM	Gaussian–Newton
RSMm	0.0296	0.3801
Time cost(s)	39	1561

The table records the time taken during the inversion phase.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wan, T.; Li, T.; Kang, X.; Zhang, R. The Application of the Supervised Descent Method in the Inversion of 3D Direct Current Resistivity Data. Minerals 2024, 14, 1095. https://doi.org/10.3390/min14111095

AMA Style

Wan T, Li T, Kang X, Zhang R. The Application of the Supervised Descent Method in the Inversion of 3D Direct Current Resistivity Data. Minerals. 2024; 14(11):1095. https://doi.org/10.3390/min14111095

Chicago/Turabian Style

Wan, Tingli, Tonglin Li, Xinze Kang, and Rongzhe Zhang. 2024. "The Application of the Supervised Descent Method in the Inversion of 3D Direct Current Resistivity Data" Minerals 14, no. 11: 1095. https://doi.org/10.3390/min14111095

APA Style

Wan, T., Li, T., Kang, X., & Zhang, R. (2024). The Application of the Supervised Descent Method in the Inversion of 3D Direct Current Resistivity Data. Minerals, 14(11), 1095. https://doi.org/10.3390/min14111095

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

The Application of the Supervised Descent Method in the Inversion of 3D Direct Current Resistivity Data

Abstract

1. Introduction

2. Theoretical Foundation

2.1. Traditional Deterministic Direct Current Resistivity Inversion

2.2. Supervised Descent Method for Direct Current Resistivity Inversion

3. Results of Algorithm Validation

3.1. Synthetic Data Example

3.1.1. Feasibility Analysis

3.1.2. Generalization Capability Analysis

3.1.3. Comparison Analysis of Gauss–Newton Method and SDM

3.2. Data Processing Results and Analysis of Actual Measurements

4. Discussion

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI