Research on Oil and Gas-Bearing Zone Prediction and Identification Based on the SVD–K-Means Algorithm—A Case Study of the WZ6-1 Oil-Bearing Structure in the Beibu Gulf Basin, South China Sea

Chen, Zhilong; Wang, Renyi; Xu, Biao; Zhu, Jianghang

doi:10.3390/en17225771

Open AccessArticle

Research on Oil and Gas-Bearing Zone Prediction and Identification Based on the SVD–K-Means Algorithm—A Case Study of the WZ6-1 Oil-Bearing Structure in the Beibu Gulf Basin, South China Sea

School of Petrochemical Engineering and Environment, Zhejiang Ocean University, Zhoushan 316022, China

^*

Author to whom correspondence should be addressed.

Energies 2024, 17(22), 5771; https://doi.org/10.3390/en17225771

Submission received: 21 October 2024 / Revised: 11 November 2024 / Accepted: 18 November 2024 / Published: 19 November 2024

(This article belongs to the Section H: Geo-Energy)

Download

Browse Figures

Versions Notes

Abstract

:

The WZ6-1 oil-bearing structure in the Beibu Gulf Basin of the South China Sea has well-developed faults with significant variations in fault sealing capacity, resulting in a complex and highly variable distribution of oil, gas, and water, and limited understanding of hydrocarbon accumulation patterns. Traditional methods, such as single seismic attributes and linear fusion of multiple seismic attributes, have proven ineffective in identifying and predicting oil and gas-bearing areas in this region, leading to five unsuccessful wells. Through comprehensive analysis of drilled wells and seismic data, six types of horizon seismic attributes were selected. A novel approach for predicting oil-bearing zones was proposed, employing SVD–K-means nonlinear clustering for multiple seismic attribute fusion. The application results indicate: ① Singular value decomposition (SVD) technology not only reduces the correlation redundancy among seismic attribute data variables, but enables data dimensionality reduction and noise suppression, decreasing ambiguity in prediction results and enhancing reliability. ② The K-means nonlinear clustering method facilitates the nonlinear fusion of multiple seismic attribute parameters, effectively uncovering the nonlinear features of the underlying relationship between seismic attributes and reservoir oil-bearing characteristics, thereby amplifying the hydrocarbon information within the seismic attribute variables. ③ Compared to K-means, SVD–K-means demonstrates superior performance across all metrics, with an 18.4% increase in the SC coefficient, a 57.8% increase in the CH index, and a 24.7% improvement in the DB index. ④ The results of oil-bearing zone prediction using the SVD–K-means algorithm align well with the drilling outcomes in the study area and correspond to the geological patterns of hydrocarbon enrichment in this region. This has been confirmed by the high-yield industrial oil flow obtained from the newly drilled WZ6-1-A3 well. The SVD–K-means algorithm for predicting oil and gas-bearing zones provides a new approach for predicting hydrocarbon-rich areas in complex fault block structures with limited drilling and poor-quality seismic data.

Keywords:

seismic attributes; K-means algorithm; singular value decomposition; petroliferous area prediction

1. Introduction

Against the backdrop of growing global energy demand, the development and utilization of offshore oil and gas fields have become crucial means to meet this demand [1,2]. However, offshore oil field development faces numerous challenges, the most significant of which is the high cost of offshore drilling combined with a relatively low number of wells. Under these conditions, seismic attribute-based methods for predicting oil and gas-bearing zones within strata have become the most widely used and effective geophysical approaches. Traditional prediction methods, such as single seismic attribute techniques like amplitude attributes and instantaneous frequency attributes [3,4], have proven effective in many cases. However, because the seismic attribute response is a cumulative result of multiple factors within the strata and is influenced by geological complexity, hydrocarbon occurrence, and seismic data acquisition and processing, single-attribute-based predictions often suffer from significant ambiguity. The second type of method involves multi-attribute linear integration, such as principal component analysis (PCA) and the weighted sum method (WSM) [5,6]. These methods are not only simple in algorithm, but integrate seismic attribute information that reflects the hydrocarbon potential of the strata from multiple perspectives, thereby enhancing the prediction accuracy for oil and gas-bearing zones. Given that the relationship between hydrocarbon potential and seismic attributes is often complex and nonlinear, these methods are limited in their ability to address the nonlinear response between strata hydrocarbon content and seismic attributes, thus constraining their applicability.

With the advancement of computer hardware and software capabilities and the rapid development of machine learning technology, machine learning techniques have been successfully applied to the prediction of hydrocarbon-bearing zones in stratigraphy. Currently, machine learning methods for predicting hydrocarbon-bearing zones can be broadly categorized into two types: unsupervised classification prediction and supervised classification prediction. The main supervised classification methods include random forest(RF) [7,8], logistic regression(LR) [9,10], and support vector machine (SVM) [11,12]. Major unsupervised classification methods include the expectation–maximization algorithm (EM) [13,14], self-organizing maps (SOM) [15,16], and K-means clustering analysis [17,18]. Supervised and unsupervised classifications are suited to different application scenarios. Supervised classification is suitable for multi-well areas and relies on well–logging interpretation parameters as well as oil testing and trial production results. Unsupervised classification methods are generally applicable for predictions in areas with fewer wells [19,20]. Since seismic information is a nonlinear composite response to multiple factors, such as stratigraphic undulations, rock framework, fluid types, seismic acquisition noise, and processing methods, the application of unsupervised classification methods generally requires a comprehensive consideration of the geological characteristics of the target area, the quality of seismic data, and the well–seismic response relationships to select the appropriate unsupervised classification method.

Due to the low exploration level and limited number of wells in the WZ6-1 structural area of the Beibu Gulf Basin in the South China Sea, combined with challenges such as well-developed faults, fragmented structures, complex and variable oil, gas, and water distribution, and poor seismic data quality, the understanding of the hydrocarbon-bearing zone distribution in this area is limited, resulting in five unsuccessful wells. The objectives for further exploration in this area remain unclear, creating an urgent need for an effective oil-bearing zone prediction method that suits the complex geological characteristics and seismic data conditions of this region, thereby providing technical support for future exploration and development decisions. This paper proposes a method based on the SVD–K-means algorithm for predicting hydrocarbon-bearing zones, which has yielded promising results. First, six types of horizon seismic attributes were selected based on the well–seismic response characteristics of the study area. Using singular value decomposition (SVD) to preprocess the seismic attributes, the K-means unsupervised nonlinear clustering method was then applied for predicting hydrocarbon-bearing zones. This approach not only eliminated redundant correlations among different seismic attributes and achieved dimensionality reduction and noise suppression, but enhanced the nonlinear relationship between multiple seismic attributes and the hydrocarbon content of the strata. Additionally, it significantly improved the convergence stability of the K-means algorithm, reduced computation, and increased the effectiveness and reliability of the oil-bearing zone prediction results. Based primarily on the results of this prediction, an exploratory well was drilled, yielding a high-production industrial oil flow. This not only confirmed the reliability of the prediction results, but identified a promising new target area for ongoing hydrocarbon exploration and development in this region.

2. Geologic Background of the Study Area

As shown in Figure 1, the WZ6-1 oil-bearing structure in the Beibu Gulf Basin of the South China Sea is an anticline complicated by faults, divided by two nearly east–west trending faults into three regions: the northern block, central block, and southern block. The southern block is segmented by several near north–south trending radial faults into four fault blocks of varying sizes, named S1, S2, S3, and S4 from west to east. The primary target reservoir layer is the W3IV oil formation of the Oligocene Weizhou Group, consisting of interbedded sandstone and mudstone deposited in a fan delta front environment. The W3IV reservoir has favorable properties, with an average porosity of 26% and permeability of 1120 × 10⁻³ μm². This structure is highly faulted, with uncertain fault-sealing properties, leading to a complex and variable distribution of oil, gas, and water, and limited understanding of the hydrocarbon accumulation patterns. Five wells were drilled in the northern and central blocks, located at higher structural positions, but they were unsuccessful, with no oil-bearing layers encountered. After further evaluation, an exploratory well, WZ6-1S-1, was drilled in the lower structural position in the southern block, specifically in the S2 fault block, which resulted in a successful oil discovery.

After oil was confirmed through drilling in the S2 fault block in the lower structural position of the southern block, discussions were raised about whether the adjacent fault blocks, S1, S3, and S4, might also contain oil, given their similar geological conditions to S2. To assess the oil-bearing potential of the three fault blocks adjacent to S2, six seismic attributes along the strata, closely associated with hydrocarbon presence, were selected based on previous studies and well–seismic response characteristics in this area. These attributes include arc length, root mean square (RMS) amplitude, dominant frequency, energy half-decay, bandwidth, and instantaneous phase, and are used to predict hydrocarbon-bearing zones based on seismic attributes. As shown in Figure 2, if traditional single seismic attribute methods are used for oil-bearing zone prediction, the results reveal that the spatial distribution of the six seismic attributes is chaotic, with no discernible pattern. Additionally, there is almost no difference in the seismic attribute characteristics between the five unsuccessful wells drilled at higher structural positions and the oil-bearing S2 fault block confirmed by drilling in the lower southern block (WZ6-1S-1). It is evident that the traditional single seismic attribute prediction method for oil-bearing zones is entirely ineffective in this area and cannot address whether the S1, S3, and S4 fault blocks adjacent to S2 contain oil.

In general, a single seismic attribute often struggles to reveal the hydrocarbon potential of strata that may be implicitly represented within seismic attributes. This is because seismic attributes are a nonlinear composite response to various factors such as strata properties and acquisition/processing effects, rather than an independent response to the hydrocarbon potential of strata. This leads to issues with ambiguity, making it challenging to accurately identify hydrocarbon-bearing zones. Furthermore, a single seismic attribute generally reflects only one physical characteristic of the strata, limiting the information it may contain regarding hydrocarbon potential. Using a linear fusion method for multiple seismic attribute parameters [21,22] allows for a more comprehensive integration of seismic attribute variables related to hydrocarbon potential, reducing ambiguity and thereby improving the reliability of prediction results. However, due to the complex nonlinear response relationship between hydrocarbon potential and seismic attributes, along with the subjectivity in determining the weights of multiple seismic attribute variables, the applicability of linear fusion methods is limited, making it difficult to meet the prediction requirements in complex geological conditions for hydrocarbon-bearing strata. This paper applies the K-means nonlinear clustering method based on multiple seismic attribute parameter data to achieve nonlinear fusion of seismic attributes. The goal is to uncover the potential nonlinear relationship between seismic attributes and hydrocarbon potential in order to assess the hydrocarbon-bearing potential of the three fault blocks adjacent to S2.

3. K-Means Model Prediction Method

K-means is an unsupervised clustering machine learning algorithm [23,24] that, by predefining K clusters, randomly selects K initial cluster center points. It typically assigns sample data points to the nearest cluster center based on Euclidean distance. Suppose there are N data samples x∈R^N×M, representing N rows of M types of seismic attribute data. Randomly select K initial cluster centers c_i = {c₁, c₂,…, c_K}, and the Euclidean distance d_ci between data sample x and the i-th cluster center is:

d_{c i} = \sqrt{\sum_{j = 1}^{M} \sum_{k = 1}^{N} {(x_{k j} - c_{i j})}^{2}}

(1)

In Equation (1), x_kj represents the j-th seismic attribute in the k-th row of a data sample x; c_ij represents the value of the j-th dimension of the i-th cluster center. d_ci denotes the Euclidean distance between data sample x and the i-th cluster center. Based on the Euclidean distance d_ci, the nearest data sample x is assigned to the cluster of c_i. By calculating the mean of data samples within each cluster, a new set of cluster centers is formed, calculated as follows:

C_{i} = \frac{\sum_{j = 1}^{n} X_{j}}{n}

(2)

In Equation (2), n represents the number of data samples in the i-th cluster, and c_i denotes the newly formed cluster center for the i-th cluster. The iteration terminates, yielding the final classification result, when the within-cluster sum of squared errors (SSE) no longer changes or converges. SSE is calculated as follows:

S S E = \sum_{i = 1}^{K} {\sum_{x \in c_{i}} | d_{c i} |}^{2}

(3)

Before performing K-means clustering, an appropriate value of K, or the number of clusters, is selected based on the needs of the specific research question. This value determines the number of clusters into which the dataset will be divided and is also known as the number of clusters. Choosing an overly small K value may result in confusion between clusters, leading to the loss of significant data characteristics; while an overly large K value may cause excessive subdivision and overfitting. Typically, metrics such as the within-cluster SSE and the Davies–Bouldin (DB) index are used to evaluate the clustering effectiveness of the K-means algorithm to determine an appropriate K value. The DB index calculation relies on the distance between cluster centers and the dispersion of samples within clusters; a smaller DB index value indicates a better clustering result:

D B I = \frac{1}{k} \sum \max (\frac{σ_{i} + σ_{j}}{d (c_{i}, c_{j})})

(4)

In Equation (4), σ_i represents the average distance between samples in the i-th cluster and its cluster center, and d(c_i,c_j) denotes the distance between the cluster centers of the i-th and j-th clusters.

Based on previous research, and using the well–seismic response characteristics of the WZ6-1 oil-bearing structural area in the Beibu Gulf Basin of the South China Sea, six selected horizon seismic attributes were applied to a K-means clustering analysis. This approach aimed to uncover the implicit nonlinear response patterns between multiple seismic attribute parameters and reservoir hydrocarbon content, thereby achieving the goal of predicting oil-bearing zones. The appropriate value of K is generally determined based on the curve characteristics of the sum of squared errors (SSE) and Davies–Bouldin (DB) index with respect to the K value (Figure 3). Typically, as the K value increases, the SSE value gradually decreases, but the rate of decrease slows down. The inflection point or “elbow” in the curve is often considered an appropriate K value [25,26]. A smaller DB index value indicates greater similarity within each cluster and greater differentiation between clusters, signifying better clustering performance. As shown in Figure 3, when the K value is set to 4, it is located at the inflection point of the SSE curve and the minimum point of the DB index curve. Therefore, setting the number of clusters to four is optimal. Additionally, the K-means algorithm requires setting the number of random initializations, as each of the K clusters has a corresponding initial center point, which is randomly selected to compute its Euclidean distance to data samples. Fewer random initializations may lead to suboptimal solutions due to unsuitable initial center point selection, while more random initializations increase computational demand, particularly with large datasets like seismic data, significantly extending algorithm runtime. Thus, an appropriate number of random initializations must be chosen. Through repeated testing and parameter tuning, the number of initializations was set to 10, and the number of clusters to 4, yielding the best overall clustering performance.

Using the selected six horizon seismic attribute data types and the optimized K-means algorithm parameters determined from previous experiments, a multi-parameter K-means clustering analysis was performed on the study area, with results shown in Figure 4. When combined with the actual drilling results in the study area, the predictions, as compared to single seismic attribute methods (Figure 2), reveal certain patterns of hydrocarbon accumulation: ① Overall, the high structural position in the northern block is predominantly classified as blue (Class IV) to green (Class III), the central block in the high structural position is mainly green (Class III) to yellow (Class II), and the low structural position in the southern block is primarily red (Class I). This distribution suggests a general trend: the northern block at the high structural position is non-oil-bearing → the central block at the high structural position shows some hydrocarbon indication → the southern block at the low structural position is oil-bearing. ② The wells WZ6-1-2 and WZ6-1-3 in the high structural position of the northern block, along with WZ6-1-1 in the central block, are all dry wells. The prediction results place them in the blue (Class IV) or green (Class III) regions. ③ The wells WZ6-1-A1h and WZ6-1-A2h in the central block at the high structural position are also dry wells. The prediction places them in the yellow (Class II) region. Although both wells showed some hydrocarbon indication during drilling, electric logging interpreted them as water-bearing layers. ④ The high-yield oil well WZ6-1S-1 in the S2 fault block in the low structural position of the southern block is predicted to fall within the red (Class I) region. It is inferred that the red (Class I) region of the K-means clustering multi-parameter seismic attribute fusion is likely most closely associated with hydrocarbon-bearing strata. Additionally, the high structural positions of the adjacent S3 and S4 fault blocks also fall within the red (Class I) region, suggesting potential hydrocarbon-bearing zones. ⑤ Outside the trap of the S2 fault block, which has been confirmed as oil-bearing by the WZ6-1S-1 well, there exists a large red (Class I) region in the low structural position. However, this appears to contradict the general hydrocarbon accumulation pattern.

This issue may be related to the correlation among the input seismic attribute variables (x), as the determination of the Euclidean distance d_ci and the iterative cluster center c_i, according to Equations (1) and (2), depends on the input seismic attribute variables (x). If there is redundancy among the input seismic attribute variables, it may cause shifts in the new cluster centers c_i generated through iteration and introduce significant deviations in the computed Euclidean distance d_ci. This dual bias ultimately leads to greater error in the classification results using the K-means algorithm. Furthermore, based on Equations (3) and (4), the clustering effectiveness indicators SSE and DB index are also affected by redundancy among the input seismic attribute variables.

This paper proposes a preprocessing method using SVD technology on multiple seismic attribute data, which effectively resolves this issue. Since SVD technology uses an orthogonal decomposition algorithm, it is commonly employed as an effective method to address data redundancy issues. In the orthogonal decomposition of singular values using SVD, the larger singular values correspond to the main informational features embedded in the data, while the smaller singular values correspond to noise interference. The SVD technique not only reduces redundancy among seismic attribute data variables, but achieves data dimensionality reduction and suppresses noise interference. Singular value decomposition was performed on the six selected seismic attributes along the layer in the study area to obtain singular values, with appropriate values retained to reconstruct the seismic attribute data. Based on the SVD-reconstructed seismic attribute data, the K-means algorithm is then applied for predictive classification, referred to as the SVD–K-means clustering method. This approach overcomes the issues encountered when directly using multi-parameter seismic attribute data for K-means predictive classification. This method has achieved satisfactory results in the application for predicting oil-bearing zones in the WZ6-1 structure.

4. SVD–K-Means Model Prediction Method

SVD can decompose a matrix into the product of three matrices: left singular vectors, singular values, and right singular vectors.

Assuming there are M types of seismic attribute categories and N samples, forming an N × M matrix x, then matrix x can be decomposed, as outlined below.

Assuming that there are m kinds of seismic attribute categories, a total of n samples, that is, the formation of m × n order matrix M, then according to the singular value decomposition form M can be divided into:

M = U Σ V^{T} = σ_{1} u_{1} v_{1} + σ_{2} u_{2} v_{2} + \cdot \cdot \cdot + σ_{R} u_{R} v_{R}

(5)

In Equation (5), U = {u₁, u₂,…, u_N} is an N × N orthogonal matrix, V = {v₁, v₂,…,v_M} is an M × M orthogonal matrix, and Σ is an N × M diagonal matrix with rank r ≤ min (N,M). The singular values σ_i, arranged in descending order along the diagonal of Σ, satisfy σ₁ ≥ σ₂ ≥ ⋯ ≥ σ_r. The matrices U and V can be derived from the eigenvectors of xx^T and xTx, respectively. The singular value σ_i is the square root of the non-negative eigenvalues of xx^T and xTx. A larger singular value represents a higher amount of information energy contained. The number of singular values to retain is determined based on the cumulative contribution rate of the singular values, controlling the amount of information retained [27,28]. This is calculated as follows:

P_{r} = \frac{\sum_{i = 1}^{r} σ_{i}}{\sum_{i = 1}^{R} σ_{i}}

(6)

In Equation (6), r represents the number of retained singular values, and P_r denotes the cumulative contribution rate. Typically, the optimal cumulative contribution rate is determined through experimental analysis based on the specific problem, to effectively reduce data redundancy, achieve dimensionality reduction, and suppress noise.

Performing SVD on the six types of seismic attributes yields a singular value distribution curve, as shown in Figure 5, which generally forms an “L” shape. The magnitude of the singular values reflects the amount of information energy contained and is positively correlated with the information content. Larger singular values reflect the main information characteristics, while smaller singular values indicate noise interference. Discarding these smaller singular values and reconstructing the seismic attribute data results in limited loss of the primary information characteristics reflected in the reconstructed data. Based on the seismic response characteristics of six drilled wells in the study area, repeated testing found that retaining the top three singular values, with a cumulative contribution rate of 88.1%, yielded the best prediction results for hydrocarbon zones using the SVD–K-means clustering method.

The results of the oil-bearing zone prediction using the SVD–K-means clustering method are shown in Figure 6, demonstrating good predictive performance: ① The southern S2 fault block, which has been confirmed as oil-bearing by the WZ6-1S-1 well, remains in the red oil-bearing zone (Class I). Compared to the K-means prediction result based directly on seismic attribute data (Figure 5), the red (Class I) oil-bearing area has significantly contracted towards the higher structural position of the S2 fault block trap, almost aligning with the S2 fault block’s trap area, which is consistent with hydrocarbon accumulation patterns. ② The locations of the five dry wells in the high structural positions (WZ6-1-1, WZ6-1-2, WZ6-1-3, WZ6-1-A1h, and WZ6-1-A2h) are all within the yellow (Class II) or green (Class III) regions. The prediction results are consistent with the drilling outcomes. ③ A rolling development well, WZ6-1-A3, was deployed in the high structural position of the S3 fault block within the red (Class I) region, yielding successful results: a total oil layer thickness of 15.8 m was encountered, and the production rate reached 150 m³/d after commissioning. ④ The high structural position of the S4 fault block in the southern block is also within the red (Class I) region, suggesting it as a potential target area for future exploration and development.

As shown in Figure 6, the results of predicting hydrocarbon zones using the SVD–K-means clustering method not only align with drilling results, but match the geological patterns of hydrocarbon enrichment in the study area, as confirmed by the high-production industrial oil flow achieved from the newly drilled well WZ6-1-A3. However, why are the results of predicting hydrocarbon zones using the K-means algorithm directly on seismic attribute data (Figure 5) less ideal? The following analysis explores the causes of this outcome based on clustering effectiveness indicators in the K-means algorithm.

Commonly, the SC coefficient (silhouette coefficient), CH (Calinski–Harabasz) index, and DB index are used to evaluate and compare the effectiveness of various clustering algorithms [29,30]. The SC coefficient measures the compactness of samples within their own clusters and their separation from other clusters. The SC value ranges from [−1, 1], with values closer to 1 indicating better clustering effectiveness. The specific calculation formula is as follows:

S C_{i} = \frac{b_{i} - a_{i}}{\max (a_{i}, b_{i})}

(7)

In Equation (7), a_i represents the average distance between sample i and other samples within the same cluster, while b_i represents the average distance between sample i and all samples in other clusters. The average SC value of all samples is used as the overall clustering effectiveness SC. The CH index is used to evaluate the distance between cluster centers and the dispersion of samples within clusters, with the calculation formula as follows:

C H = \frac{B}{W} \cdot \frac{N - k}{k - 1}

(8)

In Equation (8), B represents the between-cluster sum of squared errors, W represents the within-cluster sum of squared errors, N is the total number of samples, and K is the number of clusters. A higher CH value indicates better clustering results. According to the DB index Formula (4), the DB index can provide a basis for selecting K values and also be used for clustering effectiveness evaluation.

Due to the randomness of the initial cluster centers in the K-means clustering algorithm, each clustering result exhibits some variation. Therefore, this study uses the average values of the three evaluation metrics from multiple clustering results for comparative evaluation. As shown in Table 1 and Table 2, the clustering effectiveness of the K-means and SVD–K-means methods is compared using the average values of the three metrics across eight clustering results. The results indicate that SVD–K-means outperforms K-means across all metrics, with an 18.4% improvement in the SC coefficient, a 57.8% increase in the CH index, and a 24.7% improvement in the DB index. The superior clustering effectiveness of SVD–K-means explains its better performance in hydrocarbon zone prediction.

5. Conclusions

The reservoir of the WZ6-1 oil-bearing structure in the study area consists of interbedded sandstone and mudstone deposits from a fan delta front, with a highly heterogeneous spatial distribution of sand bodies. Faulting is extensive, resulting in fragmented structures with significant variations in fault sealing between fault blocks, and a low degree of certainty. The distribution characteristics of oil, gas, and water are complex and variable, with limited understanding of hydrocarbon accumulation mechanisms, making it challenging to predict oil and gas-bearing zones, which has led to five unsuccessful wells. Since seismic attributes are not simply a linear response to the hydrocarbon potential of strata—a single factor of interest—but rather a nonlinear composite response influenced by various factors including the strata properties, acquisition, and processing, the use of conventional single seismic attributes or linear fusion of multiple seismic attribute variables for predicting hydrocarbon-bearing zones in this area has proven ineffective. The results not only poorly correlate with drilled well data, but fail to align with the fundamental geological patterns of hydrocarbon accumulation, rendering the approach “ineffective”. The application of the SVD–K-means clustering method proposed in this paper for predicting oil-bearing zones has yielded positive results. Not only does it align well with the drilling results, but it also corresponds to the geological patterns of hydrocarbon accumulation in the study area. This has been further confirmed by the high-yield industrial oil flow obtained from the newly drilled WZ6-1-A3 well, providing crucial technical support for subsequent exploration and development decisions and offering valuable insights for predicting hydrocarbon-bearing zones under similar complex geological conditions.

Author Contributions

Conceptualization, R.W.; methodology, Z.C.; validation, B.X.; formal analysis, J.Z.; investigation, Z.C.; resources, R.W.; data curation, Z.C.; writing—original draft preparation, Z.C.; writing—review and editing, R.W.; visualization, B.X.; supervision, R.W.; project administration, R.W.; funding acquisition, R.W. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the Natural Science Foundation of Zhejiang Province (No. LY20D020002).

Data Availability Statement

Dataset available on request from the authors.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Li, Q.; Li, Q.; Han, Y. A numerical investigation on kick control with the displacement kill method during a well test in a deep-water gas reservoir: A case study. Processes 2024, 12, 2090. [Google Scholar] [CrossRef]
Li, Q.; Li, Q.; Wang, F.; Wu, J.; Wang, Y. The carrying behavior of water-based fracturing fluid in shale reservoir fractures and molecular dynamics of sand-carrying mechanism. Processes 2024, 12, 2051. [Google Scholar] [CrossRef]
Wang, X.; Sun, Y.; Wang, W.; Ma, J. Random noise attenuation by self-supervised learning from single seismic data. Math. Geosci. 2022, 55, 401–422. [Google Scholar] [CrossRef]
Shi, R. Application of seismic attribute analysis technology in fine reservoir description. Daqing Pet. Geol. Dev. 2019, 38, 138–143. [Google Scholar]
Yang, J.; Lu, R.; Tao, W.; Cai, M.; Liu, G. MultiURNet for 3D seismic fault attributes fusion detection combined with PCA. J. Appl. Geophys. 2024, 221, 105296. [Google Scholar] [CrossRef]
Xie, C.; Peng, Z.; Zhou, J.; Zhang, P.; Zhang, W. Seismic multi-attribute fusion method based on Contourlet transfor. Pet. Geophys. Explor. 2014, 49, 739–744. [Google Scholar]
Liu, W.; Yang, J.; Li, P.; Han, Y.; Zhao, J. A novel object-based supervised classification method with active learning and random forest for PolSAR imagery. Remote Sens. 2018, 10, 1092. [Google Scholar] [CrossRef]
Provost, F.; Hibert, C.; Malet, J.P. Automatic classification of endogenous landslide seismicity using the Random Forest supervised classifier. Geophys. Res. Lett. 2017, 44, 113–120. [Google Scholar] [CrossRef]
Ahmed, A.; Jalal, A.; Kim, K. A novel statistical method for scene classification based on multi-object categorization and logistic regression. Sensors 2020, 20, 3871. [Google Scholar] [CrossRef]
Ibrahim, A.F.; Ahmed, A.; Elkatatny, S. Applications of Different Classification Machine Learning Techniques to Predict Formation Tops and Lithology While Drilling. ACS Omega 2023, 8, 42152–42163. [Google Scholar] [CrossRef]
Gholami, R.; Shahraki, A.R.; Jamali, P.M. Prediction of hydrocarbon reservoirs permeability using support vector machine. Math. Probl. Eng. 2012, 2012, 670723. [Google Scholar] [CrossRef]
Anifowose, F.; Labadin, J.; Abdulraheem, A. Improving the prediction of petroleum reservoir characterization with a stacked generalization ensemble model of support vector machines. Appl. Soft Comput. 2015, 26, 483–496. [Google Scholar] [CrossRef]
Cariou, C.; Chehdi, K. Unsupervised texture segmentation/classification using 2-D autoregressive modeling and the stochastic expectation-maximization algorithm. Pattern Recognit. Lett. 2008, 29, 905–917. [Google Scholar] [CrossRef]
Miyahara, H.; Aihara, K.; Lechner, W. Quantum expectation-maximization algorithm. Phys. Rev. A 2020, 101, 012326. [Google Scholar] [CrossRef]
Zhu, Z.; Chen, X.; Ren, H.; Ren, H.; Tao, L.; Jiang, J.; Wang, T.; Cheng, M.; Ding, S.; Du, R. Seismic facies analysis using the multiattribute SOM-K-means clustering. Comput. Intell. Neurosci. 2022, 2022, 1688233. [Google Scholar] [CrossRef]
Chen, S.; Liu, Z.; Zhou, H.; Wen, X.; Xue, Y. Seismic facies visualization analysis method of SOM corrected by uniform manifold approximation and projection. IEEE Geosci. Remote Sens. Lett. 2023, 20, 7501805. [Google Scholar] [CrossRef]
Adnan, R.M.; Parmar, K.S.; Heddam, S.; Shahid, S.; Kisi, O. Suspended Sediment Modeling Using a Heuristic Regression Method Hybridized with Kmeans Clustering. Sustainability 2021, 13, 4648. [Google Scholar] [CrossRef]
Xu, G.; Ding, X.; Han, T.; Kang, Y. Analysis of influencing factors on excellent teachers’ professional growth based on DB-Kmeans method. EURASIP J. Adv. Signal Process. 2022, 1, 117. [Google Scholar]
Yue, D.; Li, W.; Wang, W.; Hu, G.; Qian, H. Fused spectral-decomposition seismic attributes and forward seismic modelling to predict sand bodies in meandering fluvial reservoirs. Mar. Pet. Geol. 2019, 99, 27–44. [Google Scholar] [CrossRef]
Karelia, M.L.; Heather, B. Deepwater seismic facies and architectural element interpretation aided with unsupervised machine learning techniques: Taranaki basin, New Zealand. Mar. Pet. Geol. 2022, 136, 105427. [Google Scholar]
Li, X.; Li, K. Fault-attri-attention: A method for fault identification based on seismic attributes attention. Neural Comput. Appl. 2024, 36, 3645–3661. [Google Scholar] [CrossRef]
Hampson, D.P.; Schuelke, J.S.; Quirein, J.A. Use of multiattribute transforms to predict log properties from seismic data. Geophysics 2001, 66, 220–236. [Google Scholar] [CrossRef]
Wohnrath, M.F.T.; Fernanda, A.S.; Eduardo, M.M.D.; Vileira, L.S.D. A machine-learning based approach to predict facies associations and improve local and regional stratigraphic correlations. Mar. Pet. Geol. 2024, 160, 106636. [Google Scholar]
Li, Q.; Zheng, H.; Cui, T.; Zhang, Y. Identification and location method of strip ingot for autonomous robot system using kmeans clustering and color segmentation. IET Control Theory Appl. 2023, 17, 2124–2135. [Google Scholar] [CrossRef]
Goutte, C.; Toft, P.; Rostrup, E.; Nielsen, F.A.; Hansen, L.K. On clustering fMRI time series. NeuroImage 1999, 9, 298–310. [Google Scholar] [CrossRef]
Amorim, D.C.R.; Hennig, C. Recovering the number of clusters in data sets with noise features using feature rescaling factors. Inf. Sci. 2015, 324, 126–145. [Google Scholar] [CrossRef]
Lin, L.; Zhang, G.; Liu, J.; Han, L.; Zhang, J. Estimation of fracture density and orientation from azimuthal elastic impedance difference through singular value decomposition. Pet. Sci. 2021, 18, 1675–1688. [Google Scholar]
Lv, P.; Wu, X.; Zhao, Y.; Chang, J. Noise removal for semi-airborne data using wavelet threshold and singular value decomposition. J. Appl. Geophys. 2022, 201, 104622. [Google Scholar] [CrossRef]
Zhu, L.; Ma, B.; Zhao, X. Clustering validity analysis based on contour coefficient. Comput. Appl. 2010, 30, 139–141. [Google Scholar]
Chen, X.; Su, H. Sea state clustering analysis based on kernel K-Means and SOM neural network algorithm. J. Shaanxi Univ. Sci. Technol. 2023, 41, 208–214. [Google Scholar]

Figure 1. Well location map of the Weizhou Formation (W3IV) in WZ6-1 structure.

Figure 2. Layer-by-layer seismic attribute plan of the Weizhou Formation (W3IV) in WZ6-1 structure.(a) Energy half-time; (b) Instantaneous phase; (c) Dominant frequency; (d) Bandwidth.

Figure 3. SSE elbow method and DB index method line chart. (a) SSE elbow method; (b) DB index method.

Figure 4. WZ6-1 constructs the K-means algorithm prediction diagram of the Weizhou Formation (W3IV).

Figure 5. The singular value distribution curve of SVD decomposition of the Weizhou Formation (W3IV) in WZ6-1 structure is plotted.

Figure 6. The SVD–K-means algorithm prediction map of the Weizhou Formation (W3IV) is constructed by WZ6-1.

Table 1. K-means eight clustering evaluation index.

Index	1	2	3	4	5	6	7	8	Average
SC	0.317	0.316	0.314	0.313	0.317	0.317	0.314	0.314	0.315
CH	3324.512	3462.897	3463.157	3462.821	3426.737	3426.850	3462.992	3463.156	3436.64
DB	1.043	1.160	1.164	1.169	1.155	1.158	1.166	1.164	1.147

Table 2. SVD–K-means eight clustering evaluation index.

Index	1	2	3	4	5	6	7	8	Average
SC	0.372	0.376	0.373	0.372	0.374	0.373	0.372	0.372	0.373
CH	5422.135	5422.179	5422.054	5420.614	5420.614	5422.353	5422.266	5422.107	5421.79
DB	0.922	0.921	0.922	0.915	0.915	0.920	0.921	0.922	0.920

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, Z.; Wang, R.; Xu, B.; Zhu, J. Research on Oil and Gas-Bearing Zone Prediction and Identification Based on the SVD–K-Means Algorithm—A Case Study of the WZ6-1 Oil-Bearing Structure in the Beibu Gulf Basin, South China Sea. Energies 2024, 17, 5771. https://doi.org/10.3390/en17225771

AMA Style

Chen Z, Wang R, Xu B, Zhu J. Research on Oil and Gas-Bearing Zone Prediction and Identification Based on the SVD–K-Means Algorithm—A Case Study of the WZ6-1 Oil-Bearing Structure in the Beibu Gulf Basin, South China Sea. Energies. 2024; 17(22):5771. https://doi.org/10.3390/en17225771

Chicago/Turabian Style

Chen, Zhilong, Renyi Wang, Biao Xu, and Jianghang Zhu. 2024. "Research on Oil and Gas-Bearing Zone Prediction and Identification Based on the SVD–K-Means Algorithm—A Case Study of the WZ6-1 Oil-Bearing Structure in the Beibu Gulf Basin, South China Sea" Energies 17, no. 22: 5771. https://doi.org/10.3390/en17225771

APA Style

Chen, Z., Wang, R., Xu, B., & Zhu, J. (2024). Research on Oil and Gas-Bearing Zone Prediction and Identification Based on the SVD–K-Means Algorithm—A Case Study of the WZ6-1 Oil-Bearing Structure in the Beibu Gulf Basin, South China Sea. Energies, 17(22), 5771. https://doi.org/10.3390/en17225771

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on Oil and Gas-Bearing Zone Prediction and Identification Based on the SVD–K-Means Algorithm—A Case Study of the WZ6-1 Oil-Bearing Structure in the Beibu Gulf Basin, South China Sea

Abstract

1. Introduction

2. Geologic Background of the Study Area

3. K-Means Model Prediction Method

4. SVD–K-Means Model Prediction Method

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI