MSIMRS: Multi-Scale Superpixel Segmentation Integrating Multi-Source Remote Sensing Data for Lithology Identification in Semi-Arid Area

Lu, Jiaxin; Li, Liangzhi; Wang, Junfeng; Han, Ling; Xia, Zhaode; He, Hongjie; Bai, Zongfan

doi:10.3390/rs17030387

Open AccessArticle

MSIMRS: Multi-Scale Superpixel Segmentation Integrating Multi-Source Remote Sensing Data for Lithology Identification in Semi-Arid Area

by

Jiaxin Lu

¹,

Liangzhi Li

^2,*

,

Junfeng Wang

³,

Ling Han

²,

Zhaode Xia

⁴,

Hongjie He

⁵ and

Zongfan Bai

²

¹

School of Geological Engineering and Geomatics, Chang’an University, Xi’an 710064, China

²

School of Land Engineering, Chang’an University, Xi’an 710064, China

³

School of Civil Engineering, Xi’an University of Architecture & Technology, Xi’an 710055, China

⁴

School of Earth Science and Resources, Chang’an University, Xi’an 710064, China

⁵

Department of Geography and Environmental Management, University of Waterloo, Waterloo, ON N2L 3G1, Canada

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(3), 387; https://doi.org/10.3390/rs17030387

Submission received: 20 November 2024 / Revised: 15 January 2025 / Accepted: 21 January 2025 / Published: 23 January 2025

Download

Browse Figures

Versions Notes

Abstract

:

Lithology classification stands as a pivotal research domain within geological Remote Sensing (RS). In recent years, extracting lithology information from multi-source RS data has become an inevitable trend. Various classification image primitives yield distinct outcomes in lithology classification. The current research on lithology classification utilizing RS data has predominantly concentrated on pixel-level classification, which suffers from a long classification time and high sensitivity to noise. In order to explore the application potential of superpixel segmentation in lithology classification, this study proposed the Multi-scale superpixel Segmentation Integrating Multi-source RS data (MSIMRS), and conducted a lithology classification study in Duolun County, Inner Mongolia Autonomous Region, China combining MSIMRS and the Support Vector Machine (MSIMRS-SVM). In addition, pixel-level K-Nearest Neighbor (KNN), Random Forest (RF) and SVM classification algorithms, as well as deep-learning models including Resnet50 (Res50), Efficientnet_B8 (Effi_B8), and Vision Transformer (ViT) were chosen for a comparative analysis. Among these methods, our proposed MSIMRS-SVM achieved the highest accuracy in lithology classification in a typical semi-arid area, Duolun County, with an overall accuracy and Kappa coefficient of 92.9% and 0.92. Moreover, the findings indicate that incorporating superpixel segmentation into lithology classification resulted in notably fewer fragmented patches and significantly improved the visualization effect. The results showcase the application potential of superpixel primitives in lithology information extraction within semi-arid areas.

Keywords:

multi-source remote sensing data; lithology identification; multi-scale superpixel; semi-arid area

1. Introduction

Geological Remote Sensing (RS) has proven to be an effective method for conducting high-precision geological mapping [1]. Geologists can retrieve valuable geological information from various RS images acquired from different platforms. This information is essential for geological surveys and mapping, particularly in regions characterized by challenging natural environments and limited communication and transportation resources [2,3]. Lithology, which represents the primary composition of the Earth’s shallow surface, is a fundamental geological component. The classification aiming at lithology using RS techniques has emerged as a significant research focus within the field of geological RS [4].

At present, RS data acquired from various platforms and sources has provided a wealth of data for automatic lithology classification [5,6,7]. Optical and Synthetic Aperture Radar (SAR) RS data, as two important data sources in the RS field, offer distinct advantages for lithology classification and identification. Over time, several well-established methodological systems have emerged for lithology classification and identification based on RS data, including the early band ratio [8] and spectral angle mapping [9], as well as some methods with dimensionality reduction as the core idea, such as independent component analysis [10], matched filtering [11], and Principal Component Analysis (PCA) [12]. However, the above methods mainly focus on the geological application adopting optical RS data. Optical RS technology plays a crucial role in lithology classification by virtue of its abundant spectral information, while it is affected by weather conditions and faces challenges in obtaining complete lithology information within vegetated areas [4,13,14]. In contrast, SAR exhibits all-weather, all-day capabilities, and the properties of penetrating dry sand and vegetation on the Earth’s surface [15,16,17]. Specifically, polarimetric SAR data capture abundant polarization information of ground objects. It can be utilized to extract polarization, texture, and other features, which have been widely used in the classification of surface environmental elements [18,19]. However, there is a lack of research on lithology classification based on polarimetric SAR data [20].

Advancements in intelligent computing technology and computational power offer opportunities for combining machine learning (ML) with RS data for lithology classification [21,22]. For lithological classification based on optical RS data combining ML, El Fels and El Ghorfi [23] used Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) data to obtain the optimal lithology classification accuracy of 90.46% with a Kappa coefficient of 0.84 using the Regularized Discriminant Analysis method. Using a Support Vector Machine (SVM) classifier, He et al. [24] employed Worldview-2 RS images and combined texture features with terrain features to successfully identify Lujavrite, achieving an overall recognition accuracy of 89.57%. Khan et al. [25] integrated ASTER, Landsat-8, and Sentinel-2 data, and applied various machine-learning (ML) algorithms to identify subclasses within a given lithology. The optimized method using Random Forest (RF) achieved the highest accuracy of 96.36%. Wang et al. [26] identified the shallow granite bodies in the Himalayas based on ASTER and Sentinel-2A data, using a hybrid approach including image fusion, metric learning, and RF, and effectively recognized seven target lithological units with an Overall Accuracy (OA) of 85.75%. Abdelkader et al. [27] employed an SVM classifier to classify thirteen types of lithologies by combining Sentinel-2 and ASTER data. The classification achieved an impressive OA of 91.2% with a Kappa coefficient of 0.90.

Compared with the above studies, there are fewer studies on lithology classification combining SAR data and ML. Guo et al. [20] extracted a total of 10 parameters including backscattering and polarization features from Sentinel-1 SAR data to carry out wavelet transform, and then used the random forest classifier to conduct lithology classification in the East Tianshan Mountains in western China, while the OA only reached 55.6%. Wang et al. [28] developed a lithology classification method using L-band and C-band SAR data acquired by Spaceborne Imaging Radar-C (SIR-C) system in Qinghe, Xinjiang, China. The method employed a stacked sparse self-coder approach based on superpixel segmentation. The classification achieved an impressive OA of 98.9%.

Recently, there is a study using both optical and polarimetric SAR data simultaneously for lithology classification purposes, which inspired us to use multi-source data in lithology classification. In particular, Lu et al. [29] conducted lithology classification using GaoFen-2 (GF-2), GF-3, Sentinel-2A, and ASTER data. They first performed feature extraction based on vegetation suppression in RS images within a semi-arid area. Next, they applied feature selection and utilized an improved particle swarm optimization algorithm in combination with the SVM classifier for lithology classification, which achieved an OA of 92.1% by utilizing the optimal combination of 35 features. Although the methods mentioned above achieve high performance in lithology classification, they are mainly pixel-level methods, suffering from a high computing resource requirement, long classification time, and high sensitivity to noise.

In literature, except for pixel-level methods, there are some studies conducting lithology classification on superpixels which is proposed by Ren and Malik, belonging to the domain of image segmentation [30]. Image segmentation is a fundamental process in computer vision that involves partitioning an image into multiple non-overlapping regions or segments, and superpixel segmentation is a technique within this domain [31]. A superpixel refers to an irregular collection of pixels, composed of a series of pixels with similar features and adjacent positions in the original image, with specific properties lying between a single pixel and a complete image object [30]. By elevating the level of image processing from individual pixels to regions defined by superpixels, superpixel segmentation technology significantly reduces the complexity of image post-processing and provides great convenience for subsequent image analysis tasks. However, the exploration is limited, although superpixel segmentation technology has been widely utilized in the computer vision field, such as target recognition [32], target tracking [33,34], image segmentation [35,36], and pose estimation [37,38]. From the literature, we can only find Wang et al. [28] where superpixels are employed for lithology classification. However, it does not employ multi-source RS data. In addition, previous superpixel segmentation algorithms are mainly designed for single image segmentation, which makes it difficult to meet the application requirements of segmentation and classification using multi-source RS data in geological RS [39,40]. Therefore, there is a lack of research on lithology classification at the superpixel level using multi-source RS data, which is the research gap recognized in this paper.

Thus, this paper proposed the Multi-scale superpixel Segmentation algorithm Integrating Multi-source RS data (MSIMRS). To demonstrate the enhanced lithology identification effect using superpixel segmentation in semi-arid areas, a new experiment was conducted by combining MSIMRS and the SVM classifier (MSIMRS-SVM), and utilizing a 35-dimensional optimal feature combination of the study area obtained by Lu et al. [29].

The contributions of the paper are as follows:

(1): We proposed a new algorithm, MSIMRS, for superpixel segmentation fusing multi-source RS data;
(2): We developed a new lithology classification method based on superpixel and multi-source RS data.

In this paper, the study area and data are introduced in Section 2, and the proposed MSIMRS algorithm is introduced in detail in Section 2. All the experimental results are presented in Section 3, including segmentation and lithology classification results. And the related discussion and conclusion are given in Section 4 and Section 5, respectively.

2. Materials and Methods

2.1. Study Area and Data

2.1.1. Study Area

The study area, located in Duolun County in the Inner Mongolia Autonomous Region, China, is characterized by semi-arid conditions. It spans the longitude range of 116°15′–116°30′E and the latitude range of 42°00′–42°20′N, as shown in Figure 1. The average annual temperature is 3.2 °C and the average annual precipitation is 367.1 mm [41].

The geological conditions of the study area are relatively complex, with rock formations dominated by volcanic lava and volcaniclastic rocks. Figure 2 depicts the geologic schematic map of the study area, which reveals that the main lithological types include trachyte, rhyolite, conglomerate, sandstone, tuff, and others. Faults and fractured zones are well-developed within the area, making the rocks highly susceptible to weathering. Due to the widespread distribution of loose sandy sediments on the surface, they are easily displaced by wind, resulting in a relatively small overall exposed area of rocks within the region, with better rock outcrops observed in the northwest direction.

In terms of vegetation cover, the study area is located at the juncture of forests, grasslands, and sandy lands, characterized by a rich variety of vegetation types that are widely but unevenly distributed in space. This diversity poses significant interference in lithology classification based on remote sensing data within the area. The main vegetation types include temperate grassland vegetation, meadow vegetation, marsh vegetation, sandy land vegetation, and others.

2.1.2. Experimental Data

The experimental data for our study area were sourced from four different satellite platforms: GF-2, Sentinel-2A, ASTER, and GF-3. The band information utilized in the experiment is presented in Table 1. The reason for selecting these specific bands was to leverage the complementary advantages of optical and SAR data given the available data conditions. Additionally, we aimed to cover a spectral range that spans from the visible band to the thermal infrared band, containing as many spectral bands as possible and prioritizing the highest resolution data for the same band across different platforms. The 35-dimensional features used were shown in Table 2. By manually selecting control points, the registration correction between RS images from different data sources was achieved using a polynomial transformation model, and the resampling output was subsequently performed using the cubic convolution interpolation method. All feature images were resampled to 4 m. The optical and SAR backscattering images are, respectively, processed by mean filtering and water cloud model to suppress vegetation [4]. The texture features are obtained from gray level co-occurrence matrix, and the polarization features are obtained by

H / A / α

decomposition [42] and AnYang decomposition algorithms [43].

In July 2022, we conducted field investigations at 276 sites within the study area. These investigation sites were placed based on the principle of using points to represent areas, covering both the outcropping and vegetation-covered areas of different types of geological units, with an average distance of approximately 1000 m between sites. The classification samples were derived by fine labeling combining field investigation, available geological data, and high-resolution satellite images. Seven quaternary lithologies, eight rock or rock combinations, and three ground objects were selected for classification. All the labeled samples covered 534,407 pixels with 4 m/pixel spatial resolution.

2.2. Improved Multi-Scale Superpixel Segmentation Algorithm

2.2.1. Clustering Criterion Integrating Multi-Source RS Data

The classical Simple Linear Iterative Clustering (SLIC) algorithm addresses the superpixel segmentation for an individual image by assigning clustering centers solely based on Commission Internationale de l’Eclairage (CIE) Lab color attributes and spatial locations [44,45]. As an improvement, Figure 3 depicts the proposed MSIMRS superpixel clustering criterion to meet the application requirements of multi-source RS data.

Firstly, the seed point initialization and gradient minimization process were carried out. The distance

S

between two adjacent superpixels was calculated as follows:

S = sqrt (N / K)

(1)

In Equation (1),

N

was the total number of pixels,

K

was the preset number of superpixels, and the search space of each seed point was

2 S \times 2 S

.

Subsequently, the feature images utilized for classification were categorized into two groups according to optical and radar data sources, and then the Principal Component Analysis (PCA) was conducted on them, respectively. The Principal Component (PC) feature images that contributed cumulatively more than 90% were employed for superpixel segmentation to diminish information redundancy and improve segmentation efficiency. The superpixel clustering criterion was also applicable to a single data source. It only needed to perform PCA on all bands or all backscattering images of the single data source and then take PC feature images with a cumulative contribution of more than 90% as the segmentation input data.

The distance measurement equations for clustering were shown as follows:

d_{p} = \sqrt{\sum_{m = 1}^{n} {(p_{m, i} - p_{m, j})}^{2}}

(2)

d_{l} = w_{s} \frac{\sqrt{{(x_{i} - x_{j})}^{2} + {(y_{i} - y_{j})}^{2}}}{S}

(3)

D_{p, l} = \sqrt{{(d_{p})}^{2} + {(d_{l})}^{2}}

(4)

In Equations (2)–(4),

d_{p}

represented the distance in PC feature space,

n

was the number of PC feature extracted from images used for segmentation,

p_{m, i}

and

p_{m, j}

represented the pixel value of pixel

i

and pixel

j

on the normalized PC feature numbered

m

, respectively,

d_{l}

represented the spatial distance after normalization,

w_{s}

was the spatial distance coefficient,

x_{i}

and

y_{i}

represented the pixel position,

S

was the distance between two adjacent seed points, and

D_{p, l}

was the joint distance referencing the PC features and the spatial position information.

2.2.2. Single-Scale Superpixel Segmentation

The proposed segmentation algorithm for a certain scale was designed as in Figure 4.

Firstly, the number of superpixels was determined, and the seed point initialization and the gradient minimization were completed. Then, iterative clustering was conducted based on the aforementioned clustering criterion. When the variation of clustering centers between adjacent iterations is less than the set threshold

T_{1}

, the iterations can be stopped. And the variation of clustering centers between adjacent iterations was calculated as follows:

A_{i} = [\begin{array}{l} p n_{1, 1} - p o_{1, 1} & p n_{1, 2} - p o_{1, 2} & \dots & p n_{1, n} - p o_{1, n} & w_{s} (x n_{1, 1} - x o_{1, 1}) / S & w_{s} (y n_{1, 1} - y o_{1, 1}) / S \\ p n_{2, 1} - p o_{2, 1} & p n_{2, 2} - p o_{2, 2} & \dots & p n_{2, n} - p o_{2, n} & w_{s} (x n_{2, 1} - x o_{2, 1}) / S & w_{s} (y n_{2, 1} - y o_{2, 1}) / S \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋮ & ⋮ \\ p n_{t, 1} - p o_{t, 1} & p n_{t, 2} - p o_{t, 2} & \dots & p n_{t, n} - p o_{t, n} & w_{s} (x n_{t, 1} - x o_{t, 1}) / S & w_{s} (y n_{t, 1} - y o_{t, 1}) / S \end{array}]

(5)

C_{i} = \frac{{‖A_{i}‖}_{2}}{\sqrt{t}}

(6)

In Equations (5) and (6),

A_{i}

was the difference matrix between the old and new clustering centers up to the iteration

i

,

t

was the total number of clustering centers,

n

was the total number of PC features used for segmentation,

p n_{t, n}

and

p o_{t, n}

was the pixel value of the clustering center numbered

t

on the PC image numbered

n

obtained from iteration

i

and

i - 1

, respectively,

w_{s}

was the spatial distance coefficient,

S

was the distance between two adjacent seed points (

x n_{t, 1}, y n_{t, 1}

) and (

x o_{t, 1}, y o_{t, 1}

), and

C_{i}

was the change value of the clustering centers up to the iteration

i

.

Finally, to address issues such as multi-connectivity and excessively small superpixel sizes in the clustering results, post-processing of the clustering was conducted by enhancing connectivity. This involved traversing a

2 S \times 2 S

region centered around each initialized seed point and processing the accompanying connected domains that share the same label as the main connected domain. The label values of the 4-neighborhood points of the contour points of each accompanying connected domain were counted, and the pixels of the connected domain to be processed were updated with the label value that the contour points touched the most.

2.2.3. MSIMRS Superpixel Segmentation

After clarifying the process of single-scale superpixel segmentation, the process flow for MSIMRS algorithm was illustrated in Figure 5. First, the segmentation scales were determined, and the single-scale superpixel segmentation flow described in Section 2.2.2 was employed to conduct the initial superpixel segmentation at the largest scale. Then, the distance indicator between all the pixels within a superpixel and their corresponding clustering center was calculated using the formula below:

B_{j} = [\begin{array}{l} p p_{1, 1} - p c_{j, 1} & p p_{1, 2} - p c_{j, 2} & \dots & p p_{1, n} - p c_{j, n} & w_{s} (x p_{1, 1} - x c_{j, 1}) / S & w_{s} (y p_{1, 1} - y c_{j 1}) / S \\ p p_{2, 1} - p c_{j, 1} & p p_{2, 2} - p c_{j, 2} & \dots & p p_{2, n} - p c_{j, n} & w_{s} (x p_{2, 1} - x c_{j, 1}) / S & w_{s} (y p_{2, 1} - y c_{j 1}) / S \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋮ & ⋮ \\ p p_{z, 1} - p c_{j, 1} & p p_{z, 2} - p c_{j, 2} & \dots & p p_{z, n} - p c_{j, n} & w_{s} (x p_{z, 1} - x c_{j, 1}) / S & w_{s} (y p_{z, 1} - y c_{j 1}) / S \end{array}]

(7)

D_{j} = \frac{{‖B_{j}‖}_{2}}{\sqrt{z}}

(8)

In Equations (7) and (8),

B_{j}

was the difference matrix between all the pixels and the clustering center of the superpixel numbered

j

,

z

was the total number of the pixels contained in the superpixel numbered

j

,

n

was the total number of the PC features used for segmentation,

p p_{z, n}

was the pixel value of the pixel numbered

z

in the PC image numbered

n

,

p c_{j, n}

was the pixel value of the clustering center numbered

j

in the PC image numbered

n

,

w_{s}

was the spatial distance coefficient,

S

was the distance between two adjacent seed points,

x p_{z, 1}, x c_{j, 1}, y p_{z, 1}, y c_{j, 1}

represented the position of the pixel numbered

z

and the clustering center in the current superpixel numbered

j

, and

D_{j}

was the distance indicator from all the pixels in the superpixel numbered

j

to their clustering center.

When the corresponding distance index of a superpixel is less than the set threshold

T_{2}

, the superpixel will not be processed further in subsequent multi-scale segmentation steps. The other superpixels that exceed this set threshold will continue to be segmented. The process iterates cyclically to complete the segmentation of the required regions at all scales.

Finally, post-processing same as Section 2.2.2 for final segmentation result was conducted by enhancing connectivity to finalize the MSIMRS segmentation.

2.3. Evaluation of Segmentation Effect

The segmentation results were evaluated combined with visualization effect, segmentation time, the number of segmentation iterations, the number of superpixels, and a quantitative evaluation index of segmentation accuracy. The quantitative evaluation index was as follows:

A c c_{s e g} = \sum_{i = 1}^{n} \frac{M a x_{i}}{A l l_{i}} / n

(9)

In Equation (9),

A c c_{s e g}

was the superpixel segmentation accuracy, whose value ranged from 0 to 1, and, the closer it was to 1, the better the segmentation effect was.

n

was the total number of superpixels in the segmentation result,

M a x_{i}

was the number of pixels belonging to the lithological type with the largest proportion in the superpixel numbered

i

, and

A l l_{i}

was the total number of pixels contained in the superpixel numbered

i

.

2.4. Experimental Settings

We used MSIMRS-SVM method to carry out lithology identification experiment in Duolun study area. The parameters of MSIMRS algorithm were optimally determined in a 1000 × 1000-pixel region by comprehensively comparing the segmentation visualization effect, segmentation time, number of iterations, superpixel number, segmentation accuracy, and appropriate weighted results between statistical indicators with significant differences. At last, the spatial distance coefficient

w_{s}

was set as 0.03, the clustering center change threshold

T_{1}

was set as 10-3, and the distance indicator threshold

T_{2}

was set as 0.23. And four segmentation scales (320 m, 160 m, 80 m, and 40 m) were used. The segmentation scales were determined based on specific application scenarios, actual ground feature distributions, and the general rules of geological survey. The minimum superpixel side length was set to be less than the minimum side length of a geological body unit, as required by the general rules of geological mapping at a scale of 1:50,000, which specifies 50 m. The maximum superpixel side length was close to the size of the Quaternary lithological unit with the largest number and area in our study area. Each dimension feature value of each superpixel was represented by the average value of all the pixels within the superpixel in each feature image.

In addition, we selected pixel-level K-Nearest Neighbor (KNN) [46] and Random Forest (RF) [47] and SVM [48] classification algorithms, as well as deep-learning models including Resnet50 (Res50) [49], Efficientnet_B8 (Effi_B8) [50], and Vision Transformer (ViT) [51], for comparative analysis. In addition, we also combined MSIMRS algorithm with KNN and RF classification algorithm for comparative analysis, named MSIMRS-KNN and MSIMRS-RF, respectively.

For all pixel-level classification algorithms, in our study area, 0.5% of the samples for each category were chosen as the training set at random, totaling 2671 samples, and the remaining samples were used for verification. All the fitness values were determined by the five-fold cross-validation of the classifiers on the training set.

For the three deep-learning models, we annotated 1800 regions of 96 × 96 pixels on the optimal feature images, covering 18 categories with 100 images per category. The annotated dataset was subsequently divided into training and testing sets in equal proportions across all categories, yielding a total of 1440 training images and 360 testing images.

For evaluating the classification results, we used indicators such as overall accuracy (OA), Kappa coefficient (K), precision (P), and recall (R), calculated as follows:

O A = \frac{N_{c}}{N_{a}} \times 100 %

(10)

K = \frac{O A - l}{1 - l}, l = \frac{\sum_{i = 1}^{C} a_{i} \times b_{i}}{{N_{a}}^{2}}

(11)

P = \frac{t}{t + f}

(12)

R = \frac{t}{t + n}

(13)

In Equations (10) and (11),

N_{c}

was the number of samples correctly classified for all the categories,

N_{a}

was the total number of samples,

a_{i}

was the number of real samples for each category, and

b_{i}

was the number of predicted samples for each category. In Equations (12) and (13),

t

was the number of samples in which the predicted and actual categories were consistent,

f

was the number of samples that were actually not of a certain class but were predicted to be of that class, and

n

was the number of samples that were actually of a certain class but were predicted to be of another class.

3. Results

3.1. Visualization Comparison

Figure 6 presents the single-scale segmentation result and the MSIMRS segmentation result in a 1000 × 1000-pixel region of our study area, using the optimal segmentation parameters detailed in Section 2.4. Additionally, it displays the classification results corresponding to these two segmentation methods. The base maps in Figure 6a,b are RGB composite images combining short-wave infrared, near-infrared, and red bands. Statistics revealed that the multi-scale segmentation result in Figure 6b comprised 8241 superpixels, whereas the single-scale segmentation result in Figure 6a included 9985 superpixels. Compared to the single-scale result, the multi-scale segmentation result had approximately 17.5% fewer superpixels, which effectively enhanced the efficiency of subsequent lithology classification. In addition, as shown in Figure 6c,d, the lithology classification results based on single-scale segmentation and MSIMRS segmentation reveal that the differences between the two mainly lie in the edge regions of larger superpixel blocks. Notably, the lithology boundaries corresponding to MSIMRS segmentation are more accurate. Overall, MSIMRS segmentation demonstrates superior efficiency and effectiveness in lithology classification compared to single-scale segmentation.

Figure 7 presents the lithology classification maps for our study area using different classification algorithms. From these maps, it is evident that the classification results obtained using our MSIMRS-SVM method exhibit a more consistent correspondence with the RS imagery than those from other algorithms. Specifically, the classification results from the KNN, RF, and (standalone) SVM algorithms contained more fragmented patches. This was due to the fact that these pixel-level classification methods, when processing image data, primarily focused on individual pixels or comparisons of local pixel features, lacking the utilization of global features. Consequently, numerous fine-grained patches emerged in the classification results, reducing the accuracy and stability of the classification process. In contrast, the compared deep-learning models were capable of automatically learning and extracting high-level features from images, providing a more abstract representation of the image content and thereby reducing the occurrence of fine-grained patches. Furthermore, the superpixel-level classification with MSIMRS integrated pixels with similar color and texture features to form larger, more regular superpixels, which resulted in a reasonable reduction in fine-grained patches while capturing and preserving image details.

Figure 8 compares the lithologic classification results of the whole study area obtained by the SVM algorithm and MSIMRS-SVM algorithm. In terms of consistency with the geologic schematic map of the study area shown in Figure 2, the lithology classification result obtained by the MSIMRS-SVM algorithm shown in Figure 8b was superior to that obtained by the SVM algorithm shown in Figure 8a. Moreover, it was obvious that the boundaries of the lithological distribution map obtained by the proposed method in Figure 8b were clearer, the speckled phenomenon was reduced, and the classification visualization effect was significantly improved compared with the SVM algorithm.

3.2. Accuracy Comparison

Table 3 presents the OAs and Kappa coefficients of the lithology classification results obtained using various algorithms in our study area.

According to the statistical results, KNN and Res50 exhibited relatively lower classification accuracies, with OAs of 77.8% and 77.4%, respectively, and both possessing a Kappa coefficient of 0.74. In contrast, SVM, Effi_B8, and ViT algorithms demonstrated higher classification accuracies, with OAs surpassing 80%. Notably, the proposed MSIMRS-SVM method achieved the highest classification accuracy in the Duolun study area, boasting an OA of 92.9% and a Kappa coefficient of 0.92.

Figure 9 presents the precision and recall statistics for different lithological types obtained by various algorithms in our study area. As shown in Figure 9, in the Duolun study area, the precision and recall for lithological types 1 to 7 corresponding to Quaternary lithologies were superior to those for lithological types 8 to 15 corresponding to other rock lithologies. Among the different algorithms, KNN and RF exhibited the worst performance in terms of precision and recall across all lithological types. Among the three deep-learning models, Res50 achieved the highest precision for various lithological types, and ViT achieved the highest recall for various lithological types.

Compared to other classification algorithms, our MSIMRS-SVM algorithm achieved the highest classification precision and recall. Even for lithological type 10, which exhibited the lowest precision, it still managed to reach 72.4%. Furthermore, with the exception of the 15th lithology category, which had a recall rate of 56%, all other lithology categories exhibited recall rates higher than 76%. The results obtained through the MSIMRS-SVM algorithm demonstrated significant advantages in lithology classification within the semi-arid area.

4. Discussion

This paper conducted lithology classification experiments in the typical semi-arid area of Duolun County, characterized by limited rock exposures and extensive vegetation coverage within complex geographical and geological conditions. The experiments compared our proposed MSIMRS-SVM classification method against other six algorithms including KNN, RF, SVM, Res50, Effi_B8, and ViT. In arid areas with simpler geographical and geological conditions, optical remote sensing data alone can yield satisfactory lithology classification accuracy [52]. However, to accommodate the complex geological and geographical conditions of the semi-arid area, we integrated GF-2, Sentinel-2A, ASTER, and GF-3 remote sensing data, leveraging the rich spectral information advantages of optical remote sensing and the penetration capabilities of radar remote sensing. In terms of the overall classification accuracy, the MSIMRS-SVM algorithm achieved the highest OA and Kappa coefficient in Duolun area.

However, the classification accuracies of different methods varied significantly for various lithological types and other ground objects in our study area. As illustrated in Figure 9, the MSIMRS-SVM classification method exhibited the highest classification accuracy among the different methods. The geological conditions in the study area are complex, with diverse vegetation types and uneven coverage. The rocks are prone to weathering, and large areas of Quaternary deposits or vegetation cover obscure rock outcrops, rendering the lithological information blurred on remote sensing images. This blurriness increases the difficulty of lithology classification, as classification algorithms need to accurately identify and distinguish between the characteristics of different lithologies.

In pixel-level classification algorithms, the KNN, RF, and SVM primarily focus on individual pixels or comparisons of local pixel features. However, in our study area with complex geological conditions and extensive vegetation cover, the classification accuracy of these algorithms is difficult to guarantee. This is because a single pixel may not fully reflect the true characteristics of different lithologies, and comparisons of local pixel features can also be influenced by interfering factors such as surrounding vegetation and soil.

When the sample dataset is limited, deep-learning models such as Res50, Effi_B8, and ViT also face challenges in lithology classification tasks. Although these models possess powerful feature extraction and classification capabilities, their performance largely depends on the quantity and quality of training data. In areas with complex geological conditions, lithological classification tasks often face issues of insufficient training data or inaccurate labels. This limits the generalization ability of the models, making it difficult for them to accurately capture the complex textures, colors, shapes, and spatial relationships of lithologies in the study area.

In contrast, superpixel-level classification methods, typically based on finer image segmentation and feature extraction, can more accurately capture both the local and global features of lithologies and are more suitable for processing lithological images with complex textures and structures. Experimental results have shown that combining the proposed MSIMRS segmentation algorithm with pixel-level classification methods including KNN, RF, and SVM improves the accuracy of lithology classification. In particular, the best classification results were obtained when combining it with the SVM. This is because the SVM classifier excels in performance with small samples and generalization ability [53]. The SVM distinguishes between samples of different categories by finding an optimal hyperplane, making the MSIMRS-SVM classification strategy more effective for lithological classification tasks with limited sample coverage areas and complex textures and structures.

When analyzing the types to be classified, we found that ground objects including construction, roads, and water bodies achieved the highest accuracy, followed by the quaternary lithologies. The lowest classification accuracy was observed for rock or rock combination lithologies. This disparity in performance can be traced back to the inherently distinct characteristics that define each lithology and ground object type. Lithology is more prone to mislabeling than other ground objects [54].

The geometric, spectral, backscattering, and texture features of construction, roads, and water bodies were regular and prominent, which can be recognized by visual interpretation. These features were reliably and accurately sampled, resulting in high precision and recall values.

Compared to construction, roads, and water bodies, the regularity of quaternary lithologies in various feature images was relatively lower, yet their distribution area was broader and their continuity was superior. The samples of quaternary lithologies selected through a combination of field investigation and visual interpretation were deemed more reliable than those pertaining to rock or rock combination lithologies, leading to a high classification accuracy. Notably, the MSIMRS-SVM classification method achieved the highest accuracy in classifying the quaternary lithologies.

The classification accuracy of the rock or rock combinations was the lowest, and the precision was significantly higher than the recall for all the three methods. Specifically, when using the MSIMRS-SVM method, the average precision exceeded the average recall by 6.87%. The low classification accuracy of the rock or rock combination lithologies can be attributed to the unique geological conditions of the study area. The lithologies predominantly consisted of easily weathered volcanic lava and pyroclastic rocks, which had limited outcrops and minimal differences across various feature images. Consequently, distinguishing between these lithologies and performing sample selection proved to be challenging. Despite an average precision of 85.59% for the rock or rock combination lithologies, the average recall was only 78.72%. Certainly, the recall rate serves as a metric, reflecting the quality of the samples to a degree, thereby emphasizing the need for future studies to prioritize enhancing sample reliability [55].

Compared with the existing research on lithology classification based on polarimetric SAR image superpixel segmentation [28], our research is more abundant in data sources, and provides a multi-scale superpixel segmentation scheme integrating multi-source RS data to adapt to the geographical and geological environment conditions in semi-arid areas. Meanwhile, our proposed MSIMRS-SVM classification method is superior to other compared classification algorithms in classification result visualization and classification accuracy efficiency, and has the best comprehensive classification performance, indicating that multi-scale superpixel segmentation integrating multi-source RS data has certain application potential in lithology classification in a semi-arid area. Addressing the challenge of lithology classification in a semi-arid area, the application of the relevant research findings presented in this paper is anticipated to significantly alleviate the burden of field surveys. Nonetheless, in order to ensure the reliability of the classification results, when extending these findings to other semi-arid locales, it remains essential that we conduct limited field investigations to assist in the labeling of classified samples, and it is necessary that we put forward high requirements on the accuracy of all related fieldwork.

Although our research results can provide an effective reference for geologists in semi-arid areas, it must be acknowledged that there are still some shortcomings in the experimental design of this paper. In terms of data, other RS data such as hyperspectral, S- and L-band SAR, and topographic data could be adopted to further optimize the lithology classification results in semi-arid areas. In terms of computational efficiency, our method demonstrates notable advantages in the classification stage, saving nearly 100 times compared to existing methods, according to statistics. However, the segmentation phase requires more time, with the segmentation running time of MSIMRS in the study area reaching 144.49 s/km². In recognition of this limitation, we will further optimize the segmentation program in the future to enhance computational efficiency. In addition, the lithology classification of other areas with more complex geographical and geological conditions could be further studied.

5. Conclusions

In this paper, a new lithology classification method named MSIMRS-SVM was proposed, and lithology classification experiments were carried out in the Duolun study area. The proposed MSIMRS-SVM method was compared with KNN, RF, SVM, Res50, Effi_B8, and ViT classification algorithms. The key research conclusions are as follows:

(1): Compared with other algorithms, especially the pixel-level KNN, RF, and SVM classification algorithms, the fragment patches in the lithology classification results obtained by MSIMRS-SVM were significantly reduced. This improvement led to a clearer representation of the lithology classes.
(2): In our study area, Duolun County, the proposed MSIMRS-SVM method obtained the highest lithology classification accuracy, with OA of 92.9% and Kappa coefficient of 0.92. Compared with the ViT model with a better performance than the other algorithms involved in the comparison, OA was increased by 6.5% higher. The experimental results demonstrated the reliability of our proposed MSIMRS-SVM lithology classification method.
(3): The proposed MSIMRS-SVM classification method exhibits the best comprehensive classification performance and shows a certain application potential in lithology classification in semi-arid areas. This method can provide a more reliable technical reference for geological survey workers. In the future, on the basis of existing data sources, we will integrate SAR data from other bands, topographic data, and their derived features to carry out lithology classification research, and devote ourselves to exploring the lithology classification of regions with more complex geographical and geological environmental conditions.

Author Contributions

Conceptualization, J.L. and L.H.; methodology, J.L.; software, J.W. and L.L.; validation, J.L., L.L. and J.W.; formal analysis, J.L.; investigation, J.L., Z.X. and Z.B.; resources, L.H.; data curation, J.L. and L.L.; writing—original draft preparation, J.L.; writing—review and editing, H.H.; visualization, J.L.; funding acquisition, L.H. and J.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Ministry of Science and Technology of the People’s Republic of China, grant number 91-Y50G32-9001-22/23, the National Natural Science Foundation of China, grant number 42171348 and the Postdoctoral Fellowship Program of CPSF, grant number GZC20232065.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Pereira, J.; Pereira, A.J.S.C.; Gil, A.; Mantas, V.M. Lithology Mapping with Satellite Images, Fieldwork-Based Spectral Data, and Machine Learning Algorithms: The Case Study of Beiras Group (Central Portugal). Catena 2023, 220, 106653. [Google Scholar] [CrossRef]
Han, W.; Li, J.; Wang, S.; Zhang, X.; Dong, Y.; Fan, R.; Zhang, X.; Wang, L. Geological Remote Sensing Interpretation Using Deep Learning Feature and an Adaptive Multisource Data Fusion Network. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–14. [Google Scholar] [CrossRef]
Wang, S.; Huang, X.; Han, W.; Li, J.; Zhang, X.; Wang, L. Lithological Mapping of Geological Remote Sensing via Adversarial Semi-Supervised Segmentation Network. Int. J. Appl. Earth Obs. Geoinf. 2023, 125, 103536. [Google Scholar] [CrossRef]
Lu, J.; Han, L.; Zha, X.; Li, L. Lithology Classification in Semi-Arid Areas Based on Vegetation Suppression Integrating Microwave and Optical Remote Sensing Images: Duolun County, Inner Mongolia Autonomous Region, China. Geocarto Int. 2022, 37, 1–24. [Google Scholar] [CrossRef]
Shebl, A.; Abdellatif, M.; Hissen, M.; Ibrahim Abdelaziz, M.; Csámer, Á. Lithological Mapping Enhancement by Integrating Sentinel 2 and Gamma-Ray Data Utilizing Support Vector Machine: A Case Study from Egypt. Int. J. Appl. Earth Obs. Geoinf. 2021, 105, 102619. [Google Scholar] [CrossRef]
Pan, T.; Zuo, R.; Wang, Z. Geological Mapping via Convolutional Neural Network Based on Remote Sensing and Geochemical Survey Data in Vegetation Coverage Areas. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 3485–3494. [Google Scholar] [CrossRef]
Huo, Y.; Cheng, X.; Lin, S.; Zhang, M.; Wang, H. Memory-Augmented Autoencoder with Adaptive Reconstruction and Sample Attribution Mining for Hyperspectral Anomaly Detection. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5518118. [Google Scholar] [CrossRef]
Alifu, H.; Tateishi, R.; Johnson, B. A New Band Ratio Technique for Mapping Debris-Covered Glaciers Using Landsat Imagery and a Digital Elevation Model. Int. J. Remote Sens. 2015, 36, 2063–2075. [Google Scholar] [CrossRef]
Mohan, M.; Meyyappan, M. Mapping of Mafic-Ultramafic Rocks in SMUC-SGT, India Using ASTER & Sentinel-2A Satellite Images. Remote Sens. Appl. Soc. Environ. 2022, 28, 100826. [Google Scholar]
Comon, P. Independent Component Analysis, A New Concept? Signal Process. 1994, 36, 287–314. [Google Scholar] [CrossRef]
Wang, R.; Lin, J.; Zhao, B.; Li, L.; Xiao, Z.; Pilz, J. Integrated Approach for Lithological Classification Using ASTER Imagery in a Shallowly Covered Region-The Eastern Yanshan Mountain of China. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 4791–4807. [Google Scholar] [CrossRef]
Lu, Y.; Yang, C.; Meng, Z. Lithology Discrimination Using Sentinel-1 Dual-Pol Data and SRTM Data. Remote Sens. 2021, 13, 1280. [Google Scholar] [CrossRef]
Cheng, X.; Huo, Y.; Lin, S.; Dong, Y.; Zhao, S.; Zhang, M.; Wang, H. Deep Feature Aggregation Network for Hyperspectral Anomaly Detection. IEEE Trans. Instrum. Meas. 2024, 73, 5033016. [Google Scholar] [CrossRef]
Cheng, X.; Zhang, M.; Lin, S.; Zhou, K.; Zhao, S.; Wang, H. Two-Stream Isolation Forest Based on Deep Features for Hyperspectral Anomaly Detection. IEEE Geosci. Remote Sens. Lett. 2023, 20, 5504205. [Google Scholar] [CrossRef]
Ren, B.; Ma, S.; Hou, B.; Hong, D.; Chanussot, J.; Wang, J.; Jiao, L. A Dual-Stream High Resolution Network: Deep Fusion of GF-2 and GF-3 Data for Land Cover Classification. Int. J. Appl. Earth Obs. Geoinf. 2022, 112, 102896. [Google Scholar] [CrossRef]
Zhang, C.; Feng, Y.; Hu, L.; Tapete, D.; Pan, L.; Liang, Z.; Cigna, F.; Yue, P. A Domain Adaptation Neural Network for Change Detection with Heterogeneous Optical and SAR Remote Sensing Images. Int. J. Appl. Earth Obs. Geoinf. 2022, 109, 102769. [Google Scholar] [CrossRef]
Ling, J.; Wei, S.; Gamba, P.; Liu, R.; Zhang, H. Advancing SAR Monitoring of Urban Impervious Surface with a New Polarimetric Scattering Mixture Analysis Approach. Int. J. Appl. Earth Obs. Geoinf. 2023, 124, 103541. [Google Scholar] [CrossRef]
Dong, H.; Zhang, L.; Zou, B. Exploring Vision Transformers for Polarimetric SAR Image Classification. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–15. [Google Scholar] [CrossRef]
Chen, L.; Cai, X.; Xing, J.; Li, Z.; Zhu, W.; Yuan, Z.; Fang, Z. Towards Transparent Deep Learning for Surface Water Detection from SAR Imagery. Int. J. Appl. Earth Obs. Geoinf. 2023, 118, 103287. [Google Scholar] [CrossRef]
Guo, S.; Yang, C.; He, R.; Li, Y. Improvement of Lithological Mapping Using Discrete Wavelet Transformation from Sentinel-1 SAR Data. Remote Sens. 2022, 14, 5824. [Google Scholar] [CrossRef]
Shirmard, H.; Farahbakhsh, E.; Müller, R.D.; Chandra, R. A Review of Machine Learning in Processing Remote Sensing Data for Mineral Exploration. Remote Sens. Environ. 2022, 268, 112750. [Google Scholar] [CrossRef]
Han, W.; Zhang, X.; Wang, Y.; Wang, L.; Huang, X.; Li, J.; Wang, S.; Chen, W.; Li, X.; Feng, R.; et al. A Survey of Machine Learning and Deep Learning in Remote Sensing of Geological Environment: Challenges, Advances, and Opportunities. ISPRS J. Photogramm. Remote Sens. 2023, 202, 87–113. [Google Scholar] [CrossRef]
El Fels, A.E.A.; El Ghorfi, M. Using Remote Sensing Data for Geological Mapping in Semi-Arid Environment: A Machine Learning Approach. Earth Sci. Inform. 2022, 15, 485–496. [Google Scholar] [CrossRef]
He, L.; Lyu, P.; He, Z.; Zhou, J.; Hui, B.; Ye, Y.; Hu, H.; Zeng, Y.; Xu, L. Identification of Radioactive Mineralized Lithology and Mineral Prospectivity Mapping Based on Remote Sensing in High-Latitude Regions: A Case Study on the Narsaq Region of Greenland. Minerals 2022, 12, 692. [Google Scholar] [CrossRef]
Khan, M.F.A.; Muhammad, K.; Bashir, S.; Din, S.U.; Hanif, M. Mapping Allochemical Limestone Formations in Hazara, Pakistan Using Google Cloud Architecture: Application of Machine-Learning Algorithms on Multispectral Data. ISPRS Int. J. Geo-Inf. 2021, 10, 58. [Google Scholar] [CrossRef]
Wang, Z.; Zuo, R.; Dong, Y. Mapping of Himalaya Leucogranites Based on ASTER and Sentinel-2A Datasets Using a Hybrid Method of Metric Learning and Random Forest. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 1925–1936. [Google Scholar] [CrossRef]
Abdelkader, M.A.; Watanabe, Y.; Shebl, A.; El-Dokouny, H.A.; Dawoud, M.; Csámer, Á. Effective Delineation of Rare Metal-Bearing Granites from Remote Sensing Data Using Machine Learning Methods: A Case Study from the Umm Naggat Area, Central Eastern Desert. Egypt. Ore Geol. Rev. 2022, 150, 105184. [Google Scholar] [CrossRef]
Wang, W.; Ren, X.; Zhang, Y.; Li, M. Deep Learning Based Lithology Classification Using Dual-Frequency Pol-SAR Data. Appl. Sci. 2018, 8, 1513. [Google Scholar] [CrossRef]
Lu, J.; Han, L.; Liu, L.; Wang, J.; Xia, Z.; Jin, D.; Zha, X. Lithology Classification in Semi-Arid Area Combining Multi-Source Remote Sensing Images Using Support Vector Machine Optimized by Improved Particle Swarm Algorithm. Int. J. Appl. Earth Obs. Geoinf. 2023, 119, 103318. [Google Scholar] [CrossRef]
Ren, X.; Malik, J. Learning a Classification Model for Segmentation. Proc. IEEE Int. Conf. Comput. Vis. 2003, 1, 10–17. [Google Scholar]
Ma, W.; Qu, J.; Wang, L.; Zhang, C.; Yang, A.; Zhang, Y. Pellet image segmentation model of superpixel feature-based support vector machine in digital twin. Appl. Soft Comput. 2024, 151, 111083. [Google Scholar] [CrossRef]
Ni, J.C.; Luo, Y.; Wang, D.; Liang, J.; Zhang, Q. Saliency-Based SAR Target Detection via Convolutional Sparse Feature Enhancement and Bayesian Inference. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5202015. [Google Scholar] [CrossRef]
Zhan, J.; Zhao, H.; Zheng, P.; Wu, H.; Wang, L. Salient Superpixel Visual Tracking with Graph Model and Iterative Segmentation. Cognit. Comput. 2021, 13, 821–832. [Google Scholar] [CrossRef]
Han, W.; Lekamalage, C.K.L.; Huang, G. Bin Efficient Joint Model Learning, Segmentation and Model Updating for Visual Tracking. Neural Netw. 2022, 147, 175–185. [Google Scholar] [CrossRef]
Mi, L.; Chen, Z. Superpixel-Enhanced Deep Neural Forest for Remote Sensing Image Semantic Segmentation. ISPRS J. Photogramm. Remote Sens. 2020, 159, 140–152. [Google Scholar] [CrossRef]
Ma, F.; Zhang, F.; Xiang, D.; Yin, Q.; Zhou, Y. Fast Task-Specific Region Merging for SAR Image Segmentation. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–16. [Google Scholar] [CrossRef]
Cao, S.; Yu, Y.; Guan, H.; Peng, D.; Yan, W. Affine-Function Transformation-Based Object Matching for Vehicle Detection from Unmanned Aerial Vehicle Imagery. Remote Sens. 2019, 11, 1708. [Google Scholar] [CrossRef]
Liu, Y.; Wang, Q.; Zhuang, Y.; Hu, H. A Novel Trail Detection and Scene Understanding Framework for a Quadrotor UAV with Monocular Vision. IEEE Sens. J. 2017, 17, 6778–6787. [Google Scholar] [CrossRef]
Xiang, D.; Zhang, F.; Zhang, W.; Tang, T.; Guan, D.; Zhang, L.; Su, Y. Fast Pixel-Superpixel Region Merging for SAR Image Segmentation. IEEE Trans. Geosci. Remote Sens. 2020, 59, 9319–9335. [Google Scholar] [CrossRef]
Yang, Z.; Niu, H.; Wang, X.; Huang, L.; Yang, K. An Unsupervised Semantic Segmentation Method That Combines the ImSE-Net Model with SLICm Superpixel Optimization. Int. J. Digit. Earth 2024, 17, 2341970. [Google Scholar] [CrossRef]
Wu, R.; Tang, H.; Lu, Y. Exploring Subjective Well-Being and Ecosystem Services Perception in the Agro-Pastoral Ecotone of Northern China. J. Environ. Manag. 2022, 318, 115591. [Google Scholar] [CrossRef] [PubMed]
Cloude, S.R.; Pottier, E. An Entropy Based Classification Scheme for Land Applications of Polarimetric SAR. IEEE Trans. Geosci. Remote Sens. 1997, 35, 68–78. [Google Scholar] [CrossRef]
An, W.; Cui, Y.; Yang, J. Three-Component Model-Based Decomposition for Polarimetric SAR Data. IEEE Trans. Geosci. Remote Sens. 2010, 48, 2732–2739. [Google Scholar]
Achanta, R.; Shaji, A.; Smith, K.; Lucchi, A.; Fua, P.; Süsstrunk, S. SLIC Superpixels Compared to State-of-the-Art Superpixel Methods. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 34, 2274–2281. [Google Scholar] [CrossRef]
Li, L.; Han, L.; Hu, H.; Liu, Z.; Cao, H. Standardized Object-Based Dual CNNs for Very High-Resolution Remote Sensing Image Classification and Standardization Combination Effect Analysis. Int. J. Remote Sens. 2020, 41, 6635–6663. [Google Scholar] [CrossRef]
Mishra, A.; Sharma, A.; Patidar, A.K. Evaluation and Development of a Predictive Model for Geophysical Well Log Data Analysis and Reservoir Characterization: Machine Learning Applications to Lithology Prediction. Nat. Resour. Res. 2022, 31, 3195–3222. [Google Scholar] [CrossRef]
Peng, X.; He, G.; She, W.; Zhang, X.; Wang, G.; Yin, R.; Long, T. A Comparison of Random Forest Algorithm-Based Forest Extraction with GF-1 WFV, Landsat 8 and Sentinel-2 Images. Remote Sens. 2022, 14, 5296. [Google Scholar] [CrossRef]
Zeng, L.; Li, T.; Huang, H.; Zeng, P.; He, Y.; Jing, L.; Yang, Y.; Jiao, S. Identifying Emeishan Basalt by Supervised Learning with Landsat-5 and ASTER Data. Front. Earth Sci. 2023, 10, 1–9. [Google Scholar] [CrossRef]
Dawson, H.L.; John, C.M. Object Detection Algorithms to Identify Skeletal Components in Carbonate Cores. Mar. Pet. Geol. 2024, 167, 106965. [Google Scholar] [CrossRef]
Shen, A.; Chen, R.; Zhu, Y.; Hu, R. Segmentation of Multi-Organ Functional Tissue Units Using UNet-EfficientNet-B8. In Proceedings of the 2023 4th International Symposium on Artificial Intelligence for Medicine Science, Chengdu, China, 20–22 October 2023; Association for Computing Machinery: New York, NY, USA, 2024; pp. 232–235. [Google Scholar]
Xue, Z.; Tan, X.; Yu, X.; Liu, B.; Yu, A.; Zhang, P. Deep Hierarchical Vision Transformer for Hyperspectral and LiDAR Data Classification. IEEE Trans. Image Process. 2022, 31, 3095–3110. [Google Scholar] [CrossRef]
Chen, Y.; Wang, Y.; Zhang, F.; Dong, Y.; Song, Z.; Liu, G. Remote Sensing for Lithology Mapping in Vegetation-Covered Regions: Methods, Challenges, and Opportunities. Minerals 2023, 13, 1153. [Google Scholar] [CrossRef]
Sun, H.; Wang, L.; Liu, H.; Sun, Y. Hyperspectral Image Classification with the Orthogonal Self-Attention ResNet and Two-Step Support Vector Machine. Remote Sens. 2024, 16, 1010. [Google Scholar] [CrossRef]
Zhu, X.; Zhang, H.; Zhu, R.; Ren, Q.; Zhang, L. Classification with Noisy Labels through Tree-Based Models and Semi-Supervised Learning: A Case Study of Lithology Identification. Expert Syst. Appl. 2024, 240, 122506. [Google Scholar] [CrossRef]
Ircio, J.; Lojo, A.; Mori, U.; Malinowski, S.; Lozano, J.A. Minimum Recall-Based Loss Function for Imbalanced Time Series Classification. IEEE Trans. Knowl. Data Eng. 2023, 35, 10024–10034. [Google Scholar] [CrossRef]

Figure 1. The location map of the study area.

Figure 2. Geologic schematic map of the study area.

Figure 3. The superpixel clustering criterion integrating multi-source RS data.

Figure 4. Flowchart of single-scale superpixel segmentation.

Figure 5. Flowchart of the MSIMRS algorithm.

Figure 6. Segmentation results in the region of 1000 × 1000 pixels in our study area: (a) single-scale segmentation result; (b) MSIMRS segmentation result; (c) classification result corresponding to single-scale segmentation; (d) classification result corresponding to MSIMRS segmentation; and (e) legend.

Figure 7. Classification maps of different algorithms in our study area.

Figure 8. Lithology classification results of the whole study area through different algorithms: (a) SVM; (b) MSIMRS-SVM; and (c) legend.

Figure 9. Accuracy statistics of different lithology categories in our study area: (a) precision (%); and (b) recall (%). The category numbers are the same as those in the legend in Figure 7.

Table 1. Experimental data source information list of our study area.

Satellite Platforms	Download Source	Time	Number of Scenes	Band or Polarization Mode	Resolution (m)
GF-2	http://www.sasclouds.com/chinese/home, accessed on 21 August 2014	2 June 2020	6	Panchromatic band	1
GF-2		2 June 2020	6	Visible and near-infrared bands	4
Sentinel-2A	https://www.gscloud.cn/search, accessed on 29 June 2015	21 June 2020	1	Short-wave infrared bands	20
ASTER	https://search.earthdata.nasa.gov/search, accessed on 4 March 2000	17 October 2021	1	Thermal infrared bands	90
GF-3	http://www.sasclouds.com/chinese/home, accessed on 14 August 2016	7 August 2019 23 April 2019	4	Full polarization	5

Table 2. 35-dimensional optimal features for classification experiment of our study area.

Satellite Platforms	Feature Description	Total Number
GF-2	Panchromatic, green, red, and near-infrared bands	4
GF-2	Texture features corresponding to the panchromatic image, including mean, homogeneity, dissimilarity, and entropy	4
Sentinel-2A	Short-wave infrared bands	2
ASTER	Thermal infrared band 1 and thermal infrared band 3	2
GF-3	HH, HV, and VHpolarization backscattering	3
	The polarization features from $H / A / α$ decomposition: mean eigenvalue of the polarization coherence matrix, polarization scattering entropy and average scattering angle	3
	The polarization features from AnYang decomposition: surface scattering and volume scattering	2
	Texture features corresponding to HH polarization, including mean, dissimilarity, entropy, and correlation	4
	Texture features corresponding to HV polarization, including mean, dissimilarity, and correlation	3
	Texture features corresponding to VH polarization, including variance, homogeneity, entropy, and correlation	4
	Texture features corresponding to VV polarization, including mean, variance, homogeneity, and correlation	4

Table 3. The OAs and Kappa coefficients for different algorithms in our study area.

	KNN	RF	SVM	Res50	Effi_B8	ViT	MSIMRS-KNN	MSIMRS-RF	MSIMRS-SVM
OA (%)	77.8	78.2	82.2	77.4	83.9	86.4	82.5	89.6	92.9
Kappa	0.74	0.75	0.79	0.74	0.82	0.85	0.79	0.88	0.92

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lu, J.; Li, L.; Wang, J.; Han, L.; Xia, Z.; He, H.; Bai, Z. MSIMRS: Multi-Scale Superpixel Segmentation Integrating Multi-Source Remote Sensing Data for Lithology Identification in Semi-Arid Area. Remote Sens. 2025, 17, 387. https://doi.org/10.3390/rs17030387

AMA Style

Lu J, Li L, Wang J, Han L, Xia Z, He H, Bai Z. MSIMRS: Multi-Scale Superpixel Segmentation Integrating Multi-Source Remote Sensing Data for Lithology Identification in Semi-Arid Area. Remote Sensing. 2025; 17(3):387. https://doi.org/10.3390/rs17030387

Chicago/Turabian Style

Lu, Jiaxin, Liangzhi Li, Junfeng Wang, Ling Han, Zhaode Xia, Hongjie He, and Zongfan Bai. 2025. "MSIMRS: Multi-Scale Superpixel Segmentation Integrating Multi-Source Remote Sensing Data for Lithology Identification in Semi-Arid Area" Remote Sensing 17, no. 3: 387. https://doi.org/10.3390/rs17030387

APA Style

Lu, J., Li, L., Wang, J., Han, L., Xia, Z., He, H., & Bai, Z. (2025). MSIMRS: Multi-Scale Superpixel Segmentation Integrating Multi-Source Remote Sensing Data for Lithology Identification in Semi-Arid Area. Remote Sensing, 17(3), 387. https://doi.org/10.3390/rs17030387

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

MSIMRS: Multi-Scale Superpixel Segmentation Integrating Multi-Source Remote Sensing Data for Lithology Identification in Semi-Arid Area

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area and Data

2.1.1. Study Area

2.1.2. Experimental Data

2.2. Improved Multi-Scale Superpixel Segmentation Algorithm

2.2.1. Clustering Criterion Integrating Multi-Source RS Data

2.2.2. Single-Scale Superpixel Segmentation

2.2.3. MSIMRS Superpixel Segmentation

2.3. Evaluation of Segmentation Effect

2.4. Experimental Settings

3. Results

3.1. Visualization Comparison

3.2. Accuracy Comparison

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI