1. Introduction
Remote sensing has the advantages of a large detection range and fast data acquisition. The use of remote sensing for geological mapping can shorten the production cycle of geological maps. Optical sensors and satellites such as ASTER, Landsat-8, and Sentinel-2 are now widely used for large-scale geological mapping due to the typical characteristic spectral profiles of different rocks or minerals [
1,
2,
3]. Hewson et al. [
4] generated maps of carbonate and ferrous iron content from ASTER short-wave infrared (SWIR) data and quartz content from thermal infrared (TIR) data. Almalki et al. [
5] used band ratios to highlight differences between bands of Landsat-8 to map the geology of hard-to-reach islands. Albert et al. [
6] combined Sentinel-2 and the shuttle radar topography mission (SRTM) digital elevation model (DEM) data to produce geological maps of arid regions. Several studies have used more than one type of optical remote sensing data [
7]. Pour et al. [
8] used ASTER and Landsat-8 data to map lithology and alteration minerals in Antarctica. Sekandari et al. [
9] used band ratios and principal component analysis (PCA) methods to extract Pb–Zn mineralization with multiple optical remote sensing data. Different data provide multiple wavelengths and information related to rocks or minerals.
Synthetic aperture radar (SAR) data are less commonly used for geological mapping. As long wavelength data, SAR provides different texture information and scattering mechanisms than optical sensors. Pour et al. [
10] used ASTER data for mineral mapping. PALSAR data were used to detect linear structures (faults and fractures) and curvilinear structures (back-slope and dip). Choe et al. [
11] discussed the application of the roughness characteristics provided by SAR data in assisting the geological mapping of ASTER and Landsat-8 data. Pour et al. [
12] combined Hyperion and SAR data to enable the identification of hydrothermal alteration rocks by textural variations in outcrop and surface roughness. This is useful for exploring gold mineralization in Malaysia.
Suspended material in the air can affect the imaging of optical sensors, while SAR can penetrate clouds and fog to detect the surface directly. Although SAR has many advantages, rock-type discrimination using SAR data alone is less effective. SAR does not have as many bands as multispectral or hyperspectral sensors, so it is difficult to find parameters that are directly related to rocks or minerals. Lu et al. [
13] discussed the capability of Sentinel-1 in lithology discrimination, but only close to half of the samples could be correctly classified. Radford et al. [
14] compared the accuracy of SAR, gravity, and magnetic data for rock-type identification and found that gravity and magnetic data with lower resolution had stronger classification capability. The low accuracy of lithology discrimination shown in their studies may make it difficult to use SAR data alone for lithological mapping. Although the overall accuracy is improved by combining SAR with optical remote sensing data, the difference in the imaging time of multiple sensors affected the reproducibility of the conclusions.
Currently, methods to improve the accuracy of rock-type discrimination based on SAR data focus on (i) improvements in classification algorithms (e.g., improvements in deep-learning algorithms) [
15], (ii) using time series to filter features [
16], and (iii) using multi-frequency or fully polarized data [
15,
17]. With the development of artificial intelligence, the classification ability of classifiers has gradually improved. Recently, deep learning has been commonly used to improve the classification accuracy of remote sensing data [
18,
19]. Deep-learning algorithms require a large number of training samples, and many studies do not investigate the sample points in the field but select samples directly based on geological maps. This has a significant impact on the classification effectiveness due to the limited accuracy of the geological map and the fact that it does not show all rock types. Time series are commonly used in crop and sea ice classification [
20]. Although rock morphology hardly changes within several years, SAR is very sensitive to surface morphology, and different weather conditions may make rock units differently separable. Lu et al. [
16] extracted one year of VV/VH backscatter coefficients from Sentinel-1 images and improved the overall accuracy by 9% compared to a single image. Time series are applicable for open-access data such as Sentinel-1, but for most SAR data, this is costly. More studies have used fully polarized or multi-frequency data for classification [
21,
22]. Data with different polarizations feed back different scattering mechanisms, which is effective in rock identification. However, the use of multi-frequency SAR may face the same limitation of higher cost.
The dual-pol SAR data has only two bands, and the correlation between adjacent rows or columns of the image is strong, so the information available for extraction is limited. It is a meaningful task to mine as many features as possible from one image. We have used wavelet transform for microwave feature-extraction of rock units. The wavelet transform can send the signal from the spatial domain to the frequency domain with little correlation between the transformed components [
23]. The single-dimensional wavelet transform is often used to process vibrational signals and is commonly used in remote sensing to extract features of spectral curves. Yang et al. [
24] found that the high-frequency components of the wavelet-transformed spectral curve have a strong correlation with the feldspar content in rocks. Two-dimensional (2D) wavelet transforms are often used for image denoising, compression, or classification [
25]. SAR images often contain a lot of speckle noise, and wavelet-based denoising methods are widely used [
26,
27]. In image classification, the wavelet transform is often used for feature extraction [
28,
29]. Wavelet-transformed images are better for medical disease diagnosis and land-use classification [
30]. 2D discrete wavelet transform (DWT) produces high-frequency and low-frequency components uniquely characterizing the surface texture, which contributes significantly to identifying sea ice using remote-sensing images [
31,
32]. However, few studies have explored the potential of the 2D wavelet transform for lithological mapping using SAR data. Understanding the performance of different types of mother wavelet functions (MF) in rock-unit classification can help to reduce the experimental time in future studies [
33,
34,
35]. This method does not require multiple images, which improves the classification results while saving costs.
This paper aims to perform the lithological mapping ability of Sentinel-1 in the East Tianshan region of western China with the help of 2D DWT and a random forest classifier. The main objectives of this study are (i) to extract the backscattering features and decomposition features of different types of rock units based on Sentinel-1; (ii) to select the MF that is most suitable for rock type classification; and (iii) to accurately verify and robustly check the different temporal data.
3. Methodology
The method mainly consists of four parts:
(1) The preprocessing of SAR data: the extraction of polarization decomposition parameters and backscatter coefficients;
(2) 2D-DWT of the parameters;
(3) Random forest classification, the selection of MF, and a robustness test;
(4) Lithological mapping and the analysis of classification results.
3.1. Preprocessing of Sentinel-1 and Parameter Extraction
Preprocessing of Sentinel-1 and parameter extraction were implemented in Sentinel’s Application Platform (SNAP) from ESA and PolSARpro v5.1.3 [
38,
39]. PolSARpro v5.1.3 was developed by the Institute of Electronics and Telecommunications of Rennes of the University of Rennes 1 (Rennes, France). The extraction process of scattering parameters (backscatter coefficients) contains orbit correction, radiometric calibration, deburst, multilooking, speckle filtering, and terrain correction. The extraction process of the polarization decomposition parameters (H, A, α, Shannon entropy, λ1, and λ2) generates the complex matrix during radiometric calibration and implements the final decomposition in PolSARpro v5.1.3 [
40,
41]. The filter is selected as the Refined LEE filter, and the DEM used for terrain correction is the SRTM DEM [
42].
Figure 3 shows the flow chart of preprocessing. The ten parameters extracted from Sentinel-1 are shown in
Table 3. They describe the scattering mechanisms for different rock types.
3.2. Discrete Wavelet Transform
The wavelet transform is an ideal tool for signal time-frequency analysis [
23]. The schematic diagram of the 2D-DWT is shown in
Figure 4. The 2D-DWT deals with a 2D numerical discrete matrix, and the process of wavelet transformation includes wavelet decomposition and reconstruction. Each decomposition produces one low-frequency component (A) and three high-frequency components (horizontal (H), vertical (V), and diagonal (D)). The high-frequency component shows the image edges or noise, i.e., the areas where the grayscale values change sharply. The low-frequency information is the area where the grayscale changes slowly.
DWT in this study was implemented in the wavelet toolbox of Matlab R2021b, which was developed by The MathWorks Inc. (Natick, MA, USA) [
43,
44]. Usually, the selection of MF requires continuous attempts to finally select the most suitable MF. We selected MF with both double orthogonality and discrete properties. The Daubechies series, Symlet series, and Coiflet series wavelets were used to implement wavelet decomposition and reconstruction.
Table 4 shows the MF used in this study.
For each MF, we performed one-level decomposition and reconstruction for all parameters extracted from Sentinel-1, i.e., each image is transformed into four images. Even though DWT can carry out multi-level decomposition, only one level of decomposition is performed in this study due to the large number of MF.
3.3. Rock-Type Descriptors
μ and
l2 were used to describe the regional characteristics of each sample in the image with the following equations [
45]:
These descriptors are computed in all the sample regions, where M denotes the number of columns of each sample matrix, N denotes the number of rows of each sample matrix, and f (x,y) denotes the sample matrix.
3.4. Kruskal–Wallis Test
The Kruskal–Wallis (K–W) test was performed on all of the samples to explore the differences in the parameters after DWT and the ability to distinguish between different rock types [
34]. The results of the K–W test were combined with the accuracy of the classifier and used together as an evaluation criterion for classification effectiveness.
3.5. Evaluation of the Classification Accuracy and the Selection of MF
The random forest (RF) method was used to classify rock types. In the process of screening MF, 5-fold cross-validation tests were used to evaluate the classification performance. The RF algorithm used the package randomForest in R 4.0.5, developed by Andy Liaw and Matthew Wiener (Rahway, NJ, USA) [
46]. Overall accuracy, KAPPA coefficients, and F1 score were used to evaluate the classification effectiveness. These metrics evaluate the ability of different MF to improve the classification of rock units from both overall and category perspectives. Because of the differences between the metrics, we combined all of the metrics and selected the excellent MF.
The incidence angle of SAR may affect the final result. To demonstrate the robustness of the MF with great performance to the incidence angle, we selected 13 summer season images. The classification was performed separately using the original SAR parameters and the parameters based on DWT. It was observed whether the overall accuracy was improved after DWT.
3.6. Lithological Mapping
All pixels in the study area are classified using the parameters extracted from the optimal MF. Finally, the resulting map is compared with the geological map and Sentinel-2 data to analyze and summarize the advantages and disadvantages of lithological mapping using SAR data.
5. Discussion
The principle of lithology classification by optical and TIR remote sensing is that different minerals have typical reflectance or emissivity in different bands. Due to the long wavelength of microwaves, there is no strong correlation between parameters and minerals. The echo intensity of SAR in arid regions is correlated with surface roughness and dielectric constant [
47]. Backscatter coefficients, polarization decomposition features, and texture features have been mainly used for lithology classification in previous studies [
13]. In many studies, combining SAR data with DEM enhances the accuracy because different rocks have different morphologies, which may lead to different elevations, but this does not capture the ability of SAR data to distinguish rock types. To discuss the contribution of dual-pol SAR data in lithological mapping, we do not use elevation data.
The use of microwave data for lithological mapping is mainly based on the surface morphology of the rock and whether the rock has been mineralized. Nearly half of the study area is within the Gobi Desert, which is heavily weathered. Conversely to optical sensors, the sensitivity to clast sizes makes SAR more suitable for areas of the outcrop of sedimentary clastic rocks.
Unlike many studies that use SAR data for lithology discrimination, we performed lithological mapping after discussing the classification accuracy. This is one of the most important contributions of geology remote sensing. The F1 score of quaternary (CG) after DWT reached 0.682, which is an impressive accuracy in sediment subdivision. The F1-score for granitoids improved by 13% after DWT (MF: db13). However, 57% is not a very satisfactory overall accuracy, mainly because effusive rocks are almost completely unrecognized correctly. After DWT of some MF, effusive rocks are not even correctly identified at all. There are two possible reasons for this: compared with other rock types, the number of effusive rock samples is small. This may lead to the unbalanced classification of the RF classifier. This is a difficult problem to solve because the distribution of rocks on the surface is uneven, and effusive rocks are not marked in many small-scale geological maps. Another possibility is that effusive rocks have similar backscattering characteristics to other rock types. In the future, we will test 2D DWT in the area with clearer outcrops and explore the microwave scattering model of rock.
Most of the boundaries in the wavelet-transformed (MF: db13) classification results of the study area match the geological map. This indicates that one level of DWT can meet the requirements of large-scale lithological mapping. The granite bodies in the northwestern sector of the study area have the most excellent classification quality. There are two possible reasons for the high backscatter coefficients of the granite bodies. Lu et al. [
13] tested the dielectric constant of rocks in Xingcheng, China, and we show the dielectric constants of rocks at 5.4 GHz frequency in
Table 9. Although granites often have a high dielectric constant, this is not the main reason why they are easy to identify. Rocks in the Gobi and desert often have similar dielectric constants. The dielectric constants of rocks in such a region need to be further studied. Due to the fact the rocks in the study area are scattered as shown in
Table 1, the surface roughness may be the main reason for controlling the backscatter coefficients. More roughness experiments will be carried out in future fieldwork. We note that the granite bodies are classified with striated quaternary sediments and alluvium, which are common vein rocks in the study area. During the field investigation, we found that the vein-like distribution of intermediate and acidic rocks (e.g., pegmatites and porphyries) developed in granite and amphibolite areas (
Figure 9). Due to only a few vein rock outcrops along the survey route, we did not obtain enough samples, which resulted in different types of vein rocks not being distinguished. Nevertheless, the use of SAR data and DWT for vein-rock identification is a topic that can be explored in depth in the future.
Comparing
Figure 7 and
Figure 8, we can see that the resulting boundary is clearer and smoother after using DWT. These are the most important points to improve the accuracy of the geological map. The low-frequency component of DWT has a filtering effect, which makes the resulting map smoother. The high-frequency component reflects the abrupt change of the signal, which can enhance the detection of the rock-type boundary. Most of the high-frequency components ranked low in the K–W test
p-value ranking, which may be due to the small area of each sample, which is not enough to reflect high-frequency components. Obviously, for the classification of the whole study area, the high-frequency components are important. In addition, the ability of vein rocks to be better identified after DWT may be related to the role of the high-frequency components.
Two types of representative features are misclassified: man-made structures on the surface, and locations where significant elevation changes occur. Even though SAR has some penetration ability, it cannot penetrate artificial buildings. In the subsequent process, the surface man-made structures need to be masked off. The boundaries of the folds in the study area and the areas with significant elevation changes were misclassified as granitoids. To some extent, this is a new idea of fold recognition. While previous studies have used spectral information for fold identification, SAR provides roughness information to assist in the identification from another scale. Outside of folds, SAR is often used to identify tectonic phenomena such as faults. We did not perform the identification of faults because the richly developed vein rocks are easily confused with faults in SAR data.
In this study, we performed only one level of DWT. According to the wavelet transform principle, two-level decomposition or higher-level decomposition can yield more variables for classification. Some of the high-frequency components of the one-level decomposition ranked low in the K–W test, and we speculate that some of the components generated by the two-level decomposition may weaken the classification effect. Although the RF classifier is insensitive to features with weak divisibility, we have to consider the fitness of the method for other classifiers.
The traditional method of creating geological maps is to observe the lithology around the geological survey route and delineate similar lithology in the plan. This approach is often less accurate and limited by road conditions. The lithology discrimination method based on 2D DWT and SAR data provided in this study provide a new idea for lithological mapping in arid areas. The wavelet transform technique amplifies the capability of SAR in rock discrimination. Combining DWT with multi-frequency SAR data can be considered in future research to explore a more stable lithological mapping method.
6. Conclusions
Due to the penetrating nature of SAR data, it has been applied to rock-unit classification. However, limited by the low overall accuracy, few studies have directly used dual-pol SAR data for lithological mapping. In this study, we attempt to use DWT to transform the parameters of Sentinel-1 to improve the rock-classification accuracy.
We combined the backscatter coefficients and polarization decomposition parameters extracted from Sentinel-1 and applied the RF algorithm to classify the rock units. All the parameters were decomposed using 94 MF for one level of wavelet decomposition, and the overall accuracy, KAPPA coefficients, and F1 score were used to evaluate the classification effectiveness. The results show that the overall accuracy can be improved regardless of the MF used. Among them, the db13 wavelet has the best improvement for the classification of rock units and has some robustness. Quaternary (FG) are best classified with an F1 score of 0.682. In the resulting map, we find that DWT has greater potential for vein rock identification and SAR works well for the identification of sedimentary rocks with different grain sizes. These results make it possible to use dual-pol SAR data for lithological mapping. Future developments are expected to test the effectiveness of more transformed methods in improving lithological mapping. Moreover, further mosaiced products of Sentinel-1 data acquired at continental scales are needed to prove the capability of DWT in large-scale regional mapping.