Mapping Bedrock Outcrops in the Sierra Nevada Mountains (California, USA) Using Machine Learning

Shastry, Apoorva; Cerovski-Darriau, Corina; Coltin, Brian; Stock, Jonathan D.

doi:10.3390/rs17030457

Open AccessCommunication

Mapping Bedrock Outcrops in the Sierra Nevada Mountains (California, USA) Using Machine Learning

by

Apoorva Shastry

¹

,

Corina Cerovski-Darriau

^2,3,*

,

Brian Coltin

^4,5 and

Jonathan D. Stock

^2,4

¹

Universities Space Research Association, Contractor to the U.S. Geological Survey, Moffett Field, CA 94035, USA

²

U.S. Geological Survey, Moffett Field, CA 94035, USA

³

U.S. Geological Survey, Golden, CO 80401, USA

⁴

NASA Ames Research Center, Moffett Field, CA 94035, USA

⁵

KBR Inc., Moffett Field, CA 94035, USA

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(3), 457; https://doi.org/10.3390/rs17030457

Submission received: 5 December 2024 / Revised: 20 January 2025 / Accepted: 27 January 2025 / Published: 29 January 2025

(This article belongs to the Special Issue Advances of Remote Sensing in Land Cover and Land Use Mapping)

Download

Browse Figures

Versions Notes

Abstract

:

Accurate, high-resolution maps of bedrock outcrops can be valuable for applications such as models of land–atmosphere interactions, mineral assessments, ecosystem mapping, and hazard mapping. The increasing availability of high-resolution imagery can be coupled with machine learning techniques to improve regional bedrock outcrop maps. In the United States, the existing 30 m U.S. Geological Survey (USGS) National Land Cover Database (NLCD) tends to misestimate extents of barren land, which includes bedrock outcrops. This impacts many calculations beyond bedrock mapping, including soil carbon storage, hydrologic modeling, and erosion susceptibility. Here, we tested if a machine learning (ML) model could more accurately map exposed bedrock than NLCD across the entire Sierra Nevada Mountains (California, USA). The ML model was trained to identify pixels that are likely bedrock from 0.6 m imagery from the National Agriculture Imagery Program (NAIP). First, we labeled exposed bedrock at twenty sites covering more than 83 km² (0.13%) of the Sierra Nevada region. These labels were then used to train and test the model, which gave 83% precision and 78% recall, with a 90% overall accuracy of correctly predicting bedrock. We used the trained model to map bedrock outcrops across the entire Sierra Nevada region and compared the ML map with the NLCD map. At the twenty labeled sites, we found the NLCD barren land class, even though it includes more than just bedrock outcrops, accounted for only 41% and 40% of mapped bedrock from our labels and ML predictions, respectively. This substantial difference illustrates that ML bedrock models can have a role in improving land-cover maps, like NLCD, for a range of science applications.

Keywords:

machine learning; neural network; quaternary geology; bedrock mapping; aerial photos; land cover

1. Introduction

Advances in imaging Earth’s surface are making our ability to map surficial geology an attractive, and increasingly possible, reality. This is already being performed for land cover, but only poorly for features of geologic interest, like bedrock and soil [1,2,3]. Since Landsat satellite imagery became widely available, a consortium of federal agencies has been maintaining the National Land Cover Database (NLCD), which classifies the 30 m resolution data into surface cover categories, like forests, wetlands, barren land, etc., (e.g., [4]). These thematic categories are broad and focused on vegetation, so the most relevant category to geologists—barren land—includes soil cover and rock without distinction. Publicly available, sub-meter imagery for the United States presents an opportunity to improve the resolution and extent of surface cover maps—particularly for bedrock outcrops, as we tested here.

One could improve existing land cover maps by manually mapping bedrock outcrops using sub-meter imagery. While still interpretive, that map would arguably be the most accurate and useful but could introduce variability due to individual mapper interpretations and be prohibitively time-consuming over a large area (e.g., [5,6,7,8]). A less expensive and efficient way to overcome these two challenges is to use machine learning techniques to capture the visual characteristics that a human expert sees and automate them at scales that would not be mappable by a single human [8]. Automated methods are increasingly common for classifying and making land cover maps (e.g., [9,10,11,12,13,14]), but their use for bedrock mapping is a newer application with limited examples [2,15,16,17,18,19]. The major challenge is successfully training an algorithm because of potentially large regional variations in soils, geology, and ecology. Coupling high-resolution imagery and expert-guided mapping with machine learning improves our ability to interpret large areas of the Earth’s surface, which could expand Quaternary mapping capacity as an entry point for mineral assessments, ecosystem mapping, and hazard mapping [19]. Better differentiation between bedrock outcrops and barren soil will conversely improve estimates of soil cover, which could improve carbon cycling, hydrologic, landslide susceptibility, and other models reliant on accurate soil maps [1,19]. Therefore, the goal of this study is to create a regional bedrock map using high-resolution imagery and machine learning.

In this paper we train DELTA (Deep Earth Learning, Tools, and Analysis), an open-source machine learning tool developed by NASA [20,21], to identify surface exposures of bedrock without including mobile, fragmental material (e.g., talus, boulders, etc.), or bare soil. We compare the DELTA accuracy to the existing U.S. Geological Survey (USGS) National Land Cover Database (NLCD). We use the Sierra Nevada Mountains, California (USA) as our study area. We test if we can produce an improved map of bedrock outcrops using expert mapping at local scales for the training input that captures natural geologic variability at a regional scale.

2. Data

2.1. Study Area

We mapped the exposed bedrock in the ~63,000 km² Sierra Nevada region (Figure 1, red outline). The Sierra Nevada is a mountain range in the Western United States between the latitudes of 35.10°N and 39.80°N and the longitudes of 117.87°W and 120.85°W. The region experiences a Mediterranean climate with dry summers and annual precipitation between 500 mm and 2000 mm [X]. The mountains are predominantly high-albedo, glacially carved granitic rocks, with localized outcrops of volcanic, sedimentary, and metamorphic rocks [Y]. The vegetated areas include grasslands, chaparral, woodlands, subalpine forests, and alpine meadows [4].

2.2. Remote Sensing Images

We used publicly available 2016 National Agriculture Imagery Program (NAIP) imagery (Figure 1). NAIP regularly acquires aerial imagery during the agricultural growing seasons across the United States. The 1411 image tiles used have a spatial resolution of 0.6 m, with an 8-bit radiometric resolution in four spectral bands: blue, green, red, and near-infrared (NIR).

Training Data

We selected twenty sites to train and test the machine learning model that visually captures the regional variability of rock types, outcrop morphology, and vegetation cover (Figure 1). Eight existing labeled sites were available from an earlier effort [1]. We further revised those labels for improved accuracy and created an additional twelve reference sites where Petliak et al. [1] performed poorly, following the same protocol of using the landscape classification tool [22] and manual corrections to map “rock” and “not rock” from the 2016 NAIP imagery. All data are available in the associated data release [23]. In total, the twenty mapped areas cover 83.2 km² (Figure 1). Within these areas, ~16.3 km² is labeled rock, or bedrock outcrops, and the other 66.9 km² was not bedrock.

2.3. National Land Cover Database

We compared our results to the NLCD [24,25] barren land class. NLCD provides land cover data at 30 m spatial resolution for the whole of the United States based on Landsat satellite data [4]. NLCD land cover data were classified from various temporal, spectral, spatial, and terrain data [26]. We used 2016 NLCD maps to directly compare with the mapping from the 2016 NAIP imagery. NLCD contains twenty distinct land cover classes, but the bedrock is not an exclusive class. Instead, outcrops are included in the barren class, which includes bedrock, desert pavement, scarps, talus, landslides, volcanic material, glacial debris, sand dunes, strip mines, gravel pits, and other accumulations of earthen material. NLCD barren land class accounts for ~2035 km² of the Sierra Nevada, or about 3% of the land cover in Figure 1.

3. Methods

3.1. Machine Learning Training and Implementation

The two key components for successfully running a machine learning model are: (1) creating accurate and representative reference input, and (2) splitting the input data into representative train and test subsets [1]. We used reference labels at twenty sites for bedrock outcrops across diverse rock types and rock appearances as our representative sample. We used a larger, fifteen-site subset, covering 65.9 km², to sufficiently train the model. We used the remaining five sites, covering 17.3 km², as the testing subset to calculate model accuracy. We ensured a balanced bedrock percentage in the training data (~19%), testing (~22%), and the entire dataset (~20%). In Figure 1, we show the sites used for training the machine learning model (dark purple) and the sites used for testing the machine learning model (pink).

We used the open-source toolkit, DELTA [21], to train a neural network model to segment, or map, bedrock. A neural network is a machine learning technique that trains a computer to recognize patterns in data by analyzing training examples, in this case, 4-band NAIP imagery guided by our input reference labels. Neural networks are shown to provide improved accuracy for image and pixel classification over traditional supervised methods [1,14]. We followed the protocol developed by Shastry et al. [20] to segment flood water using DELTA and adapted the neural network to identify bedrock outcrops using our fifteen training sites. The DELTA neural network is based on an existing network for spinal cord gray matter [27] that has previously worked well for the segmentation of binary features, such as the presence of water versus no water [20]. Further details on the model structure and parameters are included in Shastry et al. [20].

Once the model was trained, and before running it for the entire region, we validated the accuracy by comparing the model predictions to our five reserved testing sites. The model output gives the likelihood (from 0 to 255) that each pixel is bedrock, not a direct binary classification. We used the standard binary classification threshold of 50%, or pixel value 127, where pixel values higher than 127 are classified as rock and pixel values less than 127 as not rock. This 50% threshold tends to strike a balance between false positives and false negatives but can be adjusted to shift the balance as needed. DELTA-predicted bedrock refers to pixel values 127 or higher. Figure 2 shows a visual comparison of the testing labels (orange), DELTA-predicted bedrock (purple), areas of overlap (dark pink), and the statistics derived to evaluate the performance. Overlapping labels and DELTA predictions (dark pink) indicate agreement, or true positives (TP). Purple indicates areas where DELTA predicts bedrock, but training labels do not or give false positives (FP). Orange indicates areas where DELTA missed bedrock that is present in reference labels, or false negatives (FN). The white areas are not bedrock according to both our labels and DELTA predictions, or true negatives (TN). These four combinations are generally visualized as a confusion matrix (Figure 2B). The performance of the trained model is evaluated by commonly used statistics shown below the confusion matrix.

Precision is the fraction of correctly predicted bedrock among all the predicted bedrock, or:

P r e c i s i o n = \frac{T P}{T P + F P}

(1)

where TP and FP are true and false positives. Precision accounts for overprediction by the neural network. Recall (or hit rate) is the fraction of predicted bedrock among “true” bedrock from reference labels. It is defined as:

R e c a l l = \frac{T P}{T P + F N}

(2)

where TP is true positives and FN is false negatives. This accounts for underprediction by the network. The overall accuracy is defined as:

A c c u r a c y = \frac{T P + T N}{T N + F P + F N + T P}

(3)

where TP and TN are true positives and negatives, and FP and FN are false positives and negatives.

We implemented the trained machine learning model for the entire study area (Figure 1), iterating through the remaining 1411 NAIP image tiles to produce a probability map of likely bedrock for the entire region. The bedrock map, for applications like mineral resource identification, can be improved by masking out urban areas. Therefore, we decided to mask out urban areas by removing areas classified as any of NLCD’s four “developed” classes to limit false positives in the final map.

3.2. Comparison with NLCD

3.2.1. Sites with Reference Data

To estimate the accuracy of the various methods, we compared the NLCD barren class to our reference labels at the twenty sites used for training and validating the machine learning model, as well as the DELTA-predicted bedrock at those twenty sites. We calculated the precision, recall, and accuracy of both DELTA predictions and the NLCD barren class against our NAIP-derived reference labels, or “truth”. In addition, we also calculated the miss rate:

M i s s r a t e = \frac{F N}{T P + F N} = 1 - R e c a l l

(4)

where TP are true positives, and FN are false negatives. We did this in order to quantify the amount of bedrock missed by both DELTA predictions and by the NLCD barren class.

3.2.2. Entire Sierra Nevada

Lastly, using the DELTA-predicted bedrock results for the entire Sierra Nevada, we compared the presence of bedrock with the NLCD barren class to estimate how much NLCD underestimates the amount of bedrock. We used the urban-masked DELTA prediction layer for this analysis. Since the NLCD barren class includes more than just bedrock outcrops, we only considered underprediction by NLCD and ignored any overprediction by NLCD.

4. Results and Discussion

4.1. Machine Learning Training

We evaluated the DELTA-trained machine learning model on the five test sites, shown in pink (Figure 1). Figure 2B shows the confusion matrix for validation. Compared to the testing subset, the model performed well, with an overall accuracy of 90%, a recall of 82%, and a precision of 75%. The comparatively low precision means that the model has some problems with the overprediction of bedrock (Figure 3A). Qualitatively, we observed that these areas mostly belonged to two groups: urban infrastructure and loose, fragmental rock. Therefore, removing urban areas from the DELTA-predicted bedrock results improves the accuracy, and the urban-masked map is used for the comparisons below.

4.2. Comparison with NLCD

4.2.1. Sites with Reference Data

Figure 3 shows the NAIP imagery and corresponding reference labels, DELTA-predicted bedrock, and the NLCD barren class at three sample locations from the twenty sites used for training and validating the machine learning model. At these three sites, the DELTA predictions very closely follow our reference labels. However, the NLCD barren class seems to heavily under- or over-predict bedrock in these three examples (Table 1).

We compared the performance of urban-masked DELTA predictions and the NLCD barren class against the reference labels, which were considered “truth”. Figure 3B shows the confusion matrix for the comparison between DELTA predictions and reference labels, and Figure 3C shows the same for the comparison between the NLCD barren class and reference labels. DELTA predictions compared to reference labels have an overall accuracy of 91%, recall of 84%, and precision of 72%. The miss rate for DELTA predictions is 16%. In contrast, when the NLCD barren labels are compared with reference labels, recall is 18%, and the miss rate is 82%. NLCD incorrectly predicts most bedrock as shrub/scrub (68%), evergreen forests (7%), and herbaceous (6%) at our twenty labeled sites. We tested changing the rock pixel threshold for the training sites. We found it did not significantly change the success metrics, so continued to use the traditional 50% threshold for simplicity. We do not calculate accuracy or precision since the barren class contains more areas than just bedrock and cannot accurately estimate overprediction or false positives.

4.2.2. Entire Sierra Nevada

Using the trained DELTA model, we mapped bedrock across the study area to produce a map of likely bedrock exposure that does not include talus or other fragmental rock cover (Figure 4; [23]). We used the urban-masked DELTA predictions of bedrock for comparison with the NLCD barren class and considered this masked layer as “truth” in the confusion matrix. When compared to the NLCD barren class, NLCD had a miss rate of 85%. Each of the NAIP image tiles took less than 30 s to process, compared to the estimated 4–12 h for a human expert to map each of the twenty reference sites.

4.3. Implications and Limitations

DELTA accurately differentiated in-place bedrock from bare soil or mobile rock and far outperformed the NLCD miss rate of 82%. The high accuracy was likely due to the careful selection of the training areas and expert input into the mapping [8]. The reference sites were rigorously selected and mapped to represent a range of bedrock (lithology type, color, and texture) and vegetation covers present across the study area, and bolster training where the previous model underperformed (e.g., dry grasslands and loose rock-dominated areas) [1]. While DELTA outperformed NLCD, loose rock-dominated areas as shown in the middle of Figure 3A continued to have a lower accuracy (Table 1). Future work could test the portability of this DELTA model using validation sites in other regions to verify whether this DELTA model could achieve similar accuracy without additional training. More testing would be needed to know how much training data are needed to achieve similar, or better, accuracy more broadly. We suspect that additional training would be necessary if the landscape includes visually different land cover or bedrock [28]. However, our initial study area did cover a wide range of ecosystems and visually different bedrock, and other deep-learning studies have shown good transferability (e.g., [14]). Therefore, it may be transferrable to other mountainous regions with no to minimal additional training.

The model accuracy would possibly be improved with high-resolution Digital Elevation Model (DEM) data (slope, elevation, curvature) [2,7] or with additional imagery bands or spectral analysis (e.g., Normalized Digital Vegetation Index, NDVI) [29,30,31] to separate the non-rock class into more precise surface covers (e.g., water and vegetation types). However, DELTA performed remarkably well using only a single, readily available data input (NAIP) and twenty well-chosen training sites. We did find DELTA overpredicted bedrock in urban areas, so masking with land-use maps as performed in this study, is likely necessary for successful performance in regions with more buildings or infrastructure.

Many previous studies have shown that neural networks like DELTA are better at image classification compared to traditional supervised models like Random Forest (e.g., [1,14]). However, deep-learning models like DELTA do require high-performance computing resources as they are more computationally expensive than traditional methods. So, the trade-off between computational resources and accuracy could be further tested for large-scale applications. Improved accuracy in locating surficial exposure of bedrock, or inversely soil, has implications for mineral resource to hazard mapping [1,19]. In natural landscapes, soil (or mobile regolith) exists where bedrock is not exposed. Therefore, accurately distinguishing between bedrock outcrops and barren soil greatly improves estimates of soil-covered landscapes, which is a key input for models such as carbon storage, infiltration, and shallow-landslide susceptibility [1]. Land cover, particularly the presence of bedrock or soil, has implications for models used by scientists to decision-makers for land-use planning, conservation or resource allocation, and climate adaptation.

5. Conclusions

We explored the use of machine learning tools to map exposed bedrock for the Sierra Nevada Mountains. We trained the DELTA model using 2016 NAIP images, labeled as rock and not rock at 0.6 m resolution using data from fifteen out of twenty sites. Reference labels from the remaining five sites were used for validation and reporting model performance. We found DELTA performed well with 90% accuracy, 82% recall, and 75% precision. The DELTA map of bedrock outcrops outperformed the existing NLCD barren class, with a miss rate of only 16% compared to 82% for NLCD. When the NLCD barren class was compared with the DELTA bedrock map for the entire Sierra Nevada (Figure 1), the NLCD barren class missed 85% of bedrock, despite barren being a broader class that includes bedrock as well as fragmental rock and bare soil. Our DELTA-trained model predicts bedrock outcrops much more accurately over a wide study area (63,000 km²), even with minimal training data (83.2 km²), across a range of rock types and morphology. We conclude that bedrock maps can be substantially improved using a combination of high-resolution imagery and machine learning, and that these improvements can impact many land-cover-dependent models, and allow for Sierra-wide analyses to test uplift or landscape models (e.g., [32]). However, more model training with additional labeled images could be needed to accurately expand the model to other regions.

Author Contributions

Conceptualization, A.S., C.C.-D. and J.D.S.; methodology, A.S.; software, B.C.; validation, A.S.; formal analysis, A.S.; investigation, A.S. and C.C.-D.; resources, J.D.S.; data curation, A.S. and C.C.-D.; writing—original draft preparation, A.S. and C.C.-D.; writing—review and editing, A.S., C.C.-D., B.C. and J.D.S.; visualization, A.S.; supervision, C.C.-D. and J.D.S.; project administration, C.C.-D.; funding acquisition, J.D.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the U.S. Geological Survey (USGS) Mineral Resources Program, National Cooperative Geologic Mapping Program, Landslide Hazards Program, and National Innovation Center and run on the USGS Tallgrass high-performance computer.

Data Availability Statement

Associated data are available at the USGS ScienceBase Data Repository https://doi.org/10.5066/P9UQDIDE. The DELTA code is available to download from https://github.com/nasa/delta (accessed on 21 September 2021). The National Land Cover Database (NLCD) is available online https://www.mrlc.gov (accessed on 27 January 2025). NAIP imagery is available to download from various online portals including https://earthexplorer.usgs.gov/ (accessed on 27 January 2025).

Acknowledgments

We thank the USGS reviewer for their thoughtful comments during the internal review, and the four anonymous journal reviewers for their helpful feedback that greatly improved this manuscript. Additionally, we thank the USGS Tallgrass high-performance computing staff. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government.

Conflicts of Interest

Author Brian Coltin was employed by the company KBR Inc. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Petliak, H.; Cerovski-Darriau, C.; Zaliva, V.; Stock, J. Where’s the Rock: Using Convolutional Neural Networks to Improve Land Cover Classification. Remote Sens. 2019, 11, 2211. [Google Scholar] [CrossRef]
Ganerød, A.J.; Bakkestuen, V.; Calovi, M.; Fredin, O.; Rød, J.K. Where Are the Outcrops? Automatic Delineation of Bedrock from Sediments Using Deep-Learning Techniques. Appl. Comput. Geosci. 2023, 18, 100119. [Google Scholar] [CrossRef]
Lary, D.J.; Alavi, A.H.; Gandomi, A.H.; Walker, A.L. Machine Learning in Geosciences and Remote Sensing. Geosci. Front. 2016, 7, 3–10. [Google Scholar] [CrossRef]
Wickham, J.; Stehman, S.V.; Sorenson, D.G.; Gass, L.; Dewitz, J.A. Thematic Accuracy Assessment of the NLCD 2016 Land Cover for the Conterminous United States. Remote Sens. Environ. 2021, 257, 112357. [Google Scholar] [CrossRef] [PubMed]
Hillier, J.K.; Smith, M.J.; Armugam, R.; Barr, I.; Boston, C.M.; Clark, C.D.; Ely, J.; Fankl, A.; Greenwood, S.L.; Gosselin, L.; et al. Manual Mapping of Drumlins in Synthetic Landscapes to Assess Operator Effectiveness. J. Maps 2015, 11, 719–729. [Google Scholar] [CrossRef]
Guzzetti, F.; Mondini, A.; Cardinali, M.; Fiorucci, F.; Santangelo, M.; Chang, K.-T. Landslide Inventory Maps: New Tools for an Old Problem. Earth-Sci. Rev. 2012, 112, 42–66. [Google Scholar] [CrossRef]
Odom, W.; Doctor, D. Rapid Estimation of Minimum Depth-to-Bedrock from Lidar Leveraging Deep-Learning-Derived Surficial Material Maps. Appl. Comput. Geosci. 2023, 18, 100116. [Google Scholar] [CrossRef]
van der Meij, W.M.; Meijles, E.W.; Marcos, D.; Harkema, T.T.L.; Candel, J.H.J.; Maas, G.J. Comparing Geomorphological Maps Made Manually and by Deep Learning. Earth Surf. Process. Landf. 2022, 47, 1089–1107. [Google Scholar] [CrossRef]
Zhang, C.; Sargent, I.; Pan, X.; Li, H.; Gardiner, A.; Hare, J.; Atkinson, P.M. Joint Deep Learning for Land Cover and Land Use Classification. Remote Sens. Environ. 2019, 221, 173–187. [Google Scholar] [CrossRef]
Zhang, P.; Ke, Y.; Zhang, Z.; Wang, M.; Li, P.; Zhang, S. Urban Land Use and Land Cover Classification Using Novel Deep Learning Models Based on High Spatial Resolution Satellite Imagery. Sensors 2018, 18, 3717. [Google Scholar] [CrossRef] [PubMed]
Al-Najjar, H.A.H.; Kalantar, B.; Pradhan, B.; Saeidi, V.; Halin, A.A.; Ueda, N.; Mansor, S. Land Cover Classification from Fused DSM and UAV Images Using Convolutional Neural Networks. Remote Sens. 2019, 11, 1461. [Google Scholar] [CrossRef]
Maxwell, A.E.; Warner, T.A.; Fang, F. Implementation of Machine-Learning Classification in Remote Sensing: An Applied Review. Int. J. Remote Sens. 2018, 39, 2784–2817. [Google Scholar] [CrossRef]
Wang, J.; Bretz, M.; Dewan, M.A.A.; Delavar, M.A. Machine Learning in Modelling Land-Use and Land Cover-Change (LULCC): Current Status, Challenges and Prospects. Sci. Total Environ. 2022, 822, 153559. [Google Scholar] [CrossRef] [PubMed]
Heydari, S.S.; Mountrakis, G. Meta-Analysis of Deep Neural Networks in Remote Sensing: A Comparative Study of Mono-Temporal Classification to Support Vector Machines. ISPRS J. Photogramm. Remote Sens. 2019, 152, 192–210. [Google Scholar] [CrossRef]
Leverington, D.W.; Moon, W.M. Landsat-TM-Based Discrimination of Lithological Units Associated with the Purtuniq Ophiolite, Quebec, Canada. Remote Sens. 2012, 4, 1208–1231. [Google Scholar] [CrossRef]
Kahle, A.B.; Gillespie, A.R. Thermal Inertia Imaging: A New Geologic Mapping Tool. Geophys. Res. Lett. 1976, 3, 26–28. [Google Scholar] [CrossRef]
Asano, Y.; Yamaguchi, Y.; Kodama, S. Geological Mapping by Thermal Inertia Derived from Long-Term Maximum and Minimum Temperatures in ASTER Data. Q. J. Eng. Geol. Hydrogeol. 2022, 56, 1–9. [Google Scholar] [CrossRef]
Cracknell, M.J.; Reading, A.M. Geological Mapping Using Remote Sensing Data: A Comparison of Five Machine Learning Algorithms, Their Response to Variations in the Spatial Distribution of Training Data and the Use of Explicit Spatial Information. Comput. Geosci. 2014, 63, 22–33. [Google Scholar] [CrossRef]
Scarpone, C.; Schmidt, M.G.; Bulmer, C.E.; Knudby, A. Semi-Automated Classification of Exposed Bedrock Cover in British Columbia’s Southern Mountains Using a Random Forest Approach. Geomorphology 2017, 285, 214–224. [Google Scholar] [CrossRef]
Shastry, A.; Carter, E.; Coltin, B.; Sleeter, R.; McMichael, S.; Eggleston, J. Mapping Floods from Remote Sensing Data and Quantifying the Effects of Surface Obstruction by Clouds and Vegetation. Remote Sens. Environ. 2023, 291, 113556. [Google Scholar] [CrossRef]
NASA. DELTA (Deep Earth Learning, Tools, and Analysis); NASA Ames Intelligent Robotics Group: Moffett Field, CA, USA, 2021.
Buscombe, D.; Ritchie, A.C. Landscape Classification with Deep Neural Networks. Geosciences 2018, 8, 244. [Google Scholar] [CrossRef]
Shastry, A.; Cerovski-Darriau, C. Data from “Mapping Bedrock Outcrops in the Sierra Nevada Mountains (California, USA) Using Machine Learning”: U.S. Geological Survey Data Release; U.S. Geological Survey: Reston, VA, USA, 2023. [CrossRef]
Jin, S.; Homer, C.; Yang, L.; Danielson, P.; Dewitz, J.; Li, C.; Zhu, Z.; Xian, G.; Howard, D. Overall Methodology Design for the United States National Land Cover Database 2016 Products. Remote Sens. 2019, 11, 2971. [Google Scholar] [CrossRef]
Yang, L.; Jin, S.; Danielson, P.; Homer, C.; Gass, L.; Bender, S.M.; Case, A.; Costello, C.; Dewitz, J.; Fry, J.; et al. A New Generation of the United States National Land Cover Database: Requirements, Research Priorities, Design, and Implementation Strategies. ISPRS J. Photogramm. Remote Sens. 2018, 146, 108–123. [Google Scholar] [CrossRef]
Homer, C.; Dewitz, J.; Jin, S.; Xian, G.; Costello, C.; Danielson, P.; Gass, L.; Funk, M.; Wickham, J.; Stehman, S.; et al. Conterminous United States Land Cover Change Patterns 2001–2016 from the 2016 National Land Cover Database. ISPRS J. Photogramm. Remote Sens. 2020, 162, 184–199. [Google Scholar] [CrossRef] [PubMed]
Perone, C.S.; Calabrese, E.; Cohen-Adad, J. Spinal Cord Gray Matter Segmentation Using Deep Dilated Convolutions. Sci. Rep. 2018, 8, 5966. [Google Scholar] [CrossRef] [PubMed]
Rafique, M.U.; Zhu, J.; Jacobs, N. Automatic Segmentation of Sinkholes Using a Convolutional Neural Network. Earth Sp. Sci. 2022, 9, e2021EA002195. [Google Scholar] [CrossRef]
Pei, J.; Wang, L.; Huang, N.; Geng, J.; Cao, J.; Niu, Z. Analysis of Landsat-8 OLI Imagery for Estimating Exposed Bedrock Fractions in Typical Karst Regions of Southwest China Using a Karst Bare-Rock Index. Remote Sens. 2018, 10, 1321. [Google Scholar] [CrossRef]
Ruan, O.; Liu, S.; Zhou, X.; Luo, J.; Hu, H.; Yin, X.; Yuan, N. LANDSAT Multispectral Image Analysis of Bedrock Exposure Rate in Highly Heterogeneous Karst Areas through Mixed Pixel Decomposition Considering Spectral Variability. Land Degrad. Dev. 2023, 34, 2880–2895. [Google Scholar] [CrossRef]
Chen, Y.; Wang, Y.; Zhang, F.; Dong, Y.; Song, Z.; Liu, G. Remote Sensing for Lithology Mapping in Vegetation-Covered Regions: Methods, Challenges, and Opportunities. Minerals 2023, 13, 1153. [Google Scholar] [CrossRef]
Hahm, W.J.; Riebe, C.S.; Lukens, C.E.; Araki, S. Bedrock Composition Regulates Mountain Ecosystems and Landscape Evolution. Proc. Natl. Acad. Sci. USA 2014, 111, 3338–3343. [Google Scholar] [CrossRef]

Figure 1. Extent of the Sierra Nevada Mountains (USA) study area (red outline) on the 2016 National Agriculture Imagery Program (NAIP) imagery in North American Datum of 1983 (NAD 83) and Universal Transverse Mercator (UTM) Zones 10N and 11N. Inset map shows the study area location in California (USA). Squares show twenty locations of mapped data used to train (dark purple) and test (pink) the model.

Figure 2. (A) Example showing comparison between reference labels and Deep Earth Learning, Tools, and Analysis (DELTA) predictions of bedrock. Purple is overprediction (false positives (FP)), orange is underprediction (false negatives (FN)), and dark pink is both true and predicted (true positives (TP)). These results are used to calculate (B) the confusion matrix for the testing set and the corresponding statistics derived from comparing reference labels with DELTA predictions of bedrock.

Figure 3. (A) 2016 National Agriculture Imagery Program (NAIP) images showing three examples from the twenty reference sites (top: high-albedo, glacially smoothed granitic rocks, middle: abundant talus downslope of bedrock ridges, bottom: exposed, weathered granitic bedrock below a reservoir), corresponding reference labels (light blue), Deep Earth Learning, Tools, and Analysis (DELTA) predictions (dark blue), and National Land Cover Database (NLCD) Barren class (red), with resulting metrics in Table 1, (B) Confusion matrix of DELTA predicted bedrock with respect to reference labels, and (C) Confusion matrix of NLCD Barren class with respect to reference labels.

Figure 4. Deep Earth Learning, Tools, and Analysis (DELTA) predicted bedrock map (blue) compared to the National Land Cover Database (NLCD) Barren class (red) in the study area (black outline). The shaded refiled basemap is derived from the USGS 30 m DEM. Inset shows enlarged DELTA and NLCD predictions from the central study area (black box). The overlay and resulting metrics show that NLCD vastly underpredicts exposed bedrock visible in high-resolution imagery (NAD83, UTM Zones 10–11N).

Table 1. Resulting metrics for the three reference sites shown in Figure 3A.

	DELTA			NLCD
	Recall	Precision	Accuracy	Recall	Precision	Accuracy
Row 1 (SLT) *	98%	88%	90%	9%	84%	44%
Row 2 (FRS) *	81%	57%	76%	35%	23%	45%
Row 3 (TUO2) *	40%	36%	99%	0%	0%	99%

* Row number refers to image position in Figure 3. The name in parenthesis refers to the naming convention used in the data repository [23].

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Shastry, A.; Cerovski-Darriau, C.; Coltin, B.; Stock, J.D. Mapping Bedrock Outcrops in the Sierra Nevada Mountains (California, USA) Using Machine Learning. Remote Sens. 2025, 17, 457. https://doi.org/10.3390/rs17030457

AMA Style

Shastry A, Cerovski-Darriau C, Coltin B, Stock JD. Mapping Bedrock Outcrops in the Sierra Nevada Mountains (California, USA) Using Machine Learning. Remote Sensing. 2025; 17(3):457. https://doi.org/10.3390/rs17030457

Chicago/Turabian Style

Shastry, Apoorva, Corina Cerovski-Darriau, Brian Coltin, and Jonathan D. Stock. 2025. "Mapping Bedrock Outcrops in the Sierra Nevada Mountains (California, USA) Using Machine Learning" Remote Sensing 17, no. 3: 457. https://doi.org/10.3390/rs17030457

APA Style

Shastry, A., Cerovski-Darriau, C., Coltin, B., & Stock, J. D. (2025). Mapping Bedrock Outcrops in the Sierra Nevada Mountains (California, USA) Using Machine Learning. Remote Sensing, 17(3), 457. https://doi.org/10.3390/rs17030457

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Mapping Bedrock Outcrops in the Sierra Nevada Mountains (California, USA) Using Machine Learning

Abstract

1. Introduction

2. Data

2.1. Study Area

2.2. Remote Sensing Images

Training Data

2.3. National Land Cover Database

3. Methods

3.1. Machine Learning Training and Implementation

3.2. Comparison with NLCD

3.2.1. Sites with Reference Data

3.2.2. Entire Sierra Nevada

4. Results and Discussion

4.1. Machine Learning Training

4.2. Comparison with NLCD

4.2.1. Sites with Reference Data

4.2.2. Entire Sierra Nevada

4.3. Implications and Limitations

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI