1. Introduction
Data gaps/voids (i.e., the absence of data) are a common occurrence that plague remote sensing data including terrestrial laser scanning (TLS) 3D point cloud data. TLS point cloud data gaps can have an adverse effect on subsequent point cloud-derived products, including digital surface models (DSMs), bare-earth digital elevation models (DEMs), triangulated surface meshes, and 3D solid models, among others. A point cloud data gap of significant size and extent is unable to provide geometric or radiometric information to the chosen spatially continuous product; therefore, assumptions must be made to span the data gap, which inherently adds uncertainty to the derived product.
TLS data gaps stem from two primary sources (
Figure 1): A line-of-sight obstacle resulting in an occlusion, and a dropout [
1] resulting from a specular reflective or absorbent surface preventing the energy from a given laser pulse from returning to the TLS instrument. The extent of 3D point cloud data gathered by a TLS instrument is limited to what is directly visible by the scanner; line-of-sight obstacles (e.g., topographic high points, trees) result in occlusions on the side of the obstacle opposite the scanner. Hence, a comprehensive survey of a complex site requires multiple scan positions from varied points of view to mitigate occlusions [
2]. TLS data gaps can also stem from bodies of water [
3] and other specular reflective or highly absorbent surfaces. If the surface of an object is such that a laser pulse emitted by the scanner is received and then reflected away never to return to the scanner, no point will be recorded at the current location in the scene and a data gap will result. Although dropouts can be difficult to distinguish from occlusions, the distinction between the two is important.
The relative proportion of occlusions in TLS data is often indicative of the survey quality. Recognizing that regions of the scanned scene occluded from one scan position are commonly visible from another location, a prevalence of occlusions in point cloud data can indicate an insufficient number of scans and/or poor scanner placement. Conversely, a prevalence of dropouts is not tied as closely to survey quality. Generally, the scanner operator cannot control the presence of specular reflective or absorbent surfaces, be they water puddles, ponds, large bodies of water, or glass, at certain incidence angles. Consequently, data gaps resulting from dropouts typically cannot be avoided by even the most careful and comprehensive of TLS surveys. In some cases, however, the timing of the survey, particularly in locations with tidal or seasonal fluctuations, can have a substantial influence on the presence of pooled water and can be considered.
To date, there is a lack of literature concerning both the identification and classification of data gaps in TLS data as well as TLS-derived products. Existing work has explored data gap filling methods (e.g., [
4]) and mitigation of occlusions [
2]; however, no prior work has been identified that differentiates between occlusion and dropout data gaps. Classification of data gaps can enable optimization of DEM data gap filling. For instance, proper classification of data gaps in a DEM can enable one to select the appropriate method to interpolate such as using a thin plate spline method [
4] to fill occlusions and a hydro-flattening type [
5] technique to fill water-derived dropouts.
With respect to the classification of water, the literature seems to solely focus on applications relevant to airborne laser scanning (ALS) [
5,
6,
7,
8]. Unfortunately, methods for identifying and/or classifying bodies of water in ALS data are not relevant to TLS point cloud data due to differences in airborne and terrestrial points of view relative to horizontally oriented bodies of water. For instance, it is common in ALS data to have some points representative of the water surface with both low and very high laser pulse energy levels (intensity), whereas, because of the commonly oblique incidence angle of TLS observations to the ground surface, it is likely no water surface points will be captured.
An important quality metric for TLS point cloud data and derived DEMs is completeness. Given that the presence of any data gaps in a point cloud can bring the survey quality into question and lead to increased levels of DEM uncertainty from over-interpolation, a need exists for a methodology to properly classify these data gaps, and quantify how much of the scanned area consists of point returns as well as the two types of data gaps. Knowledge of data gap origin can facilitate the judgement of TLS survey quality and the identification of pooled water in a scanned scene. Having the ability to quantify the presence of occlusions in a DEM provides the opportunity to evaluate the influence of TLS data acquisition and DEM creation parameters on the overall completeness of a given DEM [
9]. Examples of these TLS parameters include angular resolution of TLS data, quantity of scans per unit area, DEM resolution, and minimum required points per DEM pixel. The proposed methodology can also communicate important information to those using TLS-derived products for scientific applications. For example, identifying pooled water has implications for habitat modeling and mapping in ecological research focused on species that respond substantially to variation in the submergent–emergent boundaries found in a rocky intertidal ecosystem [
10,
11,
12,
13].
As a result, we developed a novel data gap classification methodology that included two major steps. The first of these steps flagged the boundaries of dropout-based gaps in a projected 2D representation of the point cloud data (2D TLS Image), while the second step used the flags to classify the individual data gaps present in a TLS-derived DEM. We then applied this methodology to a field site located in the rocky intertidal ecosystem to assess a real-world application.
3. Rabbit Rock Study Site
Following validation, the proposed data gap classification methodology was performed on TLS-derived DEMs generated for a natural environment known as Rabbit Rock, a rocky intertidal site located on the Oregon Coast. The Rabbit Rock site served as an example of applying the data gap classification to a real-world site in support of ecological research where the presence of pooled water influences habitats. The Rabbit Rock site is a complex, rocky intertidal environment (
Figure 10) located along the central Oregon Coast, approximately 3.5 km north of Depoe Bay, OR along Hwy. 101. TLS data were collected at this location on two separate occasions during very low (minus) tides to model and identify the foraging habitat for the black oystercatcher (
Haematopus bachmani), a rocky-intertidal obligate shorebird [
23]. TLS scan positions 1–14 were acquired on 18 May 2011 and scans 15–21 were acquired on 1 June 2011 (
Figure 11). The second set of scans were acquired to fill in areas of the site inaccessible during the May survey because of higher tidal conditions. All TLS scans were acquired at angular resolutions of 0.03 or 0.05 degrees. Registration and geo-referencing of the point cloud data was performed with a constrained cloud-to-cloud registration technique implemented in PointReg v3 [
22] based on GNSS coordinates, sensor inclination, and an estimated yaw angle for the TLS instrument at each scan position. Post-processed GNSS coordinates for the individual scan positions were generated using the rapid–static processing available through the National Geodetic Survey’s Online Positioning User Service (OPUS-RS). Two 10-cm TLS-derived DEMs were generated for the Rabbit Rock site, one using only TLS scans 1–14 (DEM RR1) and the second using data from all 21 scan positions (DEM RR2) (
Figure 12).
Given the presence of undulating rock and numerous pools of water at the Rabbit Rock site (
Figure 10), there were many opportunities for both occlusions and dropouts to exist in the scanned scene. When examining the Rabbit Rock DEMs with unclassified data gaps, two questions arose: How well was the site captured (TLS survey quality) and what regions of the DEM were occupied by pooled water? The presence of pooled water within the Rabbit Rock site was important for identifying and modeling the shorebird foraging habitat. In the unclassified DEM, occlusions and dropouts caused by pooled water were indistinguishable. To assess the survey quality and identify regions of the Rabbit Rock site occupied by pooled water, the proposed data gap classification methodology was utilized. To avoid classifying erratic dropouts associated with the dynamic ocean water surrounding the site, a site boundary was used to focus on the region targeted in the TLS topographic survey.
Results and Discussion
Results of the data gap classification (
Figure 12 and
Table 2) indicated that for DEM RR1, ~61% of the site was occupied by elevation data (TLS returns), ~36% was occupied by dropouts, but only ~2.6% was attributed to occlusions. For DEM RR2, ~72% of the site was occupied by TLS returns, ~25% was occupied by dropouts, and we saw a similar but slightly lower relative percentage of occlusions at around ~2.4%.
The classification result rasters for DEMs RR1 and RR2 were differenced in ArcGIS Pro v.2.6.1 [
24] to identify any changes in pixel classification that occurred. Results of this comparison indicated ~88% of the pixels experienced no change in classification and ~10% of pixels classified as a data gap in DEM RR1 became a return in DEM RR2—~93% of which stemmed from the occurrence of a dropout becoming a return. Less than 1% of pixels underwent a change in classification from a return to a data gap or data gap switching (e.g., a dropout becoming an occlusion and vice versa). The reclassification of an RR1 return as a data gap in RR2 should not occur; however, only 0.05% of pixels experienced this change. Further examination of the comparison results indicated approximately 1727 m
2 of dropout pixels changed to a return pixel in DEM RR2, which represented ~93% of the total dropout area reduction presented in
Table 2. Based on the observed decrease in the identified dropouts and the minimal change in percent occlusions, from DEM RR1 to RR2, the majority of the ~11% increase in returns was attributed to filling in areas that were obscured by high water conditions in the previous TLS survey. The similar percent occlusions observed for DEMs RR1 and RR2 may have been attributed to the combination of consistent TLS surveying techniques/scanner placements and the repeated undulating nature of the Rabbit Rock terrain.
There were a few locations where dropouts were classified near a TLS scan location. In these regions, the photographs taken from the scan position could be used to perform a qualitative validation of the results. The TLS-based imagery for scan positions SP15 and SP16 are presented in
Figure 13. The pools of water visible in the imagery corroborated the classification results of extensive dropouts surrounding the scan positions. Circular occlusions were observed beneath SP15 and SP16 in
Figure 13 due to other scan positions’ inability to fill in these areas. It is important to note that these scanner-based occlusions were adjacent to pooled water, which made it difficult to determine where exactly the pooled water stopped and the occlusion began. In this case, we decided to be conservative with respect to judging survey quality; therefore, we assumed that all data gaps within a certain radius of a given scan position should be classified as occlusions. If we wanted to ensure that we were capturing all the potential water pools, we could have changed the algorithm to classify the entire, merged data gap as a dropout.
A quantitative validation of the Rabbit Rock classification results was performed using the near-infrared (NIR) channel from National Agriculture Imagery Program (NAIP) 1-m spatial resolution aerial imagery collected in 2011, 2014, and 2016. An unsupervised classification of each image was performed with twenty classes using the Iso Cluster Unsupervised Classification tool in ArcGIS Pro v.2.6.1 [
24]. Each image was individually analyzed to identify which of the resulting classes were associated with extensive pools of water. Validated water classes were then used to generate a binary raster identifying “water” and “non-water” pixels for each NAIP image. Next, the three binary rasters were added together using the raster calculator functionality in ArcGIS Pro. The resulting raster had pixel integer values ranging from 0–3, where the value of a given pixel represented in how many NAIP images that location was classified as water. For example, a pixel value of “2” corresponded to a location that was classified as water in two out of the three NAIP images. To target persistent water pools that existed at the Rabbit Rock site, the validation analysis focused on regions that were classified as water in all three NAIP images (i.e., pixel value of 3). Groups of pixels with a value of “3” were converted to polygon features and then used to generate a set of 100 random points within these locations. The 100 points were then cross referenced with the DEM RR2 data gap classification raster to identify what percentage of the locations were correctly classified. For the 100 random points sampled form persistent water pools identified using the three NAIP images, ~89% were correctly classified as water in DEM RR2, ~3% were classified as occlusions, and ~8% were identified as TLS return pixels with elevation data. While we could not be absolutely certain that the water pools identified in the NAIP imagery represented the site conditions during the TLS surveys, the consistency observed in this validation indicated that the proposed data gap classification methodology was capable of identifying extensive water pools that were present in aerial imagery collected over a five-year period.
4. Conclusions
The proposed data gap classification methodology differentiated between occlusion and dropouts in a TLS-derived DEM using structured TLS point cloud data (PTX), and the associated DEM. The test site results showed a high degree of correct classification of occlusions and dropout-based data gaps and identification of a similar surface area of pooled water present in the scanned scene. The results for the Rabbit Rock site analysis indicated the identified dropouts correlated well with the presence of water, and the quality of the Rabbit Rock TLS survey was high given the low percent of occlusions.
For the test and Rabbit Rock sites, the extensive dropouts could be attributed to pooled water present in the scanned scene. If this classification methodology were applied to a dataset that included other highly reflective objects (e.g., glass windowpanes), dropouts could not be solely attributed to the presence of water; however, they would still be separated from the occlusions, which is an important distinction to make. For assessment of TLS survey quality, data gaps due to dropouts had to be identified and removed before the relative percentage of data gaps due to occlusions was determined. The proposed data gap classification methodology enabled us to make this required distinction.
In a complex environment such as Rabbit Rock, we could assume the primary source of dropouts was attributed to water. Thus, TLS offers tremendous potential for ecological studies in the rocky intertidal ecosystem. Due to the nature of this highly limited (spatially) ecosystem, TLS-derived DEMs may provide the foundation for scale-appropriate habitat models and simulations. Previous work using TLS-derived DEMs for modeling the shorebird foraging habitat demonstrated substantial capability [
23]. However, a missing key attribute that influenced the development of habitat models was the accurate identification of submergent areas during low tides (i.e., tidepools). Because the rocky intertidal ecosystem was very spatially limited, submergent areas (ostensibly dropouts) may have comprised a considerable proportion of the total area of interest (see
Table 2) and may become important for subsequent habitat assessments and modeling. Also of importance was the identification of water area (tidepools) boundaries. This interface was a region of many important interactions between the terrestrial and aquatic components of the intertidal ecosystem. For example, the black oystercatcher foraging habitat model developed by Hollenbeck et al. (2014) lacked the ability to identify local regions of key prey items (limpets) that congregate at tidepool boundaries. Consequently, the ability to differentiate submergent areas from emergent areas that are exposed to the terrestrial component of the intertidal ecosystem is paramount for scale-appropriate habitat analyses and TLS-derived DEMs, processed to differentiate data dropouts and occlusions, and may hold significant promise for intertidal research. Previously available methods for delineating these locales have often involved very intensive field-survey methods [
13] or digitizing the DEM or point cloud, which requires substantial effort and is often not feasible in many ecological studies.