1. Introduction
Recently, Korean government agencies have focused on constructing virtual spaces, such as digital twins, to use for urban environments and management policies [
1,
2]. Implementing three-dimensional (3D) digital transformation and data collection of urban forests and urban spaces is essential to construct digital twin technology [
3,
4,
5]. Trees are one of the key components within urban spaces. In the visualization of digital twins, 3D tree images constructed using LiDAR can be utilized to represent urban spaces effectively [
6,
7]. Therefore, remote sensing technology can be utilized for 3D data collection for digital twins [
8,
9,
10]. Furthermore, for urban forest management, individual trees must be quantified and measured in a 3D structure [
11,
12]. In this process, the collected 3D data of the urban forest can play a significant role as input data for various analyses and modeling [
13,
14,
15,
16,
17,
18,
19].
Remote sensing technology is a suitable technique for collecting spatial data from urban environments and urban forests [
20,
21,
22,
23]. Remote sensing is effective for the spatial analysis of urban forest characteristics, such as tree location, height, crown width, and tree shape in urban forest management [
24,
25,
26]. Point cloud data acquired through remote sensing allow the object recognition of each tree and various analyses [
27,
28]. The technology for capturing 3D data allows for the objective restoration of the targeted object by scanning it into point cloud data using a light detection and ranging (LiDAR) sensor, accurately reflecting the surface shape of the object and providing location information [
29].
Therefore, remote sensing techniques have been employed to collect urban environment and urban forest spatial data, and related research has been conducted. The researchers have applied various sensing techniques to verify the accuracy of the data collection or testing methods to analyze the sensor data. Remote sensing can be classified into terrestrial and airborne methods. The terrestrial LiDAR data used in traditional surveying fields are a representative tool for the 3D digitalization of space. This system can accurately capture the 3D structure of objects. However, terrestrial LiDAR has relative mobility constraints, making it disadvantageous for surveying extensive areas, such as forests [
30]. Airborne LiDAR emphasizes its ability to quickly scan large areas from a high altitude and can collect data across large and complex terrains relatively easily. However, airborne LiDAR can provide data with a relatively lower resolution than terrestrial LiDAR. Additionally, the LiDAR viewpoints significantly affect the areas that are focused upon for detection. Airborne LiDAR tends to create point clouds concentrated on the tree canopy. In contrast, terrestrial LiDAR more intensively detects the trunks, branches, and leaves of the trees [
31,
32].
Previous studies have compared the performance of airborne and terrestrial LiDAR or have focused on enhancing tree detection techniques and algorithms using LiDAR data. In the former case, the suitable LiDAR analysis technique varied depending on the comparison metrics in the research. Research based on height measurement concluded that using airborne LiDAR is more accurate [
31,
33]. For applications that prioritize data from the canopy, such as the height and crown width, airborne LiDAR has been widely adopted. Lindberg et al. [
34] utilized individual tree crown (ITC) methods to extract tree crowns from data collected using airborne laser scanning (ALS). In contrast, terrestrial LiDAR is considered ideal for capturing intricate details of trees, especially when a detailed depiction is essential. It excels in assessing canopy volume and branch morphology and is adept at gathering data from the subterranean structures of the tree. Given the unique strengths of each technology, the selection between airborne and terrestrial LiDAR should be made judiciously, based on the research objectives and required level of detail. Recently, researchers have begun to combine these methods, collecting tree data in a synergistic approach. Concurrently, efforts are underway to construct 3D datasets from these merged inputs [
28,
35].
In the latter case, the research has involved semantically segmenting the collected data to extract tree locations and information in planar space or to perfectly represent tree objects. Research has been conducted to test commonly used algorithms, advance existing algorithms, and develop new ones. The results from each algorithm were compared, and their accuracy was tested. The primary algorithms were established through this process. Sun et al. [
36] used LiDAR data captured using uncrewed aerial vehicles (UAVs) and assessed six algorithms to identify individual trees and distinguish their crowns. The research employed the linear regression model, a linear model with ridge regularization, support vector regression, random forest, artificial neural network, and the k-nearest neighbors (KNN) models. McRoberts et al. [
37] used forest inventory and airborne laser-scanning data in conjunction with the KNN technique to estimate the average aboveground biomass per unit area. Wulder et al. [
38] investigated the use of local maximum filtering to detect trees in high spatial resolution (1 m) images. In addition, Michałowska et al. [
39] examined the capabilities of LiDAR for classifying tree species and deliberated on the most effective classification algorithm to enhance classification accuracy. Li et al. [
40] proposed a new point cloud-driven method called skeleton refined extraction. This method was designed to enhance the accuracy of recognizing tree trunks from point cloud data. Further, Raumonen et al. [
41] introduced a new methodology, the tree quantitative structure model (TreeQSM), to reconstruct 3D tree models accurately from point cloud data collected using the terrestrial LiDAR scanning method. Tarsha Kurdi et al. [
42] aimed to modify existing solutions for a more accurate representation and visualization of tree crowns. The objective of this research is to propose a new method for calculating tree biomass using terrestrial laser scanning (TLS) and airborne laser-scanning (ALS) measurement data, adopting an approach where the volumetric model of tree biomass is represented in mathematical structures.
These preceding studies have focused on the accuracy of LiDAR sensing technology and tree modeling techniques. However, since LiDAR is a detection technology that utilizes light, it is inevitably obstructed by obstacles. In tree modeling, the most significant obstacle is the trees themselves. Portions of trees that are obscured by leaves or branches cannot be detected, resulting in models that fall short of the desired level, especially for trees with dense canopies or those obscured by other trees. It is necessary to determine the appropriate level of green density and structure suitable for data collection using LiDAR. Therefore, this study aims to compare the remote sensing data collected according to different green structures and evaluate whether the green structure is suitable for constructing three-dimensional tree models.
For this purpose, the point cloud data of the target area were collected using terrestrial LiDAR and aerial photogrammetry. The first comparison was based on the data collection methods. Data collected from the terrestrial and data collected from the aerial were compared to understand the characteristics of the collected data. The second comparison focused on the data collection results according to the green structure of the urban forest. The green spaces within the target area were classified into three types of green structures, and sampling was conducted for analysis based on these three types. The green structure types are Simple-Structure, Narrow-Structure, and Congested-Structure. Simple-Structure is an area where only canopy trees are present, and the distance between trees is wide enough that their canopies do not overlap. Narrow-Structure also contains only canopy trees, but the distance between trees is narrow, causing the canopies to overlap. Congested-Structure includes both canopy trees and shrubs, with a high planting density, resulting in overlapping canopies. Based on the results of aerial photogrammetry and handheld LiDAR, tree object classification was conducted according to each type, and the success rate of tree classification within each zone was assessed. For analysis, a preprocessing process was carried out on the point cloud data, including data merging, noise removal, separation of DTM and DSM, and separation of green spaces and structures.
Through this study, it is possible to assist in extracting 3D greenery models of urban forests, including the tree height, tree base height, breast height diameter, canopy volume, and canopy transparency. Additionally, this research is expected to offer foundational information for determining planting standards for future urban forest development.
2. Materials and Methods
2.1. Study Area
The study area, approximately 350 × 245 m, covers the National Debt Repayment Movement Memorial Park in Daegu, Korea. It is bounded by latitudes 35.8684 and 35.8689 and longitudes 128.6002 and 128.6016. The area primarily comprises trees such as zelkova, pine, maple, and ginkgo. This region is a restricted flight zone for UAVs because it is inside an airport control zone. Therefore, this study was conducted by requesting flight approval.
Figure 1 displays the research area along with the measured ground control points.
2.2. Equipment and Data Colleciton
We utilized terrestrial laser scanning and UAV photogrammetry to collect point cloud data for trees (
Figure 2).
2.2.1. UAV Photometric Data
The UAV used for the acquisition of 3D point cloud data was the Inspire2 (DJI, Shenzhen, China). For 3D mapping, Pix4D Mapper (Pix4D, Prilly, Switzerland) was utilized, and for GNSS equipment, the Trimble R4s GNSS Receiver was employed. The point cloud data obtained by the UAVs were matched with the color information acquired from the RGB camera attached to the UAVs, allowing for a relatively clear representation of realistic colors. This aids in identifying the characteristics of the target area. Since the data are captured from a high altitude in a downward direction, it is particularly easy to extract point cloud data for flat objects such as rooftops of buildings, the tops of structures, tree canopies, plazas, and parking lots. UAVs are effective for quickly acquiring data over large areas and offer the advantage of being able to analyze various additional information about urban forests, such as orthophotos and 3D simulations, in addition to point cloud data.
Approval for UAV flight and aerial photography was obtained from the relevant authority (11th Fighter Wing, Daegu Base, Air Operations Division, UAV Control Team). The first flight and photography session was conducted on 30 June 2022, and the second on 28 July 2022. The flights were carried out at altitudes of 100 m (three times) and 120 m (once), and a total of 589 images were uploaded after the flights. The flight courses were set to fly over the target area either in an east–west direction or a north–south direction (
Figure 3).
2.2.2. Terrestrial Laser Scanning Data
We employed a handheld LiDAR device called MapTorch (A.M.Autonomy, Seoul, Republic of Korea), originating from Korea. The LiDAR sensor used in the MapTorch is the Velodyne Puck-LITE (Velodyne LiDAR, San Jose, CA, USA), and its detailed specifications are as follows (
Table 1).
The MapTorch device operates by using a simultaneous localization and mapping method. This approach is particularly advantageous because it enables self-localization in areas where global positioning satellite signals might be weak or unavailable. The compact and lightweight design, distinct from traditional stationary ground LiDAR systems, allows individuals to carry it easily. This portability makes it especially appropriate for scanning expansive areas rather than just individual entities. The Trimble R4s global navigation satellite system receiver was used for positioning the ground control points with an accuracy of 1 to 2 cm using the real-time kinematic positioning method.
2.3. Preprocessing
The point cloud data, acquired using UAV and handheld LiDAR and subsequently merged, were processed using Cloud Compare v2.12 alpha (64 bit) for classification, segmentation, and individual object recognition. A four-step preprocessing procedure was conducted for the semantic segmentation of trees (
Figure 4).
The first step in preprocessing was the merging of the point cloud data. Using Cloud Compare’s merge function, we combined the photogrammetry data from the UAV with the handheld LiDAR data (
Figure 5). Both data collection methods utilized the same coordinate system, and GPS was corrected through RTK equipment prior to data collection to minimize errors. During the merging process, duplicate points at the same location were removed to reduce potential issues that could arise during the semantic segmentation of individual trees.
The second step was the removal of noise points. By eliminating noise generated during the capture of point cloud data, we aimed to improve the accuracy of the final analysis. Dynamic objects such as pedestrians and pets captured during handheld LiDAR scanning were designated as noise. The removal process was conducted manually by opening the collected data in Cloud Compare and removing the points located on walkways.
The third step was to separate points identified as terrain from the point cloud data and classify them into digital terrain models (DTMs) or digital surface models (DSMs). To reference the terrain information of the study area, the digital terrain model was obtained from the National Geographic Information Institute. The DTM was provided in shapefile (.shp) format and converted to raster format (.tif) using ArcGIS Pro v3. The raster resolution was set to 1 m, and each grid contained the terrain’s elevation values. To distinguish between ground and non-ground elements, the elevation difference between the collected point cloud data and the DTM was calculated. Cloud Compare’s Cloud/Mesh Dist. function was used to calculate the distance between the point cloud and the DTM. This distance represents the relative elevation from the ground, allowing us to differentiate ground from non-ground elements (e.g., buildings, vegetation). Points with an elevation difference of 0.3 m or more from the DTM were considered non-ground elements and were filtered out. The point cloud data were then separated into ground (DTM) and non-ground (DSM) elements and saved accordingly.
The fourth step involved using the point cloud data classified as DSM to distinguish between data representing trees and those representing structures. The data related to structures was removed to focus on the semantic segmentation of trees. This process was performed manually, and structures within the park, such as streetlights and signposts, were removed.
2.4. Semantic Segmentation of Individual Tree
This study analyzed the green structures of the target area and classified it into three types based on the green structures (
Figure 6). Evaluation zones were established for each type. The green structures were categorized as follows: areas where only canopy trees are planted and the canopies do not overlap (Simple-Structure), areas where only canopy trees are planted but the canopies overlap (Narrow-Structure), and areas where both canopy trees and understory vegetation are planted, resulting in a congested structure (Congested-Structure) (
Table 2).
The Simple-Structure consists only of canopy trees, with a wide spacing between them, ensuring that tree crowns do not overlap. In this structure, it is visually clear that each tree stands individually, and the tree density is low enough for manual tree separation. The average distance between trees is 11 m, and there are 0.9 trees per 100 square meters. Although some trees with low heights are present, they can still be clearly distinguished.
The Narrow-Structure also consists only of canopy trees, but the distance between them is much shorter, causing the tree crowns to overlap. In this structure, the distinction between the crowns is not clear. The average distance between trees is about 5 m, and there are 2.2 trees per 100 square meters.
The Congested-Structure is a type where canopy trees and shrubs coexist, resulting in the highest tree density. Not only are the tree crowns difficult to distinguish, but the trunks are also hard to differentiate due to the dense presence of shrubs. The average distance between trees is approximately 3 m, and there are 3 trees per 100 square meters.
During the phase dedicated to individual tree segmentation, the point cloud library (PCL), an open-source library specifically designed for the processing of point cloud and 3D data, was harnessed. The PCL includes algorithms for filtering, feature estimation, surface reconstruction, model fitting, object recognition, and segmentation. These algorithms are categorized under labels, such as filters, features, key points, registration, kdtree, octree, segmentation, surface, and visualization. The PCL can handle various point cloud data formats, not just the ‘.pcd’ format but also others, such as ‘.xyz’ and ‘.las’.
The Euclidean cluster extraction algorithm was used to individually segment trees. This method is effective in distinguishing individual objects by grouping nearby points. In PCL, a kdtree was used to find the neighboring points for each point, enabling the clustering process. After the neighbor search, the Euclidean cluster extraction algorithm calculated the distance between points and grouped neighboring points into clusters based on each point. The maximum allowable distance between points was set to 0.3 m, considering the proximity or overlapping nature of green structures. This conservative threshold ensured proper segmentation. The minimum cluster size, which helps to identify small clusters as noise, was set to 100, considering the size of the trees and the point density of the point cloud.
4. Discussion
Wieser et al. [
43] compared the accuracy of the UAV laser-scanning method (ULS) and the terrestrial laser-scanning method (TLS). He determined that ULS, being an aerial-based method, is advantageous for capturing the canopy of trees. On the other hand, since TLS captures from the ground, it can more accurately detect the stem of trees but tends to overestimate the height of the trees. Beyer et al. [
44] employed ground-based LiDAR for structural tree modeling. He identified that if canopy data are deficient due to the density of tree leaves, tree modeling becomes challenging. He proposed that this limitation can be addressed with additional aerial LiDAR images using drones, enhancing the 3D leaf density and allowing for direct comparison. These findings demonstrate that this study’s approach of using both ULS and TLS is suitable for 3D tree scanning. This study compared point cloud data obtained from ULS with that from TLS, discerning the unique characteristics, advantages, and disadvantages of each method. Subsequently, by integrating both types of data, this study addressed the limitations and established foundational data for modeling and measurement.
However, leveraging the combined data presented challenges. Since LiDAR operates on light, it cannot observe areas unreachable by light. Even after expanding the LiDAR’s observation scope using both terrestrial and aerial means, parts obscured by trees posed measurement challenges. In densely wooded areas, individual recognition processes faltered, and tree-obscured sectors lacked data collection. Moreover, high leaf density meant that leaves concealed branches, resulting in the separation of the canopy and trunk during the object segmentation phase. To adjust for leaf density, the study proposed gathering additional data during winter, when deciduous trees shed their leaves. Neuville et al. [
45] tested this approach in temperate deciduous canopy forests during both leaf-on and leaf-off seasons. The outcomes revealed that the suggested method could detect up to 82% of tree stems with 98% accuracy. It also recognized the challenges of distinguishing between the canopy and tree stems in the leaf-on season. Anticipated future research will validate the 3D modeling process, incorporating seasonal changes, and will establish modeling data.
For tree recognition in complex green structures, it is necessary to adjust the object recognition settings of the segmentation algorithm. By reducing the voxel size and object recognition threshold, a higher level of segmentation can be achieved. This approach is expected to lead to improved tree recognition success. However, due to the nature of LiDAR, which uses light, recognizing parts obscured by dense vegetation is not possible in complex green structures. Therefore, it is crucial to determine the appropriate tree density for conducting tree recognition using LiDAR. Although this study did not perform a detailed adjustment of tree density, future research should include specific discussions regarding the tree density suitable for applying remote sensing investigations.
5. Conclusions
The study was conducted in the National Debt Repayment Movement Memorial Park located in Daegu Metropolitan City, using point cloud data extracted from handheld LiDAR and UAV imagery. The aim was to convert the urban forest and surrounding areas into 3D digital data and recognize, segment, and analyze the green structure of the individual trees. For the primary 3D digital conversion of the target area, point cloud data were acquired using UAVs and handheld LiDAR merged using Cloud Compare. Furthermore, the point cloud data were separated into ground and non-ground data, and individual objectification was attempted on trees in the non-ground data.
Through the semantic segmentation process that recognizes each tree as an individual object, it is possible to grasp the detailed characteristics of individual trees and the urban forest, a collection of individual trees. This approach aids in easily understanding the specifications of individual trees, such as the height, canopy width, and breast height diameter, quantifying the function of the urban forest, and predicting its changes. The 3D digital conversion process for urban forest investigation, analysis, and management proceeds in the order of merging and editing point cloud data, noise removal and point cloud data cleanup, separation of ground (DTM) and non-ground (DSM) data, file format conversion, semantic segmentation manually or using an algorithm, file storage and organization, analysis of individual trees by density type, and extraction of analysis files. Manual and automated methods should be combined, using algorithms at each stage.
The semantic segmentation of trees was conducted by dividing the park’s green structures into three types of zones, and the work was carried out for the trees within these zones. The green structures consist of Simple-Structure, Narrow-Structure, and Congested-Structure. In the Simple-Structure zone, only canopy trees exist, and the spacing between trees is wide enough that their canopies do not overlap. In the Narrow-Structure zone, only canopy trees are present, but the spacing between trees is narrow, causing their canopies to overlap. The Congested-Structure zone contains both canopy trees and shrubs, and the tree density is high, resulting in overlapping canopies.
As a result of the semantic segmentation performed according to the green structures, the Simple-Structure zone showed a success rate of 50%, the Narrow-Structure zone an average success rate of 21%, and the Congested-Structure zone a success rate of 4%. The denser the trees and the more their canopies overlapped, the lower the success rate of semantic segmentation. This issue arises because the tree density exceeds the point recognition distance used in the segmentation algorithm. While reducing the point recognition distance might allow segmentation to be performed to some extent, it could also lead to over-segmentation. Therefore, it is necessary to consider the appropriate point recognition distance and adjust the algorithm’s variable settings according to the green structure.
Additionally, branches obscured by tree canopies were difficult to capture using LiDAR imaging, which resulted in lower point density for the branches and prevented accurate segmentation. To address this problem, supplementary imaging during the leafless season, when trees have shed their leaves, could be employed to capture additional points.
This study identified the need to understand the appropriate green structures for tree surveys in urban forests using remote sensing, and emphasized the importance of adjusting the variables of the semantic segmentation algorithm according to the green structures and considering imaging conditions. However, the study did not confirm an increase in the success rate of tree objectification based on specific variable adjustments or improvements in imaging techniques. Therefore, further research is needed to develop specific application methods, such as optimal variable settings based on different green structures.
The direction of this study involved utilizing remote sensing technology to understand the green structures of urban forests and collect data on tree objects, ultimately determining whether tree object images that could be applied to digital twins could be created using this technology. Digital twins assist us in building a virtual representation of the real environment, allowing us to conduct various simulations. To test and verify the multiple functions of urban forests, it is necessary to establish data that can accurately replicate real measurements, and remote sensing technology was considered an efficient method for building such data. Therefore, based on the results of this study, we concluded that remote sensing technology should be applied considering the complexity of the green structures. This conclusion can serve as a basis for determining where remote sensing technology should be applied in future tree surveys aimed at constructing digital twins.