1. Introduction
Individual tree species identification is important for precise forest management, and several studies have evaluated identification methods that are based upon discrete small footprint airborne lidar alone [
1,
2,
3,
4,
5,
6,
7,
8] or in combination with passive multispectral imagery [
9,
10,
11,
12,
13]. Airborne multispectral lidar recently has been proposed as an improvement over single-sensor technology for identifying tree species [
14,
15,
16].
Single-tree species identification using lidar data has generally been performed in three steps. First, single-tree crowns are delineated, after which the corresponding lidar point clouds are extracted. Second, for each single-tree point cloud, features are calculated from return heights (3D features), and from return intensities (intensity features). Third, a classifier (e.g., random forest, RF) is trained using reference tree crowns of known species to compute a classification model, which is then applied to target tree crowns to identify the species. One of the underlying assumptions of this approach is that the location of trees relative to the laser scanner has no effect on their point cloud features. However, scan angle, which determines the incidence angle of the laser pulses on the trees, may affect the values of the computed features, consequently influencing the accuracy of species identification. The majority of airborne lidar systems allow large maximum lateral scanning angles (up to ±30°) to help reduce acquisition costs. Yet, in forestry applications lidar service providers frequently propose maxima of ±15° to 20° to avoid bias at larger scan angles. In addition to lateral scan angles, a multispectral lidar, such as the Titan system of Teledyne Optech Inc. (Vaughan, ON, Canada), may have lasers scanning the same tree using different scanning planes (e.g., with forward tilted scanning planes). The combination of across-track and along-track angles increases the resulting scan angles (hereafter referred to as “net” scan angle). Multispectral airborne lidar is usually considered advantageous because it collects intensities at different wavelengths, which should improve tree species separation compared to single-laser systems. Furthermore, it not only provides different single-channel intensity classification features, but also increases the possibility of computing ratios and normalized differences, as well as 3D features that are enhanced by the greater point density [
14,
15,
16,
17,
18,
19]. While providing richer spectral information, one inconvenient aspect of the system is that it produces data with variable acquisition geometries between channels (i.e., larger net scan angles and no vertical view for two of the three channels in the case of the Titan system). This potentially increases scan angle influence on 3D and intensity features compared to mono-spectral lidar (i.e., a scanner using only a vertical scanning plane).
Individual tree crown (ITC)-level 3D and intensity features are highly variable because they are influenced by multiple elements, some of which depend upon tree characteristics (species, height, stress, tree shape, vertical and horizontal distribution of vegetation within the crown, clumping and occlusion, leaf density, leaf angles, phenological state), together with each tree’s environment (tree status, surrounding tree density, topography, understory). Features also depend upon laser properties (pulse power, pulse divergence, wavelength, range, scan angle), and on the configuration of the survey (pulse density, flight altitude, lateral overlap, flight lines configuration, maximum scan angle). Several of these elements influence ITC features in a manner that is not independent of scan angle, possibly producing cumulative effects and increasing feature variability. A priori, it is difficult to isolate the respective influences of all elements.
The difficulty of isolating and understanding the specific scan angle influence from among other influencing factors has been highlighted in studies adopting an Area-Based Approach (ABA). These studies have used either simulated lidar in of modelled trees or real lidar data [
20,
21,
22,
23,
24,
25,
26,
27]. As the influence of scan angles up to 20° on ABA features (e.g., canopy height) has not been found to be very large, scan angle has been ignored in most studies [
20,
21,
22,
28,
29]; nevertheless, its potential influence on forest-feature accuracy is frequently mentioned [
27].
The sensitivity and variability of ITC features are greater than those of ABA features, given that the ITC volume in which laser pulses can be intercepted is smaller than that of ABA cell volume. Thus, we highlight the specific need to study ITC feature value variations with scan angle, even for relatively small scan angles, i.e., up to +/−20°. The studies that are specifically addressing the effect of scan angle on tree features at ITC level are mainly based upon simulation models using geometric representations of trees [
30]. Very few studies have employed real lidar data [
31]. So far, we could not find any study that specifically assessed the influence of scan angle on multispectral features at the ITC level using real airborne lidar data.
Large scan angles (large deviations from the vertical) produce at least three effects. First, large angles cause a decrease in return intensity that is due to both the increase in laser range and footprint size. Consequently, the pulse energy per unit area reaching the top of the canopy decreases [
32,
33]. The effect of this attenuation (effect #1) is more pronounced for beams having a large divergence. Second (effect #2), large scan angles cause a decrease in the number of returns per emitted pulse (i.e., receiving multiple returns from a given pulse becomes less likely), which in turn produces a higher proportion of single returns. Third (effect #3), large scan angles cause a change in the distribution of returns through the forest canopy along the pulse trajectory. A decrease in the peak pulse power concentration at the top of the canopy alters the shape and amplitude of the signal, given that more vegetation material is needed to trigger a return. Thus, pulses penetrate deeper, and the height of returns is shifted downward [
33].
The decrease in return intensity at larger scan angles changes the values of intensity percentiles (effect #1). In addition, the change in return proportions (effect #2) affects these features, especially when computed from all returns, because the number of second and third returns (having a reduced intensity) is larger at the nadir than at larger scan angles. Even if intensity normalization would compensate for the intensity decrease (effect #1), rigorous radiometric calibration, in practice, is very difficult for lidar intensity. Intensity normalization for vegetation returns through physical modelling is very complex because many factors have to be accounted for, such as range-related factors (laser spread loss and attenuation by air), or tree-related factors (leaf reflectance, leaf size and orientation). Different physical equations have been proposed to empirically normalize return intensities to compensate for range variations [
34,
35,
36]. These equations theoretically allow the isolation of the scan angle effect as a function of range. More complex calibration methods introducing additional information for the purpose of retrieving radiometric characteristics [
37,
38,
39,
40] would make it more difficult to isolate the scan angle effect. The specific effect of intensity normalization as a function of range in reducing feature variability has been tested on the feature values themselves [
41,
42,
43] or through the degree of improvement in tree species classification accuracy [
34]. Moreover, even simple range normalization is technically difficult to apply to data that are distributed in the widely used LAS format because of the lack of information regarding pulse range.
No corrections have been proposed for effects #2 and #3 because these effects mainly concern second or third returns that are controlled by complex species-specific interactions between laser pulses and leaves or branches. Effect #2 mainly affects LiDAR features that are computed from return proportions, e.g., ratio of single to multiple returns, ratio of canopy to ground returns or the ratios of returns above a threshold of all returns; it has been highlighted in several studies using the ABA (i.e., at the plot-level) for canopy cover [
23,
24,
25,
29,
44,
45] or gap fractions [
24,
25,
26,
27,
46].
Changes in return height distributions (effect #3) have been shown to influence lidar features, such as ABA height percentiles [
20,
21,
22,
24,
28,
29,
32,
46,
47,
48,
49]. Some of these studies show that the two opposing effects of scan angle (shifting the height percentiles upward or downward) are not only related to forest density and occlusions, but they are also species-specific. For example, height percentiles varied more for species with deeper tree crowns relative to crown diameter, such as spruce (
Picea spp.), compared to species with shorter crowns, such as pines (
Pinus spp.), given the probability of intercepting oblique pulses [
44]. Simulation studies also have explained the upward shift by the increased distance along which the beams must travel through the canopy, thereby producing an increase in the interception probability of pulses at large incidence angles [
30,
32,
50]. The upward shift is further related to effect #2 by which a decrease in the number of returns per pulse influences the height percentiles that are calculated from all returns [
49]. Overall, the three effects, which remain difficult to correct, influence feature values. In turn, feature values would affect species identification accuracy. The main objective of this study is to quantify the effect of scan angle on individual tree species identification, from feature calculation to classification, using real tree data and to evaluate the improvement brought about by range normalization of intensity. To do this, we addressed four specific objectives: (i) investigation of the effect of scan angle on 3D and intensity features that are used to identify species; (ii) identification of the species for which features are most sensitive to scan angle; (iii) evaluation as to whether scan angle affects the accuracy of species classification; and (iv) evaluation of the effect of intensity normalization per feature type and per mean scan angle on the accuracy of species identification.
2. Materials and Methods
2.1. Study Area and Reference Data on Tree Species
The study area is located in the York Regional Forest (YRF) of southern Ontario, Canada (79°19′W, 44°04′N), in the Great Lakes–St. Lawrence forest region [
51]. The topography is mostly flat, with a terrain elevation range of about 43 m (240 m to 303 m). The forest stands in the study area are either naturally growing, mixed species stands or needle-leaf reforested/planted stands. We sampled six needle-leaf species that were found in plantations of different ages: red pine (
Pinus resinosa); eastern white pine (
Pinus strobus); Scots pine (
Pinus sylvestris); tamarack or eastern larch (
Larix laricina); Norway spruce (
Picea abies); and white spruce (
Picea glauca). The trees in this study were sampled mainly from needle-leaf plantations. Trees in plantations typically have similar characteristics, given that they are even-aged and grow under similar conditions. Because of this homogeneity, this dataset facilitates the isolation of the scan angle effect from other tree characteristics. For example, in more complex environments, such as natural stands or mixed stands that are composed of both needle-leaf and broadleaf species, trees grow in a wider variety of shapes and sizes, in dominant, sub-dominant or suppressed positions. These heterogeneous conditions could hinder separating scan angle effects from tree effects. Field identification of tree species was conducted in August 2015 to produce a reference database of individual trees within the study plantations. High-resolution (10 cm) images that were acquired during the Titan survey with the CM-1000 RGB camera (Teledyne Optech Inc., Vaughan, ON, Canada) were used to identify additional reference trees or to verify tree species in plantations using photo-interpretation.
The large sample tree dataset that was used in this study permits readily identifiable statistical trends, given that tree characteristics, occlusions and other flight parameters might exert greater effects on lidar features than would the scan angles if the number of sample trees was limited. Moreover, this study took advantage of multiple overlapping flight lines and the high return density of the Titan dataset, which allowed computing advanced features for multiple single flight line views of each tree.
2.2. Lidar Data and Intensity Correction for Range
The Titan system (Teledyne Optech Inc., Vaughan, ON, Canada) has three integrated lasers (hereafter referred as channels C1, C2 and C3) with different wavelengths (respectively, 1550, 1064 and 532 nm for C1–C3), and different scan angle planes, which are respectively tilted at 3.5°, 0° and 7° relative to the vertical plane. The data were acquired on July 2015 over an area of 2546 ha, at a mean altitude of about 800 m above the ground surface. The mean number of first returns per m
2, by channel and by individual flight line was 3.4 for C1 and C2, and 3.3 for C3, which is equivalent to a total of 10 returns per m
2 for all channels within a single flight line. At a range of 800 m, the average footprint at nadir was 28 cm for C1 and C2, and 56 cm for C3. Ranges and footprint diameters are summarized in
Table 1. A total of 19 flight lines were acquired, with an average lateral overlap of 50%, thereby providing two views of each tree from different angles. Most flight lines were acquired following parallel centrelines, but in one area additional flight lines were acquired perpendicularly. These two orientations (north-south and east-west) locally increased the number of views per tree with different scan angles.
Manual delineation of sampled crowns was performed on the canopy height model (CHM), which was generated using the highest returns of the three channels above the digital terrain model (DTM), within 10 cm pixels. Tree delineation followed the procedure that was proposed by Budei [
14]. A colour composite image that was generated by interpolating the first return intensity of each of the three channels was used to verify species and delineation, especially when two trees of different species were difficult to separate from the 3D information alone. The DTM was calculated from all ground returns that were amassed from all channels and all flight lines. This allowed us to produce a single DTM (ground reference) for all three channels and for all flight lines. A similar approach was used for tree height calculations: individual tree heights were calculated using all first returns.
Lidar data that were acquired with the Titan system were obtained both in LAS 1.3 and ASCII formats. The latter contained additional information on laser range and provided more precise scan angle data than did the integer values in degrees that were provided in the LAS files. The laser range information allowed us to normalize intensity following the equation that was proposed by Korpela [
34]:
where
In is the range-normalized intensity,
Iraw is the raw intensity,
R is the range, and
Rref is the reference range. The exponent
a was set to 2.0.
2.3. Net Pulse Scan Angles
The maximum lateral scan angle (mirror angle) for each of the three channels was about ±15°. The tilt of the non-vertical sensor scanning planes of two of the Titan lasers increased the net scan angle. This net scan angle was calculated from the angle along the scanning plane (mirror angle), and the angle of the sensor-scanning plane (tilt) (
Figure 1). The high precision data on the mirror scan angles (in degrees, to 5 decimal places) that were provided in the ASCII files were used for this purpose. Yet, information on the sensor orientation for each pulse (roll and pitch) or flight trajectories was not available. Therefore, roll and pitch of the aircraft were considered to be equal to zero. We consider these assumptions to be reasonable since the wind speed during the flight was under 11 km/h, and because there are no visual variations in the horizontal return distributions along the flight path. We used the following equation to calculate the net angle (
γ) from the mirror scan angle (
α) and the scanning plane tilt (
β):
Hereafter we use “scan angle” to designate the net angle as calculated with Equation (2). It was calculated for Titan channels 1 and 3, given that channel 2 had zero tilt. The “incidence angle”, which is largely used in studies concerning backscattered signals from a scanned object [
32,
33,
45,
52], assumes an accurate measure of the scan angle and of the normal to the local surface (often calculated from the DTM). Yet, the use of the term “incidence angle” was considered inappropriate in this study, given that the exact aircraft attitude parameters (roll, pitch and yaw) were not known; consequently, the true incidence angle relative to the horizontal plane at tree level could not be precisely calculated. Furthermore, the normal of a tree crown at the position of a lidar return is difficult to define.
Across overlapping flight lines, each tree was scanned from multiple angles. For the sake of simplicity, a scan of a single tree from a single flight line (point of view) is named a “tree view”. A given tree view comprises the three point clouds respectively corresponding to each of the three Titan channels. For each tree view, different mean scan angles were calculated (see details in
Section 2.4).
Figure 2 highlights the difference in point cloud configuration according to mean scan angle in C2 for an individual
Pinus resinosa.
The 3D and intensity features were computed for each reference tree. Several versions of each feature were computed from the point cloud that was extracted for each channel of each tree view. A feature version corresponds to one of three return types: all returns (all); first returns (1st); and single returns (si). A variable threshold for each tree was used to remove potential ground and understory points by retaining only returns that were greater than 40% of the height of the reference trees.
The 3D features were calculated using the normalized height above the ground of crown returns. A unique value of ground, the DTM pixel value at the tree crown centroid, was used to calculate return heights for all returns of the tree to avoid altering the crown return height distribution on sloping terrain. Height normalization consisted of dividing the height of each return by that of the highest return of the corresponding crown. The same raster DTM was used for the calculation of all features. The 3D features were calculated for each single channel (C1, C2, C3), as well as for the three channels combined (C321). The features included the mean (mn), the relative height at certain percentiles (PE, 5th, 10th, 25th, 50th, 75th, 90th and 95th percentiles: p05, p10, p25, p50, p75, p90, p95), the return height distribution (DI, i.e., standard deviation: sd; coefficient of variation: cv; skewness: skew, and kurtosis: kurt), the ratios of features that were calculated from different return types (RM, e.g., 1st returns mean height divided by tree height), and crown taper (SL, mean of the slope between each return and the highest return within a tree). We also computed the ratios of the number of returns of different types (Return Proportion—RP, e.g., the ratio of the number of first returns over that of all returns, etc.), which we have included with 3D features.
For comparison purposes, the intensity features were computed from raw intensities, as well as from the range-normalized intensities. Most intensity features were computed from individual channels (e.g., mean intensity, intensity percentiles). Other intensity features consisted of simple ratios of intensities between two channels (Ratio Channels or RC, two ratios using the green (G) channel: RCG1 from C2/C3 channels combination, RCG2 from C1/C3, and the third ratio using only infrared (IR) channels: RCIR from C1/C2), or normalized differences (ND, NDG1 as (C2 + C3)/(C2 − C3), NDG2 and NDIR from corresponding channels). For a complete description of features and their acronyms, see Budei [
14] and Budei and St-Onge [
18]. A description of the features is also provided in
Table 2.
2.4. Tree View Selection
Depending upon tree characteristics (tree height), tree environment (neighbour occlusion) or flight configuration (return density or position of the tree near the swath end), the number of returns by channel within individual flight lines was occasionally insufficient for calculating some of the features, leading to “non-applicable” (NA) values. The presence of NA could create bias in comparisons between the correlation coefficients of features that were calculated with different scan angles or between classification accuracies for different groups of tree views having a similar scan angle. Several a priori criteria were used to discard trees that would yield NA values in the features. For example, trees that were severely defoliated were not considered. Furthermore, only trees taller than 10 m were selected; below this threshold, the frequency of NA was too high. Furthermore, only trees that were sufficiently well scanned from at least two flight lines in three channels were selected. Tree views that were located at the extremity of the scan swath (that had a mean scan angle larger than ±14° in C2) were discarded. The main reason for this choice was that returns near the end of scan lines of an oscillating mirror that operated with a maximum scan angle of 15° have less precise geolocations and are not evenly distributed. They are closer together at the end of scan line when the oscillating mirror slows before reaccelerating in the opposite direction. In addition, the space between scan lines is greater at wider scan angles, leaving bigger gaps. Common practice is, therefore, to eliminate returns at the extremities of scan lines. Yet, in order to avoid computing features from point clouds that would be cut in the middle of tree views (i.e., in the middle of a tree crown) at the 14° scan angle threshold, we retained all tree views that had a mean scan angle of 14° or less in C2, but we kept the returns therein having a scan angle above 14°, if any. From a total of 37,063 candidate trees that had been manually delineated [
18], 13,325 trees met all of the above selection criteria. This resulted in 27,922 tree views that were used for computing the correlation between feature values and mean scan angle, and for RF (random forest) classification. The tree views were divided into three scan angle classes (see
Figure 3). Because of the position of these trees relative to the flight line, the frequency distributions of the mean scan angle of the tree views per channel differed between species (
Figure 4). Therefore, we enforced a minimum of 30 tree views in each mean scan angle class, and we applied a balanced RF algorithm (down sampling) to compensate for the difference in sample sizes between different species.
2.5. Correlation between Feature Values and Mean Scan Angle
Each feature was computed for a specific combination of point clouds corresponding to different view geometries. For this reason, different mean scan angles were considered for each tree view, depending upon the specific combination of point clouds extracted from different channels (C1, C2, C3), or combinations of two or three channels (C1_C2, C2_C3, C1_C3 and C321). Combinations of two or three channels were calculated to understand the effect of scan angle on multi-channel feature values, such as channel ratios or NDVIs and 3D features. For single-channel features, the mean scan angle value of a tree view was computed using only the absolute scan angle value of all returns in that channel. For features that were constructed as channel ratios or NDVIs, the mean scan angle was first computed for each channel’s point cloud after which the overall mean was then calculated using:
where
m_angl is the absolute mean angle of points from flight line
l;
angC1 and
angC2 are the return scan angles of the point cloud of a tree, respectively for the first and second channels of a ratio feature. The mean scan angles were always calculated from all return types of the channels that were concerned. Yet, these mean scan angles were also attributed to features that were computed only from first or single returns. These computed mean scan angles were then used (a) to calculate the Pearson product-moment correlation coefficient (
r) between scan angle and feature values, (b) to divide tree views into scan angle classes, and (c) to evaluate the variation of species identification accuracy with scan angle class.
Descriptive statistics and boxplots of 3D features and intensity features that had been calculated for each channel and species were used to compare variations in feature values with mean scan angle (objective i).
We also evaluated the magnitude of the intensity normalization effect on the correlation between feature values and scan angle, and on classification accuracy (objective iv). The percentage of features that were correlated with mean scan angle with a coefficient greater than |±0.2| was compared between intensity features that were calculated respectively from raw and normalized intensities. To evaluate the effect of intensity normalization on species classification accuracy, results of RF classifications based upon intensity features that were calculated respectively from raw and normalized intensities were compared.
2.6. Random Forest Classification
Tree views were first divided into three classes based on their mean scan angle (
Table 3). RF classifications of species were performed separately for each angular class. This method makes it possible to determine whether the mean scan angle had affected species classification accuracy (objective iii) by comparing the accuracies between the scan angle classes. The RF models were trained separately using samples that were balanced between species-scan angle class combinations to avoid bias in the RF classification. The same value was used for the
nsize parameter of the
rf function from the
randomForest R library [
53]. This value was set to 20 for all classifications, which was somewhat less than the sample size of the smallest tree view group of species-scan angle class. When compared to using the minimum
nsize for each specific RF classification subset, this choice lowers the resulting accuracy for all classifications, but ensures that differences between classification accuracies cannot be attributed to differences in the
nsize parameter settings. Note that the main objective of these classifications was not to compare performance between different types of features or channels that were used but to test whether results differed between the three scan angle classes while using the same features in the RF classification models.
Optimal delimitations were determined by creating a set comprising a small, medium and large scan angle class per channel or channel combination (
Table 3). The choice of class boundaries was guided by criteria, such as ensuring that the number of trees in each group of species-scan angle class was sufficient.
RF classifications were performed for each individual channel and ran separately for 3D features and for intensity features (raw and normalized), together with the combined 3D and intensity features. First, for each of these RF classifications, the model was trained with all tree views from the three scan angle classes. The out-of-bag RF accuracy of this pooled classification was reported as the overall accuracy. Second, the accuracy of the RF classification for each scan angle class was then calculated separately.
4. Discussion
The scan angle of airborne laser scanners is variable and can theoretically affect the 3D distribution and intensity of returns from tree crowns. This could affect our capacity to use lidar point clouds to identify individual tree species, but research on this question was lacking. This study on individual trees, which was similar to plot-level studies, showed that for net scan angles between 0° and 20° and over terrain having only small topographic variations, the correlation between individual tree feature values and scan angle was low, i.e., generally less than |±0.2| (objective i). At the same time, variability in individual tree feature values for a given tree species that had been captured from about the same scan angles (see boxplots in
Figure 5,
Figure 6,
Figure 7 and
Figure 8) was itself quite large. The inherently large feature variability dampens our ability to isolate the relative effect of scan angle. A similar situation was observed for plot-level features. For example, Morsdorf [
45] found that there was not a significant difference between the predicted percent canopy cover at larger scan angles than at lower scan angles because their inherent standard deviations were larger than the differences. This large variability might be explained by other tree properties; for example, Budei and St-Onge [
18] highlighted the influence of tree height on feature values and [
44] highlighted the influence of tree density on feature values. Even considering the controlled conditions of our test site, the effect of scan angle on feature values did not predominate over feature variability.
Our analysis highlighted several effects that were attributable to scan angles (objective i). The features that were most affected by scan angle were those related to crown shape, return proportions and single-channel intensity. Effect #2 of scan angle (decrease in the number of returns per emitted pulse) explains the higher correlations of features, such as the return proportion (RP), because the number of second- and third-returns are reduced as scan angle increases. In addition, features that are related to crown shape (SL and SU) are highly sensitive to return distributions in the crown, being equally influenced by scan angle effects #2 and #3 (change in the distribution of returns) (
Table A1). The single-channel intensity features that were computed from first- or single-returns were related mainly to effect #1 (intensity attenuation), compared to those computed from all returns; the latter are influenced equally by effects #1 and #2. Thus, range normalization generally reduced correlations with scan angle for intensity features from first- and single-returns, and had less importance with respect to features that were computed from all returns (
Table 6).
The features that were least sensitive to scan angle are NDVIs and intensity channel ratios. The reason is likely that scan angle effects #1 and #2 more or less offset one another in the case of ratios between channels. Yet, configuration differences between Titan channels must be accounted for when considering the ratios of channels and NDVIs. Overall, our analysis shows that the effect of scan angle on channel ratios and NDVIs can be ignored for scan angles lower than 15°.
The influence of tree species on the variation lidar features (objective ii), i.e., the different upward or downward shifts in height percentile depended on tree species. For example, features that are calculated for species with an elongated tree crown length, (e.g., spruces) were more affected by scan angle than those calculated for species with shorter crown lengths and more compact crown shapes (e.g., pines). This observation was consistent with simulation studies, such as the one conducted by [
44].
Our assessment of an alternative solution to circumvent normalization showed that classification improvement following intensity normalization depends upon the percentage of features that are affected by intensity normalization. Intensity normalization changed feature values that were calculated from single channels, such as percentiles of intensity. Intensity normalization did not substantially change the feature values that were calculated as ratios of channels, NDVIs or dispersion characteristics. Consequently, features such as ratios, NDVIs or 3D reduce the necessity of intensity normalization for tree species identification compared to using intensity features from a single channel. The possibility of reducing the necessity of intensity normalization by using features that were computed from two channels is a major advantage of multispectral lidar compared to mono-spectral lidar.
At least two limitations remain in our study and need further assessment. The first limitation concerns the exclusion of trees at large scan angles. For single trees, occlusions occurring at large scan angles can lead to the impossibility of calculating all features for a given tree. The uneven point spacing at swath ends or at low-return density (in this study a single flight line) may also lead to the impossibility of calculating all features; a concern that especially applies to smaller trees. This study being limited to trees above 10 m high, the overall ability to identify species of smaller trees at large scan angle is worth considering in further studies, as errors are expected to increase. The second limitation concerns the progressive importance of occlusion effects and interception as scan angles become larger, which in turn influences the variability and the correlation of feature values with scan angle. These effects would also be increased by greater variations in topography [
30]. For large scan angles, occlusions that were incurred by taller trees or through self-occlusion cause lateral asymmetry of return distribution in the crown.
The data that were used in this study were registered with a maximum lateral scan angle of 15°, which resulted in a net scan angle up to 20°. Further studies would be needed to verify the effect of larger scan angles (e.g., up to 30°).
The methods that were used in this study isolated the individual flight lines, an approach that allows clear division of trees into scan angle classes, given that all pulses that reach a tree view had similar scan angles. In most cases, airborne lidar flight lines are laid out such that scanning swaths from at least two flight lines overlap. This reduces occlusions and mitigates scan angle effects. However, survey costs increase with the percentage of swath overlap, creating a situation where a compromise must be sought between accuracy and costs.
5. Conclusions
The objective of the present study was to evaluate whether scan angle (up to a net scan angle of 20°) influences 3D and intensity feature values and if this influence affected species classification accuracy. Quantifying the magnitude of this influence is important from an operational perspective, especially in choosing the best scanning parameters when species identification is considered. In the context of our study, we found that:
Scan angle had a small effect on 3D feature values. Yet, this influence differed depending upon tree species, with Picea abies and Larix laricina having the greatest number of correlated features above |±0.2|.
Scan angle had a greater effect on raw (non-normalized) intensity feature values (mostly concerning Picea abies, Picea glauca and Pinus sylvestris) than on normalized ones (when only one species being concerned, i.e., Picea abies). Yet, this effect almost disappeared when channel ratios or normalized vegetation indices (NDVIs) were used, these features being the least correlated with scan angle.
Classification accuracy did not vary with scan angle when 3D features and the best intensity features (the ones that were least affected by scan angle) were used together.
Despite theoretical expectations that scan angle might affect 3D or intensity features and, consequently, species identification accuracy, this study highlights the fact that the magnitude of the effect is small and can be mitigated. Mitigation can be accomplished by combining plot clouds from multiple overlapping flight lines by standardizing the intensity channel if only one exists, or by using ratios or normalized differences of intensity if at least two channels (wavelengths) are available. Furthermore, by selecting the 3D and intensity features that are least sensitive to scan angle, as identified in our study, the scan angle effect can be ignored. This potentially alleviates the necessity of applying complex methods that compensate for scan angle effects. If these precautions are not taken, errors in tree species identification could vary, to a certain extent, as a function of scan angle, thereby, generating false spatial patterns of species distributions.