1. Introduction
Urban forests and greenspaces are increasingly considered an important priority for improving the sustainability, resilience, and livability of the urban landscape [
1]. Trees in the urban forest provide many benefits such as air pollution reduction [
2], storm water runoff attenuation [
3], carbon sequestration [
4], and building energy conservation [
5]. Benefits generally increase as the size of trees increase [
6], but as trees mature they are more likely to develop decay, which increases their likelihood of failure [
7]. In built environments, tree failures can result in fatalities [
8], power outages [
9], and catastrophic fires [
10], and damage from failures is associated with higher costs [
11] and legal liability [
12].
Arborists have assessed tree risk for many years. Recent revisions have brought the process into better alignment with risk assessment practices used in other disciplines. The current U.S. standard considers (1) the likelihood of a tree failure, (2) the likelihood of the impact of a tree or tree part on a target, and (3) the severity of the consequence if impact were to occur. Arborists assign one of four ratings regarding the likelihood of failure (
improbable,
possible,
probable, or
imminent) that are defined as follows [
13]:
Improbable: The tree or tree part is not likely to fail during normal weather conditions and may not fail in extreme weather conditions within the specified time frame.
Possible: Failure may be expected in extreme weather conditions, but it is unlikely during normal weather conditions within the specified time frame.
Probable: Failure may be expected under normal weather conditions within the specified time frame.
Imminent: Failure has started or is most likely to occur in the near future, even if there is no significant wind or increased load. This is a rare occurrence for a risk assessor to encounter and may require immediate action to protect people from harm. The imminent category overrides the stated time frame.
Decay is a common defect that is often associated with tree failure [
7,
14,
15]. Decay reduces load-bearing capacity by reducing wood strength and, if wood components are completely digested, by creating voids that reduce the cross-sectional area. Many tools and techniques to detect and assess the extent of decay have been developed. Some are simple (e.g., sounding the stem with a mallet), whereas others are sophisticated (e.g., resistance drills and tomography) [
16]. Many studies have investigated how well decay detection tools and techniques work [
17,
18,
19,
20,
21,
22,
23,
24].
Despite advancements in decay detection tools and techniques, many aspects of risk assessment remain uncertain because of the lack of knowledge about how trees grow and fail. Uncertainty may also be exacerbated by assessor bias, including an assessor’s personal risk tolerance [
25]. Cognitive studies on human risk perception attribute an individual’s attitude towards risk to personal experiences [
26,
27], personal fears [
28], and biases shared by communities [
29]. An assessor’s training also influences ratings: trained professionals tend to return lower likelihood of failure ratings (LoFRs) than those without training [
25,
30].
Our objectives for this study were as follows:
To determine whether more detailed information about the extent of trunk decay influences experienced assessors’ LoFRs and, if so,
To identify factors related to assessors and trees that explain the influence.
2. Materials and Methods
The study took place on the campus of the University of Massachusetts in Amherst, Mass., USA (USDA Hardiness Zone 5b). In July 2021, 18 experienced arborists who held the International Society of Arboriculture’s (ISA) Tree Risk Assessment Qualification (TRAQ) (among other credentials) assessed the likelihood of stem failure due to decay of 30 trees using 5 (basic and advanced) assessment techniques.
We selected trees for the field assessment based on practical considerations. The first was the availability of sonic and electrical resistance (ER) tomograms taken of the trunk, which were taken within 2 m of the ground. These tomograms had been previously obtained using a PiCUS Sonic Tomograph 3, a TreeTronic 3 for ERT, and the Caliper 3 Geometry Measurement System (Argus Electronic GMBH, Rostock, Germany) following the methods of [
23]. A second consideration was variation in the compartmentalization response: weak (
Pinus) and strong (
Quercus). Finally, only (i) larger individuals (>50 cm stem diameter measured 1.4 m above ground (“DBH”)) and (ii) individuals that were close enough to one another that they could be grouped by location were selected. In the latter case, we selected individuals in six discrete clusters around the campus. We selected clusters of individuals for two reasons: (i) they included a variety of landscape settings (open space or near infrastructure such as roads, buildings, and parking lots); and (ii) they limited travel time to maximize the number of individuals that could be assessed in the two days when assessors visited campus. Prior to conducting the study, we pre-tested the methods and determined an efficient route to assess as many trees as possible in two days.
We recruited assessors from our professional networks, inviting only experienced assessors who (i) held the TRAQ credential, (ii) regularly performed risk assessments as part of their professional practice, and (iii) were familiar with advanced decay detection techniques such as resistance drilling and tomography. We offered continuing education units to assessors, but did not offer financial compensation nor reimbursement of travel expenses.
Before assessors arrived on campus in July 2020 to participate in the study, we used a Resistograph
® F500-S (IML North America, Moultonborough, NH, USA) to determine the thickness of sound wood (
t) between three and six locations spaced at approximately even intervals around the stem circumference and at the same height as the tomogram. For each location, we computed the
t/
R ratio, where
R is the trunk radius [
31]. We flagged the stem to indicate the locations of the tomography and Resistograph measurements (
Figure 1).
We provided each assessor a binder that included a sheet for each tree. The sheet contained the following information: genus and species, the DBH, height, the Resistograph output (
Figure 2), and the sonic and ERT tomograms (
Figure 3). Output from the Resistograph included a scaled diagram of the cross-section of the stem and lines indicating where the drillings were made, the height and stem diameter where the drillings were made, the mean
t value, and a table of all
t/
R values. The tomograms included the percentage of the cross-sectional area that was sound or decayed. The decayed proportion of the cross-section was computed automatically from the combined areas of blue and purple in the sonic tomogram. Since we used the default settings (SoT1 calculation option and minimum velocity established at 50%), the resulting tomogram depicts the greatest possible area of decay in comparison to those generated using SoT2 and an expanded color space to view the minimum percent velocities. However, the computed proportion of decayed wood indicated at the top of the tomogram that assessors viewed during the study (e.g.,
Figure 3) did not include areas of intermediate velocities. We explained this to the assessors prior to the field study. After the field study, we computed the loss in section modulus due to decay (Z
LOSS) from each sonic tomogram following the method of [
32].
We instructed assessors to assign a rating of the likelihood of stem failure due to decay (“LoFR”) within 2 m of the ground and reminded them not to assess the likelihood of failure of other parts of the tree. We used LoFRs from [
13] and provided assessors with the definitions (listed in the Introduction). We instructed assessors to assign their LoFR based on a timeframe of three years.
The assessors performed five consecutive assessments of the LoFR. In order, the assessments were as follows:
Performing a visual assessment of the tree and its surroundings;
Sounding the trunk with a plastic mallet;
Viewing the Resistograph output (
Figure 2);
Consulting with a randomly assigned assessor.
Assessment techniques (a) and (b) are part of the Level 2 (“basic”) risk assessment [
13]. Assessment techniques (c) and (d) are more sophisticated techniques to assess the amount and location (i.e., the “extent”) of decay and are part of the Level 3 (“advanced”) risk assessment [
13]. For odd-numbered trees, assessors viewed the resistance drilling output (c) before viewing the tomogram (d); for even-numbered trees, assessors viewed the tomogram first. Consulting with a peer is not explicitly recommended in common professional guidelines [
13,
33]. Within each cluster of trees, assessors were randomly paired and inspected individual trees at their own pace.
After each of the five assessments ((a)–(e)) on a tree, the assessors completed a survey to indicate their LoFR and describe the factor(s) (e.g., species, decay severity, tree size, exposure, lean, crown, etc.) that most influenced their LoFR, and if the additional information gained in the assessment technique changed their LoFR.
Assessors also self-reported the following information on the survey: years of experience performing tree risk assessments; number of trees assessed annually; relevant credentials in addition to the TRAQ; and how frequently they use assessment techniques (b), (c), and (d) as part of their professional practice.
During the field study, not every assessor completed all five assessments of every tree. As a result, approximately 15% of the expected dataset was missing values. We used multivariate imputation by chained equations [
34,
35] to impute the most likely value for each missing value to obtain a full dataset prior to OLR analyses.
The university campus is well maintained, and no assessor assigned an LoFR of four (“imminent”) to any tree. Consequently, we coded the LoFRs ordinally as one (“improbable”), two (“possible”), or three (“probable”) and built ordinal logistic regression (OLR) models to investigate the effect of assessment technique on the LoFR. All analyses were performed using the statistical language R, v4.1.2 [
36]. In the OLR models, we included covariates describing trees (genus; DBH; percent of cross-sectional area with decay (from tomograms); average sound wood thickness (
t) from the Resistograph output;
t/R, where
R is the stem radius; Z
LOSS) and participants (years of experience; frequency of using a mallet, resistance drilling, and tomography when conducting risk assessments). We also included tree and assessor identification as random effects in each OLR model. We built models with the “clmm” function from the “Ordinal” package by iteratively adding covariates as single effects or interactions with the main effect of the assessment technique [
37]. Since the order of assessments differed between even- (viewed tomogram before Resistograph output) and odd-numbered (viewed Resistograph output before tomogram) trees, the variable “assessment technique” contained ten levels that represented an interaction between the five assessment techniques and even- or odd-numbered trees. We then selected the best model using the lowest AICc scores.
In addition to the OLR analyses, we created a contingency table with four rows (one for each of the assessment techniques that followed the initial visual assessment) and two columns (to indicate whether the additional information gained for the assessment technique changed (“Yes”) or did not change (“No”) assessors’ LoFRs). We used a test to determine whether the proportion of affirmative and negative responses varied among assessment techniques.
Lastly, we investigated the influence of the random variables in the OLR model (assessor and tree) on the LoFR. To investigate the influence of assessors, we evaluated if the consistency in assessor LoFRs changed among the five assessment techniques or four frequency-of-use categories of the tomogram or Resistograph. We quantified LoFR consistency with the “betadisper” function in the “vegan” package, which performed a multivariate test of homogeneity of variances on a Bray–Curtis (rank-based) dissimilarity matrix of the proportional distribution of LoFRs [
38]. A multivariate approach was needed to evaluate inconsistencies in LoFRs with a single test.
To investigate the influence of trees between the initial visual assessment and each subsequent assessment technique, we computed the ratio of the weighted mean change in the LoFR to the proportion of unchanged LoFRs for each tree. The ratio illustrated the frequency, magnitude, and direction of changes in the LoFRs from the initial visual assessment. We computed the ratio () as follows:
Compute the difference in the LoFR from the initial visual LoFR:
where
,
, and
, are indices for the 4 assessment techniques following the initial visual assessment (indicated by the subscript
), the 30 trees, and the 18 assessors, respectively.
Compute the proportion of unchanged LoFRs (i.e.,
) for each tree and assessment technique:
Compute the weighted mean change in the LoFR:
where
is a weighting factor of 1 (for LoFRs that changed one level from the initial visual assessment, e.g., from probable to possible or improbable to possible) or 2 (for LoFRs that changed two levels, e.g., from probable to improbable).
For each tree and assessment technique,
We thus computed 30 values of for each of the 4 assessment techniques that followed the initial visual assessment. From the resulting distribution of 120 values of , we considered only values in the upper and lower quartiles as having an increased and decreased LoFR, respectively. We considered values of within the interquartile range (IQR) as having the same LoFR as the initial visual assessment. In the rest of the paper, we refer to “increased”, “decreased”, or “unchanged” LoFRs rather than values of in the upper quartile, lower quartile, and IQR, respectively.
We described the basic assessment techniques as “consistent” if the LoFR assigned in the mallet assessment was unchanged from the initial visual assessment, and “inconsistent” if the LoFR assigned in the mallet assessment was greater or less than in the initial visual assessment. We described the advanced assessment techniques as consistent if the change in the LoFR from the initial visual assessment was the same for both advanced assessment techniques. We described the advanced assessment techniques as inconsistent if the change in the LoFR from the initial visual assessment was not the same for both advanced assessment techniques. With respect to changes in LoFRs from the initial visual assessment, we described the effect of the consultation assessment as “confirming” (or not) the basic and advanced assessments. If the LoFR assigned in the mallet and consultation assessments was unchanged from the initial visual assessment, the consultation assessment confirmed the basic assessment techniques. Similarly, if the LoFR was greater than or less than the initial visual assessment for both advanced assessment techniques and the consultation assessment, the consultation confirmed the advanced assessment techniques.
4. Discussion
Our results demonstrate that detailed information about the extent of trunk decay influenced experienced TRAQ-credentialed assessors’ LoFRs, but neither consistently nor in a straightforward way. The effect was most noticeable in greater LoFRs assigned following the tomogram assessment. However, covariates related to trees (percent of decay and t) and assessors (frequency of using resistance drilling tools for risk assessments) led to significant interactions with the assessment technique, indicating the need for a more nuanced interpretation. A larger sample of assessors may have improved our understanding of their effect on LoFRs. It is also important to note that we could not confirm any of the LoFRs as “correct” because none of the trees failed in the interval between when the assessors assigned LoFRs (July 2021) and the publication of this manuscript (May 2023).
Because the Resistograph output and tomogram helped assessors visualize the extent of decay, we expected that the advanced assessment techniques would influence LoFRs—particularly for assessors who use advanced techniques less frequently. The influence was obvious in the changing proportions of LoFRs as the percent of decay changed, but only after assessors viewed the Resistograph output and tomogram. The pattern persisted following the consultation assessment, further supporting the idea that visualizing decay affected assessors’ LoFRs. However, the overall trend did not apply to every tree. Our observation that the consultation assessment confirmed the basic assessment nearly as often as the advanced assessment was the result of greater LoFRs assigned following the tomogram assessment.
We speculate that the significant increase in the LoFR following the tomogram assessment was due, in part, to the visual presentation of tomograms themselves. Our choice of the default (and more liberal) SoT1 calculation with a minimum velocity set at 50% created tomograms with the largest area of decay. Assessors who often use tomography for risk assessments would more likely have understood that the tomograms may have overestimated the extent of decay using the default calculation, whereas assessors who only rarely use tomography may have been more inclined to increase the LoFR they assigned, as our findings suggest. Especially on stems of a larger diameter and less regular shape, it is imperative that assessors are familiar with the uncertainty associated with interpreting tomograms [
39].
It is also plausible that the complete and in-color view of decayed areas in tomograms may have been perceived as a more definitive depiction of decay, especially for assessors who use tomography less frequently. For instance, the number of holes drilled for the Resistograph may not have been adequate to precisely define the extent of decay, which could, in turn, result in assessors experiencing greater uncertainty in how to interpret the Resistograph outputs. The Resistograph outputs also were truncated and did not traverse the entire diameter. In contrast, the tomograms presumably presented more visually compelling cross-sectional images than the black and white line drawings of the Resistograph output. For example, the extent of decay presented in the Resistograph outputs and tomograms was similar for trees 22 and 27 (
Figure 8), but the change in the LoFR from the visual assessment was only greater after viewing the tomograms. Without comparing the tomograms and the outputs from the Resistograph to pictures of the cross-sections themselves, it was not possible to know which portrayal of internal decay was more accurate. Many studies have demonstrated the accuracy and limitations of each technique [
18,
19,
21,
22,
23,
24], which is why using both techniques to investigate the extent of decay is helpful [
40].
For assessments that followed the initial visual assessment, the decreasing proportion of probable LoFRs assigned by assessors who more frequently use a resistance drilling tool in practice was intuitive. With visual assessments, however, the trend was inverted: the proportion of probable LoFRs increased with assessors who more frequently use a resistance drilling tool. It was not clear why this occurred. It may reflect assessors being accustomed to using simple and advanced tools to detect decay rather than focusing on a tree’s outward visual appearance. However, previous studies have found for several species that the visual assessment of a tree’s appearance often aligns with the extent of internal decay [
14,
15,
17].
Statistically significant differences, however, do not imply that the trends applied to all trees, assessors, and techniques. Trees 3 and 4 (
Figure 9) highlighted both the advantage of using more than one technique to assess the likelihood of failure due to stem decay and the challenges of individual assessment techniques. Both trees were
Q.
bicolor with nearly identical DBHs; they were in the same location and presumably exposed to the same wind loads. Both trees also showed signs of past lightning strikes, with wound wood formed around the lightning damage, and superficial trunk decay. Their tomograms showed nearly identical percentages of sound wood (86% and 87%), but with areas of green indicating intermediate velocities and the possibility of decay. The Resistograph output for tree 3 (
t/
R ≥ 0.59, average
t = 30 cm) aligned neatly with the tomogram, confirming—at least for an assessor who appreciates the nuanced interpretation of green areas using the SoT1 setting—that the extent and severity of decay were minimal. However, the Resistograph output for tree 4 (minimum
t/
R = 0.22, average
t = 18 cm) contradicted the tomogram: the extent and severity of decay presented more of a concern. The detailed description of each tree was reflected in the changes in LoFRs: LoFRs assigned following the Resistograph and tomogram assessments decreased compared to the initial visual assessment of tree 3, but the LoFRs of tree 4 decreased compared to the initial visual assessment only following the tomogram assessment.
Individual trees also illustrated the limitations of using simple tools and techniques. Trees 14 and 26 (both P. strobus) thwarted assessors’ attempts to assess the extent of decay by sounding the trunk with a mallet, even though all but one assessor “often” sound trunks in practice. Following the mallet assessment, the LoFR of each tree increased from the visual assessment. Assessors described the trunk as sounding hollow, but the advanced techniques revealed little decay. There were only five P. strobus in the study; that two were problematic suggests that sounding with a mallet may not be reliable for some species. Future studies should investigate this technique’s reliability.
Previous studies have shown that risk assessments are prone to bias related to an assessor’s training, experience, and perceptions of risk [
25,
41,
42,
43]. To manage subjectivity, clear definitions of categories in a risk matrix (e.g., the four LoFRs in [
13]) [
44] and sufficient training to calibrate assessors [
45] are imperative. Yet, despite assessors (i) holding the TRAQ credential (which requires continual training to obtain and maintain), and (ii) receiving more information about the extent of decay through five successive assessments of stem decay, some variation among their LoFRs persisted. For most trees and all assessment techniques, assessors assigned two or three LoFRs, and the non-significant beta dispersion test demonstrated that obtaining more information about the extent of decay did not reduce assessors’ variation, aligning with the findings of [
42]. None of the covariates that described assessors’ experience adequately explained this finding. We speculate that this reflects the innate imprecision of assessing the likelihood of failure. The persistent variation in LoFRs in our study and [
42] may not be as problematic as one might suppose because studies have shown that assigned LoFRs were broadly consistent with the measured likelihood of failure following storms [
46,
47].
Another advanced technique to assess the likelihood of failure is the static pulling test [
48]. Unfortunately, we were not able to include the pulling test in the experiment because of travel restrictions imposed by the COVID-19 pandemic.