3. Results
3.1. Sociodemographic Characteristics of the Participants
Table 1 presents information on the family caregivers. For the total sample, the average age of the family caregivers was 32 years and they had 1–3 children. Most were women (82.5%), had a primary/secondary education (63%) and self-identified as Catholics (80.9%).
All of the patients were being treated in the hematology-oncology department and acute lymphoblastic leukemia was the most frequently occurring type of cancer. For the total sample, the average age of the children was 5.97 years (SD = 5.07), ranging from one to seven years. On average, they were hospitalized for 1.71 months (SD = 1.23) and 3.52 years (SD = 2) elapsed had since their diagnosis. The sex ratio was equal (exact probability of the two-tailed binomial test: p = 0.394).
3.2. Testing the Correlated-Factor Models Originally Proposed for the BAI in the Total Sample
Table 2 shows the GFIs for the two-, three- and four-factor models originally proposed for the BAI as well as the statistics of convergent validity for each factor (AVE and ω) and discriminant validity between factors (r
2). All three models showed a poor fit to the data and clear problems in discriminant validity between factors (r
F1,F22 > AVE
F1 and AVE
F2). The data seemed to suggest a single-factor model but the single-factor model with 21 indicators also showed a poor fit. Consequently, an exploratory methodology was adopted. As a first step, the properties of the items were analyzed using the Rasch model.
3.3. Analysis of the Discriminability, Reliability and Normality of the Items in the Total Sample
A frequency analysis revealed that approximately 80% of the questionnaire responders selected the first two answers (1 = little or none; 2 = more or less) for all of the items, so their distributions showed positive asymmetry. The null hypothesis of univariate normality was rejected in all cases according to the Kolmogorov-Smirnov test with the Lilliefors correction.
Next, the multicollinearity of the data set was examined through the tolerance limits of the 21 items when predicting a random variable. The results indicated that item 12 (“hands trembling”) and item 13 (“shaky/unsteady”) presented excessive multicollinearity (tolerance <0.25), so they were removed from the subsequent analyses.
Table 3 presents the results of the Rasch analysis. Items 20 (“face flushed”), 16 (“fear of dying”), 1 (“numbness or tingling”), 18 (“indigestion”) and 4 (“unable to relax”) had item discrimination indexes below or close to 0.80 (<0.90 indicates weak discriminability). Except for item 4 (infit = 0.94 and outfit = 1.03), their infit and outfit indexes were also close to or above the poor fit value (1.3). These five items also had the lowest point-biserial correlation values. Due to their poor discriminability and reliability, they were removed from subsequent analyses.
Once the two items with tolerance values <0.25 and the five items that the Rasch analysis indicated had discriminability and reliability problems within the scale were eliminated, atypical multivariate cases were identified using the Mahalanobis distance measure. A total of 31 of the 445 cases had Mahalanobis
d2 distances greater than 36.1 (corresponding to the 0.999 quantile in a chi-square distribution with 14 degrees of freedom) and were withdrawn. The normality of the items was re-evaluated without these atypical cases but the Kolmogorov-Smirnov test with the Lilliefors correction again indicated that the item scores did not follow a normal distribution. Because removal of the multivariate atypical cases did not reverse the univariate normality, these cases were not discarded. Following the recommendations of Tabachnick and Fidell [
59] for data with strong positive asymmetry, an inverse transformation was applied to items in the total sample of 445 participants: Y = 1/(1 + X). The result revealed a monotone-increasing transformation; therefore, a higher score in the item continued to indicate a higher level of anxiety. This procedure enabled the distributions to become more symmetrical bell curves, so subsequent procedures were carried out with these transformed scores.
3.4. Random Division of the Total Sample into Two Subsamples
The total sample of 445 participants was randomly divided into two independent subsamples: 224 participants to explore the factorial structure and 221 participants to test the models derived from exploratory analysis. In this way, the recommendation for the sequence of exploration and confirmation of the factor model was followed. No significant differences in the sociodemographic characteristics of the family caregivers were found between the two subsamples (
Table 1). There were also no statistically significant differences in the two sociodemographic variables
t(443) = −0.457,
p = 0.648 for age and χ²(1) = 1.20,
p = 0.272 for sex) or two clinical variables
t(443) = 0.496,
p = 0.620, for time of hospitalization and
t(443) = −0.675,
p = 0.500, for time since diagnosis) of children. Therefore, the random division of the total sample into two subsamples did not generate biases in sociodemographic or clinical characteristics.
3.5. EFA of Subsample 1
To define the maximum number of factors, a parallel analysis was conducted. Two observed eigenvalues were greater than the random eigenvalues, so two factors had to be extracted according to this analysis. Extracting the factors by the principal axis and rotating the factor loadings matrix using the promax method revealed that items 14 (“fear of losing control”) and 19 (“faint/lightheaded”) had factor loadings <0.40, so they were removed from subsequent analyses. Horn’s parallel analysis was repeated, resulting in two empirical eigenvalues greater than the random eigenvalues. Therefore, the number of factors to be extracted with 12 items was also two.
When two factors were extracted again, 49.9% of the total variance was explained. After the oblique rotation, all items had factor loadings greater than 0.50 in the configuration matrix and greater than 0.60 in the structural matrix. The first factor was composed of seven physiological symptoms that involve the cardiovascular system (e.g., difficulty breathing, feeling of choking, heart pounding/racing) and the circulatory system (e.g., dizzy or lightheaded, wobbliness in legs). The second factor consisted of five symptoms of emotional problems, such as negative expectations regarding events (e.g., fear of the worst happening, unsteady), as well as intense apprehension (e.g., scared, nervous, terrified or afraid). The first factor was called “physiological symptoms,” and the second was called “emotional symptoms” (
Table 4).
Both factors showed good internal consistency (α > 0.80), as did the total scale (α = 0.893). The shared variance between the two factors was 41.7%, which was lower than the AVE of the first factor (47.9%) and the second factor (49.5%); as such, the factors showed discriminant validity. Considering that the AVE values for both factors were close to 0.50 and that the composite reliability values were good (ω > 0.80), both factors were considered to have convergent validity.
3.6. CFA of Subsample 2
As a first step, fulfillment of the basic assumptions of multicollinearity was tested for the 221 participants in subsample 2. The results indicated no multicollinearity problems because all items had tolerance values ≥0.55. Two atypical cases with multivariate discrepancies were identified by measuring the Mahalanobis distance (d2 > 0.999, χ²(12) = 32.91) and were therefore withdrawn. Once these two cases were eliminated, the standardized value of Mardia’s multivariate kurtosis was 4.564, which indicates a slight deviation from multivariate normality, being less than 10 and close to 2.
Based on the EFA results, a correlated two-factor model was specified. Estimating the model parameters revealed that the residual for item 2 (“feeling hot”) had high correlations with the residuals of two items of the same factor (item 21 [“hot/cold sweats”] and item 3 [“dizzy or lightheaded”]) but item 2 had the lowest factor loading (β = 0.46), whereas the measurement weights were greater than 0.50 for the other items of the physiological symptoms factor. It was decided that item 2 would be eliminated after verifying that its removal improved the data fit and did not alter the factor’s conceptual content or measurement characteristics, such as internal consistency, which remained good (α = 0.851). Furthermore, the measurement residuals for items 11 (“feeling of choking”) and 15 (“difficulty breathing”) were correlated. Due to the content of both items and their high factor loadings, they were retained in the model and it was decided that a parameter would be freed by specifying the covariance between their residuals.
Figure 1 presents the final model. The solution was admissible and all parameters were significant. Its goodness of fit was good (χ
2(42) = 51.818,
p = 0.143, χ
2/df = 1.234, GFI = 0.959, CFI = 0.987, NFI = 0.936, RMSEA = 0.033, 90% CI (0, 0.059) and SRMR = 0.038), its parsimony was high (PR = 0.764) and its parsimony fit indexes were acceptable (PCFI = 0.754, PNFI = 0.715 and PGFI = 0.610). The emotional symptoms factor showed convergent validity (AVE = 0.508 and ω = 0.840). The physiological symptoms factor showed poor convergent validity (AVE = 0.332 and ω = 0.746). In turn, the shared variance between the two factors was very high (65%) and was higher than the AVE of each factor (33.2% for physiological symptoms and 50.8% for emotional symptoms); as such, the factors lacked discriminant validity.
The lack of discriminant and convergent validity in the physiological symptoms factor was also present when item 2 was included (r2 = 0.607 > AVE = 0.328 and ω = 0.769 for physiological symptoms; r2 = 0.607 > AVE = 0.508 and ω = 0.837 for emotional symptoms); thus, the model only achieved acceptable goodness of fit: χ2(53) = 101.396, p < 0.001, χ2/df = 1.913, GFI = 0.926, CFI = 0.941, NFI = 0.885, RMSEA = 0.065, 90% CI (0.045, 0.084) and SRMR = 0.052.
Due to these problems in the model with two correlated factors, two other alternatives were pursued. The first alternative was a single-factor model with a covariance between the measurement residuals of items 11 and 15. In this model, all of the parameters were significant. Their measurement weights ranged from 0.40 to 0.74 (
Figure 2). The AVE was 40.5% and the composite reliability was good (ω = 0.870). Its parsimony was high (PR = 0.782) and its parsimony fit indexes were acceptable: PNFI = 0.703, PCFI = 0.742 and PGFI = 0.607. Although goodness of fit was rejected by the chi-square test (χ
2(43) = 81.313,
p < 0.001), the two indexes demonstrated a good fit (χ
2/df = 1.891 and RMSEA = 0.064, 90% CI (0.042, 0.085),
p-close = 0.135 for H
0: RMSEA = 0.05) and the four remaining indexes showed acceptable goodness-of-fit values (GFI = 0.932, NFI = 0.900, CFI = 0.949 and SRMR = 0.052).
The second alternative was a bifactor model (
Figure 3). In this model, all the residuals were independent, as were the two SFs. The solution was admissible. The goodness of fit was good (χ
2(33) = 42.136,
p = 0.132, χ
2/df = 1.277, GFI = 0.966, CFI = 0.988, NFI = 0.948, RMSEA = 0.036 (90% CI = 0, 0.065) and SRMR = 0.031) and the model was equivalent to the model with the two correlated factors: Δχ
2(Δgl= 9) = 9.682,
p = 0.377, Δχ
2/Δgl = 1.076 < 2, ΔGFI = 0.007, ΔCFI = 0.001 < 0.01 and ΔNFI = 0.012 ≈ 0.01. The model’s parsimony was moderate (PR = 0.600), which resulted in poor parsimonious fit indexes (PNFI = 0.569 and 0.593 < 0.60 and PGFI = 0.483 < 0.50).
The bifactor model showed convergent validity for items 5, 8, 9, 10 and 17 (AVE = 0.516 > 0.50 and ω = 0.840 > 0.70). The SF of emotional symptoms (significant measurement weights of 0.339 to 0.543, AVE_FE = 0.204 to almost 0.25 and ωh_FE = 0.328 and ECV_FE = 0.394, which are between 0.30 and 0.70) and the GF of anxiety (significant weights of 0.485 to 0.612, AVE_FG = 0.313 > 0.25 and ωh_FG = 0.512 and ECV_FG = 0.606, which are between 0.30 and 0.70) both showed substantive contributions, although the GF explained more variance in these five items than did the SF. The GF of anxiety demonstrated construct reliability (H_FG = 0.699 ≈ 0.70) but this indicator was well below 0.70 for the SF (H_FE = 0.568).
The AVE was slightly higher than 0.40 for items 3, 6, 7, 11, 15 and 21 (AVE = 0.407) and its composite reliability was merely acceptable (ω = 0.794 > 0.70); as such, its convergent validity was poor. Almost all the variance in these six items was explained by the GF of anxiety (significant measurement weights of 0.466 to 0.709, AVE_FG = 0.353 > 0.25 and ECV_FG = 0.868 and ωh_FG = 0.723 > 0.70); this factor showed construct reliability (H_FG = 0.777). The SF of physiological symptoms did not have a substantive contribution (nonsignificant measurement weights for five of the six items, AVE_FE = 0.054 < 0.25, ECV_FE = 0.132 and ωh_FE = 0.071 < 0.30); it also lacked construct reliability (H_FE = 0.267).
For the set of 11 items, the bifactor model demonstrated convergent validity by the composite reliability coefficient (ω = 0.895 > 0.70) and the AVE was greater than 0.45 (AVE = 0.457). The GF showed construct reliability (H_FG = 0.853 > 0.70) and the percentage of uncontaminated variance was less than 0.60 (PUC = 0.545). However, the contribution of the GF of anxiety was excessive (AVE_FG = 0.335 > 0.25, ωh_FG = 0.700 and ECV_FG = 0.733 ≥ 0.70). Therefore, the two SFs in the end did not have substantial weightings (AVE_FEs = 0.122 < 0.25, ωh_FEs = 0.195 and ECV_FEs = 0.267 < 0.30) and lacked construct reliability (H_FEs = 0.626 < 0.70). Consequently, CFA showed that the single-factor model was the most suitable for modeling the interrelations of the 11 selected items.
Finally, the measurement invariance of the single-factor model with 11 indicators and independent residuals was tested across the two subsamples. This multigroup analysis was performed to verify the replicability of the model without including any corrections. There were no significant differences in measurement weights between the two subsamples within the unconstrained model based on the Z-test (
p > 0.05). There was also no difference in factor variance within the unconstrained model and the models with constraints on measurement weights and intercepts. Each of the five nested models showed acceptable goodness-of-fit values (χ
2/df < 3, NNFI and CFI >0.90 and RMSEA and SRMR <0.08), except for the chi-square test (
p < 0.01). The unconstrained model showed better goodness of fit than the other nested models based on the chi-square difference test (
p < 0.01); the model with constraints on measurement weights based on Δχ
2/df > 3; and the models with constraints on measurement intercepts, structural covariances and measurement residuals based on |ΔCFI| > 0.01. Nevertheless, the fit indexes were at least acceptable in all other comparisons:
p > 0.01, Δχ
2/df < 3 and |ΔNNFI| and |ΔCFI| ≤ 0.01 (
Table 5). Therefore, the measurement invariance was not strict but the model showed acceptable invariance properties across two subsamples.
3.7. Description of Distribution of BAI-11 Index Scores and Levels of Anxiety in the Sample
Scores for the GF of anxiety (BAI-11) can be obtained through an index that varies from 0 to 100. First, the transformed scores of the items (1/(1 + X), range from 0.25 to 1) are summed. Second, the minimum possible value is subtracted from the sum. Third, this difference is divided by the difference between the maximum and minimum possible values of the sum. Finally, this quotient is multiplied by 100: Index = 100 [(sum of items—minimum possible value of the sum)/(maximum possible value - minimum possible value of the sum)]. The minimum possible value of the sum is the product of the number of items added and the minimum value of the item and its maximum possible value is the product of the number of items added and the maximum value of the item.
The overall index of anxiety symptoms with 11 items (BAI-11) was obtained with the following formula: BAI-11 = 100 [(I3 + I5 + I6 + I7 + I8 + I9 + I10 + I11 + I15 + I17 + I21 − 2.75)/8.25]. The distribution of the BAI-11 index scores showed negative skewness (standardized value of the moment coefficient of skewness: Z
Sk = −4.871 < −1.96; interquartile coefficient of skewness: IQCS = −0.100) and slight deviation from mesokurtosis (standardized value of kurtosis excess: Z
K = −2.736 < −1.96; Kelley’s percentile coefficient of kurtosis centered at 0: PCK = 0.036, 95% CI: 0.010, 0.061). Thus, the scores did not follow a normal distribution using the Kolmogorov-Smirnov test with the Lilliefors correlation (D = 0.117, Z
D = 3.734,
p < 0.001). The profile did not form a bell-shaped curve but rather corresponded to a skewed distribution with a long tail to the left. The tail of the distribution was shortened to the right but lengthened to the left, hence the contradiction between the two kurtosis coefficients. The coefficient based on moments was negative and showed platykurtosis or a shortened tail, whereas the coefficient based on percentiles was positive and showed leptokurtosis or an elongated tail (
Table 6 and
Figure 4).
Following common practice when index scores are used [
57], a score between 0 and 29.9 was considered to indicate a low anxiety level; between 30 and 69.9, medium; and from 70 to 100, high. Based on these thresholds, 220 out of 445 family caregivers (49.4%) had high levels of total symptoms of anxiety. The central tendency (mean = 65.598, 95% bias-corrected and accelerated bootstrap (BCa) confidence interval (CI): 63.101, 68.126; median = 69.697, 95% BCa CI: 67.677, 74.748) corresponded to a medium level of anxiety (
Table 6). With the stipulated cutoff point of 30, 87.6% cases of anxiety were present in the sample.
3.8. Concurrent Validity
The overall reliability of the BDI-II was good, similar to that of the somatic-affective symptom factor and that of its cognitive symptom factor was acceptable. The strength of the association between the BAI-11 and the BDI-II total score was high and that between the BAI-11 and the two factors of the BDI-21 was medium. These three correlations were significant and positive (
Table 7).
The overall reliability of the RESI-M was excellent and that of its five factors ranged from excellent to acceptable. The BAI-11 showed significant and positive correlations with the RESI-M total score and its five factors. The strength of the association of the BAI-11 with these variables was small, except for the trivial association with social support (
Table 7).
4. Discussion
The purpose of this research was to analyze the psychometric properties of the BAI in a population of Mexican family caregivers of children with cancer. First, we identified the factor model underlying the interrelationships of the items. The internal consistency reliability of the scale was subsequently evaluated. Then, an index was obtained to assess the level of anxiety and finally, the construct validity in relation to depression and resilience was verified.
We started by testing the correlated-factor models originally proposed for the BAI [
21]. These models showed a poor fit to the data and a severe problem in discriminability between factors with two [
18], three [
20] or four factors [
21]. Very high correlations between factors in the three models suggested a single-factor model. However, the one-factor model with 21 indicators also showed a poor fit. Therefore, we decided to explore new models.
The first exploratory step was to study the properties of the items in the total sample using the Rasch model. We chose this model because it favors unidimensionality. After eliminating two items due to excessive multicollinearity and five items due to poor discriminability and reliability properties, the sample was randomly divided. In one subsample, the factorial structure was explored and in the other subsample, the models derived from the exploratory analysis were tested.
The number of factors was determined using Horn’s parallel analysis. Two factors were found in the EFA subsample. After eliminating two more items due to low factor loadings, Horn’s analysis again indicated two factors for the remaining 12 items. The first factor, composed of physiological symptoms, consisted of seven items related to cardiovascular and circulatory changes. The second factor, composed of emotional symptoms, consisted of five items related to anxiety, nervousness and negative perceptions of surroundings. This solution coincides with that reported by Sanz et al. [
25,
30] and Magan et al. [
27] in Spanish samples, with the exception that our solution has 12 items and the other two have 21 items. These Spanish authors identified two factors: a somatic factor, similar to this study’s physiological symptoms and an affective-cognitive factor, similar to this study’s emotional symptoms. Our solution is also very similar to the two-factor model reported by Beck et al. [
18]. Nevertheless, item 8 (“unsteady”) of the somatic symptoms was interpreted as “emotional insecurity” both in this sample of Mexican family caregivers and the samples of Spanish university students [
25], the general population [
27] and patients with psychological disorders [
30], that is, an indicator of cognitive symptoms.
Regarding the four-factor model reported by Beck et al. [
20] for outpatients with anxiety disorders, the factors of neurophysiological (items 3 and 6), autonomic (items 4 and 21) and panic symptoms (items 7, 11 and 15) contain the items that make up the physiological symptom factor in the present study. The subjective anxiety factor (items 5, 9, 10 and 17) contains four out of five items that make up the emotional symptom factor in this study. However, the solutions differ in terms of the number of factors. The solution found here has two factors. This structure is more parsimonious than a four-factor model [
21]. This greater parsimony is a desirable property in structural models [
40,
48] and it was achieved by preserving the meaning of the underlying factors. The reduction from 21 to 12 items mostly affected items pertaining to somatic symptoms, which represent 66.7% of the BAI with 21 items and 58.3% of the BAI with the 12 selected items. The removed somatic items work well in patients with panic disorders [
23] but they have low discriminability and reliability in family caregivers of children with cancer.
The structure of this study also replicates the configuration found in Mexico by Galindo et al. [
31] in terms of general content but with fewer elements and factors. The three factors of neurophysiological, autonomic and vasomotor symptoms in Mexican cancer patients were reduced to a physiological symptom factor in this study and the factor labeled “subjective” symptoms by Galindo et al. [
31] is similar to the factor consisting of emotional symptoms in this study.
The number of BAI factors varies from one study to another [
18,
20,
21,
38,
39]. However, one finds a component related to physiological changes, evaluated by two or three factors [
26,
29,
31] and another component related to tendencies and thoughts of anxiety associated with emotional distress, the latter of which seems to be much more consistent throughout all previous studies, including Beck’s original and subsequent studies [
21]. This factor is what this study calls “emotional symptoms” due to the emotional content of the items, which are related to the cognitive processes and changes associated with negative perceptions of events by individuals.
Testing the model with two correlated factors through CFA led to the elimination of item 2 (flushing). This item was eliminated because item 2 had a measurement residual with high covariances with two items of the same factor and the physiological symptoms factor had a medium effect size on this item compared to the large effect sizes it had on the other items. The goodness of fit significantly improved for the remaining 11 items and after releasing the covariance between the measurement residuals of items 11 and 15, a good fit was achieved, even according to the chi-square test. The parsimony was high, the parsimony fit indexes were acceptable and both factors demonstrated composite reliability. However, a significant problem in lack of discriminant validity among the factors was found. The high correlation between two factors suggests that they are actually one factor. In addition, the physiological symptoms factor demonstrated poor convergent validity. Both problems were found in the model that included item 2 but with acceptable values for the GFIs as opposed to good values.
Given this lack of discrimination, there are two options: specify a single-factor model or specify a bifactor model [
49]. Testing of the single-factor model with the correlation between the measurement residuals of items 11 and 15 revealed that its goodness of fit was acceptable, its parsimony was high and its parsimony fit indexes were acceptable and it demonstrated construct reliability and acceptable convergent validity. The AVE was less than 0.50 but greater than 0.40, which was compensated for by a nearly excellent composite reliability [
43].
It should be noted that the correlation between the residuals can be attributed to their similarity (feeling of choking and difficulty breathing), which causes them to share a specific variance outside the model that measures the physiological symptoms of anxiety (two-factor model) or the model that measures general anxiety symptoms (single-factor model).
The inclusion of corrections in the model by releasing covariance parameters between the residuals is often questioned as idiosyncratic to the sample and not clearly replicable [
40]. Additional data in favor of the replicability of the single-factor model with 11 indicators (with all its independent residuals) were its adequate measurement invariance properties in the two subsamples. Consequently, this model can be validated without the need to include this correction.
Considering the existence of a GF and the possibility of improving the fit, the second alternative was to specify a bifactor model [
49]. This model showed a good fit but showed worse parsimony than the previous models, resulting in poor parsimony fit indexes. Its greatest problem was an excessive contribution by the GF of anxiety to the detriment of the two SFs, mainly affecting the physiological symptoms factor, which did not have a significant contribution. While the physiological symptoms factor lacked convergent validity in the two correlated factors model, this problem arose again in the bifactor model in relation to the six items that define this factor. Therefore, the bifactor model is not a good representation of the underlying structure and highlights the true substantiality of the GF [
49].
It should be noted that although two eigenvalues of the correlation matrix for the 11 items retained in the model were greater than one, upon repeating Horn’s analysis, only one empirical eigenvalue was higher than the random eigenvalues. Weighing all these data, the single-factor model was ultimately considered to be the best dimensional representation of the 11 items kept in the model, even when item 2 was included. In family caregivers of children with cancer, the physiological aspects of anxiety are intimately intertwined with the emotional and cognitive aspects. These aspects are not distinguishable as they are in patients with panic disorders [
21,
30]. This proposal is not novel. A single-factor model for the BAI has previously been proposed in the Australian general population [
60] as well as in psychiatric inpatients and high school adolescents in the US [
45].
In the studies in which a correlated-factor model is tested, the correlation between the factors is very high. In some cases, attention is not paid to these data [
61,
62] and in other cases, a hierarchical model is specified to maintain two, three or four factors hierarchized to a GF that have a minimal direct effect on the items [
30,
42,
63,
64]. However, when only an EFA is performed using oblique rotation, this problem of discriminant validity between factors is slightly attenuated [
18] and it has no possibility of arising when an orthogonal rotation is used [
31,
65]. In the present study, it was decided to demonstrate the lack of discriminant validity between the factors to arrive at a single-factor solution, which was facilitated by the use of Rasch analysis applied to the items [
54].
Regarding the study’s second objective, the reliability results in terms of internal consistency (estimated by Cronbach’s alpha coefficient) were satisfactory since values greater than 0.80 were obtained for the scale (with either 12 or 11 items) and its two factors. These empirical findings align with those of previous studies that reported high internal consistency indexes for the scale, ranging from α = 0.83 to α = 0.92 [
18,
28,
30], including studies conducted in Mexico [
26,
29] that reported values ranging from 0.83 to 0.86.
The third objective of the study was twofold. On the one hand, it was intended to describe the distribution of BAI scores and, on the other hand, to estimate the anxiety levels in the sample. For this purpose, an overall index of anxiety symptoms was created in accordance with the single-factor model, which best represents the interrelations of the items. According to expectations [
31], the distribution was not a Gaussian bell-shaped distribution. The majority of caregivers reported experiencing anxiety; therefore, the distribution was skewed. The scores were concentrated toward the right pole (high anxiety level) and a few cases far apart scattered toward the left pole (low anxiety level). Consequently, population-related norms for interpreting raw scores or within-group norms should be set based on percentile scores (scaling) and not based on standardized scores (standardizing, e.g., T-scores = 10 ([raw score—mean]/standard deviation) + 50) [
36,
42]. It should be noted that this type of distribution is expected when measuring a trait that is overexpressed due to some contextual factor with great influence on a population. In contrast, when the trait is determined by multiple causes with more or less equivalent effect sizes, the distribution follows a normal probability model, such as that of an expressive attitude in a free society [
66,
67].
Because the sampling in this study was not probabilistic and the study did not have a case-control design to set a cutoff point or classificatory thresholds, we decided to interpret the scores on the BAI-11 based on index scores. These index scores showed that approximately half of family caregivers suffered a high level of anxiety, as expected [
7,
8,
9,
11,
35]. The median BAI-11 score corresponded to a medium level of anxiety, with a value close to the high anxiety threshold (≥70). Taking 30 as the cutoff point, which is a customary cutoff point when using indexes [
57], approximately one out of nine family caregivers may be classified as cases of anxiety. This value corresponds to a midpoint (21 on BAI-21 equals 33.3 on BAI-11 index score) between Beck et al. [
42] thresholds for a medium level (16 on BAI-21 equals 25.40 on BAI-11 index score) and a high level of anxiety (26 on BAI-21 equals 41.27 on BAI-11 index score) and to the cutoff point suggested by Sanz [
36] in the Spanish adult population (19 on BAI-21 equals 30.16 on BAI-11 index score).
The fourth objective was to verify construct validity in relation to depression and resilience. The hypothesis of the direct relationship between anxiety and depression with a moderate to high strength of association was supported [
7,
11]. Furthermore, the expectation of an inverse relationship between anxiety and resilience was met, with a weak strength of association [
37,
38,
39]. Consequently, this study provides construct validity evidence for this short, one-dimensional version of the BAI.
One limitation of this study is the use of nonprobability sampling; therefore, caution should be used when interpreting the results. A second limitation is the cross-sectional design, making it impossible to estimate the temporal reliability or to test the temporal stability of the factor model. The ratio of the sexes was very unequal (78:367, approximately one man for every five women), which is characteristic of the population and the number of men was less than 100. These two conditions prevented us from testing the measurement invariance of the single-factor model across women and men. A fourth limitation is the hospital setting in which the data were collected; thus, the generalizability of the results to other clinical settings should be approached with care.