3. Results
3.1. Testing Simple Factor Structure and Internal Consistency Reliability of the Bullying Scale
Prior to testing measurement invariance, it was important to test and verify the simple structure of the bullying scale of the TIMSS with the fourth-grade population, particularly in light of evidence from the eighth-grade data that cyberbullying comprised a second dimension. We therefore tested the unidimensional model, which provided a good model fit to the data. Specifically, descriptive fit indices were acceptable (CFI = 0.973, TLI = 0.979), and unstandardized residuals also <8% (i.e., RMSEA = 0.058, C.I.95% = 0.057–0.058). The omnibus chi-square test statistic was significant, but this was expected given the large sample size of more than 44k participants. We further tested the presence of additional dimensions using an Exploratory Factor Analysis model using Maximum Likelihood Estimation and quartimax rotation using Kaiser’s normalization, as facets of bullying would likely be significantly correlated. This model suggested that a second factor could potentially be plausible with the three cyberbullying items; however, all three items had factor loadings that were substantially lower compared to those loading on the first factor (i.e., 0.494/0.615, 0.593/0.612, and 0.500/0.578), all favoring the one dimension solution. Given the inconclusive results observed from the EFA model, we further proceeded with testing a bifactor model in which bullying, cyberbullying, and global-domain bullying dimensions were operative. Results after fitting the bifactor model indicated that the global bullying domain factor loadings ranged between 0.570 and 0.860, whereas all the factor loadings of the two bullying domains had factor loadings ranging between 0.157 and 0.453, suggesting that the global factor was dominant. Consequently, the single-domain bullying structure was assumed to be the optimal factor structure with these data from the fourth-grade students in the TIMSS.
Furthermore, the internal consistency of the instrument using Omega coefficient was 0.882, which is excellent, confirming the unidimensional structure of the scale.
3.2. Tests of Strong Measurement Invariance: Traditional Approach
The next step involved specifying tests of the equality of slope terms across countries (metric invariance) followed by tests of the equality of intercepts (scalar invariance). Upon satisfaction of both prerequisite assumptions, tests of latent means can be conducted. Tests of exact fit showed a failure of the metric model. Specifically, based on the global chi-square test results showed significant misfit from constraining the slopes of the 11 items across all six countries [Difference χ2(55) = 240.383, p < 0.001]. Furthermore, the difference between metric and scalar models was again significant, pointing to significant misfit in the chi-square statistic from constraining item intercepts across countries [Difference χ2(50) = 668.959, p < 0.001]. These results rendered the comparison of latent means meaningless, unless some form of measurement invariance, partial, or other approach was achieved. For this purpose, the alignment procedure was utilized as shown below.
3.3. Tests of Measurement Invariance: Fixed- and Free-Alignment Methodologies
As described above, the alignment procedure was utilized to test for the equivalence of simple structures. The procedure involves two means: a freely estimated one and one, termed “fixed”, where a reference group is specified to have a mean of zero. The authors of [
22] recommend always starting with the fixed alignment method, which will most likely result in decreased standard errors compared to the free method.
Table 1 displays the results from applying the fixed method alignment procedure. As shown in
Table 1, 42 out of the 198 estimated parameters were non-invariant. This number shows a lack of invariance in 21.2% of the tested invariance parameters. Ref. [
22] stated: “A rule of thumb is that as long as the number of non-invariant parameters is less than 20%, we can expect the alignment method to work correctly” (p. 6). The fixed alignment methodology obviously violated this rule of thumb; thus, we proceeded with the free alignment option. Free alignment works best if there is substantial non-invariance. These results are shown in
Table 2. As shown in the table, there were 36/198 non-invariant parameters, amounting to 18.9% of the total number of tests. Further evidence was provided by investigating the values of the R-square statistic as a degree of non-invariance. The average number of R-square values across estimated parameters was 0.615, which is quite high, although rules of thumb are not currently available and also given that small R-square values do not necessarily imply non-invariance (such as when the levels and variability of an estimate are relatively low). Mean estimates of group invariance were 5.02, suggesting that on average, five out of the six countries were invariant across all tests of intercepts and slopes. Last, to further conclude that the free alignment procedure was successful, we explored the measurement invariance using the MIE package in R through exploring the presence of clusters of groups.
After transforming the factor loadings and intercept parameters across countries to distances using a multidimensional scaling framework, we used the visual analysis provided by [
23]. Then, the distances were considered to be within a 0.01-unit range. Both the presence of subgroups and that of outlying groups are compatible with the method (countries). Distances between countries calculated from CFI (upper panel) and RMSEA (lower panel) values for a comparison of configural and metric models are shown in
Figure 1. (lower panel). Distances between countries on RMSEA (upper panel) and CFI (lower panel) based on fitted models when contrasting metric and scalar models are shown in
Figure 2. There are minimum distances between countries on each estimate that fall within a hypothetical elliptical shape, indicating invariance. Estimates of the CFI and RMSEA for comparing configural–metric–scalar models across countries are shown in
Table 3,
Table 4,
Table 5 and
Table 6. Difference values were consistently less than 0.01, as shown in the right table columns, indicating minimal non-invariance.
3.4. Latent Mean Differences across Gulf Countries
After ensuring measurement invariance with the free alignment method, the results indicated statistically significant level differences in bullying rates between Gulf countries (see
Table 7). Saudi Arabia, in particular, had the lowest bullying rates of any country surveyed. The United Arab Emirates ranked second lowest, with rates significantly lower than those of any country except Saudi Arabia. Last but not least, we found that bullying incidence rates in Qatar were noticeably higher than those in Oman, Kuwait, the UAE, and Saudi Arabia. Thus, Saudi Arabia topped the list for lowest bullying rates, followed by the UAE, Kuwait, Oman, Bahrain, and finally Qatar. According to effect size recommendations [
24], the differences that were found to be significant were relatively small, with the largest difference (between Saudi Arabia and Qatar) reflecting a small-to-medium effect (i.e., between 0.2–0.5).
4. Discussion
The purpose of the present study was to compare and contrast levels of bullying in the six Gulf countries, namely Saudi Arabia, Bahrain, Kuwait, Oman, Qatar, and the United Arab Emirates. First, the psychometric properties of the bullying scale were investigated, followed by tests of latent means. Several important findings emerged. In terms of the psychometric analyses of the bullying scale, the results indicated very good model fit as per the unidimensional structure of the eleven bullying behaviors using data from all six Gulf countries. Internal consistency estimates were also acceptable. Thus, the 11-item measure utilized in TIMMS 2019 using the fourth-grade cohort possesses desirable psychometric properties to validly assess bullying behaviors, reflecting a single latent factor.
When compared to the other Gulf states, bullying rates were significantly lower in Saudi Arabia, the study’s focus country. Although there were notable differences in levels, the effect sizes observed were small. Several observations are in order here for the notable differences in bullying favoring the Kingdom of Saudi Arabia compared to the rest of the Gulf countries. First, the culture in Saudi Arabia emphasizes respect and obedience towards authority, as well as interpersonal behaviors and interactions that conform to cultural and religious values [
25,
26]. The Saudi Arabian culture also places a heavy emphasis on collectivism, family, and community values, reflecting the idea of “wasta”, which refers to the utilization of social networks and personal relationships to acquire an advantage in personal and professional contexts [
27]. Because forming relationships with others and maintaining peace within groups reflects a main goal of individuals, bullying may be less prevalent in Saudi Arabia so that individuals will not act in ways that would harm or damage their social relationships [
28]. Kindness, respect, empathetic understanding, and compassion toward others are fostered by the cultural and religious values of the Kingdom, which may in turn discourage bullying behaviors [
29,
30,
31]. These values are cultivated through parental involvement in their children’s education as parents target creating a sense of accountability and responsibility in their children so that they behave respectfully and appropriately. Furthermore, there is evidence that the Saudi government implements programs to increase awareness between students, staff, teachers, and parents on both how to identify and also how to address bullying instances [
32,
33]. Last, a potential explanation lies in the imposition of severe and long-lasting consequences for bullies, which can have deleterious effects for the development and future prospects [
33]. For instance, the Saudi Arabian Ministry of Education published a new anti-bullying policy in 2014 that mandates schools take disciplinary action against bullies, such as suspension, expulsion, or even reporting to the authorities [
34]. Last but not least, gender segregation may lessen occurrences of bullying between genders, since there may not be any amicable or amorous connections that might cause rivalry and conflict.
There are several caveats to the present investigation as well. To begin, several of the inferential statistical findings exhibit extremely high powers and, despite surpassing conventional levels of significance, may not be substantial in and of themselves. Second, there was a lack of explanatory variables that could illustrate and explain the observed bullying rates. While several academic variables are accessible in the TIMMS data, the same cannot be said for variables pertaining to students’ social, emotional, or physical well-being. As a result, this detail was absent from the available data. Despite these limitations, the current study sheds light on the prevalence of bullying in Gulf countries and highlights the need for ongoing school-based initiatives to prevent and address bullying behavior. Teachers and administrators can develop effective strategies to provide secure and supportive learning environments for all students by identifying factors that contribute to low levels of bullying, such as cultural values, strict disciplinary procedures, and parental involvement.
4.1. Implications of the Present Findings for Practice
Given the serious consequences of cyberbullying, it is important for parents, educators, and policy-makers to work together to prevent and address this behavior. This can involve educating children about appropriate online behavior, providing support and resources for those who have been affected by cyberbullying, and establishing clear guidelines and consequences for cyberbullying in schools and other settings. Furthermore, it is important to recognize that cyberbullying is not a problem that can be solved by any one group alone but requires multidisciplinary teams for evaluation and prevention.
4.2. Conclusions and Future Directions
It is concluded that levels of bullying were significantly lower in Saudi Arabia compared to all other GCC countries, but were also reflective of small effect sizes.
There are various directions that research can go in the future. Studies that examine predictors of bullying behaviors and their differentiation across age and gender groups will likely inform treatment and interventions. Enriching the current instrument with more behaviors (as in the eighth-grade cohort in the TIMSS) will also address issues of content validity and dimensionality. Last, the impact of other factors, such as mental health support and social–emotional learning, on bullying prevention and student well-being should be studied.