1. Introduction
It is well known that individual differences in human cognitive functioning are associated with variation in educational outcomes (
Gottfredson 1997;
Jensen 1998;
Kaufman et al. 2012;
Rosén et al. 2017). These associations, often called cognitive–achievement relations in the literature, have been found across different cognitive ability domains and reading, writing, and mathematics achievement (
Caemmerer et al. 2018;
Hajovsky et al. 2018;
Niileksela et al. 2016). Standardized test scores from cognitive ability and academic achievement measures are frequently used within high-stakes decision making, especially specific learning disability (SLD) evaluations (
Maki et al. 2015). Understanding variation in the associations between global and broad cognitive abilities and different domains of academic achievement is paramount to understanding learning difficulties and informing assessment practices.
A notable gap in the literature is whether cognitive–achievement relations observed in previous research generalize to different cognitive ability levels (e.g., higher/lower IQ). On average, students with learning difficulties often have slightly lower levels of cognitive abilities (
Johnson et al. 2010), possibly due to specific cognitive deficits that are related to their learning challenges (
Grigorenko et al. 2020). Much of the research on cognitive–achievement relations uses standardization samples from norm-referenced test batteries (
Caemmerer et al. 2018;
Hajovsky et al. 2018;
Niileksela et al. 2016). The findings from this research represent relations across the ability spectrum, but an assumption of those findings is that the magnitude of relations generalizes across the ability spectrum. If relations among cognitive abilities and academic skills differ by ability level, and those with SLD tend to have slightly lower cognitive abilities, then the research on cognitive–achievement relations may not generalize to students with SLD. The purpose of this study is to examine cognitive–achievement relations across different general ability levels in school-age children to determine if these generalize across the IQ distribution.
2. Spearman’s Law of Diminishing Returns (SLODR)
A potential reason that differences in cognitive–achievement relations may exist across ability level may be understood through the lens of Spearman’s Law of Diminishing Returns (SLODR;
Spearman 1927). Correlations among tests are typically higher for individuals with lower levels of general intelligence (
g) compared to those with higher levels of
g, suggesting greater differentiation among broad abilities (e.g., visual processing, working memory) for those with higher IQ scores (
Abad et al. 2003;
Detterman and Daniel 1989). According to the theory of cognitive ability differentiation (CAD;
Jensen 2003),
g contributes less to broad abilities when
g is higher, resulting in more differentiation among broad abilities (weaker correlations). Thus, CAD posits that the variance in broad abilities not explained by
g (residual variance or group factors) will be larger in the higher
g group. Conversely,
g contributes more to broad abilities when
g is lower, resulting in less differentiation among broad abilities (stronger correlations), which reflects more underlying variance shared by the broad abilities that can be attributed to
g (
Reynolds et al. 2011).
A potential implication underlying this theoretical proposition is that the relations between
g (or global IQ) and academic achievement will be stronger at lower levels of
g, and the relations between broad cognitive abilities (e.g., working memory) and academic achievement will be stronger at higher levels of
g (
McLarnon et al. 2018;
Murray et al. 2013). It is unknown to what extent cognitive–achievement relations are differentiated by IQ level. Thus, we aim to test whether cognitive–achievement relations vary by level of IQ (a strong proxy of
g;
Reynolds et al. 2013) using two different standardized cognitive and achievement batteries co-normed within nationally representative samples.
3. Cognitive–Achievement Relations
Most of the prior cognitive–achievement relations research has been completed with large samples representing average relations across the ability spectrum. Research supports the moderation of cognitive–achievement relations by development (e.g.,
Caemmerer et al. 2018;
Floyd et al. 2012;
Hajovsky et al. 2014;
Niileksela et al. 2016), gender (e.g.,
Hajovsky et al. 2018), and mixed results with regard to race/ethnicity (e.g.,
Hajovsky and Chesnut 2022;
Keith 1999;
Weiss and Prifitera 1995). Moreover, the theory of mutualism expands on the cognitive–academic bidirectional relationship (
Peng and Kievit 2020). In cognitive–academic mutualism, exposure to broader educational resources increases both academic achievement and cognitive performance for students with high abilities, leading to stronger relationships between cognitive ability and academic achievement. As students progress through each successive grade level, the mutualistic effects become more pronounced with high academic and cognitive ability levels (
Zhang and Peng 2023). In other words, the theory of mutualism suggests that the development of cognitive ability and academic achievement is bidirectional and that they have influences on each other, in contrast to a unidirectional relationship where cognitive ability only influences academic achievement, but not vice versa. Furthermore, this theory hypothesizes that the relation between relevant, specific cognitive abilities (e.g., reasoning, working memory) and reading/mathematics achievement should increase with age as people develop their skills in these areas (
Peng and Kievit 2020). While mutualism between verbal working memory and academic skills is mixed, it is suggested that one reason for inconclusive findings may be due to a lack of analyses that account for moderating variables (i.e., ability level).
Peng and Kievit (
2020) have hypothesized that individuals with high abilities create more mutualistic skills (i.e., cognition and academic skills).
While studying cognitive–achievement relations has been well documented, there is a recent push for integrated models in the literature (e.g.,
Hajovsky et al. 2014;
Niileksela et al. 2016;
Feraco et al. 2022). Integrated models of cognitive–achievement relations suggest that cognitive abilities influence more advanced academic skills via basic academic skills. For example, general and broad cognitive abilities influence reading comprehension via basic reading skills (e.g.,
Floyd et al. 2012;
Hajovsky et al. 2014) and math problem solving via math computation skills (e.g.,
Decker and Roberts 2015). However, this research base focuses exclusively on the average relations between cognitive and achievement scores without consideration of whether associations vary at different ability or achievement levels. Research employing quantile regression has examined cognitive–achievement relations as a function of achievement level (e.g.,
Language and Reading Research Consortium and Logan 2017). These findings suggest cognitive–achievement relations vary by academic skill level (as a function of the outcome variable or achievement), but this work does not address whether associations vary at different ability levels (as a function of the predictor variable or IQ).
There has been limited empirical work exploring differentiation of cognitive–achievement relations by
g. In one notable study,
McLarnon et al. (
2018) examined how global and narrow cognitive measures derived from the Medical College Admission Test (MCAT) predicted GPA in high-
g versus low-
g individuals. They found
g was a stronger predictor of GPA in the low-
g versus the high-
g group, providing support for differentiation of cognitive ability across the ability spectrum at the global cognitive level. Although it was predicted that narrow cognitive measures derived from the MCAT would be a stronger predictor of GPA in the higher-
g group (due to larger residual variances), it was found that the low-
g group showed stronger relations between narrow cognitive measures and GPA (
McLarnon et al. 2018). In a more recent study using the standardization data for an IQ test in a German sample, researchers found that broad cognitive abilities had minimal incremental prediction on school grades in the low-IQ and mid-IQ groups but had a significantly stronger effect on school grades in the high-IQ group after accounting for a
g-factor score (
Breit and Preckel 2020). These mixed findings suggest some evidence of differentiation of cognitive–achievement relations by
g, but this phenomenon has not been examined within U.S. nationally representative school-age samples utilizing standardized measures of cognitive ability and academic achievement.
4. Current Study
Cognitive–achievement relations research has been instrumental in understanding learning difficulties and in the development of neurocognitive models of assessment (
Alfonso and Flanagan 2018;
Schneider and Kaufman 2017). Most of this research is completed with large samples representing average relations across the ability spectrum, but it is possible that relations found in previous research do not generalize to individuals with higher or lower general intelligence (IQ). Nonetheless, the cognitive–achievement relations studies used to inform diagnostic models have rarely quantified cognitive–achievement relations by IQ level (e.g., low, average, or high). This is a non-trivial consideration as the interpretive weight attributed to basic psychological processes (e.g., phonological processing) for understanding academic functioning may shift as a function of IQ level. If the relations among global and broad cognitive abilities and academic skills differ by ability level, then the research on cognitive–achievement relations may not apply to students with suspected SLD or lower cognitive functioning.
To address this gap in the literature, we use multi-group structural equation modeling to examine whether integrated models of cognitive–achievement relations are differentiated by different IQ levels for school-age children and adolescents. A benefit of this study is the use of two large, nationally representative samples to examine the differentiation of cognitive–achievement relations by IQ level. One consideration in a comparison of different IQ groups based on a selection of cut-points using a variable that is included in the model (i.e., a general ability composite) is a concern with restriction of range related to dichotomizing a continuous variable (
Reynolds et al. 2010). To mitigate this issue, we developed an alternative general ability composite using test scores that are not included in any of the general or broad ability composites used in the models we tested for differentiation. This alternative general ability composite was used to create groups. We use composite scores in the analyses as these scores are utilized in diagnostic assessment decision making and create Low (<25th percentile), Average (25th–75th percentile), and High (>75th percentile) ability groups. We then examine integrated models of cognitive–achievement relations in each of these groups to determine if they are similar or different, with a focus on both global IQ-achievement relations and broad cognitive ability–achievement relations for basic reading skills and reading comprehension.
This study aims to test two general hypotheses: (a) general ability (IQ) and basic reading skills and reading comprehension relations will be stronger in the Low group relative to the Average or High groups; and (b) broad cognitive ability and basic reading skills and reading comprehension relations will be stronger in the High group relative to the Low and Average groups.
5. Method
5.1. Participants
The normative samples for the Woodcock–Johnson Third Edition (WJ III) (
Woodcock et al. 2001, 2007) and Woodcock–Johnson Fourth Edition (WJ IV) Tests of Cognitive Abilities and Achievement were used for this study (
Schrank et al. 2014a,
2014b). Both samples were used to replicate the findings across two different samples and test batteries. The WJ III and WJ IV standardization samples are nationally representative samples of children, adolescents, and adults ages 2 to 90+ years. The WJ III normative sample had 8782 individuals, with 4470 individuals in kindergarten through 12th grade. The WJ IV normative sample had 7416 individuals, with 3891 individuals in kindergarten through 12th grade. Participants in both samples were stratified based on age by the following demographic variables: race/ethnicity, sex, country of birth, community type, U.S. census region, parent education, school type, college type, occupational level, and employment status (
McGrew et al. 2014,
2007). For this study, the WJ III and WJ IV samples were split into elementary (1st through 5th grade) and secondary (6th through 12th grade) samples. The samples were split into elementary and secondary samples because previous research has suggested that age moderates the relations between cognitive abilities and academic skills (e.g.,
Floyd et al. 2012;
Niileksela et al. 2016). Average ages for grade levels across the elementary sample ranged from 6.5 years in 1st grade to 10.5 years in 5th grade. Average ages for grade levels across the secondary sample ranged from 11.5 years to 17.5 years.
5.2. Measures
The WJ III and WJ IV provide several composites for measuring intellectual and achievement abilities according to the Cattell–Horn–Carroll theory (CHC;
Schneider and McGrew 2018). From both batteries, corresponding composites for general intelligence, the General Intellectual Ability (GIA), the seven broad CHC abilities (G
c, G
f, G
sm, G
s, G
lr, G
v, G
a), and two reading composites including Basic Reading Skills and Reading Comprehension were used in this study. The WJ III Technical Manual (
McGrew and Woodcock 2001) contains extensive validity information of the measures guided by CHC theory (
Schneider and McGrew 2018). The WJ IV Technical Manual provides extensive concurrent, criterion, and developmental validity evidence that includes data on patterns of intercorrelations among tests and clusters and a three-stage structural validity analysis using factor analysis, cluster analysis, and multidimensional scaling (
McGrew et al. 2014).
5.3. Basic Reading Skills
The Basic Reading Skills (BRS) composite provides a measure of an individual’s reading ability in English word identification and phonetic abilities. On both the WJ III and WJ IV, the composite includes the Letter–Word Identification and Word Attack subtests, where examinees read single real words and nonsense words, respectively. The WJ III Basic Reading Skills cluster reliability coefficients ranged from 0.90 to 0.98 across ages 5–19 (
McGrew and Woodcock 2001;
McGrew et al. 2007). The WJ IV Basic Reading Skills cluster reliability coefficients ranged from 0.93 to 0.98 across ages 5–19 (
McGrew et al. 2014).
5.4. Reading Comprehension
The Reading Comprehension (RC) composite measures an individual’s understanding of what they have read. On the WJ III, the composite includes Passage Comprehension, where examinees supply words to fill in missing blanks in a sentence or paragraph, and Reading Vocabulary, where examinees supply synonyms and antonyms of words they read. The subtests included in this composite differ slightly on the WJ IV. It still includes Passage Comprehension, but instead of Reading Vocabulary, it includes Reading Recall, where examinees read a story silently and retell the story from memory. The WJ III Reading Comprehension cluster reliability coefficients ranged from 0.88 to 0.97 across ages 5–19 (
McGrew and Woodcock 2001;
McGrew et al. 2007). The WJ IV Reading Comprehension cluster reliability coefficients ranged from 0.91 to 0.99 across ages 5–19 (
McGrew et al. 2014).
5.5. General Intellectual Ability
The General Intellectual Ability (GIA) composite provides a snapshot of an individual’s current intellectual functioning and is representative of
g in CHC theory. The global intelligence measure includes one subtest representing each of the seven CHC broad abilities measured by both WJ batteries. The subtests differ on the WJ III and WJ IV. The WJ III uses scores from Verbal Comprehension, Concept Formation, Sound Blending, Spatial Relations, Visual–Auditory Learning, Visual Matching, and Numbers Reversed. The WJ IV uses scores from Oral Comprehension, Number Series, Verbal Attention, Letter–Pattern Matching, Phonological Processing, Story Recall, and Visualization. The WJ III GIA cluster reliability coefficients ranged from 0.96 to 0.97 across ages 5–19 (
McGrew and Woodcock 2001;
McGrew et al. 2007). The WJ IV GIA cluster reliability coefficients ranged from 0.95 to 0.97 across ages 5–19 (
McGrew et al. 2014).
5.6. Broad CHC Cognitive Abilities
Composite scores for the seven broad CHC abilities were used from the WJ III and WJ IV. These broad abilities are measured by two subtests, each of which measures a different narrow ability that is subsumed under the broad ability. The tests used on the broad CHC composites differ slightly across the WJ III and WJ IV but reflect the same underlying construct across measures. We provide a definition from the WJ IV, which includes:
Comprehension–Knowledge (Gc). This measures the depth and breadth of declarative and procedural knowledge and skills valued by one’s culture. It is measured by Verbal Comprehension and General Information on the WJ III, and Oral Comprehension and General information on the WJ IV.
Fluid reasoning (Gf). This measures the deliberate and controlled focused attention to solve novel problems that cannot be solved using prior knowledge. It is measured by Concept Formation and Analysis–Synthesis on the WJ III, and Number Series and Concept Formation on the WJ IV.
Visual processing (Gv). This measures the ability to use mental imagery, store images in primary memory, or perform visual–spatial analysis or mental transformation of images. It is measured by Spatial Relations and Picture Recognition on the WJ III, and Visualization and Picture Recognition on the WJ IV.
Short-term working memory (Gwm). This measures the ability to encode, maintain, and/or manipulate auditory or visual information in primary memory to solve multiple-step problems. It is measured by Numbers Reversed and Memory for Words on the WJ III, and Verbal Attention and Numbers Reversed on the WJ IV.
Auditory processing (Ga). This measures the ability to perceive, discriminate, and manipulate sound information, including processing of auditory information in primary memory and activation, restructuring, or retrieval of information from semantic–lexical memory. It is measured by Sound Blending and Auditory Attention on the WJ III, and Phonological Processing and Nonword Repetition on the WJ IV.
Cognitive processing speed (Gs). This measures the ability to control attention to automatically and fluently perform relatively simple repetitive cognitive tasks. It is measured by Visual Matching and Decision Speed on the WJ III, and Letter–Pattern Matching and Pair Cancellation on the WJ IV.
Long-term retrieval (G
lr). This measures the ability to store information and fluently retrieve it later (
Schneider and McGrew 2018). It is measured by Visual–Auditory Learning and Retrieval Fluency on the WJ III, and Story Recall and Visual–Auditory Learning on the WJ IV.
The WJ III broad ability cluster reliability coefficients varied from 0.86 to 0.96 across ages 5–19, except G
v, which ranged from 0.70 to 0.84, and G
sm, which ranged from 0.83 to 0.91 (
McGrew and Woodcock 2001;
McGrew et al. 2007). The WJ IV broad ability cluster reliability coefficients varied from 0.88 to 0.98 across ages 5–19, except G
v, which ranged from 0.80 to 0.89 (
McGrew et al. 2014).
6. Data Analytic Plan
Observed scores were used in all analyses. These were chosen for two reasons. First, in most cases, there would only be two tests available for each broad CHC ability factor because several of the extra tests that could be used as indicators in a latent variable model were used to create the alternative GIA composite used to select ability groups. Second, previous research with the WJ III and WJ IV has often had difficulties appropriately estimating latent variable models, such as having second-order factor loadings that are equal to or greater than one (e.g.,
Floyd et al. 2012;
Niileksela et al. 2016).
6.1. Identifying Ability Groups
Groups representing Low (<25th percentile), Average (25th–75th percentile), and High (>75th percentile) ability were selected from the WJ III and WJ IV normative samples. These percentiles were used to define groups for two primary reasons. First, these values were used to ensure adequate power and sample sizes. Setting these values at lower and higher percentiles (e.g., +/− 1 standard deviations, or 16th and 84th percentiles) would have resulted in substantial differences in sample sizes across ability groups and would have reduced power. Even when set at the 25th and 75th percentiles, the sample size of the Average ability group was twice as large as the Low and High ability groups. Second, the 25th percentile has been suggested as a point at which some cognitive or academic skills may be considered as requiring further attention in evaluations (e.g.,
Fletcher et al. 2019), so there is also a practical precedent for using these values to select groups, especially the Low group.
It is problematic to select groups based on the variables that will be used in the analysis because this attenuates the distribution of scores and results in a restriction of range. To avoid this issue, ability groups were selected using an alternative estimate of general intellectual ability (altGIA). This altGIA was estimated using subtests from the WJ III and WJ IV that were not included in the GIA or the broad CHC composites. None of the subtests used to identify the groups were included in composite scores used in any subsequent analyses. This approach accounts for the statistical issues that arise when performing analyses with the variables that were used to select groups.
Seven tests that represent the seven broad CHC abilities on the WJ III were selected to estimate the altGIA, including Picture Vocabulary (Gc), Memory for Names (Glr), Block Rotation (Gv), Incomplete Words (Ga), Number Series (Gf), Cross Out (Gs), and Memory for Sentences (Gsm). A one-factor model was created using these tests, and factor scores were estimated for each individual in the sample. Those scores were then used to select the Low (<25th percentile), Average (25th–75th percentile), and High (>75th percentile) ability groups. The one-factor model and factor scores were estimated separately for the elementary and secondary samples. The validity of the altGIA was established by correlating the latent factor of the altGIA with the latent factor of the tests included in the GIA on the WJ III. The correlation between the latent g for the WJ III GIA and altGIA was .99 for both the elementary and secondary samples, suggesting they were essentially equivalent at the latent level. The coefficient omega for the altGIA was .69 for the elementary sample and .74 for the secondary sample. For comparison, the coefficient omega for the WJ III GIA was .77 for the elementary sample and .81 for the secondary sample. Although omega values for the altGIA were slightly lower than the GIA, these values still suggest adequate reliability of the altGIA on the WJ III.
Seven tests that represent the seven broad CHC abilities on the WJ IV were selected to estimate the altGIA: Picture Vocabulary (Gc), Analysis Synthesis (Gf), Number Pattern Matching (Gs), Memory for Words (Gsm), Sound Blending (Ga), Memory for Names (Glr), and Visual Closure (Gv). Similar to the WJ III procedure, a one-factor model was created using these tests, and factor scores were estimated for each individual in the sample; those scores were used to select the three ability groups separately for the elementary and secondary samples. The validity of the altGIA was established by correlating the latent factor of the altGIA with the latent factor using the general factor by using the tests included in the GIA used on the WJ IV. The correlation between the latent g for the WJ IV GIA and altGIA was 1.00 for both the elementary and secondary samples, suggesting that they were essentially equivalent at the latent level. The coefficient omega for the altGIA was .69 for the elementary sample and .70 for the secondary sample. For comparison, the coefficient omega for the WJ IV GIA was .82 for the elementary sample and .81 for the secondary sample. Although omega values for the altGIA were slightly lower than the GIA, these values still suggest adequate reliability of the altGIA on the WJ IV.
6.2. Integrated Cognitive–Achievement Models
All models used methods of multi-group path analysis and structural equation modeling (MG-SEM). In this approach, a single model is estimated simultaneously across groups; in this case, the Low, Average, and High ability groups. Cross-group equality constraints are then added to the model to determine if there are statistically significant differences across groups on specific model parameters. In this study, the equality of regression paths between cognitive abilities and reading skills was of primary interest. The likelihood ratio test was used to test nested models (i.e., models with cross-group equality constraints were compared to models without cross-group equality constraints). A statistically significant degradation in model fit would suggest that the paths are not statistically equal across groups, and a non-statistically significant degradation in model fit would suggest that paths are statistically equal across groups.
Three sets of models were planned for this study. An integrated model of cognitive–achievement relations was estimated for each of the three models, where cognitive abilities predicted both BRS and RC, and the BRS predicted RC (e.g., there were only direct effects of cognitive abilities on BRS, but there were direct effects of cognitive abilities on RC and indirect effects of cognitive abilities on RC through BRS).
First, a model was estimated where the GIA predicted both BRS and RC, and BRS predicted RC. This was a simple mediation model that assumes the GIA is a predictor of both BRS and RC, and then BRS also predicts RC, where the effects of the GIA on RC may be partially mediated through BRS. This model only considers the effects of general intelligence on reading skills. The model is depicted in
Figure 1.
Second, a model where the seven broad CHC abilities predicted reading was estimated. In this model, the seven broad CHC abilities predicted both BRS and RC, and then BRS predicted RC. Like the first model, this model assumes that the broad CHC abilities predict both BRS and RC, and BRS also predicts RC, where the effects of the broad CHC abilities on RC may be partially mediated through BRS. The model is depicted in
Figure 2. This model examines the effects of the broad CHC abilities on reading skills but does not necessarily partial out variance that can be attributed to general intelligence and to the specific broad CHC abilities.
Third, a model similar to the previous model was estimated, but all of the broad CHC abilities loaded on a latent
g factor. Here, the common variance among the broad CHC abilities is partialed out, and the independent effects of
g and broad CHC abilities on BRS and RC can be estimated. In this model, there is no direct effect of
g on BRS and RC. Previous research with the WJ III and WJ IV suggests that the direct effects of
g on reading skills tend to be negative, and the effects of
g on reading skills are indirect (
Floyd et al. 2012;
Niileksela et al. 2016). These negative effects between
g and the academic skills were found when models included both direct paths from the broad abilities and
g to the academic skill simultaneously. However, other researchers have found large and positive direct effects of
g on reading skills (e.g.,
Beaujean et al. 2014;
Caemmerer et al. 2018), suggesting this finding may be specific to the test battery used. The model is depicted in
Figure 3.
All models were estimated using Mplus 7.4 (
Muthén and Muthén 1998–2012). Maximum likelihood estimation (MLE) was used for all models to account for missing data under the assumption that data were missing at random (i.e., scores were not missing due to the individual’s level of ability on the variable with a missing score;
Enders 2022).
8. Model Tests
8.1. General Intelligence Predicting Reading
The first model included the GIA as a predictor of BRS and RC and BRS as a predictor of RC. When equality constraints were added to the paths, there was a statistically significant degradation in model fit for the WJ III Elementary sample, χ2 (6) = 26.23, p < .001, WJ III Secondary sample, χ2 (6) = 50.64, p < .001, WJ IV Elementary sample, χ2 (6) = 28.88, p < .001, and WJ IV Secondary sample, χ2 (6) = 12.89, p = .045, suggesting statistically significant differences in the size of the path coefficients across the different ability groups across both the elementary and secondary samples on the WJ III and WJ IV.
Table 3 includes the unstandardized and standardized values for all paths in the model, as well as the
R2 for BRS and RC in each ability group for the WJ III and WJ IV Elementary and Secondary samples. Pairwise comparisons across the Low, Average, and High groups on all paths were examined to determine where there were differences in path coefficients. For the WJ III Elementary sample, the primary difference across groups was the path from the GIA to BRS, where the value for the Low group was larger than the Average and High groups, and the value for the Average group was larger than the High group. In addition, the
R2 for RC was larger for the Low group compared to the Average and High groups.
For the WJ III Secondary sample, there were more differences across ability groups. The paths from GIA to RC and from GIA to BRS were larger for the Low group compared to the Average and High groups. The R2 for BRS in the Low group was larger than the Average group, and the R2 for RC in the Low group was larger than the Average or High groups.
For the WJ IV Elementary sample, the path from BRS to RC was larger for the Low and Average groups compared to the High group, the path from GIA to RC was larger for the High group compared to the Low and Average groups, and the path from GIA to BRS was larger for the Average group compared to the High group. The R2 for RC in the Low group was larger than the Average or High groups. Finally, for the WJ IV Secondary sample, the path from BRS to RC was larger for the High group compared to the Average group. The path from GIA to RC was larger for the Low and Average groups compared to the High group. The R2 for RC and BRS was larger for the Low group compared to the Average group.
8.2. Broad CHC Abilities Predicting Reading
The second model included broad CHC abilities as predictors of BRS and RC, and BRS as a predictor of RC. When cross-group equality constraints were added to the paths from cognitive abilities to reading, there was not a statistically significant degradation in model fit for the WJ III Elementary sample, χ2 (30) = 30.00, p < .466, or the WJ IV Secondary sample, χ2 (30) = 42.86, p = .060. There was a statistically significant degradation in model fit for the WJ III Secondary sample, χ2 (30) = 69.38, p < .001, and the WJ IV Elementary sample, χ2 (30) = 72.24, p < .001, suggesting differences in the size of the path coefficients across the different ability groups for those samples.
Table 4 shows the results for the WJ III samples, and
Table 5 shows the results for the WJ IV samples. Not surprisingly, because the χ
2 was not statistically significant, there were few differences between groups. However, the
R2 was larger for the Low group compared to the Average and High groups for both BRS and RC. In the WJ III Secondary sample, the path from G
s to RC was larger for the Low group compared to the Average and High group, the path from G
lr to BRS was larger for the Average group compared to the High group, and the path from G
f to BRS was larger for the High group compared to the Average and Low groups. For BRS, the
R2 values for the Low group and the High group were both larger than the Average group, and for RC, the
R2 for the Low group was larger than the Average and High groups.
The WJ IV Elementary sample had several paths that were different across ability groups. The path from BRS to RC was larger for the Low and Average groups compared to the High group. The path from Gf to RC was larger for the Low group compared to the Average and High groups. The path from Gs was larger for the High group compared to the Low and Average groups. The path from Gf to BRS was larger for the Low group compared to the Average and High groups. The R2 for BRS was similar across all groups, but the R2 was larger for the Low group compared to the Average and High groups.
Results from the WJ IV Secondary sample had few differences in the size of path coefficients across ability groups. The path from Gc to BRS was larger for the Low group compared to the Average group, the path from Gc to BRS was larger for the High group compared to the Average group, the path from Ga to BRS was larger for the Low group compared to the High group, the path from Gf to BRS was larger for the Low group compared to the Average group, and the path from Gsm to BRS was larger for the Average group compared to the Low group. The R2 for BRS was larger for the Low group compared to the Average and High groups, and the R2 for RC was larger for the Low group compared to the Average group.
8.3. Separating Effects of g and Broad Abilities Predicting Reading
Finally, the last model was the same as the previous model, except all broad CHC abilities loaded on a single g factor to separate variance that can be accounted for by g and the broad CHC abilities in reading. In this model, all the broad CHC abilities loaded on the g factor, and then the broad CHC abilities predicted both BRS and RC and BRS predicted RC. When cross-group equality constraints were added to the paths from the broad CHC abilities to reading, there was not a statistically significant degradation in model fit for the WJ III Elementary sample, χ2 (30) = 30.81, p = .425, or the WJ IV Secondary sample, χ2 (30) = 42.86, p = .061. There was a statistically significant degradation in model fit for the WJ III Secondary sample, χ2 (30) = 69.87, p < .001, and the WJ IV Elementary sample, χ2 (30) = 72.22, p < .001, suggesting differences in the size of the path coefficients across the different ability groups for those samples.
Overall, the results for path coefficients and
R2 were similar to the previous model because the paths from broad CHC abilities included the indirect effects of
g and direct effects of broad CHC abilities. All the results are in
Table 6 and
Table 7. To separate variance in reading accounted for by
g and the broad CHC abilities in BRS and RC, the total indirect effect of
g on BRS and RC was squared and then subtracted from the
R2. This value represented the remaining variance accounted for by the broad CHC abilities. The square root of that value represented the total effects of the broad CHC abilities on BRS and RC. These values are included in
Table 8.
For the WJ III Elementary sample, the total indirect effect of g on BRS was .35 for the Low group and .11 and .13 for the Average and High groups, respectively. This was larger for the Low group compared to the Average and High groups. The residualized total effects of the broad CHC abilities on BRS were .34, .31, and .29 for the Low, Average, and High groups, respectively. Importantly, the effects of g on BRS for the Low group was higher than the other groups, but the total effect of broad CHC abilities after removing g was similar across groups. The R2 differs across groups, so the relative proportion of variance accounted for in BRS for g and the broad CHC abilities was calculated. Here, the relative variance accounted for by g and the broad CHC abilities were similar for the Low group, but the relative variance accounted for by the broad CHC abilities on BRS in the Average and High groups was much different, with g accounting for much less variance than the broad CHC abilities. This same pattern was present for RC, and, in general, this pattern was apparent through all samples. In other words, the variance accounted for in BRS and RC by g was consistently larger in the Low group than the Average and High groups, and the variance accounted for in BRS and RC by the broad CHC abilities was consistently larger in the Average and High groups compared to the Low group.
9. Discussion
The study of individual differences in human intelligence and its relationship with academic achievement remains an important area of inquiry. The field has recently called for more emphasis on integrated models of intelligence and achievement (e.g.,
Feraco et al. 2022;
Hajovsky et al. 2014;
Niileksela et al. 2016). However, research to date has not considered how ability level (IQ) may moderate cognitive–achievement relations. The current study examined cognitive–achievement relations using both global IQ and broad CHC cognitive abilities as predictors of basic reading skills and reading comprehension across elementary and secondary students. This study examined how these relationships differ by IQ level (low, average, and high) using the WJ III and WJ IV standardization samples.
Overall, the findings were generally consistent across WJ III and WJ IV samples and elementary and secondary school-age cohorts. General ability tended to explain more variance in basic reading skills and reading comprehension in the Low group compared to the Average and High groups. In other words, general cognitive ability (IQ) accounted for more of the achievement score variance for those groups demonstrating lower cognitive ability. These findings are consistent with SLODR, as it was hypothesized that
g would account for more variance in achievement outcomes for those with lower cognitive ability (and less differentiated specific cognitive abilities). Although the researchers did not examine cognitive–achievement relations as moderated by IQ level, meta-analytic work suggests that general cognitive ability has the largest direct effects on achievement (
Zaboski et al. 2018).
When the broad CHC abilities were used to predict reading outcomes, the broad abilities predicted basic reading skills and reading comprehension, and basic reading skills predicted reading comprehension. There were some differences among the ability groups in which broad CHC abilities predicted reading, with some of the most consistent differences across samples being the relation from Gs to reading comprehension, Gsm to basic reading skills, and Gf to basic reading skills.
Some of our findings are consistent with hypotheses proposed by the theory of mutualism. Specifically, working memory and basic reading skills relations were stronger in the WJ III secondary sample compared to the WJ III elementary sample. It has been suggested that certain cognitive abilities, like working memory and reading achievement relations, become stronger as age increases (
Peng and Kievit 2020). However, SLODR would predict that these working memory and basic reading skills relations would be stronger for higher ability groups, which was not the case in all instances. We did find support for SLODR with working memory and basic reading skills in the WJ IV secondary sample, where relations were stronger for the average- to high-ability groups when compared to the low group. Prior research suggests that G
s is both directly and indirectly related to reading comprehension, although not consistently across ages, and that it may vary based on which edition of the WJ is used (e.g.,
Floyd et al. 2012;
Niileksela et al. 2016). Similarly, extant findings suggest that G
sm is related to basic reading skills (
Caemmerer et al. 2018;
Evans et al. 2002;
Floyd et al. 2007;
Hajovsky et al. 2014;
Niileksela et al. 2016), with G
f showing strong associations with basic reading skills in the WJ IV (
Cormier et al. 2017).
Additionally, a consistent finding in these analyses also showed that the variance explained in basic reading skills and reading comprehension was greater for the Low group in the WJ III elementary and secondary samples. In order to examine the independent effects of
g and broad CHC abilities on reading, we residualized the broad CHC abilities by separating variance in basic reading skills and reading comprehension attributed to
g and broad CHC abilities (see
Caemmerer et al. 2018;
Hajovsky and Chesnut 2022 for other examples). By partitioning out variance attributable to
g from the broad CHC abilities, we found evidence consistent with SLODR. Specifically, when examining the relative proportion of variance explained in reading achievement outcomes (rather than the magnitude of the path coefficients),
g explained relatively more variance than the broad CHC abilities in both basic reading skills and reading comprehension for the Low group. These findings are consistent with studies showing that
g tends to explain the most variance in achievement outcomes in cognitive–achievement relations research (e.g.,
Niileksela et al. 2016;
Zaboski et al. 2018). Conversely, the broad CHC abilities explained relatively more variance than
g in both basic reading skills and reading comprehension in the Average and High groups. These findings align with both SLODR predictions of (a)
g explaining more variance in reading outcomes for the Low group, and (b) broad CHC abilities explaining more variance in reading outcomes for the Average and High groups. These findings are consistent with theoretical postulates according to cognitive ability differentiation (
Jensen 2003). In other words, because it is theoretically posited that
g would contribute less to the CHC broad abilities when
g (or IQ) is higher, there is more residual variance in the higher IQ groups. This phenomenon may explain why the relative effects of the CHC broad abilities on reading achievement were generally larger in the average to higher IQ groups compared to the lower IQ groups, and why the relative effects of
g (or IQ) on reading achievement were generally larger in the lower IQ groups compared to the average or higher IQ groups. As the proportion of variance in reading explained by the CHC broad abilities was generally larger in the average to higher IQ groups, this may also be explained according to the theory of mutualism (
Zhang and Peng 2023). Mutualism theory suggests that cognitive–reading achievement relations occur in students with higher abilities and thus may show stronger mutualistic effects, especially across grade levels. Our findings corroborate some of these theoretical suppositions, as some of the cognitive–reading achievement relations were stronger in the secondary grades WJ sample. This is consistent with hypotheses noted by
Peng and Kievit (
2020) that suggest students with higher abilities may generate more mutualism among skills. However, we did not examine the potential for bidirectional relations in this study, which may shed light on mutualistic effects between cognitive abilities and academic achievement across ability levels. As an example,
Zhang and Peng (
2023) have shown evidence of mutualistic effects between verbal working memory and reading in high-math students for children in elementary school. Mutualistic effects may be most clearly seen in longitudinal studies, where relationships between growth in cognitive abilities and academic achievement can be specifically modeled and evaluated. A longitudinal study that examines the mutualistic effects of a wide range of cognitive abilities and academic achievement across time would help clarify these relationships.
Although the findings are preliminary, the results from this study suggest that a differential interpretation of intelligence tests contingent on general ability of the tested individual (i.e., examinee) may be warranted. Identification of certain exceptionalities, such as SLD, intellectual disability, or gifted and talented considerations, may be impacted by IQ level and must be considered by researchers and practitioners. Where an individual falls on the normative IQ distribution impacts the degree to which outcomes can be explained or the strength of correlations between intelligence and achievement or progress monitoring performance over time. In other words, the relationship between two or more variables is stronger or weaker depending on where an individual falls on the IQ distribution. Examining these correlational patterns of strengths and weaknesses while considering the level of general ability may impact decisions regarding special education or disability service eligibility.
10. Implications of the Findings
The implications for students who are referred for psychoeducational evaluations for special education services under the Individuals with Education Disabilities Improvement Act (
IDEIA 2004) or disability resources in post-secondary education are important. Qualified evaluators, such as psychologists, medical providers, or trained diagnostic educators, provide comprehensive sources of evaluation documentation that often include cognitive and academic assessment data. These assessment data are then used to determine special education eligibility into one of the thirteen categories of support in U.S. schools (
IDEIA 2004). One of these categories is SLD, and for many states or evaluators who operate on the SLD discrepancy model or pattern of strength and weaknesses (PSW), a distinction of SLD is given to students who have an unexpected discrepancy between cognitive ability and academic performance (e.g.,
Maki et al. 2015). The results of this study suggest that a child’s IQ ability level has a significant impact on this relationship. This calls into question our identification methods: are we measuring a disorder, or a difference in how a child uses specific cognitive abilities?
Given the findings of this study, practitioners need to consider IQ level in relation to student age and grade level, specifically when examining assessment data for elementary versus secondary students. This is essential when examining cognitive–achievement relations research and how it relates to SLD diagnostic accuracy. More recently, there has been a shift in using the pattern of strengths and weaknesses within evaluation. However, the impact of IQ level has not been thoroughly researched, and the findings of this study suggest a pre-existing relationship between cognitive ability and academic achievement for children with lower achievement scores. The implications of this study impact the interpretation and diagnostic considerations of these cognitive and academic scores.
11. Limitations
A limitation of this research is the demographic diversity of the normative sample. The normative sample of the Woodcock–Johnson is selective to be representative of the U.S. population, but the results cannot be generalized to English language learners (ELL), immigrants, or refugees without adequate English level proficiency and U.S. cultural exposure. Given the lack of representation of diverse language and cultural backgrounds, the results may not generalize to these minoritized U.S. groups. Future research should examine heterogeneity within the groups, such as race, ethnicity, and language moderation, with SLODR. If IQ matters in terms of prediction, then it is possible that demographic status is a moderating variable. Conducting these analyses with other cognitive and achievement assessment batteries for culturally and linguistically diverse populations is essential to promote equity for underserved U.S. populations, including how these results impact the findings of this current study.
Further, researchers have indicated a limitation with the use of SLODR as a statistical artifact due to the influence of disturbance factors (
Sorjonen and Melin 2020). For example, external factors, such as linguistic differences/confusion, illness, or individual motivation that varies in magnitude among individuals, may influence test scores and thus impact the validity of SLODR findings. In this manner, construct irrelevant variance (i.e., systematic error) related to studies of SLODR may introduce internal validity threats that are not easily controlled and thus impact the validity of inferences drawn from study results. Future studies should seek to control these possible confounding factors to more accurately assess SLODR. An additional validity concern is related to whether the constructs are being measured the same way across the different ability groups. We used observed variables in this study; therefore, we could not test the extent to which the constructs demonstrate measurement invariance across ability groups prior to testing the strength of the predictive paths. Future studies may overcome this concern by utilizing latent variable models.
Another area for future studies is looking at cognitive–achievement relationships as they vary by achievement level using quantile regression. The model used in this study might appear more reliable for children with lower cognitive ability. Thus, future researchers should include the effects of reading, writing, or mathematical computation skills at different levels to better understand the multifaceted decisions for distinguishing between disabilities and the surrounding considerations. Finally, the integrated model used here assumed that the effects of cognitive abilities on reading comprehension are partially mediated through basic reading skills. In cross-sectional data, this does not account for the passage of time and assumes that mediation effects occur instantaneously (
Cole and Maxwell 2003;
Preacher 2015). Future studies can overcome this limitation by addressing these integrated models of cognitive–achievement relations through the use of longitudinal data.