Linking Error Estimation in Stocking–Lord Linking
Abstract
:1. Introduction
2. Assessing the Standard Error, Linking Error, and Total Error in Stocking–Lord Linking
2.1. Standard Error Estimation
2.2. Linking Error Estimation Based on Taylor Approximation
2.3. Jackknife Linking Error
2.4. Approximate Jackknife Linking Error
2.5. Bias-Corrected Approximate Jackknife Linking Error
3. Simulation Study 1: Infinite Sample Size
3.1. Method
3.2. Results
4. Simulation Study 2: Finite Sample Sizes
4.1. Method
4.2. Results
5. Discussion
6. Conclusions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
2PL | two-parameter logistic |
AJK | approximate jackknife |
DGM | data-generating model |
IRF | item response function |
IRT | item response theory |
JK | jackknife |
MML | marginal maximum likelihood |
LE | linking error |
SD | standard deviation |
SE | standard error |
SL | Stocking–Lord |
TCF | test-characteristic function |
TE | total error |
References
- Yen, W.M.; Fitzpatrick, A.R. Item response theory. In Educational Measurement; Brennan, R.L., Ed.; Praeger Publishers: Westport, CT, USA, 2006; pp. 111–154. [Google Scholar]
- Birnbaum, A. Some latent trait models and their use in inferring an examinee’s ability. In Statistical Theories of Mental Test Scores; Lord, F.M., Novick, M.R., Eds.; MIT Press: Reading, MA, USA, 1968; pp. 397–479. [Google Scholar]
- Bock, R.D.; Aitkin, M. Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika 1981, 46, 443–459. [Google Scholar] [CrossRef]
- Kolen, M.J.; Brennan, R.L. Test Equating, Scaling, and Linking; Springer: New York, NY, USA, 2014. [Google Scholar] [CrossRef]
- Holland, P.W.; Wainer, H. (Eds.) Differential Item Functioning: Theory and Practice; Lawrence Erlbaum: Hillsdale, NJ, USA, 1993. [Google Scholar] [CrossRef]
- Penfield, R.D.; Camilli, G. Differential item functioning and item bias. In Handbook of Statistics; Rao, C.R., Sinharay, S., Eds.; Elsevier: Amsterdam, The Netherlands, 2007; Volume 26, pp. 125–167. [Google Scholar] [CrossRef]
- Sansivieri, V.; Wiberg, M.; Matteucci, M. A review of test equating methods with a special focus on IRT-based approaches. Statistica 2017, 77, 329–352. [Google Scholar] [CrossRef]
- De Boeck, P. Random item IRT models. Psychometrika 2008, 73, 533–559. [Google Scholar] [CrossRef]
- Battauz, M. Multiple equating of separate IRT calibrations. Psychometrika 2017, 82, 610–636. [Google Scholar] [CrossRef] [PubMed]
- Monseur, C.; Berezner, A. The computation of equating errors in international surveys in education. J. Appl. Meas. 2007, 8, 323–335. [Google Scholar] [PubMed]
- Robitzsch, A. Robust and nonrobust linking of two groups for the Rasch model with balanced and unbalanced random DIF: A comparative simulation study and the simultaneous assessment of standard errors and linking errors with resampling techniques. Symmetry 2021, 13, 2198. [Google Scholar] [CrossRef]
- Sachse, K.A.; Roppelt, A.; Haag, N. A comparison of linking methods for estimating national trends in international comparative large-scale assessments in the presence of cross-national DIF. J. Educ. Meas. 2016, 53, 152–171. [Google Scholar] [CrossRef]
- Wu, M. Measurement, sampling, and equating errors in large-scale assessments. Educ. Meas. 2010, 29, 15–27. [Google Scholar] [CrossRef]
- Stocking, M.L.; Lord, F.M. Developing a common metric in item response theory. Appl. Psychol. Meas. 1983, 7, 201–210. [Google Scholar] [CrossRef]
- Boos, D.D.; Stefanski, L.A. Essential Statistical Inference; Springer: New York, NY, USA, 2013. [Google Scholar] [CrossRef]
- Andersson, B. Asymptotic variance of linking coefficient estimators for polytomous IRT models. Appl. Psychol. Meas. 2018, 42, 192–205. [Google Scholar] [CrossRef] [PubMed]
- Jewsbury, P.A. Generally applicable variance estimation methods for common-population linking. J. Educ. Behav. Stat. 2024; Epub ahead of print. [Google Scholar] [CrossRef]
- Zhang, Z. Asymptotic standard errors of generalized partial credit model true score equating using characteristic curve methods. Appl. Psychol. Meas. 2021, 45, 331–345. [Google Scholar] [CrossRef] [PubMed]
- Robitzsch, A. Linking error in the 2PL model. J 2023, 6, 58–84. [Google Scholar] [CrossRef]
- Robitzsch, A. Estimation of standard error, linking error, and total error for robust and nonrobust linking methods in the two-parameter logistic model. Stats 2024, 7, 592–612. [Google Scholar] [CrossRef]
- Haberman, S.J.; Lee, Y.H.; Qian, J. Jackknifing Techniques for Evaluation of Equating Accuracy; Research Report No. RR-09-02; Educational Testing Service: Princeton, NJ, USA, 2009. [Google Scholar] [CrossRef]
- Robitzsch, A. Does random differential item functioning occur in one or two groups? Implications for bias and variance in asymmetric and symmetric Haebara and Stocking-Lord linking. Asymmetry 2024, 1, 0005. [Google Scholar] [CrossRef]
- Battauz, M. Factors affecting the variability of IRT equating coefficients. Stat. Neerl. 2015, 69, 85–101. [Google Scholar] [CrossRef]
- Ogasawara, H. Standard errors of item response theory equating/linking by response function methods. Appl. Psychol. Meas. 2001, 25, 53–67. [Google Scholar] [CrossRef]
- Fay, M.P.; Graubard, B.I. Small-sample adjustments for Wald-type tests using sandwich estimators. Biometrics 2001, 57, 1198–1206. [Google Scholar] [CrossRef] [PubMed]
- Robitzsch, A. Analytical approximation of the jackknife linking error in item response models utilizing a Taylor expansion of the log-likelihood function. AppliedMath 2023, 3, 49–59. [Google Scholar] [CrossRef]
- Efron, B.; Tibshirani, R.J. An Introduction to the Bootstrap; CRC Press: Boca Raton, FL, USA, 1994. [Google Scholar] [CrossRef]
- R Core Team. R: A Language and Environment for Statistical Computing; R Core Team: Vienna, Austria, 2024; Available online: https://www.R-project.org (accessed on 15 June 2024).
- Robitzsch, A. sirt: Supplementary Item Response Theory Models; R package version 4.2-73. 2024. Available online: https://github.com/alexanderrobitzsch/sirt (accessed on 7 September 2024).
Par | JK | AJK | TAY | JK | AJK | TAY | JK | AJK | TAY | |
---|---|---|---|---|---|---|---|---|---|---|
10 | 93.1 | 93.1 | 90.1 | 94.1 | 94.2 | 91.5 | 93.5 | 93.5 | 91.0 | |
20 | 94.6 | 94.6 | 93.1 | 94.2 | 94.2 | 92.9 | 94.4 | 94.4 | 93.1 | |
40 | 94.6 | 94.6 | 93.8 | 94.9 | 94.9 | 94.3 | 94.8 | 94.8 | 94.0 | |
80 | 94.8 | 94.8 | 94.4 | 95.4 | 95.4 | 94.9 | 94.7 | 94.7 | 94.5 | |
10 | 95.6 | 95.5 | 91.6 | 95.0 | 94.8 | 91.2 | 94.6 | 94.6 | 89.9 | |
20 | 95.1 | 95.0 | 92.9 | 95.1 | 95.0 | 93.1 | 95.1 | 95.0 | 92.9 | |
40 | 94.6 | 94.5 | 93.3 | 94.9 | 94.9 | 94.0 | 95.0 | 95.0 | 93.9 | |
80 | 95.5 | 95.5 | 94.9 | 95.0 | 95.0 | 94.2 | 94.5 | 94.5 | 94.0 |
LE | LE | LE | LE | LE | LE | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
10 | 500 | 0.053 | 0.000 | 0.117 | 0.102 | 0.215 | 0.207 | 0.085 | 0.000 | 0.101 | 0.044 | 0.133 | 0.095 |
1000 | 0.038 | 0.000 | 0.109 | 0.102 | 0.211 | 0.207 | 0.060 | 0.000 | 0.079 | 0.048 | 0.117 | 0.098 | |
2000 | 0.027 | 0.000 | 0.110 | 0.107 | 0.213 | 0.211 | 0.043 | 0.000 | 0.068 | 0.051 | 0.111 | 0.101 | |
4000 | 0.019 | 0.000 | 0.106 | 0.104 | 0.208 | 0.207 | 0.030 | 0.000 | 0.059 | 0.050 | 0.104 | 0.099 | |
20 | 500 | 0.035 | 0.000 | 0.080 | 0.071 | 0.146 | 0.141 | 0.051 | 0.000 | 0.062 | 0.032 | 0.086 | 0.067 |
1000 | 0.025 | 0.000 | 0.076 | 0.071 | 0.145 | 0.143 | 0.036 | 0.000 | 0.050 | 0.034 | 0.077 | 0.067 | |
2000 | 0.018 | 0.000 | 0.073 | 0.071 | 0.143 | 0.141 | 0.025 | 0.000 | 0.043 | 0.034 | 0.071 | 0.066 | |
4000 | 0.013 | 0.000 | 0.072 | 0.071 | 0.143 | 0.142 | 0.018 | 0.000 | 0.039 | 0.034 | 0.070 | 0.067 | |
40 | 500 | 0.024 | 0.000 | 0.055 | 0.049 | 0.102 | 0.099 | 0.033 | 0.000 | 0.041 | 0.023 | 0.057 | 0.046 |
1000 | 0.017 | 0.000 | 0.053 | 0.050 | 0.100 | 0.099 | 0.023 | 0.000 | 0.033 | 0.024 | 0.052 | 0.047 | |
2000 | 0.012 | 0.000 | 0.051 | 0.050 | 0.099 | 0.099 | 0.016 | 0.000 | 0.029 | 0.023 | 0.049 | 0.046 | |
4000 | 0.009 | 0.000 | 0.050 | 0.049 | 0.099 | 0.099 | 0.012 | 0.000 | 0.026 | 0.024 | 0.048 | 0.046 |
Par | SE | TE | SE | TE | SE | TE | |||||
---|---|---|---|---|---|---|---|---|---|---|---|
10 | 500 | 95.4 | 98.1 | 96.1 | 80.8 | 97.3 | 95.5 | 57.1 | 94.8 | 93.8 | |
1000 | 95.0 | 97.8 | 95.5 | 70.4 | 96.3 | 95.0 | 44.0 | 94.8 | 94.3 | ||
2000 | 91.6 | 97.7 | 95.6 | 55.5 | 94.9 | 94.3 | 32.2 | 94.5 | 94.1 | ||
4000 | 94.9 | 98.1 | 95.6 | 44.0 | 94.6 | 94.0 | 23.8 | 94.1 | 94.0 | ||
20 | 500 | 95.4 | 97.0 | 95.6 | 85.1 | 96.1 | 95.1 | 67.9 | 96.0 | 95.2 | |
1000 | 95.3 | 97.1 | 95.6 | 78.1 | 95.8 | 95.1 | 53.7 | 95.2 | 94.8 | ||
2000 | 95.7 | 97.3 | 96.0 | 66.3 | 95.8 | 95.2 | 41.3 | 94.8 | 94.6 | ||
4000 | 95.2 | 97.0 | 95.4 | 51.3 | 94.8 | 94.4 | 29.9 | 94.8 | 94.7 | ||
40 | 500 | 93.6 | 94.8 | 93.6 | 88.4 | 94.7 | 93.8 | 73.7 | 94.1 | 93.8 | |
1000 | 94.7 | 95.6 | 94.9 | 84.2 | 95.2 | 94.6 | 64.6 | 95.3 | 95.0 | ||
2000 | 94.0 | 95.3 | 94.2 | 76.7 | 95.2 | 94.9 | 50.4 | 95.0 | 94.8 | ||
4000 | 92.9 | 94.6 | 93.1 | 61.8 | 94.7 | 94.4 | 38.4 | 95.2 | 95.1 | ||
10 | 500 | 94.9 | 99.0 | 95.9 | 92.0 | 98.8 | 95.1 | 81.8 | 97.7 | 92.8 | |
1000 | 94.9 | 99.1 | 96.2 | 88.1 | 98.0 | 93.9 | 73.7 | 97.0 | 94.1 | ||
2000 | 92.1 | 99.1 | 96.2 | 77.8 | 97.3 | 93.4 | 57.8 | 95.4 | 93.5 | ||
4000 | 95.1 | 99.3 | 96.3 | 71.4 | 97.2 | 93.9 | 48.6 | 96.3 | 95.0 | ||
20 | 500 | 95.0 | 98.0 | 95.5 | 93.3 | 98.0 | 95.4 | 86.5 | 97.7 | 95.2 | |
1000 | 95.1 | 98.5 | 95.9 | 89.9 | 97.9 | 95.5 | 77.7 | 97.1 | 95.3 | ||
2000 | 95.6 | 98.4 | 96.3 | 84.8 | 97.5 | 94.9 | 65.5 | 96.2 | 95.0 | ||
4000 | 94.9 | 98.2 | 95.6 | 77.3 | 96.8 | 95.4 | 53.2 | 95.4 | 94.8 | ||
40 | 500 | 94.8 | 97.3 | 95.0 | 93.4 | 97.3 | 95.0 | 89.0 | 96.6 | 94.9 | |
1000 | 95.0 | 97.1 | 95.2 | 91.9 | 96.9 | 95.2 | 83.4 | 96.6 | 95.3 | ||
2000 | 94.4 | 97.0 | 94.7 | 88.1 | 96.2 | 94.6 | 75.0 | 96.1 | 95.5 | ||
4000 | 94.8 | 97.2 | 95.1 | 82.5 | 96.2 | 94.9 | 60.6 | 95.6 | 95.2 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Robitzsch, A. Linking Error Estimation in Stocking–Lord Linking. Foundations 2025, 5, 2. https://doi.org/10.3390/foundations5010002
Robitzsch A. Linking Error Estimation in Stocking–Lord Linking. Foundations. 2025; 5(1):2. https://doi.org/10.3390/foundations5010002
Chicago/Turabian StyleRobitzsch, Alexander. 2025. "Linking Error Estimation in Stocking–Lord Linking" Foundations 5, no. 1: 2. https://doi.org/10.3390/foundations5010002
APA StyleRobitzsch, A. (2025). Linking Error Estimation in Stocking–Lord Linking. Foundations, 5(1), 2. https://doi.org/10.3390/foundations5010002