Multicollinearity and Linear Predictor Link Function Problems in Regression Modelling of Longitudinal Data
Abstract
:1. Introduction
2. Model and Estimation Procedure
2.1. GPLMs for Longitudinal Data
2.2. Ridge Generalized Estimating Equation (RGEE)
Algorithm 1: Monte Carlo Newton–Raphson (MCNR) algorithm |
|
2.3. Asymptotics
3. Numerical Analyses
3.1. Simulations
3.2. AIDS Data Analysis
4. Concluding Remarks and Discussion
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
- (A.1)
- Number of observations over time () is a bounded sequence of positive integers, and the distinct values of form a quasi-uniform sequence that grows dense on , and the kth derivative of is bounded for some ;
- (A.2)
- The covariates , , are uniformly bounded;
- (A.3)
- The unknown parameter belongs to a compact subset , the true parameter value lies in the interior of ;
- (A.4)
- There exist two positive constants, and , such that
References
- McCullagh, P.; Nelder, J.A. Generalized Linear Models, 2nd ed.; Chapman and Hall: London, UK, 1989. [Google Scholar]
- He, X.; Zhu, Z.Y.; Fung, W.K. Estimation in a Semiparametric model for longitudinal data with unspecified dependence structure. Biometrika 2002, 89, 579–590. [Google Scholar] [CrossRef]
- He, X.M.; Fung, W.K.; Zhu, Z.Y. Robust estimation in a generalized partially linear model for cluster data. J. Am. Stat. Assoc. 2005, 100, 1176–1184. [Google Scholar] [CrossRef]
- Qin, G.; Bai, Y.; Zhu, Z. Robust empirical likelihood inference for generalized partial linear models with longitudinal data. J. Multivar. Anal. 2012, 105, 32–44. [Google Scholar] [CrossRef] [Green Version]
- Chen, B.; Zhou, X.H. Generalized partially linear models for incomplete longitudinal data in the presence of population-Level information. Biometrics 2013, 69, 386–395. [Google Scholar] [CrossRef] [Green Version]
- Zhang, J.; Xue, L. Empirical likelihood inference for generalized partially linear models with longitudinal data. Open J. Stat. 2020, 10, 188–202. [Google Scholar] [CrossRef] [Green Version]
- Hoerl, A.; Kennard, R. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 1970, 12, 55–67. [Google Scholar] [CrossRef]
- Hoerl, A.; Kennard, R. Ridge regression: Application to nonorthogonal problems. Technometrics 1970, 12, 69–82. [Google Scholar] [CrossRef]
- Theobald, C.M. Generalization of mean squer error applied to ridge regresion. J. R. Stat. Soc. 1974, 36, 103–106. [Google Scholar]
- Tikhonov, A. On the stability of inverse problems. Proc. USSR Acad. Sci. 1943, 39, 267–288. [Google Scholar]
- Saleh, A.K.M.; Kibria, B.M.G. Performances of some new preliminary test ridge regression estimators and their properties. Commun. Stat.—Theory Methods 1993, 22, 2747–2764. [Google Scholar] [CrossRef]
- Kibria, B.M.G.; Saleh, A.K.M.E. Effect of W,LR and LM tests on the performance of preliminary test ridge regression estimators. J. Jpn. Stat. Soc. 2003, 33, 119–136. [Google Scholar] [CrossRef] [Green Version]
- Kibria, B.M.G.; Saleh, A.K.M.E. Preliminary test ridge regression estimators with student’s /errors and conflicting test-statistics. Metrika 2004, 59, 105–124. [Google Scholar] [CrossRef]
- Arashi, M.; Tabatabaey, S.M.M.; Iranmanesh, A. Improved estimation in stochastic linear models under elliptical symmetry. J. Appl. Probab. Stat. 2010, 5, 145–160. [Google Scholar]
- Bashtian, H.M.; Arashi, M.; Tabatabaey, S.M.M. Using improved estimation strategies to combat multicollinearity. J. Stat. Comput. Simul. 2011, 81, 1773–1797. [Google Scholar] [CrossRef]
- Bashtian, H.M.; Arashi, M.; Tabatabaey, S.M.M. Ridge estimation under the stochastic restriction. Commun. Stat.—Theory Methods 2011, 40, 3711–3747. [Google Scholar] [CrossRef]
- Arashi, M.; Tabatabaey, S.M.M.; Soleimani, H. Simple regression in view of elliptical models. Linear Algebra Its Appl. 2012, 437, 1675–1691. [Google Scholar] [CrossRef] [Green Version]
- Zhang, B.; Horvath, S. Ridge regression based hybrid genetic algorithms for multi-locus quantitative trait mapping. Bioinform. Res. Appl. 2006, 1, 261–272. [Google Scholar] [CrossRef]
- Malo, N.; Libiger, O.; Schork, N. Accommodating linkage disequilibrium in genetic-association analyses via ridge regression. Am. J. Hum. Genet. 2008, 82, 375–385. [Google Scholar] [CrossRef] [Green Version]
- Eliot, M.; Ferguson, J.; Reilly, M.P.; Foulkes, A.S. Ridge regression for longitudinal biomarker data. Int. J. Biostat. 2011, 7, 37. [Google Scholar] [CrossRef]
- Rahmani, M.; Arashi, M.; Mamode Khan, N.; Sunecher, Y. Improved mixed model for longitudinal data analysis using shrinkage method. Math. Sci. 2018, 12, 305–312. [Google Scholar] [CrossRef] [Green Version]
- Taavoni, M.; Arashi, M. Semiparametric ridge regression for longitudinal data. In Proceedings of the 14th Iranian Statistics Conference, Shahrood University of Technology, Shahrood, Iran, 25–27 August 2018. [Google Scholar]
- Qin, G.Y.; Zhu, Z.Y. Robustified maximum likelihood estimation in generalized partial linear mixed model for longitudinal data. Biometrics 2009, 65, 52–59. [Google Scholar] [CrossRef] [PubMed]
- Taavoni, M.; Arashi, M. High-dimensional generalized semiparametric model for longitudinal data. Statistics 2021, 55, 831–850. [Google Scholar] [CrossRef]
- Qin, G.Y.; Zhu, Z.Y. Robust estimation in generalized semiparametric mixed models for longitudinal data. J. Multivar. Anal. 2007, 98, 1658–1683. [Google Scholar] [CrossRef]
- Liang, K.Y.; Zeger, S.L. Longitudinal data analysis using generalized linear models. Biometrika 1986, 73, 13–22. [Google Scholar] [CrossRef]
- Fan, J.Q.; Li, R. Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 2001, 96, 1348–1360. [Google Scholar] [CrossRef]
- Wang, H.S.; Li, R.Z.; Tcai, C.L. Tuning parameter selectors for the smoothly clipped absolute deviation method. Biometrika 2007, 94, 553–568. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wang, H.S.; Li, B.; Leng, C.L. Shrinkage tuning parameter selection with a diverging number of parameters. J. R. Stat. Soc. Ser. B 2009, 71, 671–683. [Google Scholar] [CrossRef]
- Li, G.R.; Peng, H.; Zhu, L.X. Nonconcave penalized M-estimation with a diverging number of parameters. Stat. Sin. 2011, 21, 391–419. [Google Scholar]
- Zeger, S.L.; Diggle, P.J. Semi-parametric models for longitudinal data with application to CD4 cell numbers in HIV seroconverters. Biometrics 1994, 50, 689–699. [Google Scholar] [CrossRef]
- Wang, N.; Carroll, R.; Lin, X.H. Efficient semiparametric marginal estimation for longitudinal/clustered data. J. Am. Stat. Assoc. 2005, 100, 147–157. [Google Scholar] [CrossRef]
- Schumaker, L.L. Spline Functions; Wiley: New York, NY, USA, 1981. [Google Scholar]
Methods | Parameters | ||||
---|---|---|---|---|---|
RGEE-C | 0.028(0.032) | 0.033(0.038) | 0.046(0.052) | 0.141(0.142) | |
0.024(0.027) | 0.029(0.032) | 0.039(0.044) | 0.122(0.126) | ||
0.056(0.027) | 0.067(0.032) | 0.092(0.044) | 0.284(0.127) | ||
0.055(0.029) | 0.066(0.035) | 0.090(0.048) | 0.279(0.137) | ||
0.080(0.035) | 0.097(0.045) | 0.137(0.068) | 0.446(0.209) | ||
MSE | 0.243 | 0.291 | 0.404 | 1.271 | |
RGEE-I | 0.041(0.033) | 0.049(0.039) | 0.068(0.053) | 0.209(0.150) | |
0.057(0.029) | 0.068(0.034) | 0.093(0.047) | 0.289(0.137) | ||
0.055(0.029) | 0.065(0.034) | 0.090(0.047) | 0.278(0.137) | ||
0.061(0.031) | 0.073(0.037) | 0.101(0.051) | 0.311(0.146) | ||
0.070(0.037) | 0.076(0.048) | 0.092(0.073) | 0.249(0.236) | ||
MSE | 0.284 | 0.331 | 0.444 | 1.335 | |
GEE-C | 0.027(0.055) | 0.033(0.066) | 0.045(0.090) | 0.129(0.279) | |
0.024(0.044) | 0.028(0.052) | 0.039(0.072) | 0.113(0.223) | ||
0.058(0.051) | 0.069(0.061) | 0.096(0.084) | 0.308(0.259) | ||
0.055(0.050) | 0.065(0.060) | 0.089(0.082) | 0.244(0.254) | ||
0.082(0.053) | 0.100(0.068) | 0.143(0.103) | 0.449(0.391) | ||
MSE | 0.246 | 0.295 | 0.411 | 1.243 | |
GEE-I | 0.040(0.057) | 0.048(0.068) | 0.065(0.094) | 0.184(0.289) | |
0.055(0.048) | 0.065(0.057) | 0.088(0.078) | 0.237(0.242) | ||
0.057(0.052) | 0.069(0.062) | 0.096(0.085) | 0.320(0.262) | ||
0.061(0.056) | 0.072(0.067) | 0.099(0.092) | 0.271(0.284) | ||
0.072(0.055) | 0.079(0.071) | 0.099(0.110) | 0.262(0.423) | ||
MSE | 0.286 | 0.333 | 0.447 | 1.275 |
Coefficients | Methods | Coefficients | Methods | ||
---|---|---|---|---|---|
RGEE | GEE | RGEE | GEE | ||
AGE | 3.987 (0.006) | 4.298 (0.009) | AGE*CESD | −0.268 (0.008) | −0.262 (0.001) |
SMOKE | 32.780 (0.053) | 32.916 (0.062) | SMOKE*DRUG | −16.204 (0.046) | −16.221 (0.055) |
DRUG | 17.949 (0.066) | 18.254 (0.075) | SMOKE*SEXP | 4.051 (0.002) | 4.057 (0.005) |
SEXP | 2.801 (0.009) | 2.797 (0.013) | SMOKE*CESD | −0.268 (0.003) | −0.251 (0.002) |
CESD | −3.077 (0.002) | −3.077 (0.005) | DRUG*SEXP | −1.205 (0.005) | −1.292 (0.013) |
AGE*SMOKE | 0.039 (0.002) | −0.007 (0.003) | DRUG*CESD | 0.274 (0.003) | 0.273 (0.005) |
AGE*DRUG | −1.006 (0.003) | −1.017 (0.009) | SEXP*CESD | 0.033 (0.008) | 0.026 (0.001) |
AGE*SEXP | −0.565 (0.003) | −0.596 (0.001) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Taavoni, M.; Arashi, M.; Manda, S. Multicollinearity and Linear Predictor Link Function Problems in Regression Modelling of Longitudinal Data. Mathematics 2023, 11, 530. https://doi.org/10.3390/math11030530
Taavoni M, Arashi M, Manda S. Multicollinearity and Linear Predictor Link Function Problems in Regression Modelling of Longitudinal Data. Mathematics. 2023; 11(3):530. https://doi.org/10.3390/math11030530
Chicago/Turabian StyleTaavoni, Mozhgan, Mohammad Arashi, and Samuel Manda. 2023. "Multicollinearity and Linear Predictor Link Function Problems in Regression Modelling of Longitudinal Data" Mathematics 11, no. 3: 530. https://doi.org/10.3390/math11030530
APA StyleTaavoni, M., Arashi, M., & Manda, S. (2023). Multicollinearity and Linear Predictor Link Function Problems in Regression Modelling of Longitudinal Data. Mathematics, 11(3), 530. https://doi.org/10.3390/math11030530