Exploring Complex Survival Data through Frailty Modeling and Regularization
Abstract
:1. Introduction
2. Data and Model Formulation
3. Likelihood Function and the Proposed Methodologies
4. The Algorithms
5. Theoretical Properties
- I.
- The set of parameter is open .
- II.
- The objective function is continuously differentiable.
- III.
- The set defined as is compact in .
- IV.
- The surrogate function is continuously differentiable in and continuous in .
- V.
- The objective function ’s stationary points are isolated.
- VI.
- For the surrogate function , there exists a unique global maximum.
- (i)
- is continuous at if VI is satisfied.
- (ii)
- If I–VI are satisfied, for any initial value , when , tends to stationary point . Moreover, = , and the likelihood sequence strictly increases to if for all k.
- (a)
- is continuously differentiable.
- (b)
- γ and are compact, given .
- (c)
- Stationary points for are isolated.
Algorithm 1 Estimating Procedure. |
S1. Provide initial values for parameters γ, , and . S2. Update the estimation of parameter γ, using (8). S3. Update the estimation of other parameters of covariates , using (17) for . S4. Compute the estimation of , using (10) with the previous estimated from S3. S5. Conduct iterations from S2 to S4 repeatedly till convergence. |
Algorithm 2 An alternative method. |
S1. Provide initial values for parameters γ, , and . S2. Update the estimation of parameter γ, using (8). S3. Under the profile MM method, is updated by maximizing the equation S4. Compute the estimates of , using (10) with estimated in S3. S5. Conduct iterations from S2 to S4 repeatedly till convergence. |
6. Simulation Study
7. Real Data Application
8. Discussion
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Clayton, D.G. A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika 1978, 65, 141–151. [Google Scholar] [CrossRef]
- David, C.R. Regression models and life-tables (with discussion). J. R. Stat. Soc. B 1972, 34, 187220. [Google Scholar]
- Andersen, P.K.; Klein, J.P.; Knudsen, K.M.; Tabanera y Palacios, R. Estimation of variance in Cox’s regression model with shared gamma frailties. Biometrics 1997, 53, 1475–1484. [Google Scholar] [CrossRef] [PubMed]
- Fan, J.; Li, R. Variable selection for Cox’s proportional hazards model and frailty model. Ann. Stat. 2002, 30, 74–99. [Google Scholar] [CrossRef]
- Androulakis, E.; Koukouvinos, C.; Vonta, F. Estimation and variable selection via frailty models with penalized likelihood. Stat. Med. 2012, 31, 2223–2239. [Google Scholar] [CrossRef] [PubMed]
- Groll, A.; Hastie, T.; Tutz, G. Selection of effects in Cox frailty models by regularization methods. Biometrics 2017, 73, 846–856. [Google Scholar] [CrossRef] [PubMed]
- Becker, M.P.; Yang, I.; Lange, K. EM algorithms without missing data. Stat. Methods Med. Res. 1997, 6, 38–54. [Google Scholar] [CrossRef] [PubMed]
- Fan, J.; Li, R. Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 2001, 96, 1348–1360. [Google Scholar] [CrossRef]
- Johansen, S. An extension of Cox’s regression model. Int. Stat. Rev. 1983, 51, 165–174. [Google Scholar] [CrossRef]
- Hunter, D.R.; Lange, K. A tutorial on MM algorithms. Am. Stat. 2004, 58, 30–37. [Google Scholar] [CrossRef]
- Ding, J.; Tian, G.L.; Yuen, K.C. A new MM algorithm for constrained estimation in the proportional hazards model. Comput. Stat. Data Anal. 2015, 84, 135–151. [Google Scholar] [CrossRef]
- Hunter, D.R.; Li, R. Variable selection using MM algorithms. Ann. Stat. 2005, 33, 1617. [Google Scholar] [CrossRef] [PubMed]
- Schwarz, G. Estimating the dimension of a model. Ann. Stat. 1978, 6, 461–464. [Google Scholar] [CrossRef]
- Craven, P.; Wahba, G. Smoothing noisy data with spline functions. Numer. Math. 1978, 31, 377–403. [Google Scholar] [CrossRef]
- Friedman, J.; Hastie, T.; Tibshirani, R. Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 2010, 33, 1–22. [Google Scholar] [CrossRef] [PubMed]
- Vaida, F. Parameter convergence for EM and MM algorithms. Stat. Sin. 2005, 15, 831–840. [Google Scholar]
- Zhou, H.; Alexander, D.; Lange, K. A quasi-Newton acceleration for high-dimensional optimization algorithms. Stat. Comput. 2011, 21, 261–273. [Google Scholar] [CrossRef] [PubMed]
- Ma, S.; Huang, J. A concave pairwise fusion approach to subgroup analysis. J. Am. Stat. Assoc. 2017, 112, 410–423. [Google Scholar] [CrossRef]
- Johnson, A.E.; Pollard, T.J.; Shen, L.; Lehman, L.w.H.; Feng, M.; Ghassemi, M.; Moody, B.; Szolovits, P.; Anthony Celi, L.; Mark, R.G. MIMIC-III, a freely accessible critical care database. Sci. Data 2016, 3, 160035. [Google Scholar] [CrossRef]
Log-Normal Frailty | Inverse Gaussian Frailty | Gamma Frailty | |||||||
---|---|---|---|---|---|---|---|---|---|
K | 31.2780 | 162.1980 | 64.7648 | ||||||
T | 4.7591 | 12.6620 | 5.1203 | ||||||
L | |||||||||
MLE | Bias | SD | MLE | Bias | SD | MLE | Bias | SD | |
0.2729 | 0.4236 | 0.8502 | 0.3856 | 2.1017 | 0.1017 | 0.4097 | |||
0.3814 | 0.4252 | 0.3844 | |||||||
0.4076 | 0.4224 | 0.3818 | |||||||
0.3822 | 0.3893 | 0.4010 | |||||||
1.0606 | 0.0606 | 0.3928 | 1.0692 | 0.0692 | 0.3973 | 1.0348 | 0.0348 | 0.4015 | |
2.0530 | 0.0530 | 0.3851 | 2.1129 | 0.1129 | 0.4162 | 2.0891 | 0.0891 | 0.3830 | |
3.1770 | 0.1770 | 0.4159 | 3.1770 | 0.1770 | 0.4159 | 3.1120 | 0.1120 | 0.3890 | |
3.1071 | 0.1071 | 0.3849 | 3.2102 | 0.2102 | 0.4277 | 3.1410 | 0.1410 | 0.4070 |
Frailty | Sparsity Penalties | P (Selecting the True Model) | Zeros | |
---|---|---|---|---|
Correct | Incorrect | |||
Gamma | MCP () | 1 | 27 | 0 |
SCAD () | 1 | 27 | 0 | |
Log-Normal | MCP () | 1 | 27 | 0 |
SCAD () | 1 | 27 | 0 | |
Inverse Gaussian | MCP () | 1 | 27 | 0 |
SCAD () | 1 | 27 | 0 |
Frailty | Par. | MCP | SCAD | ||||
---|---|---|---|---|---|---|---|
MLE | Bias | SD | MLE | Bias | SD | ||
Gamma | 2.1012 | 0.3106 | 2.0678 | 0.3223 | |||
2.0891 | 0.2438 | 1.9984 | 0.0016 | 0.2081 | |||
3.1120 | 0.3517 | 2.9893 | 0.0107 | 0.2729 | |||
4.1410 | 0.3459 | 4.0175 | 0.2920 | ||||
Log-Normal | 0.4641 | 0.0359 | 0.1182 | 0.4696 | 0.0304 | 0.1101 | |
2.0563 | 0.1831 | 2.0282 | 0.1566 | ||||
2.9962 | 0.0038 | 0.3565 | 2.9705 | 0.0295 | 0.3080 | ||
3.9756 | 0.0244 | 0.2931 | 4.0010 | 0.2230 | |||
Inverse Gaussian | 0.8999 | 0.1001 | 0.2024 | 1.0193 | 0.1717 | ||
2.0562 | 0.2240 | 2.0305 | 0.1978 | ||||
3.0888 | 0.2591 | 3.0052 | 0.2548 | ||||
4.0915 | 0.1816 | 4.0055 | 0.1794 |
Penalty | BIC | No. of Significant Variables |
---|---|---|
Log-Normal Frailty | ||
MCP | 160,361 | 11 |
SCAD | 160,361 | 11 |
Inverse Gaussian Frailty | ||
MCP | 175,223 | 11 |
SCAD | 175,127 | 11 |
Gamma Frailty | ||
MCP | 160,684 | 31 |
SCAD | 160,690 | 31 |
The Inverse Gaussian Frailty Model | |
---|---|
Clinical | |
Fraction of inspired oxygen | 0.221 |
Glucose | |
Heart Rate | −0.069 |
Height | −0.173 |
Oxygen saturation | 0.109 |
Respiratory rate | −0.147 |
Elective | 0.486 |
Emergency room admit | −0.152 |
(Admission location) | |
Phys referral/normal deli | 0.220 |
(Admission location) | |
Personal Information | |
Engl (Language) | 0.156 |
Married (Marriage status) | 0.079 |
The Log-Normal Frailty Model | |
---|---|
Clinical | |
Fraction of inspired oxygen | 0.208 |
Glucose | −0.120 |
Heart Rate | −0.065 |
Height | −0.165 |
Oxygen saturation | 0.098 |
Respiratory rate | −0.139 |
Elective | 0.433 |
Emergency room admit | −0.138 |
(Admission location) | |
Phys referral/normal deli | 0.229 |
(Admission location) | |
Personal Information | |
Engl (Language) | 0.134 |
Married (Marriage status) | 0.070 |
The Gamma Frailty Model | |
---|---|
Clinical | |
Fraction of inspired oxygen | 0.246 |
Glucose | −0.048 |
Height | −0.162 |
Respiratory rate | −0.141 |
Elective | 0.533 |
Phys referral/normal deli | 0.365 |
(Admission location) | |
TRANSFER FROM OTHER HEALT | 0.833 |
Personal Information | |
Engl (Language) | 0.220 |
Separated (Marriage status) | 0.355 |
Unitarian universalist | 0.338 |
Ethnicity | |
Asian | −0.317 |
Hispanic/Latino–Puerto Rican | 0.373 |
Multi-race ethnicity | 0.262 |
Asian–other | −0.265 |
Hispanic/Latino–Colombian | 1.790 |
Hispanic/Latino–Dominican | 0.465 |
Middle eastern | 0.306 |
Hispanic/Latino–Cuban | 0.664 |
Asian–Asian Indian | 0.664 |
White–eastern European | 0.362 |
White–Brazilian | 0.888 |
Portuguese | −0.315 |
Hispanic/Latino–Mexican | −0.345 |
Asian–Japanese | −1.110 |
Hispanic/Latino–Salvadoran | 0.808 |
American Indian/Alaska native | −0.841 |
Federally recognized tribe | |
Asian–Filipino | −0.465 |
Asian–Korean | 0.263 |
Hispanic/Latino–Guatemalan | −0.221 |
American Indian/Alaska native | −0.426 |
Asian–Cambodian | 0.240 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Huang, X.; Xu, J.; Zhou, Y. Exploring Complex Survival Data through Frailty Modeling and Regularization. Mathematics 2023, 11, 4440. https://doi.org/10.3390/math11214440
Huang X, Xu J, Zhou Y. Exploring Complex Survival Data through Frailty Modeling and Regularization. Mathematics. 2023; 11(21):4440. https://doi.org/10.3390/math11214440
Chicago/Turabian StyleHuang, Xifen, Jinfeng Xu, and Yunpeng Zhou. 2023. "Exploring Complex Survival Data through Frailty Modeling and Regularization" Mathematics 11, no. 21: 4440. https://doi.org/10.3390/math11214440
APA StyleHuang, X., Xu, J., & Zhou, Y. (2023). Exploring Complex Survival Data through Frailty Modeling and Regularization. Mathematics, 11(21), 4440. https://doi.org/10.3390/math11214440