Next Article in Journal
A New Parameter-Uniform Discretization of Semilinear Singularly Perturbed Problems
Next Article in Special Issue
Generalized Accelerated Failure Time Models for Recurrent Events
Previous Article in Journal
On Certain Generalizations of Rational and Irrational Equivariant Functions
Previous Article in Special Issue
An Alternative Approach for Identifying Nonlinear Dynamics of the Cascade Logistic-Cubic System
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Nonparametric Sieve Maximum Likelihood Estimation of Semi-Competing Risks Data

School of Mathematics, Yunnan Normal University, Kunming 650092, China
*
Author to whom correspondence should be addressed.
Mathematics 2022, 10(13), 2248; https://doi.org/10.3390/math10132248
Submission received: 23 May 2022 / Revised: 13 June 2022 / Accepted: 22 June 2022 / Published: 27 June 2022
(This article belongs to the Special Issue Recent Advances in Computational Statistics)

Abstract

:
In biomedical studies involving time-to-event data, a subject may experience distinct types of events. We consider the problem of estimating the transition functions for a semi-competing risks model under illness-death model framework. We propose to estimate the intensity functions by maximizing a B-spline based sieve likelihood. The method yields smooth estimates without parametric assumptions. Our proposed approach facilitates easy computation of the covariance of the model parameters and yields direct interpretation. Compared with existing approaches, our proposed method requires neither the subjective specification of the frailty distribution nor the Markov or semi-Markov assumption which may be unmet in real applications. We establish the consistency, the convergence rate, and the asymptotic normality of the proposed estimators under some regularity conditions. We also provide simulation studies to assess the finite-sample performance of the proposed modeling and estimation strategy. A real data application is further used to illustrate the proposed methodology.

1. Introduction

In survival analysis, a subject may experience several distinct types of failures. If apart from censoring, the follow up period ends upon the occurrence of the first event, such data are often referred to as competing risks data. This framework consists of survival data where failure may be due to one of a number of competing causes. In some application, with additional information, this notion can be extended to accommodate that of semi-competing risks ([1,2]), where one type of event (terminal event, e.g., death) may censor the other events (non-terminal event, e.g., relapse of the disease), but not vice versa. The framework of semi-competing risk data have been previously discussed in [1,3]. Furthermore, competing risks data can also be regarded as a special type of multitask prediction problem, which simultaneously predicts multiple outcomes from the same set of predictors. A stacking algorithm borrowing information among multiple prediction tasks to improve multivariate prediction performance (MTPS) is recently proposed by [4]. The MTPS is shown to outperform existing multivariate prediction methods.
Recently [5] suggests that semicompeting risks data can also be analyzed using the conventional illness–death compartment model by a subjective specification of the frailty distribution and postulating the Markov or semi-Markov assumption for the conditional transition functions given the covariates and the frailty ([6,7]). However, the subjective specification of the frailty distribution or the Markov or semi-Markov assumption may be unmet in some practical applications, leading to inconsistent estimators. In such cases, alternative (non-Markov) estimators are needed. Furhthemore, their nonparametric maximum likelihood estimation approach may be computational demanding when the sample size is large.
To address the theoretical and numerical challenges in the semiparametric estimation of semi-competing risks model, we employ the B-spline based sieve maximum likelihood approach to simultaneously estimate the regression parameters and transition functions. Covariates are incorporated naturally via proportional hazards assumptions. This approach facilitates easy calculation of the covariance of the model parameters. The proposed spline estimation algorithm requires much less computation than the isotonic type algorithm used in [5] since the size of the step function is much larger than the number of parameters in our proposed B-spline based approach. Under certain regularity conditions, we are able to prove that the estimators of regression parameters is root-n consistent, asymptotically normal and semiparametric efficient.
The rest of the paper is organized as follows. In Section 2, we will introduce our proposed model and estimating approach. In Section 3, we study the asymptotic properties of the proposed estimators. In Section 4, we provide simulation results. An application to colon cancer data is given in Section 5. We then conclude with some discussion in Section 6. All proofs are relegated to the Appendix A.

2. Methodology

2.1. Model and Likelihood Function

For the ith subject, let C i , X i , T i 1 , and T i 2 denote the censoring, covariate vector, non-terminal event time, and terminal event time, respectively. Define Y i 2 = T i 2 C i , δ i 2 = I ( T i 2 C i ) , Y i 1 = T i 1 Y i 2 , and δ i 1 = I ( T i 1 Y i 2 ) . We observe ( Y i 2 , δ i 2 , δ i 1 , X i , i = 1 , , n ) . The hazard functions are defined as below.
λ 1 ( t 1 ) = lim Δ 0 P [ T 1 [ t 1 , t 1 + Δ ) | T 1 t 1 , T 2 t 1 ] / Δ ,
λ 2 ( t 2 ) = lim Δ 0 P [ T 2 [ t 2 , t 2 + Δ ) | T 1 t 2 , T 2 t 2 ] / Δ ,
λ 12 ( t 2 | t 1 ) = lim Δ 0 P [ T 2 [ t 2 , t 2 + Δ ) | T 1 = t 1 , T 2 t 2 ] / Δ ,
where 0 < t 1 < t 2 . In general, λ 12 ( t 2 | t 1 ) can depend on both t 1 and t 2 (see Remark 1 for more detailed discussions). Let Λ 1 ( t ) = 0 t λ 1 ( x ) d x and Λ 2 ( s ) = 0 s λ 2 ( x ) d x . Specifically, the probability measure P refers to the joint distribution of ( T 1 , T 2 , C ) in the unconditional case. In the conditional case, the probability measure P refers to the joint distribution of ( T 1 , T 2 , C ) given X. For the unconditional case, the likelihood function L ( θ ) then takes the form
i = 1 n λ 1 ( Y i 1 ) δ i 1 λ 2 ( Y i 2 ) ( 1 δ i 1 ) δ i 2 λ 12 ( Y i 2 | Y i 1 ) δ i 1 δ i 2 exp Λ 1 ( Y i 1 ) Λ 2 ( Y i 1 ) Y i 1 Y i 2 λ 12 ( s | Y i 1 ) d s ) ,
where θ = ( β 1 , β 2 , β 3 , λ 10 , λ 20 , λ 30 ) will be specified as follows.
For the case with q dimension covariates X, the conditional transition rate functions are defined as follows:
λ 1 ( t 1 | X = x ) = λ 10 ( t 1 ) exp ( β 1 T x ) ,
λ 2 ( t 2 | X = x ) = λ 20 ( t 2 ) exp ( β 2 T x ) ,
λ 12 ( t 2 | t 1 , X = x ) = λ 12 , 0 ( t 2 | t 1 ) exp ( β 3 T x ) .
Note that both x and X refer to the covariates where X denote the random variable and x refers to its observed values. The Equations (5)–(7) are the conditional transition functions of T 1 and T 2 (given X = x ) while the Equations (1)–(3) are the unconditional transition functions of T 1 and T 2 .
To simplify the notation, denote λ 3 ( t , s ) = λ 12 ( s | t ) ,   λ 30 ( t , s ) = λ 12 , 0 ( s | t ) ,   β = ( β 1 T , β 2 T , β 3 T ) T ,   β 0 = ( β 10 T , β 20 T , β 30 T ) T . Note that in our modeling approach, λ 30 depends on two parameters t and s.

2.2. Sieve Space Θ n for the Parameters ( β 1 , β 2 , β 3 , λ 10 , λ 20 , λ 30 )

We propose a sieve space consisting of B-splines for λ j 0 ( j = 1 , 2 , 3 ) in maximizing (4). We suppose that Y 1 and Y 2 have compact supports (say [ 0 , 1 ] ) and that β M for a known constant M . Rewrite λ 10 ( t ) = exp ( g 10 ( t ) ) , λ 20 ( s ) = exp ( g 20 ( s ) ) , λ 30 ( t , s ) = exp ( g 30 ( t , s ) ) . Let ψ = ( g 1 , g 2 , g 3 ) and ψ 0 = ( g 10 , g 20 , g 30 ) . A sieve space consisting of B-splines is defined for these new parameters as follows: First, we obtain an extended partition with equal length 1 / K n for the interval [ 0 , 1 ] :
Δ = { s m = = s 1 = 0 = s 0 < s 1 < < s K n = 1 = = s K n + m } ,
where m (independent of the sample size n) and K n = O ( n ν ) ( 0 < ν < 1 / 2 ) are two integers to be chosen later. Note that m and K n are two parameters often used in B-spline modeling where m indicates the smoothness of the basis function. Let N n = K n + m and { N j m ( s ) } j = 1 N n be a normalized B-spline basis associated with Δ (see [8]). Then the sieve space for the parameters θ = ( β , ψ ( t , s ) ) is defined as
Θ n = { θ n = ( β , ψ n ( s , t ) ) : ψ n ( s , t ) = ( g 1 n ( t ) , g 2 n ( s ) , g 3 n ( s , t ) , β M , g 1 n ( t ) = i = 1 m + K n α i N i m ( t ) , g 2 n ( s ) = i = 1 m + K n η i N i m ( s ) , g 3 n ( s , t ) = i 1 , i 2 = 1 m + K n γ i 1 , i 2 N i 1 m ( s ) N i 2 m ( t ) , max 1 i m + K n | α i | M n , max 1 i m + K n | η i | M n , max 1 i 1 , i 2 m + K n | γ i 1 , i 2 | M n } ,
where M n ( 2 m 1 ) / ( 2 m ( 2 m + 1 ) ) with a constant m arbitrarily close to m.
For any θ i = ( β i , ψ i ) Θ ( i = 1 , 2 ) , we define a distance d ( θ 1 , θ 2 ) = β 1 β 2 + ψ 1 ψ 2 2 .
Remark 1.
Here we assume that the transition intensity λ 30 ( · ) depends on both t 1 and t 2 . A semi-Markov process specifies that λ 30 ( t 1 , t 2 ) = h 2 ( t 2 t 1 ) . However, it is important to note that in either Markov or semi-Markov approaches, λ 30 depends on only one parameter, corresponding to the special cases of our modeling approach where λ 30 can flexibly depend on two parameters.

2.3. Maximization

Let P n , P denote the empirical measure and the true probability measure of ( δ 1 , δ 2 , Y 1 , Y 2 , X ) , respectively. We maximize the function
l n ( β , ψ ) = P n l ( θ ; W i ) = P n l ( β , ψ ; W i ) = P n { δ 1 i X i T β 1 + g 1 ( Y 1 i ) + ( 1 δ 1 i ) δ 2 i [ X i T β 2 + g 2 ( Y 2 i ) ] + δ 1 i δ 2 i X i T β 3 + g 3 ( Y 1 i , Y 2 i ) Λ 1 ( Y 1 i ) Λ 2 ( Y 2 i ) Y 1 i Y 2 i exp ( g 3 ( Y 1 i , s ) ) d s }
over the sieve space Θ n .
For the knot selection, we let m = 3 and use the Bayesian information criterion
BIC ( N n ) = l n ( β ^ , ψ ^ ) + log n n 3 N n + 3 q
to choose K n which minimizes the criterion function.

3. Theoretical Properties

In this section, we establish the theoretical properties of our spline-based modeling strategy under the following regularity conditions.

Assumptions

  • (A1) Y 1 and Y 2 have compact supports (say [ 0 , 1 ] ) and X has bounded support in R q where q is the dimension of X . Moreover, if there exists a constant c 0 and a constant vector γ ˜ such that γ X = c 0 almost surely, then c 0 = 0 and γ ˜ = 0 .
  • (A2) β 0 B , where B is a compact set of R 3 q with nonempty interior. λ 10 and λ 20 H r , and λ 30 C r .
  • (A3) K n = O ( n ν ) where ν satisfies the restrictions 0.25 / r < ν < 0.5 .
  • (A4) r 2 where r is the measure of smoothness of λ j in definitions of H r and C r .
We first establish the strong consistency for the estimated model parameters.
Theorem 1.
Under Assumptions A1–A3, β ^ are strong consistent estimators of the true coefficients β 0 , and λ ^ 1 λ 10 2 0 ,   λ ^ 2 λ 20 2 0 ,   λ ^ 3 λ 30 2 0 almost surely.
Next, we obtain the convergence rates for the proposed estimators.
Theorem 2.
Under Assumptions A1–A3, it holds that
λ ^ 1 λ 10 2 + λ ^ 2 λ 20 2 + λ ^ 3 λ 30 2 = O p ( n r ν + n ( 1 / 2 ν ) ) .
This theorem implies that if v = 1 / ( 2 + 2 r ) ,   λ ^ 3 λ 30 2 = O p ( n r / ( 2 r + 2 ) ) , which is the optimal convergence rate in the non-parametric regression setting for bivariate function estimation by [9].
To derive the limiting distribution of the proposed estimators, establish the asymptotic normality, we calculate the directional derivative of the log-likelihood in the associate functional spaces as follows.
Denote V as the linear span of Θ 0 θ 0 , where θ 0 denote the true value of θ = ( β , ψ ) and Θ 0 denote the true parameter space. Let l ( θ ; W ) be the log-likelihood for a sample of size one and δ n = n r ν + n ( 1 / 2 ν ) . For any θ { θ Θ 0 : θ θ 0 = O ( δ n ) } , define the first order directional derivative of l ( θ ; W ) at the direction v V as
l ˙ ( θ ; W ) = d l ( θ + s v ; W ) d s | s = 0 ,
and the second order directional derivative as
l ¨ ( θ ; W ) = d 2 l ( θ + s v + s ˜ v ˜ ; W ) d s ˜ d s | s = 0 | s ˜ = 0 = d l ˙ ( θ + s ˜ v ˜ ; W ) d s ˜ | s ˜ = 0 .
Define the Fisher inner product on the space V as
< v , v ˜ > = P l ˙ ( θ ; W ) [ v ] l ˙ ( θ ; W ) [ v ˜ ]
and the Fisher norm for v V as v 1 / 2 = < v , v > . Let V ¯ be the closed linear span of V under the Fisher norm. Then ( V ¯ , · ) is a Hilbert space.
Define the smooth functional of θ as
γ ( θ ) = b β + 0 1 ϕ 1 ( t ) λ 1 ( t ) d t + 0 1 ϕ 2 ( s ) λ 2 ( s ) d s + 0 1 0 1 ϕ 3 ( t , s ) λ 3 ( t , s ) d t d s ,
where b is any vector of 3 q dimension with b 1 ,   ϕ i H r [ 0 , 1 ] , i = 1 , 2 λ 3 C r [ 0 , 1 ] 2 . For any v V , we denote
γ ˙ ( θ 0 ) [ v ] = d γ ( θ 0 + s v ) d s | s = 0
whenever the right hand-side limit is well defined and assume:
  • (A5) for any v V ¯ , γ ( θ 0 + s v ) is continuously differentiable in s [ 0 , 1 ] near s = 0 , and
    γ ˙ ( θ 0 ) = sup v V ¯ : v > 0 | γ ˙ ( θ 0 ) [ v ] | v < .
Note that γ ( θ ) γ ( θ 0 ) = γ ˙ ( θ 0 ) [ θ θ 0 ] . Under Assumption A5, by the Riesz representation theorem, there exists v * V ¯ such that γ ˙ ( θ 0 ) [ v ] = < v * , v > for all v V ¯ and v * 2 = γ ˙ ( θ 0 ) .
Theorem 3.
Suppose suppose r > 2 and assumptions A1–A3, A5 hold, then n 1 / 2 ( γ ( θ ^ ) γ ( θ ) ) N ( 0 , γ ˙ ( θ 0 ) 2 ) in distribution and and γ ( θ ^ ) is semiparametrically efficient.
Remark 2.
Inference about β ^ .Theorem 3 offers ease of inference procedure, especially for the regression parameter β. Set ϕ j ( · ) = 0 ( j = 1 , 2 , 3 ) , then Theorem 3 yields that n 1 / 2 b ( β ^ β 0 ) N ( 0 , b Σ β β b ) , and thus
n 1 / 2 ( β ^ β 0 ) N ( 0 , Σ β β ) ,
by Gramer-Wold device, one can establish semiparametricefficiency of β ^ . where Σ β β can be consistently estimated using the inverse of the Hessian matrix.
Remark 3.
Inference about λ j ( · ) ( j = 1 , 2 , 3 ) . For λ j ( · ) ( j = 1 , 2 ) , let b = 0 and ϕ k ( k j ) = 0 , then Theorem 3 yields that
n 1 / 2 0 1 ϕ j ( w ) ( λ ^ j ( w ) λ j 0 ( w ) ) d w N ( 0 , σ λ j 2 ) ,
where σ λ j 2 ( j = 1 , 2 ) can be consistently estimated by using the delta method or some resampling methods. Similarly inference can be done for λ 3 ( t , s ) : Let b = 0 ,   ϕ 1 ( · ) = 0 ,   ϕ 2 ( · ) = 0 , then Theorem 3 yields that
n 1 / 2 0 1 0 1 ϕ j ( t , s ) ( λ ^ 3 ( t , s ) λ 30 ( t , s ) ) d t d s N ( 0 , σ λ 3 2 ) ,
where σ λ 3 2 can be consistently estimated by using the delta method or some resampling methods. The above results can be used to check the linear (quadratic) effect of t j ( j = 1 , 2 ) , or to check whether λ 3 ( t 1 , t 2 ) is an additive form of t 1 and t 2 .

4. Simulation Study

We conducted simulations to investigate finite sample performance of the proposed estimator. In the simulation, we let
λ 10 ( t 1 ) = 1 1 + 2 t 1 ,
λ 20 ( t 2 ) = 1 1 + 2 t 2 ,
λ 30 ( t 2 | t 1 ) = 2 1 + t 1 + t 2 .
By calculation, it is clear that the stipulated transition functions do not follow the transition functions from the models involving the frailty distribution and Markov or semi-Markov modells ([1,5]). It is therefore of interest to examine whether the proposed spline-based estimation procedure still yields reliable and accurate estimates for this scenario which cannot be tackled by existing approaches. We report results with one covariate, X, having a uniform. distribution between 0 and 0.5 . We consider β j = 1 , 1 , 0.5 , j = 1 , 2 , 3 , and n = 200 and 400 . The censoring time was simulated from from a uniform distribution on ( 0 , τ ) with τ = 50 . We compute the spline based semiparametric maximum likelihood estimate using the cubic B-spline and estimate the standard error of the estimated regression parameter using the inverse of the Hessian matrix. For the B-spline, the number of knots K n or equivalently N n = ( K n + m ) is chosen using BIC defined in Section 2.3. Table 1, Table 2 and Table 3 presents the estimation bias (BIAS), standard deviations (STD), the mean of the estimated standard error of the estimated regression parameter(ESE) and the coverage proportion of the 95 percent confidence intervals (CP) based on 500 replicates.
From Table 1, Table 2 and Table 3, we can see (a) the proposed estimates have very small biases; (b) standard deviations of the estimates shrink at approximately the n rate; (c) the estimated standard deviations are very close to those of the original estimates; the 95 percent confidence intervals provide adequate coverage probabilities. It can be seen that the proposed modeling strategy and estimation procedure can yield reliable and accurate estimates and exhibit direct and good interpretation in practice.

5. A Real Data Example

As our proposed B-spline based modeling strategy does not involve the subjective specification of the frailty distribution and do not require the Markov or semi-Markov assumption which may be unmet in real applications, it is hence more flexible than existing approaches in practice. To illustrate this point, we now apply the illness-death model presented in Section 2 to the colon cancer data. It is of interest to examine whether the time spent in state 1 (past) is related to the transition function from state 2 into state 3. For answering this question, we consider a working model λ 3 ( t , s ) = exp ( ξ t ) λ ( s ) . It translates to test H 0 : ξ = 0 . This can be done using the usual likelihood ratio statistic. The results obtained for the colon cancer study show that the effect of time spent in state 1 is significant (p-value < 0.05 ). This allows us to conclude that the Markov assumption may be unsatisfactory for the colon cancer data set. This further demonstrate the stringent assumptions required by existing approaches may be unmet in practice which calls for the need of our proposed methodology.
For illustrative purposes, we only consider one covariates: Lev+5-FU treatment. Our interest centers on understanding the effect of Lev+5-FU treatment and nonparametricall modelling transition functions in different states. Table 4 reports the estimates of the regression coefficients along with standard errors and p-values. From Figure 1 and Figure 2, we can see our proposed model and estimation procedure yield the estimated transition functions with direct and good interpretation. It stipulates quantitatively how the hazard functions of the time to terminal event and the time to non-terminal event evolves over time and shed lights on the disease progression and death risks for colon cancer patients with and without relapse of the cancer. We plot the estimated the transition functions in Figure 2.
Furthermore, to illustrate the computational advantage of our proposed approach, for the real data application, the existing frailty-model approach will require the number of parameters ( 3 + 413 + 1 = 417 ) . However, our proposed B-spline approach only require ( m + K n ) 3 + 3 = ( 4 + 8 ) 3 + 3 = 39 parameters. Hence, the computational cost is substantially reduced while our approach is more flexible than existing approaches because it does not require the subjective specification of the frailty distribution and the Markov or semi-Markov assumption.

6. Concluding Remarks

In this paper, we proposed an spline-based sieve semiparametric maximum likeli- hood method for semi-competing risks data. This method reduces the dimensionality of the estimation problem using the splines and therefore releases the numerical burden of the computation. This approach allow essily infer for both regression parameters and transition functions. It should be a straightforward task to apply the method presented here to allow for non-linear relationships between continuous predictors and survival in the multi-state framework ([6,10] and others). Simulations showed that the new estimator may behave very good. For illustration purposes we used a real dataset from a clinical trail for colon cancer. Competing risks data can also be regarded as a special type of multitask prediction problem. In such a field, the most state-of-the-art method is MTPS [4], which currently does not support predicting survival outcomes. Following their approaches, it would be worthwhile studying the stacked algorithm for prediction with multivariate survival outcomes including competing risks and semi-competing risks data.

Author Contributions

Conceptualization, J.X.; methodology, X.H. and J.X.; software, X.H.; formal analysis, X.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proofs of Theorem 1, Theorem 2, and Theorem 3

This section contains the proofs for Theorems 1–3. Some empirical process theorems developed in [11] will be repeatedly used. Throughout the following proofs, we denote P f = f ( x ) d P ( x ) and P n f = n 1 i = 1 n f ( X i ) , the empirical process indexed by function f ( X ) .

Appendix A.1. Proof of Theorem 1

By applying the inequality (31) in [12] (p. 31), we have
sup θ Θ n | P n l ( θ ; W ) P l ( θ ; W ) | 0 , a . s .
Let
ζ 1 n = sup θ Θ n | P n l ( θ ; W ) P l ( θ ; W ) | ,
ζ 2 n = P n l ( θ 0 ; W ) P l ( θ 0 ; W ) .
Denote K ϵ = { θ : d ( θ , θ 0 ) ϵ , θ Θ n } .
inf K ϵ P l ( θ ; W ) = inf K ϵ P l ( θ ; W ) P n l ( θ ; W ) + P n l ( θ ; W ) ζ 1 n + inf K ϵ P n l ( θ ; W ) .
If θ ^ n K ϵ , we have
inf K ϵ P n l ( θ ; W ) = P n l ( θ ^ ; W ) P n l ( θ 0 ; W ) = P n l ( θ 0 ; W ) P l ( θ 0 ; W ) + P l ( θ 0 ; W ) = ζ 2 n + P l ( θ 0 ; W ) .
By condition A3, we obtain that inf K ϵ P l ( θ ; W ) P l ( θ 0 ; W ) = δ ϵ > 0 . It completes the proof.

Appendix A.2. Proof of Theorem 2

Noticing
E P n 1 / 2 ( P n P ) F η C J η ( ε , F η , · 2 ) { 1 + J η ( ε , F η , · 2 ) η 2 n 1 / 2 } ,
where J η ( ε , F η , · 2 ) = 0 η { 1 + log N [ ] ( ε , F η , · 2 ) } 1 / 2 d ε C N 1 / 2 η . The right-hand side of (A6) yields ϕ n ( η ) = C ( N 1 / 2 η + N / n 1 / 2 ) . It is easy to see that ϕ n ( η ) / η decreasing in η , and r n 2 ϕ n ( 1 / r n ) = r n N 1 / 2 + r n 2 N / n 1 / 2 < 2 n 1 / 2 , where r n = N 1 / 2 n 1 / 2 = n ν + 1 / 2 , 0 < ν < 1 / 2 . Hence n ν + 1 / 2 d ( θ ^ , θ n 0 ) = O P ( 1 ) by Theorem 3.2.5 of [11]. This, together with d ( θ n 0 , θ 0 ) = O p ( n r ν ) (see Theorem 12.7 in [8], yields that d ( θ ^ , θ 0 ) = O p ( n ( 1 / 2 ν ) + n r ν ) . This completes the proofs.

Appendix A.3. Proof of Theorem 3

Let ε n be any positive sequence satisfying ε n = o ( n 1 / 2 ) . For any v * Θ 0 , by [8], Theorem 12.7, there exists Π n v * Θ n such that Π n v * v * = o ( 1 ) and δ n Π n v * v * = o ( n 1 / 2 ) . Also define r [ θ θ 0 ; W ] = l ( θ ; W ) l ( θ 0 ; W ) l ˙ ( θ ; W ) [ θ θ 0 ] . Then by definition of θ ^ , we have
By (A1) and Chebyshev inequality, independent and identical distribution data, and Π n v * v * = o ( 1 ) , we have I 1 = o p ( n 1 / 2 ) .
For I 2 , we have
I 2 = ( P n P ) l ( θ ^ ; W ) l ( θ ^ ± ε n Π n v * ; W ) ± ε n l ˙ ( θ 0 ; W ) [ Π n v * ] = ε n ( P n P ) l ˙ ( θ ˜ ; W ) l ˙ ( θ 0 ; W ) [ Π n v * ] ,
where θ ˜ lies between θ ^ and θ ^ ± ε n Π n v * . It follows that { l ˙ ( θ ; W ) [ Π n v * ] : θ θ 0 = O ( δ n ) } is Donsker class. Therefore, by Theorem 2.11.23 of [11], we have I 2 = ε n × o p ( n 1 / 2 ) .
It follows that δ n Π n v * v * = o ( n 1 / 2 ) , and Π n v * 2 v * 2 . Combing the above facts, together with P l ˙ ( θ 0 ; W [ v * ] ) = 0 , we can establish that
0 P n { l ( θ ^ ; W ) l ( θ ^ ± ε n Π n v * ; W ) } = ε n P n l ˙ ( θ 0 ; W ) [ v * ] ± ε n < θ ^ θ 0 , v * > + ε n × o p ( n 1 / 2 ) = ε n ( P n P ) { l ˙ ( θ 0 ; W ) [ v * ] } ± ε n < θ ^ θ 0 , v * > + ε n × o p ( n 1 / 2 ) .
Therefore, we obtain n < θ ^ θ 0 , v * > = n ( P n P ) { l ˙ ( θ 0 ; W ) [ v * ] } + o p ( 1 ) N ( 0 , v * 2 ) , where the asymptotic normality is guaranteed by Central limits Theorem and the the asymptotic variance being equal to v * 2 = l ˙ ( θ 0 ; W ) 2 . This, together with A5 imply n 1 / 2 ( γ ( θ ^ ) γ ( θ 0 ) ) = n 1 / 2 < θ ^ θ 0 , v * > + o p ( 1 ) N ( 0 , v * 2 ) in distribution. The semiparametric efficiency can be established by applying the result of [13].

References

  1. Fine, J.P.; Jiang, H.; Chappell, R. On semi-competing risks data. Biometrika 2001, 88, 907–919. [Google Scholar] [CrossRef]
  2. Wang, W. Estimating the association parameter for copula models under dependent censoring. J. R. Stat. Soc. Ser. Stat. Methodol. 2003, 65, 257–273. [Google Scholar] [CrossRef]
  3. Day, R.; Bryant, J.; Lefkopoulou, M. Adaptation of bivariate frailty models for prediction, with application to biological markers as prognostic indicators. Biometrika 1997, 84, 45–56. [Google Scholar] [CrossRef]
  4. Xing, L.; Lesperance, M.L.; Zhang, X. Simultaneous prediction of multiple outcomes using revised stacking algorithms. Bioinformatics 2020, 36, 65–72. [Google Scholar] [CrossRef]
  5. Xu, J.; Kalbfleisch, J.D.; Tai, B. Statistical analysis of illness–death processes and semicompeting risks data. Biometrics 2010, 66, 716–725. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  6. Andersen, P.K.; Borgan, O.; Gill, R.D.; Keiding, N. Statistical Models Based on Counting Processes; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
  7. Kalbfleisch, J.D.; Prentice, R.L. The Statistical Analysis of Failure Time Data; John Wiley & Sons: Hoboken, NJ, USA, 2011. [Google Scholar]
  8. Schumaker, L. Spline Functions: Basic Theory; Cambridge University Press: Cambridge, UK, 2007. [Google Scholar]
  9. Stone, C.J. Optimal global rates of convergence for nonparametric regression. Ann. Stat. 1982, 10, 1040–1053. [Google Scholar] [CrossRef]
  10. Meira-Machado, L.; de Uña-Álvarez, J.; Cadarso-Suárez, C.; Andersen, P.K. Multi-state models for the analysis of time-to-event data. Stat. Methods Med. Res. 2009, 18, 195–222. [Google Scholar] [CrossRef] [Green Version]
  11. Wellner, J. Weak Convergence and Empirical Processes: With Applications to Statistics; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
  12. Pollard, D. Convergence of Stochastic Processes; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
  13. Bickel, P.J.; Kwon, J. Inference for semiparametric models: Some questions and an answer. Stat. Sin. 2001, 11, 863–886. [Google Scholar]
Figure 1. Compartment model for semicompeting risks data.
Figure 1. Compartment model for semicompeting risks data.
Mathematics 10 02248 g001
Figure 2. Estimated transition functions for the colon cancer data.
Figure 2. Estimated transition functions for the colon cancer data.
Mathematics 10 02248 g002
Table 1. Simulation results for ( β 10 , β 20 , β 20 ) = ( 1 , 1 , 1 ) .
Table 1. Simulation results for ( β 10 , β 20 , β 20 ) = ( 1 , 1 , 1 ) .
BIASSTDESECP
n = 200 β 1 = 1 0.0210.2330.2190.953
β 2 = 1 −0.0160.2300.2630.954
β 3 = 1 0.0260.2810.2190.986
n = 400 β 1 = 1 0.0170.1660.1590.963
β 2 = 1 −0.0130.1670.1640.960
β 3 = 1 0.0180.1220.1410.965
Table 2. Simulation results for ( β 10 , β 20 , β 20 ) = ( 1 , 1 , 1 ) .
Table 2. Simulation results for ( β 10 , β 20 , β 20 ) = ( 1 , 1 , 1 ) .
BIASSTDESECP
n = 200 β 1 = −1−0.0150.2440.2250.956
β 2 = −10.0190.2320.2390.962
β 3 = −1−0.0140.2690.2840.961
n = 400 β 1 = −1−0.0130.1440.1650.961
β 2 = −10.0140.1580.1640.945
β 3 = −1−0.0130.1970.1850.980
Table 3. Simulation results for ( β 10 , β 20 , β 20 ) = ( 0.5 , 0.5 , 0.5 ) .
Table 3. Simulation results for ( β 10 , β 20 , β 20 ) = ( 0.5 , 0.5 , 0.5 ) .
BIASSTDESECP
n = 200 β 1 = 0.5 0.0170.2300.2050.966
β 2 = 0.5 −0.0130.2210.2190.965
β 3 = 0.5 0.0160.1820.2180.945
n = 400 β 1 = 0.5 0.0080.1720.1550.941
β 2 = 0.5 −0.0110.1320.1520.954
β 3 = 0.5 0.0120.1250.1570.938
Table 4. Estimated regression coefficients and their standard errors for the colon data.
Table 4. Estimated regression coefficients and their standard errors for the colon data.
TransitionParametersEstimateStandard Errorp-Value
12 β 1 −0.5130.1191.6 × 10 5
13 β 1 −0.0280.3790.469
23 β 1 0.7380.1307.0 × 10 9
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Huang, X.; Xu, J. Nonparametric Sieve Maximum Likelihood Estimation of Semi-Competing Risks Data. Mathematics 2022, 10, 2248. https://doi.org/10.3390/math10132248

AMA Style

Huang X, Xu J. Nonparametric Sieve Maximum Likelihood Estimation of Semi-Competing Risks Data. Mathematics. 2022; 10(13):2248. https://doi.org/10.3390/math10132248

Chicago/Turabian Style

Huang, Xifen, and Jinfeng Xu. 2022. "Nonparametric Sieve Maximum Likelihood Estimation of Semi-Competing Risks Data" Mathematics 10, no. 13: 2248. https://doi.org/10.3390/math10132248

APA Style

Huang, X., & Xu, J. (2022). Nonparametric Sieve Maximum Likelihood Estimation of Semi-Competing Risks Data. Mathematics, 10(13), 2248. https://doi.org/10.3390/math10132248

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop