Next Article in Journal
Spectral, Scattering and Dynamics: Gelfand–Levitan–Marchenko–Krein Equations
Next Article in Special Issue
Optimal Bandwidth Selection Methods with Application to Wind Speed Distribution
Previous Article in Journal
Similarity Classes in the Eight-Tetrahedron Longest-Edge Partition of a Regular Tetrahedron
Previous Article in Special Issue
Nonparametric Estimation of Multivariate Copula Using Empirical Bayes Methods
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Treatment Benefit and Treatment Harm Rates with Nonignorable Missing Covariate, Endpoint, or Treatment

School of Mathematics and Statistics, Shenzhen University, Shenzhen 518061, China
*
Author to whom correspondence should be addressed.
Mathematics 2023, 11(21), 4459; https://doi.org/10.3390/math11214459
Submission received: 3 September 2023 / Revised: 8 October 2023 / Accepted: 22 October 2023 / Published: 27 October 2023
(This article belongs to the Special Issue Nonparametric Statistical Methods and Their Applications)

Abstract

:
The average treatment effect is an important concept in causal inference. However, it fails to capture variation in response to treatment due to heterogeneity at many levels among patients in the target population. To study the heterogeneity in the treatment effect, researchers proposed the concepts of treatment benefit rate (TBR) and treatment harm rate (THR). Howerver, in practice, missing data often occurs in treatment, endpoints, and covariates. In these cases, the conditions given by them are not enough to identify treatment benefit rate. In this article, we address the problem of identifying the treatment benefit rate and treatment harm rate when treatment or endpoints or covariates are missing. Different types of missing data mechanisms are assumed, including several situations of nonignorable missingness. We prove that the treatment benefit rate and treatment harm rate are identifiable under very mild conditions, and then construct estimators based on methods of the EM algorithm. The performance of the proposed inference procedure is evaluated via simulation studies. Lastly, we illustrate our method by real data sets.

1. Introduction

The average treatment effect is widely used in the measurement of causal inference [1]; however, it is not the only measure. For example, in a typical randomized Phase III clinical trial, there are patients who benefit from a negative trial and patients who do not benefit from a positive trial. In this situation, the average treatment effect can not explain the casual effect completely, since it ignores the heterogeneous responses to the treatment in the target population. Some researchers made additional assumptions to address this heterogeneity. One of the main assumptions is “Monotony” [2], which assumes that the treatment effect for each individual will be no worse than the control effect. There are many scientific and empirical reasons to doubt this assumption. Ref. [3] proposed several explanations for the fact that some people respond to an inactive control group, but do not respond to an experimental treatment, and note that for some people, a placebo has been shown to be superior to active treatment.
For this reason, we focus on the measurement of treatment benefit rate (TBR) and treatment harm rate (THR) in our paper. Ref. [4] tried to identify TBR and THR by making the additional assumption that the two potential outcomes were independent, conditional on observed covariates. Ref. [5] estimated the TBR and THR assuming the existence of at least three covariates, which are mutually independent. Ref. [6] proposed a Bayesian-tree-based latent variable model to seek subpopulations with distinct TBR. Under the assumption that the potential outcomes are independent conditional on the observed covariates and an unmeasured latent variable, ref. [7] showed the identification of the TBR and THR in non-separable (generalized) linear mixed models for both continuous and binary outcomes. In our article, we follow the assumption in [4] to make the TBR and THR identifiable.
However, although our experiment is based on the assumption of a randomized experiment, we still need to face a problem in the process of identifying TBR and THR, that is, there may be missing data in the pretreatment covariate, potential endpoint, or treatment assignment. When one of the variables has missing data, TBR and THR cannot be identified; we can only get the upper and lower bounds for parameters of interest, rather than point estimates, and the upper and lower bounds are too wide to use. When the missing data mechanism is ignorable, we can ignore observations with missing data and identify TBR and THR directly. In many cases, however, missing data is not ignorable, that is, the processing of missing data depends on some possibly missing variables. In this case, TBR and THR cannot be identified without other assumptions. For example, in a randomized clinical trial [8], the covariate is obtained from electrophysiological stimulation (EPS) testing. Because the EPS testing is invasive and not a prerequisite for enrollment in the study, 79.3% of patients in the implantable cardiac defibrillator arm have EPS records, whereas only 2.4% of patients in the control arm have EPS records. Therefore, the missing data problem for the covariate is very severe and nonignorable. Another example is the Awakening and Breathing Controlled trial [9]. In this trial, because of the possibility of the patients’ death, there are nonignorable missing data in the cognitive score at 3 months and 12 months. In a total of 187 patients, there were 111 missing values for the cognitive score at 3 months and 136 missing values for the cognitive score at 12 months, and they are nonignorable. Because of the missing data, we can not identify the TBR and THR directly.
There are many examples in the literature where the problem of missing data in causal inference has been studied. Refs. [8,10] used sensitivity analysis in the nonignorable missing covariates problem. Refs. [11,12,13] studied the identification problem when the missingness of the outcomes was nonignorable. Refs. [14,15,16] discussed the identifiability of causal effects when a key covariate is missing due to death. Refs. [17,18] also discussed nonignorable missing covariates problems in survival analysis and regression models. In our paper, we deal with the case that one of the treatments, covariate, and the endpoint have missing data. We will give some basic assumptions and special conditions for pretreatment covariates, potential endpoints, and treatment, under which we can identify the TBR and THR. These assumptions and conditions have certain wide applicability.
The rest of this article is as follows. In Section 2, we introduce the notation and assumption used throughout this article. In Section 3, we introduce several missing mechanisms of covariate, endpoint, and treatment. In Section 4, we discuss the identifiability of TBR and THR under these missing mechanisms. In Section 5, we estimate TBR and THR using the EM algorithm in simulation studies when they can be identified. In Section 6, we analyze datasets from clinical trials by our methods. Lastly, we put the proofs of theorems in the Appendix A.

2. Notation and Assumption

Let Z denote the treatment assignment. Z = 1 means treatment, and Z = 0 means control. We assume that there is only one covariate, and let X denote the pretreatment covariate with K categories ( K 2 ). Suppose the K levels of X are x 1 , x 2 , … x K . Let Y donate endpoints, and suppose Y is binary. Y = 1 means that the treatment or control works. Y = 0 means that the treatment or control does not work. We assume Y ( 0 ) as the potential endpoint under control and Y ( 1 ) as the potential endpoint under intervention. Then, the observed endpoint Y can be written as Y = Z Y ( 1 ) + ( 1 Z ) Y ( 0 ) . In our article, we assume that one of X, Y, Z is missing. Let R X denote the missing data indicator for X and R X ( z ) denote the potential missing data indicator for X, R Y denotes the missing indicator for Y and R Y ( z ) denotes the potential missing data indicator for Y, R Z denotes the missing indicator for Z. Because Z is the treatment, we do not write the potential variable of R Z . We can only observed one of the pairs { Y ( 1 ) , R X ( 1 ) , R Y ( 1 ) } and { Y ( 0 ) , R X ( 0 ) , R Y ( 0 ) } . R X = 1 means X is missing, R X = 0 means X is observed. R Y = 1 means Y is missing, R Y = 0 means Y is observed. R Z = 1 means Z is missing. R Z = 0 means Z is observed. For X, Y, Z, we assume that only one of them is missing at the same time, which means that one of R X , R Y , and R Z may be 1, and the other two variables are constant 0.
Ref. [4] defines TBR (treatment benefit rate) as the proportion of the relevant population that benefits from the intervention as compared with the control for a given endpoint. THR (treatment harm rate) is defined as the proportion that is harmed by the intervention as compared with the control based on the same endpoint. Thus, we can use the following equations to describe TBR and THR:
T B R = P ( Y ( 0 ) = 0 , Y ( 1 ) = 1 ) , T H R = P ( Y ( 0 ) = 1 , Y ( 1 ) = 0 ) .
When TBR is much larger than THR, we can say that this treatment is beneficial. On the contrary, when THR is much larger than TBR, we can say that this treatment is harmful.
Let A B | C denote that variables A and B are conditionally independent, given variable C. To identify TBR and THR, we need the following assumptions.
Assumption 1
(Complete randomization).  Z { Y ( 0 ) , Y ( 1 ) , R X ( 1 ) , R X ( 0 ) , R Y ( 1 ) , R Y ( 0 ) , R Z , X } .
When the experiment is a completely randomized experiment, its data set is subject to this assumption. It means in a completely randomized experiment, the treatment assignment Z is independent of Y ( 0 ) , Y ( 1 ) , R X ( 1 ) , R X ( 0 ) , R Y ( 1 ) , R Y ( 0 ) , R Z , X . This assumption is very strong, and all the theory and methods discussed in this paper are subject to this assumption. Under this assumption, we can get the following equation: P ( Y ( z ) = 1 ) = P ( Y = 1 | Z = z ) .
Assumption 2.
Y ( 0 ) Y ( 1 ) X
This assumption means that when the pretreatment covariate X is given, the potential endpoints are independent of each other, that is, given the covariate X, Y ( 1 ) cannot predict Y ( 0 ) and Y ( 0 ) cannot predict Y ( 1 ) .
We aim to identify the TBR and THR through the observed data. If there are no missing data, under the Assumptions 1 and 2, the TBR and THR can be split into the product of two conditional probabilities based on the observed data. Let T B R x and T H R x denote the treatment benefit rate and treatment harm rate, given X. Then, we have:
T B R = E ( T B R x ) = E [ P ( Y ( 0 ) = 0 , Y ( 1 ) = 1 | X ) ] = E [ P ( Y ( 0 ) = 0 | X ) P ( Y ( 1 ) = 1 | X ) ] = E [ P ( Y = 0 | X , Z = 0 ) P ( Y = 1 | X , Z = 1 ) ] , T H R = E ( T H R x ) = E [ P ( Y ( 0 ) = 1 , Y ( 1 ) = 0 | X ) ] = E [ P ( Y ( 0 ) = 1 | X ) P ( Y ( 1 ) = 0 | X ) ] = E [ P ( Y = 1 | X , Z = 0 ) P ( Y = 0 | X , Z = 1 ) ] .
The above equations illustrate that the TBR and THR can be identified under Assumptions 1 and 2 without missing data. However, when there are missing data in one of the covariate, endpoint, or treatment variables, Assumptions 1 and 2 are not enough to ensure the identification of the TBR and THR, and the above formula no longer works. In this paper, we give sufficient conditions to identify the TBR and THR when one of X , Y , Z have missing data.
Lastly, we introduce the following assumption.
Assumption 3.
When X has missing data, P ( R X = 1 | X , Y , Z ) < 1 . When Y has missing data, P ( R Y = 1 | X , Y , Z ) < 1 . When Z has missing data, P ( R Z = 1 | X , Y , Z ) < 1 .
We need this assumption to ensure that the missing variable is only partially missing.

3. Missing Data Mechanisms

In our article, we study the TBR and THR when one of X, Y, Z have missing data. Before introducing the specific missing mechanisms, we will review the definition of missing at random (MAR) and missing not at random (MNAR) first (Little and Rubin, 2002 [19]). D 0 denotes the complete data, D 1 denotes the observed data, and D 2 denotes the missing data; therefore, we have D 0 = ( D 1 , D 2 ). Next, we introduce the two missing mechanisms mentioned above.
Definition 1.
The missing data mechanism is called missing at random (MAR), if R X , R Y , or R Z only depend on the observed data, that is, one of the following three formulas, R X D 2 | D 1 , R Y D 2 | D 1 , or R Z D 2 | D 1 , holds. Otherwise, if R X , R Y , or R Z depend on D 2 , the missing data mechanism is called missing not at random (MNAR).
When the missing indicators only rely on the observed data ( D 1 ), because the missing mechanism does not depend on the missing data, the inference for parameters can be based only on the observed data and we call it missing at random (MAR). When the missing is not at random, it is nonignorable. In such a case, we cannot ignore the missing data.
In this article, we study the TBR and THR when one of X, Y, Z is under the condition of MNAR. The missing mechanisms of X, Y, or Z are important because they influence the identifiability and estimation of the TBR and THR. For each variable in X, Y, Z, we propose three missing mechanisms.
First, we introduce three missing mechanisms of X.
  • ( R X 1 ) R X depends on X and R X is independent of ( Y , Z ) , given X, which means:
    P ( R X = 1 | Z = z , X = x , Y = y ) = P ( R X = 1 | X = x ) ,
  • ( R X 2 ) R X depends on ( X , Z ) and R X is independent of Y, given ( X , Z ) , which means:
    P ( R X = 1 | Z = z , X = x , Y = y ) = P ( R X = 1 | Z = z , X = x ) , o r
    P R X ( z ) = 1 | X = x , Y ( z ) = y = P R X ( z ) = 1 | Z = z , X = x ,
  • ( R X 3 ) R X depends on ( X , Y ) and R X is independent of Z, given ( X , Y ) , which means:
    P ( R X = 1 | Z = z , X = x , Y = y ) = P ( R X = 1 | X = x , Y = y ) .
For the first missing mechanism of X, we assume that the missingness of X depends only on X. For the second missing mechanism of X, we assume that the missingness of X depends on ( X , Z ) . For the third missing mechanism of X, we assume that the missingness of X depends on ( X , Y ) . All the missing mechanisms are nonignorable and these missing mechanisms cannot be deduced from each other.
Similarly, we introduce the following several missing mechanisms of Y.
  • ( R Y 1 ) R Y depends on Y and R Y is independent of ( X , Z ) , given Y, which means:
    P ( R Y = 1 | Z = z , X = x , Y = y ) = P ( R Y = 1 | Y = y ) ,
  • ( R Y 2 ) R Y depends on ( Y , Z ) and R Y is independent of X, given ( Y , Z ) , which means:
    P ( R Y = 1 | Z = z , X = x , Y = y ) = P ( R Y = 1 | Z = z , Y = y ) , o r
    P R Y ( z ) = 1 | X = x , Y ( z ) = y = P R X ( z ) = 1 | Z = z , Y = y ,
  • ( R Y 3 ) R Y depends on ( X , Y ) and R Y is independent of Z, given ( X , Y ) , which means:
    P ( R Y = 1 | Z = z , X = x , Y = y ) = P ( R Y = 1 | X = x , Y = y ) .
Analogously, for the first missing mechanism of Y, we assume that the missingness of Y depends only on Y. For the second missing mechanism of Y, we assume that the missingness of Y depends on ( Y , Z ) . For the third missing mechanism of Y, we assume that the missingness of Y depends on ( X , Y ) . All the missing mechanisms are also nonignorable.
Lastly, we are going to introduce the missing mechanism of Z.
  • ( R Z 1 ) R Z depends on Z and R Z is independent of ( X , Y ) , given Z, which means:
    P ( R Z = 1 | Z = z , X = x , Y = y ) = P ( R Z = 1 | Z = z ) ,
  • ( R Z 2 ) R z depends on ( Y , Z ) and R Z is independent of X, which means:
    P ( R Z = 1 | Z = z , X = x , Y = y ) = P ( R Z = 1 | Z = z , Y = y ) ,
  • ( R Z 3 ) R Z depends on ( X , Z ) and R Z is independent of Z, which means:
    P ( R Z = 1 | Z = z , X = x , Y = y ) = P ( R Z = 1 | X = x , Z = z ) .
Above, we introduced three missing mechanisms of Z. For the first missing mechanism of Z, we assume that the missingness of Z depends only on Z. For the second missing mechanism of Z, we assume that the missingness of Z depends on ( Y , Z ) . For the third missing mechanism of Z, we assume that the missingness of Z depends on ( X , Z ) .
The missing mechanisms mentioned above are all MNAR. The above-mentioned missing mechanisms for X, Y, Z all assume that the missing variable satisfies some conditional independent relationship. In the next section, we consider whether TBR and THR can be identified under these missing mechanisms.

4. Identifiability of TBR and THR

In this section, we discuss the identifiability of TBR and THR when one of X, Y, and Z have missing data. In some mechanisms, we have to identify the joint distribution of P ( X , Y , Z , R X ) , P ( X , Y , Z , R Y ) , or P ( X , Y , Z , R Z ) to ensure the identifiability of the TBR and THR. We assume the following theorems are under the Assumptions 1 and 2. Before introducing the theorems, note that X has K levels, and Y and Z are both binaries.
Firstly, we give sufficient conditions under which we can identify the TBR and THR when covariate X has missing data.
Theorem 1.
For the missing of X:
(1) 
Under the missing mechanism R X 1 , the TBR and THR are identifiable when r ( A 1 ) = r ( A 1 ¯ ) = K , where A 1 and A 1 ¯ are two matrices and the definitions of A 1 and A 1 ¯ are mentioned below, and r ( · ) is the rank function.
(2) 
Under the missing mechanism R X 2 , the TBR and THR are identifiable when K = 2 and X / Y | ( R X = 0 , Z ) .
(3) 
Under the missing mechanism R X 3 , the TBR and THR are identifiable when K = 2 and X / Z | ( R X = 0 , Y ) .
When X has missing data, under different missing mechanisms, the identification conditions are also different. Under the first missing mechanism, if we want to identify the THR and THR, we need to assume that the rank of matrix A 1 and A 1 ¯ is K. A 1 and A 1 ¯ are defined as follows.
A 1 = p x 1 0 | 00 p x 2 0 | 00 p x K 0 | 00 p x 1 0 | 10 p x 2 0 | 10 p x K 0 | 10 p x 1 0 | 01 p x 2 0 | 01 p x K 0 | 01 p x 1 0 | 11 p x 2 0 | 11 p x K 0 | 11 and A 1 ¯ = p x 1 0 | 00 p x 2 0 | 00 p x K 0 | 00 1 p x 1 0 | 10 p x 2 0 | 10 p x K 0 | 10 1 p x 1 0 | 01 p x 2 0 | 01 p x K 0 | 01 1 p x 1 0 | 11 p x 2 0 | 11 p x K 0 | 11 1 ,
where p x i 0 | y z = P ( X = x i , R X = 0 | Y = y , Z = z ) . Note that A 1 is a matrix with 4 rows and K columns, and A 1 ¯ is a matrix with 4 rows and K + 1 columns. If the rank of A 1 and A 1 ¯ is required to be equal to K, K must be less than or equal to 4. Under the second missing mechanism, if the covariate and endpoint are not conditionally independent, given R X = 0 and Z, and the covariate only has two levels, we can identify the TBR and THR. Under the third missing mechanism, if the covariate and treatment are not conditionally independent, given R X = 0 and Y, and the covariate only has two levels, we can also identify the TBR and THR.
Next, we give sufficient conditions under which we can identify the TBR and THR when endpoints Y have missing data.
Theorem 2.
For the missing of Y:
(1) 
Under the missing mechanism R Y 1 , the TBR and THR are identifiable under the condition r ( B 1 ) = r ( B 1 ¯ ) = 2 , where B 1 and B 1 ¯ are two matrices and the definitions of B 1 and B 1 ¯ can be found in the appendix, and r ( · ) is the rank function.
(2) 
Under the missing mechanism R Y 2 , the TBR and THR are identifiable under the condition r ( B 21 ) = r ( B 21 ¯ ) = r ( B 22 ) = r ( B 22 ¯ ) = 2 , where B 21 , B 21 ¯ , B 22 , and B 22 ¯ are matrices, and the definitions of B 21 , B 21 ¯ , B 22 , and B 22 ¯ can be found in the appendix, and r ( · ) is the rank function.
(3) 
Under missing mechanism R Y 3 , the TBR and THR are identifiable under the condition Y / Z | ( R Y = 0 , X = x i ) ( i = 1 , 2 , , K ) .
When Y has missing data, we cannot get a uniform identifiable condition. Under different missing mechanisms, it requires different conditions to ensure the identification of the TBR and THR. Under the first missing mechanism, if we want to identify the THR and THR, we need to assume that the rank of the matrix B 1 and B 1 ¯ is 2. B 1 and B 1 ¯ are defined as follows.
B 1 = p 00 | x 1 0 p 10 | x 1 0 p 00 | x 1 1 p 10 | x 1 1 p 00 | x K 0 p 10 | x K 0 p 00 | x K 1 p 10 | x K 1 and B 1 ¯ = p 00 | x 1 0 p 10 | x 1 0 1 p 00 | x 1 1 p 10 | x 1 1 1 p 00 | x K 0 p 10 | x K 0 1 p 00 | x K 1 p 10 | x K 1 1 ,
where p y 0 | x i z = P ( Y = y , R y = 0 | X = x i , Z = z ) . Under the second missing mechanism, if we want to identify the THR and THR, we need to assume that the rank of matrix B 21 , B ¯ 21 , B 22 , and B ¯ 22 is 2. B 21 , B 21 ¯ , B 22 , and B 22 ¯ are defined as follows.
B 21 = p 00 | x 1 0 p 10 | x 1 0 p 00 | x K 0 p 10 | x K 0 , B 21 ¯ = p 00 | x 1 0 p 10 | x 1 0 1 p 00 | x K 0 p 10 | x K 0 1 ,
B 22 = p 00 | x 1 1 p 10 | x 1 1 p 00 | x K 1 p 10 | x K 1 , B 22 ¯ = p 00 | x 1 1 p 10 | x 1 1 1 p 00 | x K 1 p 10 | x K 1 1 ,
where p y 0 | x i z = P ( Y = y , R Y = 0 | X = x i , Z = z ) . Under the last missing mechanism, we can identify the TBR and THR if Y and Z are not conditionally independent, given R Y = 0 and X = x i .
Lastly, we give sufficient conditions under which we can identify the TBR and THR when treatment Z has missing data.
Theorem 3.
For the missing of Z:
(1) 
Under the missing mechanism R Z 1 , the TBR and THR are identifiable under the condition r ( C 1 ) = r ( C 1 ¯ ) = 2 , where C 1 and C 1 ¯ are two matrices, and the definitions of C 1 and C 1 ¯ can be found in the appendix, and r ( · ) is the rank function.
(2) 
Under the missing mechanism R Z 2 , the TBR and THR are identifiable under the condition r ( C 21 ) = r ( C 21 ¯ ) = r ( C 22 ) = r ( C 22 ¯ ) = 2 , where C 21 , C 21 ¯ , C 22 , and C 22 ¯ are matrices, and the definitions of C 21 , C 21 ¯ , C 22 , and C 22 ¯ can be found in the appendix, and r ( · ) is the rank function.
(3) 
Under the missing mechanism R Z 3 , the TBR and THR are identifiable under the condition Z / Y | X = x i ( i = 1 , 2 K ) .
When Z has missing data, we also cannot get a uniform identifiable condition. Under different missing mechanisms, it requires different conditions to ensure the identification of the TBR and THR. Under the first missing mechanism, if we want to identify the THR and THR, we need to assume that the rank of matrix C 1 and C 1 ¯ is 2. C 1 and C 1 ¯ are defined as follows.
C 1 = p 00 | x 1 0 p 10 | x 1 0 p 00 | x 1 1 p 10 | x 1 1 p 00 | x k 0 p 10 | x k 0 p 00 | x k 1 p 10 | x k 1 and C 1 ¯ = p 00 | x 1 0 p 10 | x 1 0 1 p 00 | x 1 1 p 10 | x 1 1 1 p 00 | x k 0 p 10 | x k 0 1 p 00 | x k 1 p 10 | x k 1 1 ,
where p z 0 | x i y = P ( Z = z , R Z = 0 | X = x i , Y = y ) . Under the second missing mechanism, if we want to identify the THR and THR, we need to assume that the rank of matrix C 21 , C 21 ¯ , C 22 , and C 22 ¯ is 2. C 21 , C 21 ¯ , C 22 , and C 22 ¯ are defined as follows.
C 21 = p 00 | x 1 0 p 10 | x 1 0 p 00 | x k 0 p 10 | x k 0 , C 21 ¯ = p 00 | x 1 0 p 10 | x 1 0 1 p 00 | x k 0 p 10 | x k 0 1 ,
C 22 = p 00 | x 1 1 p 10 | x 1 1 p 00 | x k 1 p 10 | x k 1 , C 22 ¯ = p 00 | x 1 1 p 10 | x 1 1 1 p 00 | x k 1 p 10 | x k 1 1 ,
where p z 0 | x i y = P ( Z = z , R Z = 0 | X = x i , Y = y ) . Under the last missing mechanism, we can identify the TBR and THR if Y and Z are not conditionally independent, given R Y = 0 and X = x i .
The above three theorems give sufficient conditions under which the TBR and THR can be identified. In the next two parts, we illustrate my conclusion through simulation and actual data.

5. Computational Details and Simulation Study

In this part, we first introduce how to use the EM algorithm to estimate the TBR and THR when covariate X has missing data and satisfies missing mechanism R X 2 . When X satisfies other missing mechanisms or there are missing data in the other two variables, the estimation is similar. Next, we generate simulation data and then apply our method to the simulation data to illustrate that our estimation works. We use statistical software R to implement our numerical simulation.

5.1. Expectation Maximization Algorithms

We define P x y z r = P ( X = x , Y = y , Z = z , R X = r ) and P + y z r = P ( Y = y , Z = z , R X = r ) , where “+” represents the marginal distribution over corresponding variable. Similarly, let N x y z r denote the observed frequency in the cell ( x , y , z , r ) of the contingency table, and N + y z r denote the marginal frequency of the contingency table over the corresponding variable X. When “+” is at another position, its meaning is the same.
In practice, we can use the expectation maximization (EM) algorithm to find the MLEs. In this subsection, we only describe the computational details for missing mechanism R X 2 . For simplicity, we only describe the algorithms for binary X. The algorithms for multi-categorical X can be written similarly. Under the missing mechanism R X 2 , we have R X Y | ( X , Z ) . Thus, the joint distribution of ( X , Y , Z , R X ) can be written as P ( j ) ( X = x , Y = y , Z = z , R X = r ) = P ( j ) ( X = x ) P ( j ) ( Z = z | X = x ) P ( j ) ( Y = y | X = x , Z = z ) P ( j ) ( R X = r | X = x , Z = z ) . Superscript j indicates the j-th iteration. Define:
p x | y z 1 ( j ) = p ( j ) ( X = x | Y = y , Z = z , R X = 1 ) = P ( j ) ( X = x , Y = y , Z = z , R X = 1 ) x = 0 , 1 P ( j ) ( X = x , Y = y , Z = z , R X = 1 )
The EM algorithm iterates between the following E-step and M-step:
(a)
E-step: The sufficient statistics are imputed as N x y z 0 ( j ) = N x y z 0 and N x y z 1 ( j ) = N + y z 1 p x | y z 1 ( j ) ;
(b)
M-step: The joint distribution is updated by P ( j + 1 ) ( X = x , Y = y , Z = z , R X = 1 ) = N x + + + ( j ) N + + + + ( j ) N x + z + ( j ) N x + + + ( j ) N x y z + ( j ) N x + z + ( j ) N x + z r ( j ) N x + z + ( j ) .
After the algorithm converges, we assume that the convergent probability is P ^ . According to the formula in the second section, we can estimate the TBR and THR as follows.
T B R = P ^ ( Y = 0 | X = 1 , Z = 0 ) P ^ ( Y = 1 | X = 1 , Z = 1 ) P ^ ( X = 1 ) + P ^ ( Y = 0 | X = 0 , Z = 0 ) P ^ ( Y = 1 | X = 0 , Z = 1 ) P ^ ( X = 0 ) , T H R = P ^ ( Y = 1 | X = 1 , Z = 0 ) P ^ ( Y = 0 | X = 1 , Z = 1 ) P ^ ( X = 1 ) + P ^ ( Y = 1 | X = 0 , Z = 0 ) P ^ ( Y = 0 | X = 0 , Z = 1 ) P ^ ( X = 0 ) .
Lastly, we calculate the standard errors of the above estimator by repeating the processes 1000 times.

5.2. Simulation Study

In this section, we evaluate the finite sample performances of the likelihood-based estimator for the missing mechanisms R X 1 and R Y 2 via simulation studies. In order to mimic the real data analyzed in the next section, we assume that Z is completely randomized and Z X . We generated Z B e r n o u l l i ( 0.5 ) and X B e r n o u l l i ( 0.5 ) . P y | x z = P ( Y = y | Z = z , X = x ) is defined, and Y is generated according to the conditional distribution ( P 1 | 00 , P 1 | 10 , P 1 | 01 , P 1 | 11 ) . We set the parameters of the two missing mechanisms as follows.
  • ( R X 2 ):
    ( P 1 | 00 , P 1 | 01 , P 1 | 10 , P 1 | 11 ) = ( 0.4 , 0.5 , 0.8 , 0.4 ) , P ( R X = 0 | X = 1 , Z = 1 ) = 0.7 , P ( R X = 0 | X = 0 , Z = 1 ) = 0.4 , P ( R X = 0 | X = 1 , Z = 0 ) = 0.5 , P ( R X = 0 | X = 0 , Z = 0 ) = 0.6 .
  • ( R Y 3 ):
    ( P 1 | 00 , P 1 | 01 , P 1 | 10 , p 1 | 11 ) = ( 0.4 , 0.5 , 0.5 , 0.6 ) , P ( R Y = 0 | X = 1 , Y = 1 ) = 0.4 , P ( R Y = 0 | X = 0 , Y = 1 ) = 0.3 , P ( R Y = 0 | X = 1 , Y = 1 ) = 0.4 , P ( R Y = 0 | X = 0 , Y = 1 ) = 0.3 .
We use the EM algorithm to find the MLEs of the parameters and calculate the corresponding THR and TBR. The sample sizes of the simulation study are 500, 1000, and 1500, respectively, and we repeat the simulation 1000 times. The means and the standard errors of the estimates of the TBR and THR are given in Table 1 and Table 2.
We can see from the simulation results that the values of TBR and THR can be estimated consistently, which means that the TBR and THR are identifiable. With the increase of sample size, the standard deviation decreases gradually.

6. Application

In this part, we illustrate the correctness of our method with three real data examples.

6.1. Application to the Second Multicenter Automatic Defibrillator Intervention Trial

In this section, we re-analyzed a randomized clinical trial using the newly proposed methods under the missing mechanism R X 2 . We first briefly review the background of the illustrative clinical trial, and more details of the data can be found in the previous paper ([8]). In this example, Z is the treatment assignment variable, with Z = 1 denoting the treatment (implantable cardiac defibrillator) and Z = 0 denoting the control. The endpoint Y is the death indicator, with Y = 1 denoting dead and Y = 0 denoting alive. Let X denote the inducible indicator, with X = 1 denoting inducible and X = 0 denoting noninducible. The covariate X is obtained from the electro-physiological stimulation (EPS) testing. Because the EPS testing is invasive and not a pre-requisite for enrollment in the study, 79.3 % of patients in the implantable cardiac defibrillator arm have EPS records, whereas only 24 % of patients in the control arm have EPS records. Therefore, the problem of missing data for the covariate X is very severe. The observed data can be summarized as the following counts ( N x y z r x ): N 0000 = 4 , N 0010 = 311 , N 1000 = 6 , N 1010 = 190 , N 0100 = 0 , N 0110 = 62 , N 1100 = 2 , N 1110 = 20 , N + 001 = 382 , N + 101 = 95 , N + 011 = 136 , and N + 111 = 23 . We assume that the missing mechanism of X is R X 2 . Firstly, we use the EM algorithm to calculate the maximum likelihood estimation of the parameters and then calculate the T B R and T H R . Then, the sampling is repeated 1500 times to calculate the standard deviation of TBR and THR. The estimated TBR and THR are 0.1121 ( 0.0121 ) and 0.1697 ( 0.0175 ) . The numbers in brackets indicate the standard deviation.

6.2. Application to the Mechanical Treatment Trial for Crisis Patients

In this section, we will re-analyze a randomized clinical trial using the newly proposed methods under missing mechanism R Y 3 . We first briefly review the background of the trial ([9]). In this example, critically ill patients randomly received mechanical ventilation 1:1 within each study site to manage with a paired sedation plus ventilator weaning protocol involving the daily interruption of sedative through spontaneous awakening trials (SATs) and spontaneous breathing trials (SBTs) or sedation per usual care (UC) and SBTs. Z is the treatment assignment variable, with Z = 1 denoting the treatment (SAT and SBT) and Z = 0 denoting the control (UC and SBT). The endpoint Y is the cognitive score, with Y = 1 denoting “higher cognitive ability” and Y = 0 denoting “lower cognitive ability”. Let X denote age, with X = 1 denoting “the people older than 33 years old” and X = 0 denoting “the people younger than 33 years old”. In randomized studies involving severely ill patients, functional endpoints are often unobserved due to missed clinic visits, premature withdrawal, or death. The observed data can be summarized as the following counts ( N x y z r y ): N 0000 = 9 ,   N 0100 = 9 ,   N 1000 = 7 ,   N 1100 = 0 ,   N 0010 = 12 , N 0110 = 16 , N 1010 = 10 , N 1110 = 6 , N 0 + 01 = 24 ,   N 1 + 01 = 45 ,   N 0 + 11 = 23 , and N 1 + 11 = 26 . We assume that the missing mechanism of Y is R Y 3 . Similarly, we use the EM algorithm to calculate the maximum likelihood estimation of the parameters and then calculate the T B R and T H R . Then, we use the bootstrap method to repeat sampling 1500 times to calculate the standard deviation of the TBR and THR. The estimated TBR and THR are 0.270 ( 0.043 ) and 0.230 ( 0.039 ) . The numbers in brackets indicate the standard deviation.

6.3. Application to the Job Search Intervention Study

In this section, we will analyze a randomized trial using the proposed methods under missing mechanism R Z 1 . Firstly, we will introduce the background of the data. The Job Search Intervention Study (JOBS II) was a randomized field experiment that investigated the efficacy of a job training intervention on unemployed workers ([20]). There are 899 unemployed workers in the “jobs” dataset. All the workers were randomly assigned to two groups, the control group (people received a booklet describing job-search process) and the treatment group (people participated in job skills workshops); the binary endpoint represents whether the respondents had become employed. Z is the treatment assignment variable, with Z = 1 denoting the treatment (people participated in job skills workshops) and Z = 0 denoting the control (people received a booklet describing job search process). Y denotes the endpoint; Y = 1 denotes that the worker became employed finally, that is, the treatment worked; and Y = 0 denotes that the worker was still unemployed. Additionally, X denotes sex, with X = 0 for female and X = 1 for male. The observed data can be summarized as the following counts ( N x y z ): N 111 = 211 , N 011 = 182 , N 101 = 99 , N 100 = 38 , N 110 = 134 , N 010 = 79 , N 000 = 48 , N 001 = 108 . Based on this data, we assume P ( R Z = 1 | Z = 1 ) = 0.3 ,   P ( R Z = 1 | Z = 0 ) = 0.2 and manually generate missing data. The generated data can be summarized as the following counts ( N x y z r z ): N 00 + 1 = 47 ,   N 01 + 1 = 63 ,   N 10 + 1 = 41 ,   N 11 + 1 = 95 , N 0000 = 35 , N 0100 = 68 , N 1000 = 24 , N 1100 = 102 ,   N 0010 = 74 ,   N 0110 = 130 ,   N 1010 = 72 ,   N 1110 = 148 . Similarly, we use the EM algorithm to calculate the maximum likelihood estimation of the parameters and then calculate the T B R and T H R . We use the bootstrap method to repeat the sampling 1500 times to calculate the standard deviation of the TBR and THR. The estimated TBR and THR are 0.2090 ( 0.0160 ) and 0.1870 ( 0.0163 ) . The number in brackets indicates the standard deviation.

7. Discussion

In the field of causal inference, the average causal effect is an important measure, but this measure is also flawed. Its flaw is that it ignores the heterogeneous responses to the treatment in the target population. Therefore, in this article, we study the TBR and THR proposed by [4]. In addition, in randomized experiments, the existence of missing data is a common phenomenon [21], so we assume that there are missing data in one of the covariate, endpoint, or treatment. We give sufficient conditions to make the TBR and THR identifiable in the presence of missing data. We illustrate our method through simulated data, and then apply our method to several actual data.
There are several issues beyond the scope of this paper. First, in Assumption 2, we assume that given a covariate X, the two potential variables Y 0 and Y 1 are conditionally independent. This assumption also appeared in [4]. However, we can only observe one of the two potential variables, and the other one cannot be observed, which means that Assumption 2 cannot be verified by the data. Thus, it is better to propose a more appropriate assumption to ensure that TBR and THR can be identified.
Second, in our article, we assume that the covariate X in Assumption 1 is a binary one-dimensional variable. However, in practice, X may be a continuous variable or high-dimensional variable, and there may also be unobservable variables in X. In this case, even if there are no missing data, it is very difficult to identify the TBR and THR because the observations in each subgroup may be very sparse in a limited sample. If there are still missing data, we need to propose new conditions so that the TBR and THR can be identified.
Third, we discussed the situation where only one of the covariate, endpoint, and treatment variables may be MNAR. In many applications, both the covariate and the endpoint may be MNAR at the same time. In this case, the identification and estimation of the TBR and THR will be more complicated.
Although the problems mentioned above are beyond the scope of this article, we will continue our research in this area.

Author Contributions

Conceptualization, P.L.; methodology, Y.H.; software, L.Z.; formal analysis, L.Z.; investigation, Y.H.; writing—original draft preparation, Y.H.; writing—review and editing, P.L.; supervision, P.L.; project administration, P.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Natural Science Foundation of China (12001380).

Data Availability Statement

The data was found in the R software package.

Acknowledgments

We would like to thank the Editor, Associate Editor, and three reviewers for their very valuable comments and suggestions, which led to a significant improvement of our article.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proofs of Theorems

Note that:
T B R = E X [ P r ( Y ( 0 ) = 0 , Y ( 1 ) = 1 | X ) ] = E x [ P r ( Y = 0 | X , Z = 0 ) P r ( Y = 1 | X , Z = 1 ) ]
T H R = E X [ P r ( Y ( 0 ) = 1 , Y ( 1 ) = 0 | X ) ] = E x [ P r ( Y = 1 | X , Z = 0 ) P r ( Y = 0 | X , Z = 1 ) ]
From the above formula, we can see that as long as we identify P ( Y | X , Z ) and P ( X ) , we can identify the TBR and THR. As such, in the following proof, the primary task is to prove that P ( Y | X , Z ) and P ( X ) are identifiable. In our proof, we regard uppercase letters as random variables and lowercase letters as the values of random variables.
Proof of Theorem 1.
We will use the following notation in proof 1:
p x r x | y z = P ( X = x , R X = r x | Y = y , Z = z ) , p r x | x = P ( R X = r x | X = x ) , p r x | x z = P ( R X = r x | X = x , Z = z ) , p r x | x y = P ( R X = r x | X = x , Y = y )
(1): Because ( Y , Z ) R X | X , we have P ( X , Y , Z , R X ) = P ( Y , Z ) P ( X | Y , Z ) P ( R X | X , Y , Z ) = P ( Y , Z ) P ( X | Y , Z ) P ( R X | X ) . Divide both sides by P ( Y , Z ) P ( R X | X ) and consider the missing mechanism, and we can get:
P ( X , R X = 0 | Y , Z ) P ( R X = 0 | X ) = P ( X | Y , Z ) ,
and
i = 1 K P ( X = x i , R X = 0 | Y , Z ) P ( R X = 0 | X = x i ) = i = 1 K P ( X = x i | Y , Z ) = 1 .
To identify P ( X ) and P ( Y | X , Z ) , we have to identify P ( X , Y , Z , R X ) first, and calculate the marginal distribution of X and the conditional distribution of Y after, given X , Z . In the equation mentioned above, P ( Y , Z ) is identifiable, so we have to prove that P ( X | Y , Z ) and P ( R X = 1 | X , Z ) are also identifiable. Write the above formula in the form of matrix, and then we can get the following equations:
p x 1 0 | 00 p x 2 0 | 00 p x K 0 | 00 p x 1 0 | 10 p x 2 0 | 10 p x K 0 | 10 p x 1 0 | 01 p x 2 0 | 01 p x K 0 | 01 p x 1 0 | 11 p x 2 0 | 11 p x K 0 | 11 1 p 0 | x 1 1 p 0 | x 2 1 p 0 | x K = 1 1 1 1 .
To identify P ( X | Y , Z ) , the above equation should have only one solution. Notice that the equation is a linear equation and has K unknowns; they are 1 p 0 | x 1 , 1 p 0 | x 2 , … and 1 p 0 | x 2 . Suppose the coefficient matrix and augmented matrix of this equation are A 1 and A 1 ¯ , respectively, which means:
A 1 = p x 1 0 | 00 p x 2 0 | 00 p x K 0 | 00 p x 1 0 | 10 p x 2 0 | 10 p x K 0 | 10 p x 1 0 | 01 p x 2 0 | 01 p x K 0 | 01 p x 1 0 | 11 p x 2 0 | 11 p x K 0 | 11 and A 1 ¯ = p x 1 0 | 00 p x 2 0 | 00 p x K 0 | 00 1 p x 1 0 | 10 p x 2 0 | 10 p x K 0 | 10 1 p x 1 0 | 01 p x 2 0 | 01 p x K 0 | 01 1 p x 1 0 | 11 p x 2 0 | 11 p x K 0 | 11 1
According to the theory of linear equations, when the rank of the coefficient matrix, the rank of the augmented matrix, and the number of unknowns are the same( r ( A ) = r ( A ¯ ) = K ), the equation has a unique solution. After identifying P ( R X | X ) , according to the above formula, P ( X | Y , Z ) can also be identified, and then the entire joint distribution can be identified. As such, the TBR and THR can also be identified.
(2): Because Y R x | ( X , Z ) , we have P ( Y | X , Z ) = P ( Y | X , Z , R X = 0 ) . Notice that P ( Y | X , Z , R X = 0 ) is identifiable; thus, P ( Y | X , Z ) is identifiable. In addition, for P(X), notice that P ( X , Y , Z , R X ) = P ( Y , Z ) P ( X | Y , Z ) P ( R X | X , Y , Z ) = P ( Y , Z ) P ( X | Y , Z ) P ( R X | X , Z ) . Divide both sides by P ( Y , Z ) P ( R X | X , Y , Z ) and consider the missing mechanism, and we can get:
P ( X , R X = 0 | Y , Z ) P ( R X = 0 | X , Z ) = P ( X | Y , Z ) ,
and
i = 1 K P ( X = x i , R X = 0 | Y , Z ) P ( R X = 0 | X = x i , Z ) = i = 1 K P ( X = x i | Y , Z ) = 1 .
To identify P ( X ) , we have to identify P ( X , Y , Z , R X ) and calculate its marginal distribution with respect to X. In the equation mentioned above, P ( Y , Z ) is identifiable, and so we have to prove that P ( X | Y , Z ) and P ( R X = 0 | X , Z ) are identifiable. According to the above equations, we can get:
p x 1 0 | 00 p x 2 0 | 00 p x K 0 | 00 p x 1 0 | 10 p x 2 0 | 10 p x K 0 | 10 1 p 0 | x 1 0 1 p 0 | x 2 0 1 p 0 | x K 0 = 1 1
p x 1 0 | 01 p x 2 0 | 01 p x K 0 | 01 p x 1 0 | 11 p x 2 0 | 11 p x K 0 | 11 1 p 0 | x 1 1 1 p 0 | x 2 1 1 p 0 | x K 1 = 1 1 .
To identify P ( X | Y , Z ) , the above equations should have only one solution. Notice that both of these equations have K unknowns, and the rank of coefficient matrix is less than or equal to 2, so to have a unique solution we need K 2. In addition, we have mentioned before that K 2, so there is a unique solution if K = 2 .
When K = 2 , if the solution to the equation is unique, the coefficient matrix must be invertible. ⇔ p x 1 0 | 00 p x 2 0 | 10 p x 2 0 | 00 p x 1 0 | 10 and p x 1 0 | 01 p x 2 0 | 11 p x 1 0 | 11 p x 2 0 | 01 . ⇔ X Y | ( R X = 0 , Z ) .
(3): Because of the missing mechanism R X Z | ( X , Y ) , we can spilt the joint distribution: P ( X , Y , Z , R X ) = P ( Y , Z ) P ( X | Y , Z ) P ( R X | X , Y , Z ) = P ( Y , Z ) P ( X | Y , Z ) P ( R X | X , Y ) . Divide both sides by P ( Y , Z ) P ( R X | X , Y ) , and we can get:
P ( X , R X = 0 | Y , Z ) P ( R X = 0 | X , Y ) = P ( X | Y , Z )
and
i = 1 K P ( X = x i , R X = 0 | Y , Z ) P ( R X = 0 | X , Y ) = i = 1 K P ( X = x i | Y , Z ) = 1 .
To identify P ( Y | X , Z ) and P ( X ) , we should identify P ( X , Y , Z , R X ) first. Similarly, write the above equation in the form of matrix, and we can get the following equations:
p x 1 0 | 00 p x 2 0 | 00 p x K 0 | 00 p x 1 0 | 01 p x 2 0 | 01 p x K 0 | 01 1 p 0 | x 1 0 1 p 0 | x 2 0 1 p 0 | x K 0 = 1 1
p x 1 0 | 10 p x 2 0 | 10 p x K 0 | 10 p x 1 0 | 11 p x 2 0 | 11 p x K 0 | 11 1 p 0 | x 1 1 1 p 0 | x 2 1 1 p 0 | x K 1 = 1 1
To identify P ( X , Y , Z , R X ) , the above equations should have only one solution. Notice that both of these equations have K unknowns, and the rank of the equation coefficient matrix is less than or equal to 2. In order to have a unique solution, we need the condition that K is less than or equal to 2. We have mentioned that K 2 , so there is a unique solution if, and only if, K is equal to 2.
When K = 2 , the solution to the equation is unique. ⇔ The coefficient matrix is invertible. ⇔ P x 1 0 | 00 P x 2 0 | 01 P x 2 0 | 00 P x 1 0 | 01 and P x 1 0 | 10 P x 2 0 | 11 P x 1 0 | 11 P x 2 0 | 10 are true. ⇔ X Z | ( R X = 0 , Y ) . □
Proof of Theorem 2.
We will use the following notation in proofs 2:
p y r y | x z = P ( Y = y , R Y = r y | X = x , Z = z ) , p r y | y = P ( R Y = r y | Y = y ) , p r y | y z = P ( R Y = r y | Y = y , Z = z ) , p r y | x y = P ( R Y = r y | X = x , Y = y )
(1) We can spilt the joint distribution: P ( X , Y , Z , R Y ) = P ( X , Z ) P ( Y | X , Z ) P ( R Y | X , Y , Z ) . Divide both sides by P ( X , Z ) P ( R Y | X , Y , Z ) , because of the missing mechanism R Y ( X , Z ) | Y , and we can get:
P ( Y , R Y = 0 | X , Z ) P ( R Y = 0 | Y ) = P ( Y | X , Z ) ,
and
y = 0 , 1 P ( Y = y , R Y = 0 | X , Z ) P ( R Y = 0 | Y ) = y = 0 , 1 P ( Y = y | X , Z ) = 1 .
P ( X , Z ) is identifiable, and to identify P ( X , Y , Z , R Y ) , we need to prove that P ( Y | X , Z ) and P ( R Y = 0 | Y ) are identifiable. Write the above formula in the form of matrix, and we can get:
p 00 | x 1 0 p 10 | x 1 0 p 00 | x 1 1 p 10 | x 1 1 p 00 | x K 0 p 10 | x K 0 p 00 | x K 1 p 10 | x K 1 1 p 0 | 0 1 p 0 | 1 = 1 1 1 1 ,
To identify the unknown parameters, the equations should only have one solution. Thus, the rank of coefficient matrix and augmented matrix is 2. Suppose the coefficient matrix and augmented matrix of this equation are B 1 and B 1 ¯ , respectively, which means:
B 1 = p 00 | x 1 0 p 10 | x 1 0 p 00 | x 1 1 p 10 | x 1 1 . . . p 00 | x K 0 p 10 | x K 0 p 00 | x K 1 p 10 | x K 1 and B 1 ¯ = p 00 | x 1 0 p 10 | x 1 0 1 p 00 | x 1 1 p 10 | x 1 1 1 . . . p 00 | x K 0 p 10 | x K 0 1 p 00 | x K 1 p 10 | x K 1 1
According to the theory of linear equations, when the rank of the coefficient matrix, the rank of the augmented matrix, and the number of unknowns are the same ( r ( B 1 ) = r ( B 1 ¯ ) = 2 ), the equation has a unique solution.
(2) We can spilt the joint distribution: P ( X , Y , Z , R Y ) = P ( X , Z ) P ( Y | X , Z ) P ( R Y | X , Y , Z ) . Divide both sides by P ( X , Z ) P ( R Y | X , Y , Z ) and consider the missing mechanism R Y X | ( Y , Z ) , and we can get:
P ( Y , R Y = 0 | X , Z ) P ( R Y = 0 | Y , Z ) = P ( Y | X , Z ) ,
and
y = 0 , 1 P ( Y = y , R Y = 0 | X , Z ) P ( R Y = 0 | Y , Z ) = y = 0 , 1 P ( Y = y | X , Z ) = 1 .
Because P ( X , Z ) is identifiable, to identify P ( X , Y , Z , R Y ) , we need to prove that P ( Y | X , Z ) and P ( R Y = 1 | Y , Z ) are identifiable. Write the above equations in the form of a matrix, and we can get:
p 00 | x 1 0 p 10 | x 1 0 p 00 | x K 0 p 10 | x K 0 1 p 0 | 00 1 p 0 | 10 = 1 1 ,
p 00 | x 1 1 p 10 | x 1 1 p 00 | x K 1 p 10 | x K 1 1 p 0 | 01 1 p 0 | 11 = 1 1 .
To identify the unknown parameters, the equations should only have one solution. Thus, the rank of coefficient matrices and augmented matrices is 2. Suppose the coefficient matrix and augmented matrix of this equation are B 21 , B 22 , and B 21 ¯ , B 22 ¯ , respectively, which means:
B 21 = p 00 | x 1 0 p 10 | x 1 0 p 00 | x K 0 p 10 | x K 0 , B 21 ¯ = p 00 | x 1 0 p 10 | x 1 0 1 p 00 | x K 0 p 10 | x K 0 1 ,
B 22 = p 00 | x 1 1 p 10 | x 1 1 p 00 | x K 1 p 10 | x K 1 , B 22 ¯ = p 00 | x 1 1 p 10 | x 1 1 1 p 00 | x K 1 p 10 | x K 1 1 ,
According to the theory of linear equation, when the rank of the coefficient matrices, the rank of the augmented matrices, and the number of unknowns are the same ( r ( B 21 ) = r ( B 21 ¯ ) = r ( B 22 ) = r ( B 22 ¯ ) = 2 ), the equations have a unique solution.
(3) We can split the joint distribution: P ( X , Y , Z , R Y ) = P ( X , Z ) P ( Y | X , Z ) P ( R Y | X , Y ) . Divide both sides by P ( X , Z ) P ( R Y | X , Y ) and consider the missing mechanism R Y Z | ( X , Y ) , and we can get:
P ( Y , R Y = 0 | X , Z ) P ( R Y = 0 | X , Y ) = P ( Y | X , Z ) ,
and
y = 0 , 1 P ( Y = y , R Y = 0 | X , Z ) P ( R Y = 0 | X , Y = y ) = y = 0 , 1 P ( Y = y | X , Z ) = 1 .
P ( X , Z ) is identifiable, and to identify P ( X , Y , Z , R Y ) , we need to prove that P ( R Y = 0 | X , Y ) and P ( Y | X , Z ) are identifiable. Similarly, we can get following equations:
p 00 | x 1 0 p 10 | x 1 0 p 00 | x 1 1 p 10 | x 1 1 1 p 0 | x 1 0 1 p 0 | x 1 1 = 1 1 ,
                 
p 00 | x k 0 p 10 | x k 0 p 00 | x k 1 p 10 | x k 1 1 p 0 | x k 0 1 p 0 | x k 1 = 1 1 .
To identify the unknown parameters, the equations should only have one solution; it is equivalent to:
p 00 | x i 0 p 10 | x i 0 p 00 | x i 1 p 10 | x i 1 0 i = 1 , 2 k , p 00 | x i 0 p 00 | x i 0 p 10 | x i 1 p 10 | x i 1 Y / Z | ( R Y = 0 , X = x i ) ( i = 1 , 2 k )
In sum, if the condition Y / Z | ( R Y = 0 , X = x i ) ( i = 1 , 2 k ) is satisfied, we can identify joint distribution P ( X , Y , Z , R Y ) , and then we can identify the TBR and THR naturally. □
Proof of Theorem 3.
We will use the following notations in the proof:
p z r z | x y = P ( Z = z , R Z = r z | X = x , Y = y ) , p r z | z = P ( R Z = r z | Z = z ) , p r z | x z = P ( R Z = r z | X = x , Z = z ) , p r z | y z = P ( R Z = r z | Y = y , Z = z ) .
(1): the missing mechanism is ( X , Y ) R Z | Z , thus
P ( X , Y , Z , R Z ) = P ( X , Y ) P ( Z | X , Y ) P ( R Z | X , Y , Z ) = P ( X , Y ) P ( Z | X , Y ) P ( R Z | Z ) .
Divide both sides by P ( X , Y ) P ( R Z | Z ) , and we can get:
P ( Z , R Z = 0 | X , Y ) P ( R Z = 0 | Z ) = P ( Z | X , Y ) ,
and
z = 0 , 1 P ( Z = z , R Z = 0 | X , Y ) P ( R Z = 0 | Z = z ) = z = 0 , 1 P ( Z = z | X , Y ) = 1 .
To identify P ( X ) and P ( Y | X , Z ) , we have to identify P ( X , Y , Z , R Z ) first, and then calculate the marginal distribution with respect to X and calculate the conditional distribution of Z after, given X, Y. In the equation mentioned above, P ( X , Y ) is identifiable, so we have to prove that P ( Z | X , Y ) and P ( R Z = 0 | X , Z ) are identifiable. Similarly, we can get:
p 00 | x 1 0 p 10 | x 1 0 p 00 | x 1 1 p 10 | x 1 1 . . . p 00 | x k 0 p 10 | x k 0 p 00 | x k 1 p 10 | x k 1 1 p 0 | 0 1 p 0 | 1 = 1 1 . . . 1 1 ,
To identify the unknown parameters, the equations should only have one solution. Thus, the rank of coefficient matrix and augmented matrix is 2. Suppose the coefficient matrix and augmented matrix of this equation are C 1 and C 1 ¯ , respectively, which means:
C 1 = p 00 | x 1 0 p 10 | x 1 0 p 00 | x 1 1 p 10 | x 1 1 . . . p 00 | x k 0 p 10 | x k 0 p 00 | x k 1 p 10 | x k 1 and C 1 ¯ = p 00 | x 1 0 p 10 | x 1 0 1 p 00 | x 1 1 p 10 | x 1 1 1 . . . p 00 | x k 0 p 10 | x k 0 1 p 00 | x k 1 p 10 | x k 1 1
According to the theory of linear equation, when the rank of the coefficient matrix, the rank of the augmented matrix and the number of unknowns are the same ( r ( C 1 ) = r ( C 1 ¯ ) = 2 ), the equation has a unique solution.
(2): Because R Z X | ( Y , Z ) , we can spilt the joint distribution into P ( X , Y , Z , R Z ) = P ( X , Y ) P ( Z | X , Y ) P ( R Z | Y , Z ) . Divide both sides by P ( X , Y ) P ( R Z | X , Y , Z ) and consider the missing mechanism, and we can get:
P ( Z , R Z = 0 | X , Y ) P ( R Z = 0 | Y , Z ) = P ( Z | X , Y ) ,
and
z = 0 , 1 P ( Z = z , R Z = 0 | X , Y ) P ( R Z = 0 | Y , Z = z ) = z = 0 , 1 P ( Z = z | X , Y ) = 1 .
P ( Z , R Z = 0 | X , Y ) is identifiable, and we need to prove that P ( Z | X , Y ) and P ( R Z = 0 | Y , Z ) are also identifiable. Similarly, we can get the following equations:
p 00 | x 1 0 p 10 | x 1 0 p 00 | x k 0 p 10 | x k 0 1 p 0 | 00 1 p 0 | 01 = 1 1
p 00 | x 1 1 p 10 | x 1 1 p 00 | x k 1 p 10 | x k 1 1 p 0 | 10 1 p 0 | 11 = 1 1
To identify the unknown parameters, the equations should only have one solution. Thus, the rank of coefficient matrices and augmented matrices is 2. Suppose the coefficient matrix and augmented matrix of this equation are C 21 , C 22 , and C 21 ¯ , C 22 ¯ respectively, which means:
C 21 = p 00 | x 1 0 p 10 | x 1 0 p 00 | x k 0 p 10 | x k 0 , C 21 ¯ = p 00 | x 1 0 p 10 | x 1 0 1 p 00 | x k 0 p 10 | x k 0 1 ,
C 22 = p 00 | x 1 1 p 10 | x 1 1 p 00 | x k 1 p 10 | x k 1 , C 22 ¯ = p 00 | x 1 1 p 10 | x 1 1 1 p 00 | x k 1 p 10 | x k 1 1 ,
According to the theory of linear equations, when the rank of the coefficient matrix, the rank of the augmented matrix, and the number of unknowns are the same ( r ( C 21 ) = r ( C 21 ¯ ) = r ( C 22 ) = r ( C 22 ¯ ) = 2 ), the equation has a unique solution.
(3): Because Y R Z | ( X , Z ) , P ( Y | X , Z ) = P ( Y | X , Z , R Z = 1 ) and P ( Y | X , Z , R Z = 1 ) are, thus, identifiable, and then P ( Y | X , Z ) is identifiable. In addition, for P(X), similarly, P ( X , Y , Z , R Z ) = P ( X , Y ) P ( Z | X , Y ) P ( R Z | X , Y , Z ) = P ( X , Y ) P ( Z | X , Y ) P ( R Z | X , Z ) . Divide both sides by P ( X , Y ) P ( R Z | X , Z ) , and we can get:
P ( Z , R Z = 0 | X , Y ) P ( R Z = 0 | X , Z ) = P ( Z | X , Y ) ,
and
z = 0 , 1 P ( Z = z , R Z = 0 | X , Y ) P ( R Z = 0 | X , Z = z ) = z = 0 , 1 P ( Z = z | X , Y ) = 1 .
To identify P ( X ) and P ( Y | X , Z ) , we have to identify P ( X , Y , Z , R Z ) first. In the equation mentioned above, P ( X , Y ) is identifiable, so we have to prove that P ( Z | X , Y ) and P ( R Z = 0 | X , Z ) are identifiable. Similarly, we can get:
p 00 | x 1 0 p 10 | x 1 0 p 00 | x 1 1 p 10 | x 1 1 1 p 0 | x 1 0 1 p 0 | x 1 1 = 1 1 ,
                 
p 00 | x k 0 p 10 | x k 0 p 00 | x k 1 p 10 | x k 1 1 p 0 | x k 0 1 p 0 | x k 1 = 1 1 .
To identify the unknown parameters, the equations should only have one solution; it is equivalent to:
p 00 | x i 0 p 10 | x i 0 p 00 | x i 1 p 10 | x i 1 0 i = 1 , 2 k , p 00 | x i 0 p 00 | x i 0 p 10 | x i 1 p 10 | x i 1 Z / Y | ( R Z = 0 , X = x i ) ( i = 1 , 2 k )
In sum, if the condition Z / Y | ( R Z = 0 , X = x i ) ( i = 1 , 2 k ) is satisfied, we can identify the joint distribution P ( X , Y , Z , R Z ) , and then we can identify the TBR and THR naturally. □

References

  1. Imbens, G.; Rubin, D. Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction; Cambridge University Press: Cambridge, UK, 2015. [Google Scholar]
  2. Goetghebeur, E.; Molenberghs, G. Causal inference in a placebo-controlled clinical trial with binary outcome and ordered compliance. J. Am. Stat. Assoc. 1996, 91, 928–934. [Google Scholar] [CrossRef]
  3. Berger, V.; Rezvani, A.; Makarewicz, V. Direct effect on validity of response run-in selection in clinical trials. Control. Clin. Trials 2003, 24, 156–166. [Google Scholar] [CrossRef] [PubMed]
  4. Shen, C.; Jeong, J.; Li, X.; Chen, P.S.; Buxton, A. Treatment benefit and treatment harm rate to characterize heterogeneity in treatment effect. Biometrics 2013, 69, 724–731. [Google Scholar] [CrossRef] [PubMed]
  5. Yin, Y.; Zhou, X.H.; Geng, Z.; Lu, F. Assessing The Heterogeneity Of Treatment Effects By Identifying The Treatment Benefit Rate And Treatment Harm Rate. Stat. Sin. 2018, 28, 137–156. [Google Scholar]
  6. Shen, C.; Hu, Y.; Li, X.; Wang, Y.; Chen, P.S.; Buxton, A.E. Identification of subpopulations with distinct treatment benefit rate using the Bayesian tree. Biom. J. 2016, 58, 1357–1375. [Google Scholar] [CrossRef] [PubMed]
  7. Yin, Y.; Liu, L.; Geng, Z. Assessing the treatment effect heterogeneity with a latent variable. Stat. Sin. 2018, 28, 115–135. [Google Scholar]
  8. Scharfstein, D.; Onicescu, G.; Goodman, S.; Whitaker, R. Analysis of subgroup effects in randomized trials when subgroup membership is missing: Application to the second multicenter automatic defibrillator intervention trial. J. R. Stat. Soc. Ser. C (Appl. Stat.) 2011, 60, 607–617. [Google Scholar] [CrossRef]
  9. Girard, T.D.; Kress, J.P.; Fuchs, B.D.; Thomason, J.W.; Schweickert, W.D.; Pun, B.T.; Taichman, D.B.; Dunn, J.G.; Pohlman, A.S.; Kinniry, P.A.; et al. Efficacy and safety of a paired sedation and ventilator weaning protocol for mechanically ventilated patients in intensive care (Awakening and Breathing Controlled trial): A randomised controlled trial. Lancet 2008, 371, 126–134. [Google Scholar] [CrossRef] [PubMed]
  10. Egleston, B.; Wong, Y. Sensitivity analysis to investigate the impact of a missing covariate on survival analyses using cancer registry data. Stat. Med. 2009, 28, 1498–1511. [Google Scholar] [CrossRef]
  11. Chen, H.; Geng, Z.; Zhou, X.H. Identifiability and estimation of causal effects in randomized trials with noncompliance and. completely nonignorable missing data (with discussion). Biometrics 2009, 65, 675–682. [Google Scholar] [CrossRef] [PubMed]
  12. Ma, W.; Geng, Z.; Hu, Y. Identification of graphical models for nonignorable nonresponse of binary outcomes in. longitudinal studies. J. Multivar. Anal. 2003, 87, 24–45. [Google Scholar] [CrossRef]
  13. Frangakis, C.E.; Rubin, D.B. Addressing complications of intention-to-treat analysis in the combined presence of all-or-none treatment-noncompliance and subsequent missing outcomes. Biometrika 1999, 86, 365–379. [Google Scholar] [CrossRef]
  14. Egleston, B.L.; Scharfstein, D.O.; MacKenzie, E. On estimation of the survivor average causal effect in observational studies when important confounders are missing due to death. Biometrics 2009, 65, 497–504. [Google Scholar] [CrossRef] [PubMed]
  15. Frangakis, C.E.; Rubin, D.B.; An, M.W.; MacKenzie, E. Principal stratification designs to estimate input data missing due to death (with discussion). Biometrics 2007, 63, 641–649. [Google Scholar] [CrossRef] [PubMed]
  16. Yan, W.; Hu, Y.; Geng, Z. Identifiability of causal effects for binary variables with baseline data missing due to death. Biometrics 2012, 68, 121–128. [Google Scholar] [CrossRef]
  17. Rathouz, P.J. Identifiability assumptions for missing covariate data in failure time regression models. Biostatistics 2007, 8, 345–356. [Google Scholar] [CrossRef]
  18. Little, R.J.; Zhang, N. Subsample ignorable likelihood for regression analysis with missing data. J. R. Stat. Soc. Ser. C (Appl. Stat.) 2011, 60, 591–605. [Google Scholar] [CrossRef]
  19. Little, R.J.A.; Rubin, D.B. Statistical Analysis with Missing Data; John Wiley & Sons: Hoboken, NJ, USA, 2019. [Google Scholar]
  20. Tingley, D.; Yamamoto, T.; Keele, L.; Imai, K. Mediation: R Package for Causal Mediation Analysis. R Package Version 4.2. 2012. Available online: http://CRAN.R-project.org/package=mediation (accessed on 8 October 2023).
  21. Ding, P.; Geng, Z. Identifiability of subgroup causal effects in randomized experiments with nonignorable missing covariates. Stat. Med. 2014, 33, 1121–1133. [Google Scholar] [CrossRef]
Table 1. TBR and THR mean(sd) by 10,000 repetitions (under the missing mechanism R X 2 ).
Table 1. TBR and THR mean(sd) by 10,000 repetitions (under the missing mechanism R X 2 ).
True ValueSample Size n
50010001500
TBR = 0.410.4071 (0.0282)0.4082 (0.0204)0.4086 (0.0156)
THR = 0.110.1138 (0.0146)0.1135 (0.0104)0.1134 (0.0081)
Table 2. TBR and THR mean(sd) by 10,000 repetitions (under the missing mechanism R Y 3 ).
Table 2. TBR and THR mean(sd) by 10,000 repetitions (under the missing mechanism R Y 3 ).
True ValueSample Size n
50010001500
TBR = 0.300.2900 (0.0228)0.2900 (0.0155)0.2905 (0.0135)
THR = 0.210.2132 (0.1976)0.2130 (0.01333)0.2125 (0.0115)
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

He, Y.; Zheng, L.; Luo, P. Treatment Benefit and Treatment Harm Rates with Nonignorable Missing Covariate, Endpoint, or Treatment. Mathematics 2023, 11, 4459. https://doi.org/10.3390/math11214459

AMA Style

He Y, Zheng L, Luo P. Treatment Benefit and Treatment Harm Rates with Nonignorable Missing Covariate, Endpoint, or Treatment. Mathematics. 2023; 11(21):4459. https://doi.org/10.3390/math11214459

Chicago/Turabian Style

He, Yi, Linzhi Zheng, and Peng Luo. 2023. "Treatment Benefit and Treatment Harm Rates with Nonignorable Missing Covariate, Endpoint, or Treatment" Mathematics 11, no. 21: 4459. https://doi.org/10.3390/math11214459

APA Style

He, Y., Zheng, L., & Luo, P. (2023). Treatment Benefit and Treatment Harm Rates with Nonignorable Missing Covariate, Endpoint, or Treatment. Mathematics, 11(21), 4459. https://doi.org/10.3390/math11214459

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop