Next Article in Journal
Absolute Value Inequality SVM for the PU Learning Problem
Next Article in Special Issue
Joint Statistical Inference for the Area under the ROC Curve and Youden Index under a Density Ratio Model
Previous Article in Journal
Mathematical Model of the Process of Data Transmission over the Radio Channel of Cyber-Physical Systems
Previous Article in Special Issue
Data-Adaptive Multivariate Test for Genomic Studies Using Fused Lasso
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Computation of the Mann–Whitney Effect under Parametric Survival Copula Models

1
Research Center for Medical and Health Data Science, The Institute of Statistical Mathematics, Tokyo 190-8562, Japan
2
Department of Industrial Engineering and Economics, Tokyo Institute of Technology, Tokyo 152-8552, Japan
3
Department of Information Management, Chang Gung University, Taoyuan 33302, Taiwan
4
Biostatistics Center, Kurume University, Kurume 830-0011, Japan
*
Author to whom correspondence should be addressed.
Mathematics 2024, 12(10), 1453; https://doi.org/10.3390/math12101453
Submission received: 15 March 2024 / Revised: 1 May 2024 / Accepted: 6 May 2024 / Published: 8 May 2024
(This article belongs to the Special Issue Statistical Analysis and Data Science for Complex Data)

Abstract

:
The Mann–Whitney effect is a measure for comparing survival distributions between two groups. The Mann–Whitney effect is interpreted as the probability that a randomly selected subject in a group survives longer than a randomly selected subject in the other group. Under the independence assumption of two groups, the Mann–Whitney effect can be expressed as the traditional integral formula of survival functions. However, when the survival times in two groups are not independent of each other, the traditional formula of the Mann–Whitney effect has to be modified. In this article, we propose a copula-based approach to compute the Mann–Whitney effect with parametric survival models under dependence of two groups, which may arise in the potential outcome framework. In addition, we develop a Shiny web app that can implement the proposed method via simple commands. Through a simulation study, we show the correctness of the proposed calculator. We apply the proposed methods to two real datasets.

1. Introduction

When comparing the survival times of two independent groups, the Mann–Whitney parameter plays an important role in the two-sample problem [1]. The Mann–Whitney parameter, say, p, is defined as the probability that a random subject from one group (with survival time T 1 in group 1) survives longer than an independent random subject from the other group (with survival time T 2 in group 2), plus one-half the probability that the two subjects survive at the same time (tie):
p = P ( T 1 > T 2 ) + 1 2 P ( T 1 = T 2 ) .
When researchers verify the superiority of a treatment over another, they examine whether the effect exceeds the null difference, p = 0.5 . For instance, when they obtain p = 0.7 , this effect is transformed to 0.8 of Cohen’s d according to well-known benchmark values [2]. Therefore, it can be interpreted as a large effect. The Mann–Whitney effect relates to important statistical ideas, such as, the Mann–Whitney test [3], hazard ratios, and win ratio [4]. The Mann–Whitney test examines the null hypothesis H 0 : p = 1 / 2 vs. H 1 : p 1 / 2 . The hazard ratio is the main effect measure of the Cox proportional hazards model, which is a typical statistical model in survival analysis. The win ratio w is given by the odds of p; that is, w = p / ( 1 p ) . That is, p > 1 / 2 , or equivalently w > 1 , implies a protective survival effect for group 1.
The problem of estimating the parameter p plays an important part in survival and reliability analysis. The basic idea was first studied by Birnbaum [5]. They illustrated an attractive relationship between the Mann–Whitney statistic and the stress–strength model. Efron [1] first proposed a nonparametric estimator (Efron’s estimator) for p under independent censoring. Since then, this topic has been investigated by several researchers. In the following, we refer to some recent studies in the field of survival and reliability analyses. Dobler & Pauly [6] modified Efron’s estimator for p under small sample sizes. Emura & Hsu [7] proposed a copula-graphic estimator for p and suggested the Mann–Whitney test to compare two survival distributions in the presence of dependent censoring. Biswas et al. [8] introduced the Bayesian estimation of p for the log-Lindley distribution. Rubarth et al. [9] proposed estimating the Mann–Whitney effects in factorial clustered data. Hu et al. [10] developed methodologies for constructing fixed-accuracy confidence intervals of p when T 1 and T 2 follow geometric and the exponential distributions, respectively. de la Cruz et al. [11] studied a estimation procedure of the stress–strength model for the two independent unit-half-normal distributions with different shape parameters. Patil et al. [12] investigated the effect of dependence of the variables on p in the stress-strength model with the exponential margins. Nowak et al. [13] proposed a group sequential method for estimating the Mann–Whitney parameter. Singh et al. [14] studied the estimator of p in point, interval, and Bayesian estimations when the stress variables follow geometric and Lindley distribution. All the methods assumed that T 1 and T 2 are independent.
When T 1 and T 2 are independent of each other and continuous, one can estimate p with the marginal distributions based on the following integral:
p = 0 S 1 ( t ) d S 2 ( t ) .
That is, one can estimate p by estimating two marginal survival functions, S 1 and S 2 . However, this is not the case when T 1 and T 2 are dependent; the phenomenon is sometimes called “Hand’s paradox” [15]. This showed that the paradox arises when T 1 and T 2 are regarded as potential outcomes in the causal inference framework. Therefore, p in the integral cannot be interpreted as the true treatment effect. Moreover, the dependence of outcomes from observation to observation is well known in factorial, paired, and cross-over designs [16,17].
Since p is not identifiable solely from independently sampled data, Fan & Park [18] suggested a bound for p under all possible dependence structures for T 1 and T 2 . Alternatively, Fay et al. [19] reformulated p such that it can be identified from randomized treatment assignments. Computing p under various dependence structures gives the researchers additional information for their decision making about the behavior of p when the outcomes are not independent. However, to estimate the true p, we must model the bivariate survival function of T 1 and T 2 . A copula is often used to model joint distributions of dependent survival times [20,21]. Here, the major challenge is to derive the formula of p under copula models and its extension to allow for a limited follow-up length for survival times.
In this article, we propose a model for the bivariate survival function by using parametric copulas and parametric marginal distributions. Our focus is to study p when there is dependence between T 1 and T 2 . Therefore, we focus on assessing how p vary under various dependences modeled by copulas. We then derive a new formula for computing p under a parametric copula. We also propose a new formula for p under the restricted follow-up. To make the proposed computation method for p to be easily performed by users, we develop a Shiny-based web app. Furthermore, we validate the accuracy of the proposed computation method and Shiny web app by simulations. We finally analyze real data, where the proposed copula-based estimators are compared with Efron’s estimator (benchmark).
The rest of the paper is organized as follows. In Section 2, we define copula-based models and introduce several well-known copula families. In this section, we show that one can compute p by a theorem given by Emura & Pan [22], and we extend the theorem to compute p when the follow-up time is restricted up to time τ . In Section 3, we introduce a Shiny web app in the R software version 4.2.3 that can compute p via simple commands. In Section 4, we describe a simulation study to show the correctness of the proposed calculator for p. In Section 5, we illustrate a meaningful application of our proposed method using survival data. The computer code for the simulations and data analyses is available in Supplementary Materials.

2. Proposed Method

In this section, we first introduce copula-based bivariate survival models for T 1 and T 2 . We then propose our method for computing the Mann–Whitney effect p in Equation (1) under the copula models.

2.1. Survival Copula Models for Dependent Survival Time

According to Sklar [23], any bivariate distribution function for ( T 1 , T 2 ) can be formulated by using a copula. A bivariate copula is a bivariate distribution function for two uniformly distributed variables on [ 0 , 1 ] [24,25]. In many applications of copulas on survival analysis, bivariate survival functions are modeled via survival copulas [20,24,26,27]. Below, we introduce a survival copula model for a bivariate survival function S ( t 1 , t 2 ) = P ( T 1 > t 1 , T 2 > t 2 ) .
Let T 1 and T 2 be continuous survival times with marginal survival functions S 1 and S 2 , respectively. We model the bivariate survival function of T 1 and T 2 using a survival copula C:
P ( T 1 > t 1 , T 2 > t 2 ) = C ( S 1 ( t 1 ) , S 2 ( t 2 ) ) .
This representation is useful since two marginal survival functions S 1 and S 2 are separated from the dependence structure C. Under the survival copula model (2), the copula C is a bivariate distribution function for U = S 1 ( T 1 ) and V = S 2 ( T 2 ) , namely C ( u , v ) = P ( U u , V v ) . Note that the survival copula model (2) gives a different model from the copula model P ( T 1 t 1 , T 2 t 2 ) = C ( 1 S 1 ( t 1 ) , 1 S 2 ( t 2 ) ) unless the copula is radially symmetric, i.e., C ( u , v ) = u + v 1 + C ( 1 u , 1 v ) .
We will consider the following well-known families of bivariate copulas.
1.
The independence copula:
C ( u , v ) = u v .
2.
The Clayton copula [28]:
C θ ( u , v ) = max ( u θ + v θ 1 ) 1 / θ , 0 , θ [ 1 , ) { 0 } .
3.
The Gumbel copula [29]:
C θ ( u , v ) = exp [ ( log u ) θ + 1 + ( log v ) θ + 1 ] 1 θ + 1 , θ [ 0 , ) .
4.
The Frank copula [30]:
C θ ( u , v ) = 1 θ log 1 + ( e θ u 1 ) ( e θ v 1 ) e θ 1 , θ ( , ) { 0 } .
5.
The Farlie–Gumbel–Morgenstern (FGM) copula [31]:
C θ ( u , v ) = u v + θ u v ( 1 u ) ( 1 v ) , θ [ 1 , 1 ] .
6.
The Gumbel–Barnett (GB) copula [24,32,33]:
C θ ( u , v ) = u v exp ( θ log u log v ) , θ [ 0 , 1 ] .
In Figure 1, we present scatter plots for ( U , V ) generated from various copulas with selected values of θ . The Clayton copula (Figure 1b,c) shows lower tail dependence, the Gumbel copula (Figure 1d) shows upper tail dependence, Frank copula (Figure 1e,f) shows symmetric dependence around the median, and the FGM copula (Figure 1g,h) is similar to the Frank copula; both copulas are radially symmetric. Unlike other copulas, the GB copula exhibits negative dependence only (Figure 1i).
These copulas have been applied to survival data and other data analyses. The Clayton copula was applied to survival data with dependent censoring. For instance, Schneider et al. [34] modeled dependence between survival and dependent censoring times in the survival data of tuberculosis cure. The Clayton, Gumbel, and Frank copulas were also applied to dependently censored data in clinical trials or observational studies [35,36,37]. The Gumbel, Frank, and FGM copulas were often used in competing risks models on survival data analysis [26,38,39,40]. Copulas were also applied to multivariate meta-analysis; Shih et al. and Shih et al. [41,42] proposed bivariate Clayton, FGM, and Gumbel models for bivariate meta-analysis. The Gumbel–Barnett copula has the simple form and is suitable for modeling negative dependence [24,32]. Therefore, it is important to consider a variety of copulas for dependent survival times.
To see the strength of dependence in a copula, θ can be transformed to Kendall’s τ . Kendall’s τ is a well-known measure to assess the dependence between two variables [20,24]. Kendall’s τ does not depend on the marginals, and is solely determined by the copula (by θ ). Appendix A provides the formulas of Kendall’s τ for the Clayton, Gumbel, Frank, FGM, and GB copulas. Therefore, Kendall’s τ is advantageous over the Pearson correlation for T 1 and T 2 .

2.2. Proposed Method for Computing p

In this section, we propose a new formula for computing p under the survival copula model (2). Let U = S 1 ( T 1 ) and V = S 2 ( T 2 ) . For computing p, we will use the conditional distribution function for U given V = v , which is the partial derivative of C with respect to v:
C [ 0 , 1 ] ( u , v ) = P ( U u V = v ) = C ( u , v ) v .
Then, by slightly modifying the theorem given by Emura & Pan [22], p can be expressed as the univariate integral on [ 0 , 1 ] :
p = P ( T 1 > T 2 ) + 1 2 P ( T 1 = T 2 ) = P ( S 1 ( T 1 ) < S 1 ( T 2 ) ) = P ( U < S 1 ( S 2 1 ( V ) ) ) = E P ( U < S 1 ( S 2 1 ( V ) ) V ) = 0 1 C [ 0 , 1 ] ( S 1 ( S 2 1 ( v ) ) , v ) d v .
We note that the theorem of Emura & Pan [22] is not directly applicable to the survival copula model (2) since that theorem is designed for the copula model P ( T 1 t 1 , T 2 t 2 ) = C ( 1 S 1 ( t 1 ) , 1 S 2 ( t 2 ) ) .
In order to compute p by the above formulas, we need to specify S 1 , S 2 , and θ . One can specify S 1 and S 2 by continuous parametric models that will be discussed in Section 2.4. One can try different values for θ in a sensitivity analysis. Note that the above calculations are not applicable for discrete parametric models for S 1 and S 2 .

2.3. Computing p with Follow-Up Time

In this section, we assume that every subject has a common follow-up time τ > 0 . For survival data, the follow-up period is often limited. When a subject survives longer than the follow-up period, one may treat the survival time of the subject as equal to the follow-up period [6,43,44]. This means that we define the Mann–Whitney effect for min ( T 1 , τ ) and min ( T 2 , τ ) . We now obtain p with the follow-up time τ from the following theorem, a straightforward expansion of the theorem of Emura & Pan [22].
Theorem 1.
The Mann–Whitney effect p with a follow-up time τ is written as the univariate integral:
p τ = P ( min ( T 1 , τ ) > min ( T 2 , τ ) ) + 1 2 P ( min ( T 1 , τ ) = min ( T 2 , τ ) ) = P ( T 1 > T 2 , T 2 < τ ) + 1 2 P ( T 1 > τ , T 2 > τ ) = P ( U < S 1 ( S 2 1 ( V ) ) , V > S 2 ( τ ) ) + 1 2 C ( S 1 ( τ ) , S 2 ( τ ) ) = S 2 ( τ ) 1 C [ 0 , 1 ] ( S 1 ( S 2 1 ( v ) ) , v ) d v + 1 2 C ( S 1 ( τ ) , S 2 ( τ ) ) .
Furthermore, p τ tends to p as τ . That is,
lim τ p τ = 0 1 C [ 0 , 1 ] ( S 1 ( S 2 1 ( v ) ) , v ) d v + 1 2 C ( 0 , 0 ) = p .
Appendix B provides the formulas of p and p τ for the Clayton, Gumbel, Frank, FGM, and GB copulas.

2.4. Marginal Survival Distributions

To compute p and p τ , we considered the following parametric marginal distributions for the group j { 1 , 2 } :
1.
The exponential distribution:
S j ( t ) = exp ( λ j t ) , λ j > 0 , S i ( S j 1 ( v ) ) = v λ i λ j ,
where λ j is a rate parameter.
2.
The Weibull distribution:
S j ( t ) = exp ( λ j t k j ) , λ j > 0 , k j > 0 , S i ( S j 1 ( v ) ) = exp λ i log v λ j k i k j ,
where λ j is a scale parameter and k j is a shape parameter.
3.
The gamma distribution:
S j ( t ) = 1 γ ( k j , λ j t ) Γ ( k j ) , λ j > 0 , k j > 0 ,
where λ j is a scale parameter and k j is a shape parameter, and Γ ( k ) is the gamma function, and γ ( k , λ t ) = 0 λ t x k 1 e x d x is the lower incomplete gamma function. The gamma distribution has no simple closed-form expression for the inverse survival function. Therefore, one can use approximations for the inverse survival function. In this article, we use the R software version 4.2.3 function “qgamma” to calculate S j 1 ( v ) .
4.
The log-normal distribution:
S j ( t ) = 1 2 π σ j 2 t 1 y exp 1 2 σ j 2 ( log y μ j ) 2 d y ,
where μ j is a location parameter and σ j is a scale parameter. Note that S j ( t ) has no simple closed-form expression for the inverse survival function. Therefore, one can use approximations for the inverse survival function. In this article, we use the R software function “qlnorm” to calculate S j 1 ( v ) .
5.
The Burr III distribution:
S j ( t ) = 1 ( 1 + t c j ) k j , c j > 0 , k j > 0 , S i ( S j 1 ( v ) ) = 1 1 + ( 1 v ) 1 k j 1 c i c j k i ,
where c j , k j are shape parameters.
In Figure 2, we present survival curves of the exponential, Weibull, gamma, log-normal, and Burr III distributions with different parameters. These plots show that these distributions can represent almost any continuous survival curve that will be encountered in practice.
Example 1.
Let the marginals S 1 ( t ) , S 2 ( t ) be the exponential distributions with parameter λ 1 = 1 , λ 2 = 2 . Assume that T 1 and T 2 are independent. Then, by Theorem 1 with τ = , p is given by
p = 0 1 C θ [ 0 , 1 ] ( S 1 ( S 2 1 ( v ) ) , v ) d v = 0 1 exp λ 1 λ 2 log v d v = λ 2 λ 1 + λ 2 = 2 3 .
Example 2.
Let the marginals S 1 ( t ) , S 2 ( t ) be the exponential distributions with parameter λ 1 = 1 , λ 2 = 2 . Assume that C θ ( u , v ) is the Clayton copula with parameter θ = 3 . Then, Kendall’s τ is given by
K e n d a l l s τ = θ θ + 2 = 0.6
and by Theorem 1 with τ = , p is given by
p = 0 1 C θ [ 0 , 1 ] ( S 1 ( S 2 1 ( v ) ) , v ) d v = 0 1 v θ 1 v θ λ 1 λ 2 + v θ 1 1 θ 1 d v = 0 1 v 4 v 3 2 + v 3 1 4 3 d v = 0.84 .
When τ < , p τ is given by
p τ = 0 1 C [ 0 , 1 ] ( S 1 ( S 2 1 ( v ) ) , v ) d v + 1 2 P ( min ( T 1 , τ ) = min ( T 2 , τ ) ) = e 2 τ 1 v 4 v 3 2 + v 3 1 4 3 d v + 1 2 e 3 τ + e 6 τ 1 1 3 .
When τ = 0.5 , p τ = 0.68 . Here, we computed the last two equations by a numerical integration by the R software version 4.2.3 function “integrate”.
Parameters for the marginal distributions can be estimated by maximum likelihood estimators (MLEs) when the survival data are available in two groups (Section 5).

2.5. Sensitivity Analysis by Copulas

To comprehensively cover the possible dependence structures between T 1 and T 2 , we select the Clayton, Gumbel, Frank, FGM, and GB copulas, and set their parameters. Table 1 contains positive dependence and negative dependence with weak and strong correlations, as well as the independence. However, the selection of an appropriate copula may be difficult unless both T 1 and T 2 are observed for the same subject (e.g., by cross-over designs). This is the case for the data examples (Section 5), where T 1 and T 2 are observed for different subjects. Let Z be the group indicator ( Z = 1 when T 1 is observed; Z = 2 when T 2 is observed). What we observe is one of T 1 and T 2 , namely, T = T 1 1 ( Z = 1 ) + T 2 1 ( Z = 2 ) . As T 1 and T 2 are never observed simultaneously, the copula for T 1 and T 2 are not identifiable [18,19]. This means that copula model selection and goodness-of-fit test are not feasible. Therefore, as one cannot specify a single copula, we suggest computing p under all these copulas and dependence parameters to see how p changes under various dependence. Such sensitivity analyses allow researchers to obtain the bounds of p under the possible dependence structures.

3. Software and Web App

We developed a Shiny-based web app to implement the proposed method for computing p and p τ . The app is available at (https://nkosuke.shinyapps.io/shiny_survival/ accessed on 7 April 2024) and can be used in any environment, including smartphones. Using this app, users can choose a marginal survival distribution, copula, and the relevant parameters to compute p and p τ . Our app is easy to use without knowledge of the R software version 4.2.3. The app works without data, because it does not estimate the parameters of the marginals and copula. All relevant parameters are entered by users. The app can compute p and p τ across different parametric distributions (Section 2.4) and copula models (Section 2.5).

3.1. Input

The web app requires users to select input values to compute p and p τ . The left panel of Figure 3 shows input icons, where users can select marginal distributions from the exponential, Weibull, gamma, log-normal, and Burr III distributions and a copula from the Clayton, Gumbel, Frank, FGM, and GB copulas. One can choose marginal distributions and copulas, set their parameters, and choose the language displayed on the screen in the input panels on the left-hand of the app (Figure 3). Furthermore, users can set a follow-up time τ .

3.2. Output

This app displays key formulas, survival curves of two groups, the values of p and p τ , and the value of Kendall’s τ . These formulas include marginal and bivariate survival functions and formulas of p and p τ based on the input values. The theoretical values of p and p τ are displayed together with survival curves.

3.3. Example of Using the App

Users may set the marginal survival distribution = “Exponential Distribution”, λ 1 = 0.5 , λ 2 = 0.25 , τ = 4.5 , copula = “Clayton”, θ = 1.5 and the language displayed on the screen “English”, and push “submit” button, and then they can obtain p = 0.225 ,   p τ = 0.268 . Figure 3 displays the web app in this setting. Appendix C provides further examples for users.

4. Simulation Studies

To show the correctness of the proposed calculator for p and p τ , we conduced a simulation study. In particular, we tried to confirm the formula of Theorem 1 under a variety of marginal distributions and copulas. For the simulation study, we set the marginal survival functions to be the exponential distributions with ( λ 1 , λ 2 ) = ( 1.0 , 2.0 ) , the Weibull distributions with ( λ 1 , k 1 , λ 2 , k 2 ) = ( 1.0 , 0.5 , 2.0 , 1.0 ) , the gamma distributions with ( λ 1 , k 1 , λ 2 , k 2 ) = ( 1.0 , 1.5 , 2.0 , 2.0 ) , the log-normal distributions with ( μ 1 , σ 1 2 , μ 2 , σ 2 2 ) = ( 0.7 , 1.5 , 0.3 , 2.0 ) , and the Burr III distributions with ( c 1 , k 1 , c 2 , k 2 ) = ( 1.5 , 3.0 , 1.0 , 1.0 ) . Furthermore, we set the copula parameters, the Clayton copula with θ = 1.0 ,   5.0 or 10.0 , the Gumbel copula with θ = 0.0 or 4.0 , the Frank copula with θ = 5.0 ,   1.0 , or 5.0 , the FGM copula with θ = 1.0 ,   0.0 , or 1.0 , the GB copula with θ = 0.5 , or 1.0 , and follow-up time τ = 0.5 ,   2.0 , 5.0 , or . We chose the copulas and their parameters to cover both positive and negative dependence. We generated 100,000 pairs ( T i 1 , T i 2 ) , i = 1 , , M , where M = 100,000, from the bivariate survival function based on the aforementioned setting, and calculated the Monte Carlo simulation values defined as
p τ , sim = 1 M i = 1 M 1 ( min ( T i 1 , τ ) > min ( T i 2 , τ ) ) + 1 2 1 ( min ( T i 1 , τ ) = min ( T i 2 , τ ) ) .
Table 2 shows that the simulation values are nearly equal to the theoretical values computed by the formula of Theorem 1 for every setting. Note that the standard errors (SEs) due to the uncertainty of the Monte Carlo values are less than 0.002 in all settings, which are negligible. As the positive dependence between T i 1 and T i 2 gets strong, the value of p and p τ approaches 1. Moreover, even though the values of the Kendall’s τ are equivalent, it is possible for p τ to vary from a copula model to others due to distinct characteristics of copulas. In conclusion, our simulations show that Theorem 1 is correct, and the Shiny web app based on Theorem 1 is reliable.

5. Data Analysis

In this section, we apply our proposed methods to a tongue cancer dataset and a prostate cancer dataset. The two datasets are publicly available and anonymized. The purpose of the data analyses is to show how the proposed methods are implemented and how the results offer new insights beyond Efron’s traditional estimator for the Mann–Whitney effect.
Before analyzing the real datasets, we introduce basic notations and ideas for estimating p by using censored data. We consider survival times ( T i 1 , T i 2 ) for two groups, a group indicator Z i ( Z i = 1 for group 1 and Z i = 2 for group 2), and censoring time C i for i = 1 , , n . The sample size for each group is n j = i 1 ( Z i = j ) , j = 1 , 2 . With T i = T i 1 1 ( Z i = 1 ) + T i 2 1 ( Z i = 2 ) , we observe is X i = min ( T i , C i ) , Δ i = 1 ( T i C i ) , and Z i for i = 1 , , n . In this observation, only one of T i 1 , T i 2 , and C i is observed ( T i 1 and T i 2 are never observed simultaneously) for each i, making it difficult to identify a copula for ( T i 1 , T i 2 ) . Without specifying a copula for possibly dependent survival times ( T i 1 , T i 2 ) , one can still estimate the marginal distributions. Using the exponential distribution, we obtained the MLE of the exponential hazard rate λ ^ j , j = 1 , 2 by
λ ^ j = i = 1 n Δ i 1 ( Z i = j ) i = 1 n X i 1 ( Z i = j ) , j = 1 , 2 .
Then, by applying the values of the MLE to the proposed Shiny web app, we obtained the estimators p ^ and p ^ τ , where τ was chosen appropriately (Section 5.1 and Section 5.2). The estimators were computed under various copulas as one cannot specify a single copula (see Section 2.5). On the other hand, under the independence assumption of T i 1 and T i 2 , Efron’s estimator [1] of p is
p ^ KM = 0 S ^ 1 ± ( t ) d S ^ 2 ( t ) , p ^ τ KM = 0 S ^ 2 1 ( τ ) S ^ 1 ± ( t ) d S ^ 2 ( t ) + 1 2 S ^ 1 ± ( τ ) S ^ 2 ± ( τ ) ,
where S ^ j ± ( t ) = [ S ^ j ( t + ) + S ^ j ( t ) ] / 2 and S ^ j ( t ) is a Kaplan–Meier (KM) estimator for group j = 1 , 2 . However, these benchmark estimates are subject to the independence of two groups. Therefore, the proposed estimator is useful to examine the sensitivity under a variety of dependence structures via copulas. We apply a jackknife estimator of standard error (SE) to measure the estimators’ uncertainty, and employ the normal approximation to obtain p-values for testing H 0 : p τ = 1 / 2 vs . H 1 : p τ 1 / 2 .

5.1. Tongue Cancer Data

The tongue dataset is available in the R software version 4.2.3 package KMsurv. It has 80 observations and contains: type (tumor DNA profile: 1 = aneuploid tumor, 2 = diploid tumor), time (time to death or on-study time (weeks)), and death (event indicator: 0 = alive, 1 = dead). It contains n 1 = 52 observations in the DNA-aneuploid tumor group ( j = 1 ), and n 2 = 28 observations in the DNA-diploid tumor group ( j = 2 ). We considered the follow-up time τ = min { max i X i Δ i 1 ( Z i = 1 ) , max i X i Δ i 1 ( Z i = 2 ) } and obtained τ = 167 . The tongue cancer data resulted in λ ^ 1 = 0.00736 , λ ^ 2 = 0.0130 and p ^ τ KM = 0.615 . In Figure 4, the KM estimators of each group and the estimated exponential survival curves are plotted.
We conducted sensitivity analyses using a copula-based approach (see Section 2.5). We calculated p ^ and p ^ τ by Theorem 1 under weak, strong positive, and negative independence. We calculated p ^ τ via the web app (Figure 5). Table 3 shows the output under the independent, Clayton, Gumbel, Frank, FGM, and GB copula with parameter θ { 1 , 5 } ( the Clayton ) , θ = 4 ( the Gumbel ) , θ { 20 , 5 , 5 } ( the Frank ) , θ { 1 , 1 } ( the FGM ) , θ { 0.5 , 1 } ( the GB ) . The results for all scenarios are summarized in Table 3. We obtained the p ^ τ ranging from 0.596 to 0.895 , which is equivalent to Cohen’s d being greater than or equal to 0.5 . Therefore, we concluded that subjects in the DNA-aneuploid tumor group survive longer than in those the DNA-diploid tumor group. This conclusion did not change under any of the conducted dependence structures.

5.2. Prostate Cancer Data

The prostate cancer data are available in the R software version 4.2.3 package asaur [45]. They have 14,294 observations and contain grade (moderately differentiated and poorly differentiated), survTime (time from diagnosis to death or last date known alive), and status (event indicator: 0 = censored, 1 = death from prostate cancer). They contain n 1 = 10,988 observations in the moderately differentiated group ( j = 1 ), and n 2 = 3306 observations in the poorly differentiated group ( j = 2 ).
The prostate cancer data resulted in λ ^ 1 = 0.000817 , λ ^ 2 = 0.00374 , τ = 108 , and p ^ KM = 0.679 . In Figure 6, we plot the KM estimators of each group and estimated exponential survival curves. We calculated p ^ τ by Theorem 1 with copulas and several parameters. We calculated p ^ τ under the independent, Clayton, Gumbel, Frank, FGM, and GB copulas with parameter θ { 1 , 5 } ( the Clayton ) , θ = 4 ( the Gumbel ) , θ { 20 , 5 , 5 } ( the Frank ) , θ { 1 , 1 } ( the FGM ) , θ { 0.5 , 1 } ( the GB ) via the web app (Figure 7). The results for all scenarios are summarized in Table 4. We obtained the p ^ τ ranging from 0.623 to 0.666 , which is equivalent to Cohen’s d being equal to 0.5 . Therefore, we concluded that subjects in the moderately differentiated group survive longer than those in the poorly differentiated group. The range of p ^ τ is narrower than one of tongue cancer dataset, because the large difference of survival functions may not be influenced by copulas.

6. Conclusions

The Mann–Whitney effect has been widely used for survival analysis, which can provide a meaningful measure of treatment effects on survival outcomes. However, the Mann–Whitney effect may not be interpreted as the true treatment effect under the dependence of the two survival times. In this article, we proposed a parametric copula-based approach for estimating the Mann–Whitney effect p under dependence structures for two survival times. We derived the formulas of p under a variety of copulas and marginal survival functions and modified it by p τ with τ as the follow-up time. We also introduced a web-based calculator for p and p τ for users. Our simulation studies demonstrate the correctness of the proposed calculator under a variety of parametric marginal survival distributions and copulas. The results of the data analyses show that the proposed method provides possible changes in p and p τ under various dependence and enables the examination of the sensitivity.
In the examples of real datasets, we obtained p τ under the Clayton, Gumbel, Frank, FGM, and GB copulas with varying parameters. The value of p τ ranged from 0.596 to 0.895 in the tongue cancer dataset and from 0.623 to 0.666 in the prostate cancer dataset. We obtained narrow ranges whose lower bounds did not include the null value of 1 / 2 . The results show that, under a variety of dependence structures, the interpretation of the Mann–Whitney effect does not change. This result is consistent with previous studies showing that Hand’s paradox does not occur under a strictly monotonic effect [15,46]. The considered dependence may not affect decision making in clinical research or practice. Although more complex dependence structures with various copulas might be considered, the conclusion may not change significantly.
On computing the Mann–Whitney effect under dependence, there are several previous works (Table 5). However, these works focused on specific combinations of marginals and copulas. The advantage of our article is that it can be applied to various marginals and copulas, not just to a specific one. Our app allows readers to verify the behavior of the Mann–Whitney effect under various dependences.
The main limitation of the present article is that we only discussed the “parametric” approach. Because the proposed method uses extrapolated parameters to compute the Mann–Whitney effect, the result strongly depends on the model assumptions of the marginals and copulas and the characteristics of the extrapolated parameter. However, in practice, researchers may use the “semiparametric” or “nonparametric” approaches [17]. In future work, we will examine the method of computing p without parametric assumptions and expand it to nonparametric or semiparametric models such as Cox regression. The anticipated challenge of nonparametric approach is that the empirical copula, which is a nonparametric approach to copula models, is often nonsmooth and not a genuine copula. Another extension is to include covariates or secondary outcomes, including time-varying effects in the model, which helps obtain narrow bounds for treatment effects [54,55,56]. Another limitation is that only one-parameter bivariate copulas and noninformative censoring were implemented. Multiparameter or multivariate copulas deserve attention [24,32,57]. There are also several recent works on copula-based approaches for dependent censoring [58,59,60]. If censoring is not independent of survival, the usual MLE and KM estimators are biased (hence, the estimators in Section 5 are all biased). Furthermore, the proposed method for two-sample comparison may be extended to multigroup comparisons using the factorial designs by adopting the relative treatment effects of Brunner & Puri and Dobler & Pauly [43,61].

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/math12101453/s1, the R code for simulations and data analyses.

Author Contributions

Conceptualization, K.N. and T.E.; Methodology, K.N. and T.E.; Software, K.N. and Y.-C.L.; Resources, G.-Y.L.; Writing—original draft, K.N., R.U. and T.E.; Visualization, K.N.; Supervision, T.E.; Funding acquisition, R.U. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by [JSPS KAKENHI] grant number [22K11948], grant number [20H04147], and grant number [23K20374].

Data Availability Statement

As all data in Section 5 use publicly available data, written informed consent was not required.

Acknowledgments

We thank the editor and three reviewers for their time to comments on our paper. The comments from the reviewers greatly helped improve the presentation of the paper.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Kendall’s τ

Under the copula model (2), Kendall’s τ for T 1 and T 2 is expressed as
Kendall s τ = 4 0 1 0 1 C θ ( u , v ) C θ ( d u , d v ) 1 .
Kendall’s τ of each copula is expressed as follows.
1.
The independence copula:
Kendall s τ = 0 .
2.
The Clayton copula:
Kendall s τ = θ θ + 2 , θ [ 1 , ) { 0 } .
3.
The Gumbel copula:
Kendall s τ = θ θ + 1 , θ [ 0 , ) .
4.
The Frank copula:
Kendall s τ = 1 4 θ + 4 D θ θ , where D θ = 1 θ 0 θ x exp ( x ) 1 d x , θ ( , ) { 0 } .
5.
The FGM copula:
Kendall s τ = 2 9 θ , θ [ 1 , 1 ] .
6.
The GB copula:
Kendall s τ = 1 4 θ 0 1 t ( 1 θ log t ) log ( 1 θ log t ) d t , θ [ 0 , 1 ] .

Appendix B. Examples of p and pτ with Different Copulas

Equation (3) with different copulas is computed by the following formulas:
1.
The Clayton copula:
p = 0 1 v θ 1 { S 1 ( S 2 1 ( v ) ) } θ + v θ 1 1 θ 1 d v .
2.
The Gumbel copula:
p = 0 1 exp [ ( log S 1 ( S 2 1 ( v ) ) ) θ + 1 + ( log v ) θ + 1 ] 1 θ + 1 × log ( S 1 ( S 2 1 ( v ) ) ) θ + 1 + log ( v ) θ + 1 θ θ + 1 ( log ( v ) ) θ v d v .
3.
The Frank copula:
p = 0 1 e θ v e θ S 1 ( S 2 1 ( v ) ) 1 e θ 1 + e θ S 1 ( S 2 1 ( v ) ) 1 e θ v 1 d v .
4.
The FGM copula:
p = 0 1 S 1 ( S 2 1 ( v ) ) + θ S 1 ( S 2 1 ( v ) ) 1 S 1 ( S 2 1 ( v ) ) ( 1 2 v ) d v .
5.
The GB copula:
p = 0 1 S 1 ( S 2 1 ( v ) ) 1 θ log S 1 ( S 2 1 ( v ) ) v θ log S 1 ( S 2 1 ( v ) ) d v .
Equation (4) with different copulas is computed by the following formulas:
1.
The Clayton copula:
p τ = S 2 ( τ ) 1 v θ 1 { S 1 ( S 2 1 ( v ) ) } θ + v θ 1 1 θ 1 d v + 1 2 ( S 1 ( τ ) θ + S 2 ( τ ) θ 1 ) 1 / θ .
2.
The Gumbel copula:
p τ = S 2 ( τ ) 1 exp [ ( log S 1 ( S 2 1 ( v ) ) ) θ + 1 + ( log v ) θ + 1 ] 1 θ + 1 × log ( S 1 ( S 2 1 ( v ) ) ) θ + 1 + log ( v ) θ + 1 θ θ + 1 ( log ( v ) ) θ v d v + 1 2 exp [ ( log S 1 ( τ ) ) θ + 1 + ( log S 2 ( τ ) ) θ + 1 ] 1 θ + 1 .
3.
The Frank copula:
p τ = S 2 ( τ ) 1 e θ v e θ S 1 ( S 2 1 ( v ) ) 1 e θ 1 + e θ S 1 ( S 2 1 ( v ) ) 1 e θ v 1 d v + 1 2 1 θ log 1 + ( e θ S 1 ( τ ) 1 ) ( e θ S 2 ( τ ) 1 ) e θ 1 .
4.
The FGM copula:
p τ = S 2 ( τ ) 1 S 1 ( S 2 1 ( v ) ) + θ S 1 ( S 2 1 ( v ) ) 1 S 1 ( S 2 1 ( v ) ) ( 1 2 v ) d v + 1 2 S 1 ( τ ) S 2 ( τ ) + θ S 1 ( τ ) S 2 ( τ ) ( 1 S 1 ( τ ) ) ( 1 S 2 ( τ ) ) .
5.
The GB copula:
p τ = S 2 ( τ ) 1 S 1 ( S 2 1 ( v ) ) 1 θ log S 1 ( S 2 1 ( v ) ) v θ log S 1 ( S 2 1 ( v ) ) d v + 1 2 S 1 ( τ ) S 2 ( τ ) exp ( θ log S 1 ( τ ) log S 2 ( τ ) ) .

Appendix C. Examples of Using the Shiny Web App

This appendix gives two examples of using the Shiny web app.
Figure A1. The web app showing the results for computing p and p τ under the Burr III distributions.
Figure A1. The web app showing the results for computing p and p τ under the Burr III distributions.
Mathematics 12 01453 g0a1
Example A1.
Users set the marginal survival distribution = “Burr III Distribution”, c 1 = 1.5 , k 1 = 3.0 , c 2 = 1.0 , k 2 = 1.0 , τ = 5.0 , copula = “FGM”, θ = 0.5 and the language displayed on the screen “English”, and push the “submit” button, and then they can obtain Kendall’s τ = 0.111 , p = 0.714 , and p τ = 0.719 . Figure A1 displays the web app in this setting.
Figure A2. The web app showing the results for computing p and p τ under the log-normal distribution.
Figure A2. The web app showing the results for computing p and p τ under the log-normal distribution.
Mathematics 12 01453 g0a2
Example A2.
Users set the marginal survival distribution = “Log-normal Distribution”, log ( μ 1 ) = 0.7 , log ( σ 1 2 ) = 1.5 , log ( μ 2 ) = 0.3 , log ( σ 2 2 ) = 2.0 , τ = 5.0 , copula = “Frank”, θ = 2.0 and the language displayed on the screen “English”, and push the “submit” buttom, and then they can obtain Kendall’s τ = 0.214 , p = 0.588 , and p τ = 0.579 . Figure A2 displays the web app in this setting.

References

  1. Efron, B. The two sample problem with censored data. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability; University of California Press: Berkeley, CA, USA, 1967; Volume 4, pp. 831–853. [Google Scholar]
  2. Rahlfs, V.W.; Zimmermann, H.; Lees, K.R. Effect Size Measures and Their Relationships in Stroke Studies. Stroke 2014, 45, 627–633. [Google Scholar] [CrossRef]
  3. Mann, H.B.; Whitney, D.R. On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Stat. 1947, 18, 50–60. [Google Scholar] [CrossRef]
  4. Pocock, S.J.; Ariti, C.A.; Collier, T.J.; Wang, D. The win ratio: A new approach to the analysis of composite endpoints in clinical trials based on clinical priorities. Eur. Heart J. 2011, 33, 176–182. [Google Scholar] [CrossRef]
  5. Birnbaum, Z.W. On a use of the Mann-Whitney statistic. In Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, 1954–1955; University of California Press: Berkeley, BA, USA,, 1956; Volume I, pp. 13–17. [Google Scholar]
  6. Dobler, D.; Pauly, M. Bootstrap- and permutation-based inference for the Mann-Whitney effect for right-censored and tied data. TEST 2018, 27, 639–658. [Google Scholar] [CrossRef]
  7. Emura, T.; Hsu, J. Estimation of the Mann-Whitney effect in the two-sample problem under dependent censoring. Comput. Statist. Data Anal. 2020, 150, 106990. [Google Scholar] [CrossRef]
  8. Biswas, A.; Chakraborty, S.; Mukherjee, M. On estimation of stress-strength reliability with log-Lindley distribution. J. Stat. Comput. Simul. 2021, 91, 128–150. [Google Scholar] [CrossRef]
  9. Rubarth, K.; Sattler, P.; Zimmermann, H.G.; Konietschke, F. Estimation and Testing of Wilcoxon-Mann-Whitney Effects in Factorial Clustered Data Designs. Symmetry 2022, 14, 244. [Google Scholar] [CrossRef]
  10. Hu, J.; Zhuang, Y.; Goldiner, C. Fixed-accuracy confidence interval estimation of P(X<Y) under a geometric-exponential model. Jpn. J. Stat. Data Sci. 2021, 4, 1079–1104. [Google Scholar] [CrossRef]
  11. de la Cruz, R.; Salinas, H.S.; Meza, C. Reliability Estimation for Stress-Strength Model Based on Unit-Half-Normal Distribution. Symmetry 2022, 14, 837. [Google Scholar] [CrossRef]
  12. Patil, D.; Naik-Nimbalkar, U.V.; Kale, M.M. Effect of Dependency on the Estimation of P[Y>X] in Exponential Stress-strength Models. Austrian J. Stat. 2022, 51, 10–34. [Google Scholar] [CrossRef]
  13. Nowak, C.P.; Mütze, T.; Konietschke, F. Group sequential methods for the Mann-Whitney parameter. Stat. Methods Med. Res. 2022, 31, 2004–2020. [Google Scholar] [CrossRef]
  14. Singh, B.; Nayal, A.S.; Tyagi, A. Estimation of P [Y< Z] under Geometric-Lindley model. Ric. Mat. 2023, 1–32. [Google Scholar] [CrossRef]
  15. Hand, D.J. On Comparing Two Treatments. Am. Stat. 1992, 46, 190–192. [Google Scholar] [CrossRef]
  16. Cochran, W.G.; Cox, G.M. Experimental Designs, 2nd ed.; John Wiley & Sons Inc.: New York, NY, USA; Chapman & Hall, Ltd.: London, UK, 1957; p. xiv+617. [Google Scholar]
  17. Dobler, D.; Möllenhoff, K. A nonparametric relative treatment effect for direct comparisons of censored paired survival outcomes. Stat. Med. 2024; early view. [Google Scholar] [CrossRef] [PubMed]
  18. Fan, Y.; Park, S.S. Sharp bounds on the distribution of treatment effects and their statistical inference. Econom. Theory 2010, 26, 931–951. [Google Scholar] [CrossRef]
  19. Fay, M.P.; Brittain, E.H.; Shih, J.H.; Follmann, D.A.; Gabriel, E.E. Causal estimands and confidence intervals associated with Wilcoxon-Mann-Whitney tests in randomized experiments. Stat. Med. 2018, 37, 2923–2937. [Google Scholar] [CrossRef]
  20. Emura, T.; Matsui, S.; Rondeau, V. Survival Analysis with Correlated Endpoints; SpringerBriefs in Statistics; Springer: Singapore, 2019; p. xvii+118. [Google Scholar] [CrossRef]
  21. Li, D.; Hu, X.J.; Wang, R. Evaluating association between two event times with observations subject to informative censoring. J. Amer. Statist. Assoc. 2023, 118, 1282–1294. [Google Scholar] [CrossRef]
  22. Emura, T.; Pan, C. Parametric likelihood inference and goodness-of-fit for dependently left-truncated data, a copula-based approach. Statist. Pap. 2020, 61, 479–501. [Google Scholar] [CrossRef]
  23. Sklar, M. Fonctions de répartition à n dimensions et leurs marges. Publ. Inst. Statist. Univ. Paris 1959, 8, 229–231. [Google Scholar]
  24. Nelsen, R.B. An Introduction to Copulas, 2nd ed.; Springer Series in Statistics; Springer: New York, NY, USA, 2006; p. xiv+269. [Google Scholar] [CrossRef]
  25. Geenens, G. (Re-)Reading Sklar (1959);A Personal View on Sklar’s Theorem. Mathematics 2024, 12, 380. [Google Scholar] [CrossRef]
  26. Escarela, G.; Carrière, J.F. Fitting competing risks with an assumed copula. Stat. Methods Med. Res. 2003, 12, 333–349. [Google Scholar] [CrossRef] [PubMed]
  27. Petti, D.; Eletti, A.; Marra, G.; Radice, R. Copula link-based additive models for bivariate time-to-event outcomes with general censoring scheme. Comput. Statist. Data Anal. 2022, 175, 107550. [Google Scholar] [CrossRef]
  28. Clayton, D.G. A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika 1978, 65, 141–151. [Google Scholar] [CrossRef]
  29. Gumbel, E.J. Distributions des valeurs extrêmes en plusieurs dimensions. Publ. Inst. Statist. Univ. Paris 1960, 9, 171–173. [Google Scholar]
  30. Frank, M.J. On the simultaneous associativity of F(x,y) and x + yF(x,y). Aequationes Math. 1979, 19, 194–226. [Google Scholar] [CrossRef]
  31. Morgenstern, D. Einfache Beispiele zweidimensionaler Verteilungen. Mitteilungsbl. Math. Statist. 1956, 8, 234–235. [Google Scholar]
  32. Chesneau, C. On the Gumbel-Barnett extended Celebioglu-Cuadras copula. Jpn. J. Stat. Data Sci. 2023, 6, 759–781. [Google Scholar] [CrossRef]
  33. Toparkus, A.; Weißbach, R. Testing Truncation Dependence: The Gumbel-Barnett Copula. arXiv 2024, arXiv:2305.19675. [Google Scholar]
  34. Schneider, S.; dos Reis, R.C.P.; Gottselig, M.M.F.; Fisch, P.; Knauth, D.R.; Vigo, A. Clayton copula for survival data with dependent censoring: An application to a tuberculosis treatment adherence data. Stat. Med. 2023, 42, 4057–4081. [Google Scholar] [CrossRef]
  35. Sun, T.; Ding, Y. Copula-based semiparametric regression method for bivariate data under general interval censoring. Biostatistics 2021, 22, 315–330. [Google Scholar] [CrossRef]
  36. Moradian, H.; Larocque, D.; Bellavance, F. Survival forests for data with dependent censoring. Stat. Methods Med. Res. 2019, 28, 445–461. [Google Scholar] [CrossRef] [PubMed]
  37. Farzana, W.; Basree, M.M.; Diawara, N.; Shboul, Z.A.; Dubey, S.; Lockhart, M.M.; Hamza, M.; Palmer, J.D.; Iftekharuddin, K.M. Prediction of Rapid Early Progression and Survival Risk with Pre-Radiation MRI in WHO Grade 4 Glioma Patients. Cancers 2023, 15, 4636. [Google Scholar] [CrossRef] [PubMed]
  38. Chen, Y. Semiparametric marginal regression analysis for dependent competing risks under an assumed copula. J. R. Stat. Soc. Ser. B Stat. Methodol. 2010, 72, 235–251. [Google Scholar] [CrossRef]
  39. Shih, J.; Emura, T. Likelihood-based inference for bivariate latent failure time models with competing risks under the generalized FGM copula. Comput. Statist. 2018, 33, 1293–1323. [Google Scholar] [CrossRef]
  40. Kawakami, R.; Michimae, H.; Lin, Y. Assessing the numerical integration of dynamic prediction formulas using the exact expressions under the joint frailty-copula model. Jpn. J. Stat. Data Sci. 2021, 4, 1293–1321. [Google Scholar] [CrossRef]
  41. Shih, J.; Konno, Y.; Chang, Y.; Emura, T. Estimation of a common mean vector in bivariate meta-analysis under the FGM copula. Statistics 2019, 53, 673–695. [Google Scholar] [CrossRef]
  42. Shih, J.; Konno, Y.; Chang, Y.; Emura, T. Copula-Based Estimation Methods for a Common Mean Vector for Bivariate Meta-Analyses. Symmetry 2022, 14, 186. [Google Scholar] [CrossRef]
  43. Dobler, D.; Pauly, M. Factorial analyses of treatment effects under independent right-censoring. Stat. Methods Med. Res. 2020, 29, 325–343. [Google Scholar] [CrossRef] [PubMed]
  44. Emura, T.; Ditzhaus, M.; Dobler, D.; Murotani, K. Factorial survival analysis for treatment effects under dependent censoring. Stat. Methods Med. Res. 2024, 33, 61–79. [Google Scholar] [CrossRef]
  45. Moore, D.F. Applied Survival Analysis Using R; Springer: Berlin/Heidelberg, Germany, 2016; Volume 473. [Google Scholar]
  46. Greenland, S.; Fay, M.P.; Brittain, E.H.; Shih, J.H.; Follmann, D.A.; Gabriel, E.E.; Robins, J.M. On causal inferences for personalized medicine: How hidden causal assumptions led to erroneous causal claims about the D-value. Am. Statist. 2020, 74, 243–248. [Google Scholar] [CrossRef]
  47. Domma, F.; Giordano, S. A copula-based approach to account for dependence in stress-strength models. Statist. Pap. 2013, 54, 807–826. [Google Scholar] [CrossRef]
  48. Gao, J.; An, Z.; Liu, B. A dependent stress-strength interference model based on mixed copula function. J. Mech. Sci. Technol. 2016, 30, 4443–4446. [Google Scholar] [CrossRef]
  49. de Andrade, B.B.; do Nascimento, A.R.; Rathie, P.N. Parametric and nonparametric inference for the reliability of copula-based stress-strength models. Qual. Reliab. Eng. Int. 2020, 36, 2249–2267. [Google Scholar] [CrossRef]
  50. Rathie, P.N.; de Sena Monteiro Ozelim, L.C.; de Andrade, B.B. Portfolio Management of Copula-Dependent Assets Based on P(Y < X) Reliability Models: Revisiting Frank Copula and Dagum Distributions. Stats 2021, 4, 1027–1050. [Google Scholar] [CrossRef]
  51. James, A.; Chandra, N.; Sebastian, N. Stress-strength reliability estimation for bivariate copula function with rayleigh marginals. Int. J. Syst. Assur. Eng. Manag. 2023, 14, 196–215. [Google Scholar] [CrossRef]
  52. Shang, L.; Yan, Z. Reliability estimation stress-strength dependent model based on copula function using ranked set sampling. J. Radiat. Res. Appl. Sci. 2024, 17, 100811. [Google Scholar] [CrossRef]
  53. Lima, R.K.; Quintino, F.S.; da Fonseca, T.A.; de Sena Monteiro Ozelim, L.C.; Rathie, P.N.; Saulo, H. Assessing the Impact of Copula Selection on Reliability Measures of Type P(X < Y) with Generalized Extreme Value Marginals. Modelling 2024, 5, 180–200. [Google Scholar] [CrossRef]
  54. Patton, A.J. Modelling asymmetric exchange rate dependence. Internat. Econom. Rev. 2006, 47, 527–556. [Google Scholar] [CrossRef]
  55. Almeida, C.; Czado, C. Efficient Bayesian inference for stochastic time-varying copula models. Comput. Statist. Data Anal. 2012, 56, 1511–1527. [Google Scholar] [CrossRef]
  56. Yin, Y.; Cai, Z.; Zhou, X. Using secondary outcome to sharpen bounds for treatment harm rate in characterizing heterogeneity. Biom. J. 2018, 60, 879–892. [Google Scholar] [CrossRef]
  57. Susam, S.O. A multi-parameter Generalized Farlie-Gumbel-Morgenstern bivariate copula family via Bernstein polynomial. Hacet. J. Math. Stat. 2022, 51, 618–631. [Google Scholar] [CrossRef]
  58. Deresa, N.W.; Van Keilegom, I. A multivariate normal regression model for survival data subject to different types of dependent censoring. Comput. Statist. Data Anal. 2020, 144, 106879. [Google Scholar] [CrossRef]
  59. Jo, J.H.; Gao, Z.; Jung, I.; Song, S.Y.; Ridder, G.; Moon, H.R. Copula graphic estimation of the survival function with dependent censoring and its application to analysis of pancreatic cancer clinical trial. Stat. Methods Med. Res. 2023, 32, 944–962. [Google Scholar] [CrossRef] [PubMed]
  60. Emura, T.; Chen, Y. Analysis of Survival Data with Dependent Censoring: Copula-Based Approaches; Springer: Berlin/Heidelberg, Germany, 2018; Volume 450. [Google Scholar]
  61. Brunner, E.; Puri, M.L. Nonparametric methods in factorial designs. Statist. Pap. 2001, 42, 1–52. [Google Scholar] [CrossRef]
Figure 1. Scatter plots of 3000 data points generated from the copula distribution with parameter θ .
Figure 1. Scatter plots of 3000 data points generated from the copula distribution with parameter θ .
Mathematics 12 01453 g001
Figure 2. The plots for parametric survival functions.
Figure 2. The plots for parametric survival functions.
Mathematics 12 01453 g002
Figure 3. The web app showing the results for computing p and p τ under the exponential distribution.
Figure 3. The web app showing the results for computing p and p τ under the exponential distribution.
Mathematics 12 01453 g003
Figure 4. KM estimators for the DNA-aneuploid tumor and the DNA-diploid tumor group and the exponential survival curves with the MLEs of exponential hazard rate (darker blue and red lines), λ ^ 1 = 0.00736 ,   λ ^ 2 = 0.0130 . The vertical line signifies the follow-up time τ = 167 .
Figure 4. KM estimators for the DNA-aneuploid tumor and the DNA-diploid tumor group and the exponential survival curves with the MLEs of exponential hazard rate (darker blue and red lines), λ ^ 1 = 0.00736 ,   λ ^ 2 = 0.0130 . The vertical line signifies the follow-up time τ = 167 .
Mathematics 12 01453 g004
Figure 5. Example for the tongue cancer dataset on the app. This setting is marginal distribution: “Exponential”, λ 1 = 0.00736 ,   λ 2 = 0.0130 , τ = 167 , copula: “Clayton”, copula parameter: θ = 1 , and language: “English”.
Figure 5. Example for the tongue cancer dataset on the app. This setting is marginal distribution: “Exponential”, λ 1 = 0.00736 ,   λ 2 = 0.0130 , τ = 167 , copula: “Clayton”, copula parameter: θ = 1 , and language: “English”.
Mathematics 12 01453 g005
Figure 6. KM estimators for the moderately differentiated and the poorly differentiated group and exponential survival curves with the MLEs of exponential hazard rate (darker blue and red lines), λ ^ 1 = 0.000817 ,   λ ^ 2 = 0.00374 . The vertical line signifies the follow-up time τ = 108 .
Figure 6. KM estimators for the moderately differentiated and the poorly differentiated group and exponential survival curves with the MLEs of exponential hazard rate (darker blue and red lines), λ ^ 1 = 0.000817 ,   λ ^ 2 = 0.00374 . The vertical line signifies the follow-up time τ = 108 .
Mathematics 12 01453 g006
Figure 7. Example for the tongue cancer dataset on the app. This setting is marginal distribution: “Exponential”, λ 1 = 0.000817 ,   λ 2 = 0.00374 , τ = 108 , copula: “Gumbel”, copula parameter: θ = 4 , and language: “English”.
Figure 7. Example for the tongue cancer dataset on the app. This setting is marginal distribution: “Exponential”, λ 1 = 0.000817 ,   λ 2 = 0.00374 , τ = 108 , copula: “Gumbel”, copula parameter: θ = 4 , and language: “English”.
Mathematics 12 01453 g007
Table 1. Selection of copulas and their parameters.
Table 1. Selection of copulas and their parameters.
Copula θ Kendall’s τ
Clayton1.00.33
5.00.71
10.00.83
Gumbel0.00.00
4.00.80
Frank−20.0−0.82
−5.0−0.46
1.00.11
5.00.46
FGM−1.0−0.22
0.00.00
1.00.22
GB0.5−0.21
1.0−0.36
Table 2. Comparison of the theoretical value and the simulation value (SE < 0.002) for calculating p τ defined in Theorem 1. (The exponential, Weibull, gamma, log-normal, and Burr III distributions).
Table 2. Comparison of the theoretical value and the simulation value (SE < 0.002) for calculating p τ defined in Theorem 1. (The exponential, Weibull, gamma, log-normal, and Burr III distributions).
τ = 0.5 τ = 2 τ = 5 τ =
DistributionCopula θ Kendall s τ p τ , theory p τ , sim p τ , theory p τ , sim p τ , theory p τ , sim p τ , theory p τ , sim
ExponentialClayton1.00.330.6450.6430.7370.7380.7440.7460.7440.746
5.00.710.7040.7060.8720.8720.8810.8830.8810.883
10.00.830.7460.7450.9200.9210.9300.9300.9300.929
Gumbel0.00.000.6290.6310.6660.6660.6660.6650.6670.666
4.00.800.7980.7990.9610.9610.9700.9690.9700.968
Frank−5.0−0.460.6150.6150.6220.6220.6220.6220.6220.620
1.00.110.6360.6360.6840.6840.6850.6850.6850.685
5.00.460.6740.6740.7710.7680.7730.7720.7730.775
FGM−1.0−0.220.6170.6170.6330.6340.6330.6310.6330.633
0.00.000.6290.6290.6660.6660.6660.6660.6660.664
1.00.220.6420.6410.6990.6970.7000.7020.7000.698
GB0.5−0.210.6230.6240.6420.6430.6420.6410.6420.640
1.0−0.360.6170.6160.6290.6280.6290.6320.6290.630
WeibullClayton1.00.330.4970.4960.5940.5890.6030.6020.6030.602
5.00.710.4820.4800.6440.6450.6530.6540.6530.650
10.00.830.4720.4720.6450.6450.6540.6530.6540.652
Gumbel0.00.000.5110.5090.5600.5620.5620.5620.5620.563
4.00.800.4250.4240.5840.5850.5930.5940.5930.592
Frank−5.0−0.460.5280.5300.5420.5410.5420.5430.5420.541
1.00.110.5050.5040.5660.5650.5690.5720.5690.568
5.00.460.4860.4860.5920.5950.5970.5970.5970.594
FGM−1.0−0.220.5210.5230.5480.5490.5490.5510.5490.549
0.00.000.5110.5090.5600.5630.5620.5620.5620.564
1.00.220.5010.5030.5720.5740.5750.5770.5750.574
GB0.5−0.210.5190.5200.5490.5460.5490.5480.5490.553
1.0−0.360.5260.5260.5450.5450.5450.5450.5450.546
GammaClayton1.00.330.5290.5290.6510.6490.6790.6780.6790.680
5.00.710.5300.5280.7630.7630.8090.8100.8090.810
10.00.830.5340.5330.8170.8160.8620.8630.8630.863
Gumbel0.00.000.5300.5300.6110.6120.6150.6140.6150.616
4.00.800.5450.5460.8130.8130.8530.8530.8540.853
Frank−5.0−0.460.5320.5300.5840.5840.5840.5830.5840.584
1.00.110.5300.5290.6220.6220.6280.6280.6280.628
5.00.460.5300.5300.6790.6790.6940.6920.6940.697
FGM−1.0−0.220.5310.5330.5910.5910.5920.5910.5920.592
0.00.000.5300.5290.6110.6100.6150.6140.6150.617
1.00.220.5290.5290.6310.6310.6390.6400.6390.641
GB0.5−0.210.5310.5320.5970.5980.5980.5980.5980.597
1.0−0.360.5320.5320.5890.5920.5900.5880.5890.588
Log-normalClayton1.00.330.5800.5820.5960.5940.5930.5910.5730.574
5.00.710.5990.5990.6580.6580.6750.6770.6190.619
10.00.830.6180.6170.7150.7150.7560.7550.6840.684
Gumbel0.00.000.5740.5750.5760.5780.5690.5700.5640.562
4.00.800.6520.6520.7450.7460.7600.7590.7280.726
Frank−5.0−0.460.5670.5670.5530.5540.5470.5470.5470.544
1.00.110.5770.5760.5840.5860.5780.5800.5710.570
5.00.460.5930.5930.6240.6250.6230.6210.6090.607
FGM−1.0−0.220.5680.5670.5600.5610.5530.5530.5500.551
0.00.000.5740.5740.5760.5760.5690.5690.5640.560
1.00.220.5800.5800.5910.5930.5850.5860.5770.579
GB0.5−0.210.5710.5720.5630.5660.5580.5550.5580.557
1.0−0.360.5680.5670.5570.5570.5500.5540.5490.550
Burr IIIClayton1.00.330.6610.6620.7530.7530.7600.7610.7470.748
5.00.710.6650.6650.8190.8200.8830.8830.8630.865
10.00.830.6660.6660.8320.8320.9130.9140.8960.897
Gumbel0.00.000.6600.6610.7140.7150.7010.7010.6970.696
4.00.800.6670.6650.8320.8320.8850.8830.8710.871
Frank−5.0−0.460.6580.6580.6630.6620.6500.6490.6500.653
1.00.110.6610.6600.7300.7310.7200.7190.7150.716
5.00.460.6640.6630.7880.7890.7980.7970.7890.788
FGM−1.0−0.220.6580.6580.6830.6830.6660.6660.6650.664
0.00.000.6600.6600.7140.7110.7010.7010.6970.699
1.00.220.6610.6620.7440.7470.7370.7360.7300.729
GB0.5−0.210.6590.6590.6920.6930.6760.6770.6750.676
1.0−0.360.6580.6580.6730.6710.6590.6590.6590.659
Note: We set the exponential distributions with ( λ 1 , λ 2 ) = ( 1.0 , 2.0 ) , the Weibull distributions with ( λ 1 , k 1 , λ 2 , k 2 ) = ( 1.0 , 0.5 , 2.0 , 1.0 ) , the gamma distributions with ( λ 1 , k 1 , λ 2 , k 2 ) = ( 1.0 , 1.5 , 2.0 , 2.0 ) , the log-normal distributions with ( μ 1 , σ 1 2 , μ 2 , σ 2 2 ) = ( 0.7 , 1.5 , 0.3 , 2.0 ) , and the Burr III distributions with ( c 1 , k 1 , c 2 , k 2 ) = ( 1.5 , 3.0 , 1.0 , 1.0 ) .
Table 3. Estimates p ^ τ ( τ = 167 ) for fitting the KM estimator (independent) and with exponential marginal survival distributions (the independent, Clayton, Gumbel, Frank, FGM, GB copulas) for the tongue cancer dataset.
Table 3. Estimates p ^ τ ( τ = 167 ) for fitting the KM estimator (independent) and with exponential marginal survival distributions (the independent, Clayton, Gumbel, Frank, FGM, GB copulas) for the tongue cancer dataset.
CopulaMarginal Distribution θ p ^ SEp-Value p ^ τ ( τ = 167 ) SEp-Value
IndependentKM estimator----0.6240.0710.079
Independentexponential-0.6380.0760.0700.6330.0750.076
Claytonexponential1.00.7090.0960.0290.6760.0960.067
5.00.8560.075<0.0010.7990.1000.003
Gumbelexponential4.00.9440.084<0.0010.8950.095<0.001
Frankexponential−20.00.5960.0550.0800.5960.0550.080
−5.00.6000.0570.0810.6000.0570.081
5.00.7330.1110.0360.7140.1080.046
FGMexponential−1.00.6090.0630.0820.6090.0630.083
1.00.6660.0900.0630.6580.0880.072
GBexponential0.50.6170.0660.0770.6170.0660.078
1.00.6060.0600.0790.6060.0600.080
Table 4. Estimates p τ ( τ = 108 ) for fitting the KM estimator (independent) and p τ ( τ = 108 ) with the exponential marginal survival distributions (the independent, Clayton, Gumbel, Frank, FGM, GB copulas) for the prostate cancer dataset.
Table 4. Estimates p τ ( τ = 108 ) for fitting the KM estimator (independent) and p τ ( τ = 108 ) with the exponential marginal survival distributions (the independent, Clayton, Gumbel, Frank, FGM, GB copulas) for the prostate cancer dataset.
CopulaMarginal Distribution θ p ^ SEp-Value p ^ τ ( τ = 108 ) SEp-Value
IndependentKM estimator----0.6350.013<0.001
Independentexponential-0.8210.010<0.0010.6250.007<0.001
Claytonexponential1.00.8890.008<0.0010.6260.007<0.001
5.00.9580.003<0.0010.6350.007<0.001
Gumbelexponential4.00.999<0.001<0.0010.6660.006<0.001
Frankexponential−20.00.7410.009<0.0010.6240.007<0.001
−5.00.7530.010<0.0010.6240.007<0.001
5.00.9240.007<0.0010.6320.007<0.001
FGMexponential−1.00.7770.011<0.0010.6230.007<0.001
1.00.8650.010<0.0010.6260.007<0.001
GBexponential0.50.7860.010<0.0010.6240.007<0.001
1.00.7640.010<0.0010.6230.007<0.001
Table 5. Recent works on the Mann–Whitney effect with the copula models and marginal models.
Table 5. Recent works on the Mann–Whitney effect with the copula models and marginal models.
CopulaMarginal Distribution
Domma & Giordano [47]FGM, Generalized FGM, FrankBurr III, Dagum, Singh–Maddala
Gao et al. [48]Mixed (Clayton, Gumbel, Frank)Empirical
de Andrade et al. [49]Clayton, Gumbel, Frank, Gauss, PlackettWeibull, Gamma, Log-normal, Dagum
Rathie et al. [50]FrankDagum, Log-Dagum
James et al. [51]FGMRayleigh
Shang & Yan [52]ClaytonWeibull, Kumaraswamy
Lima et al. [53]Clayton, Frank, Gumbel–HouggardGeneralized extreme value, Weibull, gamma
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Nakazono, K.; Lin, Y.-C.; Liao, G.-Y.; Uozumi, R.; Emura, T. Computation of the Mann–Whitney Effect under Parametric Survival Copula Models. Mathematics 2024, 12, 1453. https://doi.org/10.3390/math12101453

AMA Style

Nakazono K, Lin Y-C, Liao G-Y, Uozumi R, Emura T. Computation of the Mann–Whitney Effect under Parametric Survival Copula Models. Mathematics. 2024; 12(10):1453. https://doi.org/10.3390/math12101453

Chicago/Turabian Style

Nakazono, Kosuke, Yu-Cheng Lin, Gen-Yih Liao, Ryuji Uozumi, and Takeshi Emura. 2024. "Computation of the Mann–Whitney Effect under Parametric Survival Copula Models" Mathematics 12, no. 10: 1453. https://doi.org/10.3390/math12101453

APA Style

Nakazono, K., Lin, Y. -C., Liao, G. -Y., Uozumi, R., & Emura, T. (2024). Computation of the Mann–Whitney Effect under Parametric Survival Copula Models. Mathematics, 12(10), 1453. https://doi.org/10.3390/math12101453

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop