Next Article in Journal
Innovative Approach to Detecting Autism Spectrum Disorder Using Explainable Features and Smart Web Application
Previous Article in Journal
Deep Learning-Driven Virtual Furniture Replacement Using GANs and Spatial Transformer Networks
Previous Article in Special Issue
The Wiener Process with a Random Non-Monotone Hazard Rate-Based Drift
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Nonparametric Predictive Inference for Discrete Lifetime Data

by
Frank P. A. Coolen
1,*,
Tahani Coolen-Maturi
1 and
Ali M. Y. Mahnashi
2
1
Department of Mathematical Sciences, Durham University, Durham DH1 3LE, UK
2
Department of Mathematics, College of Science, Jazan University, Jazan 45 142, Saudi Arabia
*
Author to whom correspondence should be addressed.
Mathematics 2024, 12(22), 3514; https://doi.org/10.3390/math12223514
Submission received: 30 July 2024 / Revised: 2 October 2024 / Accepted: 28 October 2024 / Published: 11 November 2024
(This article belongs to the Special Issue Reliability Analysis and Stochastic Models in Reliability Engineering)

Abstract

:
This paper presents nonparametric predictive inference for discrete lifetime data. While lifetimes are mostly treated as continuous random variables in statistics, there are scenarios where time observations are recorded as discrete values, for example, in actuary, where lifetimes are often recorded as integers in years. The presented method provides lower and upper probabilities for a variety of events of interest involving discrete lifetimes, with examples provided for illustration. Furthermore, the discrete-time situation is considered for inference of the reliability of systems, with discrete-time data for components of different types and using the survival signature to combine inference on components’ reliability to quantify the overall system reliability.

1. Introduction

In the real world, the time until an event occurs is typically considered to be a continuous variable. However, in many applications, recorded observations have a discrete nature because they are represented by discrete values. When there are many different possible values, any distinction between the continuous or discrete nature of a variable becomes negligible. Nonetheless, in certain application areas, time is often modelled as a discrete variable with relatively few possible values. This is particularly true for actuarial models, for which typically a cohort of people, either real or hypothetical, is followed over time, and events such as deaths are recorded per year. However, if a person leaves the group for reasons other than death, or other than the specific cause of death under investigation, then right-censoring occurs. In such cases, time is recorded as the person’s age at the time of the event, making time a discrete variable.
For discrete-time data, approaches like those considered in this paper can be visualised as a table with k discrete-time points, say t 1 < t 2 < < t k . The data include the number of events and the number of right-censored individuals at each discrete-time point, excluding time t 0 where no events or right-censoring are assumed to have occurred. At any given time, we consider how many people are alive; that is, how many people have survived that time, so this is effectively Bernoulli data; then, we look ahead and assess how many people will be alive in the future. The actuarial estimator is a nonparametric method for estimating the survival function which explicitly restricts attention to the discrete-time, with the possible inclusion of right-censored data [1,2,3].
In this paper, we take a similar approach, but we look at it from a predictive perspective using nonparametric predictive inference (NPI), which is inherently predictive. NPI is a statistical method that relies on only a few assumptions based on Hill’s assumption A ( n ) [4] and uses imprecise probabilities to measure uncertainty [5,6]. Additionally, NPI has been adapted to accommodate various types of data and a wide range of applications. For example, NPI has been used with Bernoulli data [7,8], data containing right-censored observations [9,10], bivariate data [11], multinomial data [12,13], and circular data [6]. The NPI approach has also been developed for right-censored observations in survival data [14]. NPI for Bernoulli data [7] will be utilised to develop the NPI alternative to the actuarial estimator, based on the assumption of non-informative right censoring [9,15].
The paper is organized as follows. Section 2 provides a brief introduction to the actuarial estimator of the survival function. Section 3 provides an overview of NPI and its application to Bernoulli and right-censored data. In Section 4, we present NPI as an alternative to the actuarial estimator for right-censored data. It proposes lower and upper probabilities for the event that all future observations are greater than a specific discrete time t j . Section 5 focuses on the proposed NPI-based discrete-time reliability function by providing lower and upper probabilities for the event that at least x out of m future observations will survive at discrete time t j . In Section 6, we apply the proposed method to system reliability using survival signatures [16,17] combined with NPI for Bernoulli data [7]. Finally, we conclude with some remarks in Section 7.

2. Actuarial Estimator of the Survival Function

In a discrete-time setting, a common nonparametric method for estimating the survival function is the actuarial estimator. To introduce the actuarial estimator, we first consider n individuals alive at time t 0 . Let X 1 , X 2 , , X n be positive, exchangeable, and discrete random variables, of which their discrete lifetimes are assumed to be independent and identically distributed, that take values at discrete-time points t j , where j = 1 , , k , with t 1 < t 2 < < t k . We refer to the event of interest as ‘death’, but it can be any time-related event of interest; for example, in reliability, it will typically be a failure event. The discrete-time hazard function at a specific time t j is defined as the conditional probability that a randomly selected individual, X i , i = 1 , , n , will experience the event of interest at time t j given that this individual did not experience the event prior to t j such that
h t j = P ( X i = t j | X i t j )
Let d t j be the number of individuals who died at time t j and let c t j be the number of individuals whose lifetimes are right-censored at time t j . Let n ` t j be the number of individuals known to be at risk (still alive and uncensored) at time t j , that is n ` t j = n ` t j 1 d t j 1 c t j 1 . It is common to assume that all individuals are at risk at t 0 , so n ` t 0 = n . Then, the discrete-time hazard function, h t j , at a discrete time t j can be estimated by the actuarial estimator [1,2,3], as
h ^ t j = d t j n ` t j
The survival function at time t j is defined as S t j = P ( X t j ) ; note that S t 0 = P ( X 0 ) = 1 . The survival function S t j can be estimated in terms of the actuarial estimator h ^ t l for l = 1 , , j 1 , by
S ^ t j = l = 1 j 1 n ` t l d t l n ` t l = l = 1 j 1 ( 1 h ^ t l )
In Section 4, we will explore an alternative to the actuarial estimator under the NPI methodology, using NPI for Bernoulli data [7]. First, a brief introductory overview of NPI, including NPI for Bernoulli data, is provided in Section 3.

3. Nonparametric Predictive Inference (NPI)

Nonparametric predictive inference (NPI) is a frequentist statistical method which requires only a few assumptions, enabled by the use of imprecise probabilities to quantify uncertainty [5,6]. NPI is based on Hill’s assumption A ( n ) [4]. Assume that X 1 , X 2 , , X n , X n + 1 are real-valued absolutely continuous and exchangeable random quantities. Let the ordered observed values of X 1 , , X n be denoted as x 1 < x 2 < < x n . To simplify notation, let x 0 = and x n + 1 = . These n observations divide up the real-line into n + 1 intervals I j = ( x j , x j + 1 ) , where j = 0 , 1 , , n . Based on n observations, the assumption A ( n ) [18] is that the probability that the next future observation X n + 1 is equally likely to fall in each open interval ( x j , x j + 1 ) , for all j = 0 , 1 , , n , so
P X n + 1 ( x j , x j + 1 ) = 1 n + 1 for all j = 0 , 1 , , n
The assumption A ( n ) alone is insufficient for constructing precise probabilities for many events of interest, but it is still useful to derive bounds for probabilities. Repeated application of the assumptions A ( n ) , A ( n + 1 ) , , A ( n + m 1 ) enables predictive inference for m 1 future observations. These assumptions imply that all orderings of the m future observations among the n data observations are equally likely. Based on these assumptions, Coolen [7] introduced NPI for Bernoulli observations, using an assumed latent variable representation for successes and failures as values on the real line separated by a threshold.
Assume that there are n + m exchangeable Bernoulli trials with failure and success as possible outcomes for each trial, and data containing s successes in n trials. Let X 1 n denote the random number of successes in trials 1 to n, and X n + 1 n + m denote the random number of successes in trials n + 1 to n + m . Coolen [7] presented general formulae for NPI lower and upper probabilities for any event of interest involving X 1 n . The attention here is limited to the results needed in this paper.
Coolen [7] derived the NPI lower and upper probabilities for the event that all m future trials are successes, given data consisting of s successes observed in n trials, for s { 0 , 1 , , n } . The NPI lower probability for this event is
P ̲ ( X n + 1 n + m = m | X 1 n = s ) = i = 1 m s + i 1 n + i
and the corresponding NPI upper probability is
P ¯ ( X n + 1 n + m = m | X 1 n = s ) = i = 1 m s + i n + i
Based on the general results by Coolen [7], Aboalkhair [19] derived formulae for the NPI lower and upper probabilities for the event that there are at least r successes in m future trials, given s successes observed in n trials. The NPI lower probability for this event is
P ̲ ( X n + 1 n + m r | X 1 n = s ) = 1 n + m n 1 × = 0 r 1 s + 1 s 1 n s + m n s
and the corresponding NPI upper probability is
P ¯ ( X n + 1 n + m r | X 1 n = s ) = n + m n 1 × s + r s n s + m r n s + = r + 1 m s + 1 s 1 n s + m n s
The nature of A ( n ) results in NPI being a frequentist statistical methodology [4,5,20], which can be interpreted in a way similar to that of posterior predictive methods within Bayesian statistics but without any prior information being included [7,21]. Hill [20] provides detailed discussion of A ( n ) including comparison with nonparametric Bayesian methods, and Hill [21] presented a formal justification of A ( n ) within the Bayesian context. This justification, however, is a rather complicated splitting process under finite exchangeability. It is more natural to consider NPI, based on A ( n ) , as a frequentist statistics methodology based on assumed exchangeability of all observations. The exchangeability assumption implies that all orderings of observations are equally likely, and A ( n ) -based inference keeps this property for the orderings of future observations among data observations. This is a relatively weak assumption, but it excludes scenarios with known trends in data, e.g., time series, or other clear patterns in the data which would undermine the assumption that all orderings of the observations are equally likely.

4. NPI-Based Discrete-Time Survival Function

In this section, we introduce an NPI-based alternative to the actuarial estimator for the survival function. We introduce it in two steps: first, we derive the NPI lower and upper probabilities for the conditional survival function, and then we derive the corresponding survival function. In doing so, we need to consider the interdependence between the m future observations; however, we first need to introduce some notation.
Let X 1 , X 2 , , X n be positive, exchangeable, and discrete random variables that take values at discrete-time points t j , where j = 1 , , k , with t 1 < t 2 < < t k . We define n ` t 0 = n for the start of the study when all individuals survived, and t k + 1 = . Let n t j be the number of individuals known to be alive at time t j . Also, let d t j represent the number of observed events at time t j , and c t j represent the number of censored events at time t j . We assume that the censored observations occur at discrete times t j , where j = 1 , 2 , , k . Then, the number of individuals at risk at time t j , denoted by n ^ t j , is computed by n ^ t j = n t j 1 c t j . Therefore, the number of individuals at risk at time t j , n ^ t j , will decrease at subsequent discrete times. Furthermore, let X l t j > t j , for l = 1 , , n ^ t j , be the event times for the individuals in the risk set at time t j .
Now, let us consider X n + i for the time of event of the ith future individual, for i = 1 , 2 , , m . We consider the event of interest that all m future observations X n + i survive a specific discrete time t j given that they survived the earlier discrete time t j 1 . This event can be denoted as i = 1 m { X n + i > t j | X n + i > t j 1 } .
We consider the survival of all m future observations at time t j as exchangeable with the survival of the n ^ t j individuals in the risk set at that discrete time. So, we assume that the random quantities X n + 1 , X n + 2 , , X n + i , with respect to the event X n + i > t j , i = 1 , , m , are exchangeable with X 1 t j , X 2 t j , , X l t j with respect to the event X l t j > t j for l = 1 , , n ^ t j , where X l t j are the event times for the individuals in the risk set at time t j .
The NPI lower and upper probabilities for the event i = 1 m { X n + i > t j | X n + i > t j 1 } can be derived by utilising NPI for Bernoulli data [7] via Equations (5) and (6), respectively. This can be performed by considering the number of individuals known to be alive at time t j , n t j , out of the number of individuals at risk at time t j , n ^ t j . Thus, the NPI lower probability for this event is
P ̲ i = 1 m { X n + i > t j | X n + i > t j 1 } = i = 1 m n t j + i 1 n ^ t j + i
and the corresponding NPI upper probability for this event is
P ¯ i = 1 m { X n + i > t j | X n + i > t j 1 } = i = 1 m n t j + i n ^ t j + i
We now consider the event that the m future observations will all exceed t j , that is i = 1 m { X n + i > t j } . The NPI lower and upper probabilities for this event can be expressed in terms of the NPI conditional lower and upper probabilities in Equations (9) and (10), respectively, at all earlier times t 1 , t 2 , , t j , as follows
P ̲ i = 1 m { X n + i > t j } = = 1 j i = 1 m n t + i 1 n ^ t + i
P ¯ i = 1 m { X n + i > t j } = = 1 j i = 1 m n t + i n ^ t + i
For the special case when we have one future observation X n + 1 (i.e., m = 1 ), the NPI lower and upper probabilities for the event X n + 1 > t j can be directly calculated from Equations (11) and (12), respectively, as
P ̲ ( X n + 1 > t j ) = l = 1 j n t l n ^ t l + 1
P ¯ ( X n + 1 > t j ) = l = 1 j n t l + 1 n ^ t l + 1
The NPI lower and upper probabilities for the event i = 1 m { X n + i > t j } , as presented in Equations (11) and (12), take into account the dependence among all these future observations when there is limited information in the form of n observations in the data set. It is of interest to see the effect of taking this dependence carefully into account. For this reason, we will compare our method with the results one would obtain if, mistakenly, when interested in m future observations, one would use the NPI lower and upper probabilities for the event X n + 1 > t j , presented in Equations (13) and (14), raised to the power of m, i.e., P ̲ , P ¯ m ( X n + 1 > t j ) . This will be demonstrated in Example 2 by studying the impact of ignoring the interdependence between the m future observations, but first, Example 1 is provided to demonstrate our method.
Example 1.
We will start with a simple example involving n = 9 observations, which are available at discrete times t j , for j = 1 , 2 , 3 , 4 . At each time point, we have the number of observed events, d t j ; the number of censored individuals, c t j ; the number of individuals known to be alive at time t j , n t j ; and the number of individuals at risk at time t j , n ^ t j . It is important to note that n ^ t j is computed differently than n ` t j , for example, n ^ t 2 = 7 but n ` t 2 = 8 (see Section 2). The data are shown in the first six columns in Table 1.
The probability of the hazard function, h t j , at a discrete time t j can be estimated by using the actuarial estimator with Equation (2). Then, the estimated probability of surviving t j , for j = 1 , 2 , 3 , 4 , is derived using Equation (3). These results are presented in the seventh and eighth columns of Table 1.
Next, we apply the NPI alternative to the actuarial estimator, leading to the NPI lower and upper probabilities for the event i = 1 m { X 9 + i > t j } , as given by Equations (11) and (12), respectively. These are calculated for the discrete-time points t 1 , t 2 , t 3 , and t 4 , for different numbers of future observations, i.e., for m { 1 , 3 , 10 , 15 } . It is worth noting that at the start of the study at time t 0 , no events or censorings have been recorded, so P ̲ ( i = 1 m { X 9 + i > t 0 } ) = P ¯ ( i = 1 m { X 9 + i > t 0 } ) = 1 .
Based on the results in Table 1, we observe that the difference between the NPI upper probability and the NPI lower probability is quite small at time t 1 for all considered numbers of future observations and becomes larger later on. This increase in difference is influenced by two effects: fewer individuals in the risk set n ^ t j at later times t 2 , t 3 , and t 4 , and the products of lower and upper probabilities are taken such that each term (i.e., time point) adds to the imprecision.
When we compare the results from our proposed method for m = 1 future observation with those resulting from estimating the survival function based on the actuarial estimator, we find that the S ^ t j values, based on using the actuarial estimator, fall between our NPI lower and upper probabilities for X 10 > t j , but they are closer to the upper probability values.
Example 2.
The dataset used in this example was also utilised by Berkson and Gage [22] as well as by Lawless [23] and Yan [15]. It comprises 374 observations, wherein 95 are right-censored, and the remaining are event times measured at 10 discrete times in years. The dataset is summarised in the first three columns of Table 2.
By using Equations (11) and (12), we obtain the NPI lower and upper probabilities for the event i = 1 m { X n + i > t j } for m { 1 , 2 , 3 , 10 } future observations at specific time points. The results are summarised in Table 2.
To understand the impact of considering the dependence among future observations, we compare our results [ P ̲ , P ¯ ] ( i = 1 5 { X 374 + i > t j } ) for five future observations with those obtained by erroneously considering only the first future observation ( X 375 > t j ) raised to the power of 5, i.e., [ P ̲ , P ¯ ] 5 ( X 375 > t j ) . Due to the positive dependence among the future observations, X 375 , X 376 , X 377 , X 378 , and X 379 , our correct NPI lower and upper probabilities for the event i = 1 5 { X 374 + i > t j } are greater than those obtained using the mistaken approach (taking the lower and upper probabilities for X 375 > t j raised to the power of 5). Although the imprecisions (differences between the upper and lower probabilities) are small, they would become more noticeable for more than five future observations due to the positive dependence among all future observations.

5. NPI-Based Discrete-Time Reliability Function

In this section, we are introducing NPI lower and upper probabilities for the event that at least x out of m future observations will survive at discrete time t j . Let N t j denote the number of future observations out of m that survive at discrete time t j . Given n ^ t j Bernoulli trials, with n ^ t j d t j observations surviving at time t j , we aim to derive the NPI lower and upper probabilities for the event N t j x , where x can take values in the set { 0 , 1 , , m } .
The NPI upper probability for the event N t j x is derived by utilising Equation (8), as
P ¯ ( N t j x ) = y = x m P ¯ ( N t j x | N t j 1 = y ) P ¯ ( N t j 1 y ) P ¯ ( N t j 1 y + 1 )
The terms on the right-hand side of Equation (15) are all derived by applying Equation (8), where for the terms involving N t j 1 the data consist of n ^ t j 1 Bernoulli trials, with n ^ t j 1 d t j 1 observations surviving at time t j 1 , leading to
P ¯ ( N t j 1 y ) P ¯ ( N t j 1 y + 1 ) = n ^ t j 1 + m n ^ t j 1 1 ( n ^ t j 1 d t j 1 ) + y ( n ^ t j 1 d t j 1 ) n ^ t j 1 ( n ^ t j 1 d t j 1 ) + m y 1 n ^ t j 1 ( n ^ t j 1 d t j 1 )
where y { 0 , 1 , , m } future observations. It is important to point out that for the case y = m , the NPI upper probability for the event N t j 1 y + 1 is equal to 0.
The NPI lower probability for the event N t j x , for x { 0 , 1 , , m } is derived by utilising Equation (7), as
P ̲ ( N t j x ) = y = x m P ̲ ( N t j x | N t j 1 = y ) P ̲ ( N t j 1 y ) P ̲ ( N t j 1 y + 1 )
The terms on the right-hand side of Equation (17) are all derived by applying Equation (7), where for the terms involving N t j 1 the data consist of n ^ t j 1 Bernoulli trials, with n ^ t j 1 d t j 1 observations surviving at time t j 1 , leading to
P ̲ ( N t j 1 y ) P ̲ ( N t j 1 y + 1 ) = n ^ t j 1 + m n ^ t j 1 1 ( n ^ t j 1 d t j 1 ) + y 1 ( n ^ t j 1 d t j 1 ) 1 n ^ t j 1 ( n ^ t j 1 d t j 1 ) + m y n ^ t j 1 ( n ^ t j 1 d t j 1 )
where y { 0 , 1 , , m } future observations. It should be remarked that the NPI lower probability for the event N t j 1 y + 1 for the case y = m is equal to 0.
It is worth noting that the lower and upper probabilities for the event N t j x when x = 0 are both equal to 1, regardless of the values of y. Therefore, only the results for x = { 1 , , m } will be reported hereafter.
Example 3.
In this example, we will illustrate the method presented in Section 5 using a simple example involving nine observations, available at discrete times t j for j = 1 , 2 , 3 , 4 (data are summarized in Table 3).
Table 3 shows the NPI lower and upper probabilities for the event N t j x | N t j 1 = y , where x { 0 , 1 , 2 , 3 } and y { 0 , 1 , 2 , 3 } , with x y . For x = 0 , the NPI lower and upper probabilities are equal to 1 for all y { 0 , 1 , 2 , 3 } and at all t j , due to the fact that no future observation out of y will survive at discrete time t j . Note that some cells in Table 3 are empty due to the calculation of probabilities for the event that at least x out of y future observations will survive at discrete time t j . From Table 3, we can also observe that at a specific discrete time t j , the NPI lower and upper probabilities decrease in x when everything else is constant and increase in y when everything else is constant.
Meanwhile, Table 4 presents the NPI lower and upper probabilities for the event N t j x for x { 1 , 2 , 3 } future observations, again the lower and upper probabilities are equal to 1 when x = 0 . From Table 4, we can see that the difference between the NPI lower and upper probabilities decreases in x while everything else is held constant at each discrete time t j . Without any further added assumptions, the values of the NPI lower probabilities at t 4 are 0 for x { 1 , 2 , 3 } , whereas the NPI upper probabilities are positive.

6. Application to System Reliability Using Survival Signatures

In Section 5, we derived the NPI lower and upper probabilities for the event that at least x out of m future observations will survive at discrete time t j . In this section, we will utilise these probabilities to assess system reliability, considering single or multiple types of components, using survival signatures. Essentially, the results from Section 5 will be employed to derive lower and upper probabilities for the discrete-time system reliability event T S > t j , where T S denotes the random failure time of the system. We will combine the concept of survival signature [16,24] with the proposed method in Section 5. First, we will give a brief overview of survival signatures, where the values of the survival signatures are assumed to be given. Then, we will demonstrate the application of the proposed methods to the reliability of some discrete-time systems with both single and multiple types of components.

6.1. The Survival Signature

The signature has been introduced to evaluate the reliability of systems consisting of only one type of component and is used to model the structure of a system, separating this from the random failure times of the components [17]. The NPI method is used in order to learn about the components within the system, based on data consisting of failure times for components that are exchangeable with those within the system. We therefore assume that such data are available, such as those obtained from testing or previous use of the components [16,17]. Following the literature, the assumption of exchangeability is often replaced by the stronger assumption of independent and identically distributed ( i i d ) component failure times [25]. Taking into account a system consisting of m components with exchangeable failure times, Samaniego [26,27] introduced the system signature as a tool for reliability assessment for systems consisting of components of a single type. However, the use of signatures becomes very complicated in the case of quantifying the reliability of systems with multiple types of components. Coolen and Coolen–Maturi [24] introduced an alternative concept called the ’survival signature’. The idea of the survival signature is to generalise the signature to systems with multiple types of components. When quantifying the reliability of systems with only one type of component, the survival signature is closely related to the signature [16,24]. The NPI methodology has been introduced for system reliability using the survival signature via lower and upper survival functions for the failure time T S of a system consisting of multiple types of components [16], combined with NPI for Bernoulli data [7].
For a system with m exchangeable components, we need to consider the state vector x ̲ = ( x 1 , x 2 , , x m ) { 0 , 1 } m taking into account that for each i, if the ith component functions, then x i = 1 , otherwise x i = 0 when the ith component does not function. For all possible state vectors x ̲ , the following structure function is defined as ϕ : { 0 , 1 } m { 0 , 1 } , so that ϕ ( x ̲ ) = 1 if the system functions and ϕ ( x ̲ ) = 0 if the system does not function. Throughout this section, the system is assumed to be coherent, which means that the structure function ϕ ( x ̲ ) must not be decreasing in any of the components of x ̲ , and this leads to the fact that the functioning of the system cannot be improved by worse performance of one or more of its components. Furthermore, we assume that the system functions if all its components function, so ϕ ( 1 ) = 1 , and the system fails if all its components fail, so ϕ ( 0 ) = 0 .
For a system consisting only of m exchangeable components, the survival signature, denoted by Φ ( l ) , for l = 1 , , m , is defined as the probability that the system functions given that precisely l of its components function [24]. For coherent systems, Φ ( l ) is an increasing function of l, and we assume that Φ ( 0 ) = 0 and Φ ( m ) = 1 . There are m l state vectors x ̲ with precisely l components x i = 1 , so with i = 1 m x i = l ; the set of these state vectors is denoted by S l . Inspired by the i i d assumption which has been considered for the failure times of the m components, all these state vectors are equally likely to occur [24]. Thus, the survival signature Φ ( l ) can be achieved as follows [24]
Φ ( l ) = m l 1 x ̲ S l ϕ ( x ̲ )
Let C ( t ) { 0 , 1 , , m } represent the number of components in the system with a single type that functions at time t > 0 . So, the probability that the system functions at time t > 0 is
P ( T S > t ) = l = 0 m Φ ( l ) P ( C ( t ) = l )
For a system consisting of K 2 types of components, the survival signature, denoted by Φ ( l 1 , , l K ) for l k = 0 , , m k , is defined as the probability that a system functions given that precisely l k of its components of type k function, for each k { 1 , 2 , , K } . There are m k l k state vectors x ̲ k with precisely l k of its m k components x i k = 1 ; so, with i = 1 m k x i k = l k , we denote the set of these state vectors for components of type k by S l k . In addition, let S l 1 , , l k denote the set of these state vectors for the whole system for which i = 1 m k x i k = l k , k { 1 , 2 , , K } . Inspired by the i i d assumption which has been considered for the failure times of the m k components of type k, all these state vectors x ̲ k are equally likely to occur. Thus, the survival signature Φ ( l 1 , , l K ) can be achieved as follows [24].
Φ ( l 1 , , l K ) = k = 1 K m k l k 1 × x ̲ S l 1 , , l K ϕ ( x ̲ )
Let C k ( t ) { 0 , 1 , , m k } represent the number of components of type k in the system which function at time t > 0 . So, the probability that the system functions at time t > 0 is
P ( T S > t ) = l 1 = 0 m 1 l K = 0 m K Φ ( l 1 , , l K ) P k = 1 K { C k ( t ) = l k }
Assuming that the failure times of components of different types are independent, while the exchangeability is assumed for the failure times of components of the same type [16], the survival function for T S can be written as
P ( T S > t ) = l 1 = 0 m 1 l K = 0 m K Φ ( l 1 , , l K ) k = 1 K P { C k ( t ) = l k }

6.2. Discrete-Time System Reliability

In this section, we will apply the proposed method to system reliability in the case of discrete time. We will combine the concept of the survival signature Φ ( l ) (as reviewed in Section 6.1) with the results obtained in Section 5 to present lower and upper probabilities for the event T S > t j of a system reliability that consists of both single type and multiple types of components. These lower and upper probabilities represent the survival functions at discrete-time points t j .
At a specific time t j , let n ^ t j represent the number of components for which test failure data are available, and let d t j represent the number of components that failed at time t j . Therefore, n ^ t j d t j is the number of components of this type that are still functioning at time t j [16,17]. Additionally, let N t j { 0 , 1 , , m } denote the number of components in the system out of f m that are still functioning at a discrete time t j .
We obtain the NPI lower and upper probabilities for the event that T S > t j for a system consisting of a single type of components, using the survival signature Φ ( l ) combined with the proposed method in Section 5 as follows
P ̲ ( T S > t j ) = = 0 m Φ ( ) D ¯ ( N t j = )
and
P ¯ ( T S > t j ) = = 0 m Φ ( ) D ̲ ( N t j = )
where D ¯ ( N t j = ) and D ̲ ( N t j = ) are derived from Equations (16) and (18), respectively, so
D ¯ ( N t j = ) = P ̲ ( N t j ) P ̲ ( N t j + 1 ) = n ^ t j + m n ^ t j 1 ( n ^ t j d t j ) + 1 ( n ^ t j d t j ) 1 n ^ t j ( n ^ t j d t j ) + m n ^ t j ( n ^ t j d t j )
and
D ̲ ( N t j = ) = P ¯ ( N t j ) P ¯ ( N t j 1 ) = n ^ t j + m n ^ t j 1 ( n ^ t j d t j ) + ( n ^ t j d t j ) n ^ t j ( n ^ t j d t j ) + m 1 n ^ t j ( n ^ t j d t j ) 1
We now consider a system consisting of K 2 types of components with m k components of k { 1 , 2 , , K } , with k = 1 K m k = m . For a specific time t j , let n ^ t j k denote the number of components of type k for which test failure data are available, and let d t j k denote the numbers of components that failed at time t j ; therefore, n ^ t j k d t j k is the number of components of type k that are still functioning at time t j [16,17]. The failure times of components of different types are assumed to be independent, while failure times of components of the same type are assumed to be exchangeable [16]. Let N t j k { 0 , 1 , , m k } denote the number of components of type k in the system out of m k that are still functioning at a discrete time t j , k = 1 , 2 , , K .
The NPI lower and upper probabilities for the event T S > t j of a system consisting of multiple types of components using the survival signature Φ ( l ) combined with the proposed method in Section 5 are as follows
P ̲ ( T S > t j ) = 1 = 0 m 1 K = 0 m K Φ ( 1 K ) k = 1 K D ¯ ( N t j k = k )
and
P ¯ ( T S > t j ) = 1 = 0 m 1 K = 0 m K Φ ( 1 K ) k = 1 K D ̲ ( N t j k = k )
where D ¯ ( N t j k = k ) and D ̲ ( N t j k = k ) for k { 0 , 1 , , m k } are derived from Equations (16) and (18), respectively, thus
D ¯ ( N t j k = k ) = P ̲ ( N t j k k ) P ̲ ( N t j k k + 1 ) = n ^ t j k + m k n ^ t j k 1 ( n ^ t j k d t j k ) + k 1 ( n ^ t j k d t j k ) 1 n ^ t j k ( n ^ t j k d t j k ) + m k k n ^ t j k ( n ^ t j k d t j k )
and
D ̲ ( N t j k = k ) = P ¯ ( N t j k k ) P ¯ ( N t j k k 1 ) = n ^ t j k + m k n ^ t j k 1 ( n ^ t j k d t j k ) + k ( n ^ t j k d t j k ) n ^ t j k ( n ^ t j k d t j k ) + m k k 1 n ^ t j k ( n ^ t j k d t j k ) 1
Next, we will apply the results presented above to discrete-time system reliability, which consists of a single type of component (see Example 4) and multiple types of components (see Example 5).
Example 4.
The system depicted in Figure 1 is utilised in this example and was also utilised by Coolen and Coolen–Maturi [28]. We are examining the reliability of a discrete-time system with m = 5 exchangeable components, presenting the survival signature values as follows: Φ ( 0 ) = 0 , Φ ( 1 ) = 0 , Φ ( 2 ) = 0.6 , Φ ( 3 ) = 0.9 , Φ ( 4 ) = 1 , and Φ ( 5 ) = 1 . We will analyse two datasets of different sizes, one with n = 10 observations and the other with n = 20 observations. These datasets include failure events and right-censored observations for discrete times t 1 to t 5 . Table 5 presents NPI lower and upper probabilities for T S > t j at these discrete times, based on the survival signature values as provided and the results in Section 6.2.
Upon comparing the results in Table 5, it is evident that the imprecisions (the difference between the upper and lower probability) for both sample sizes are minimal at time t 1 and increase as we progress to later times, owing to fewer observations in the risk set. Additionally, the differences between the lower and upper probabilities for T S > t j with n = 20 observations are generally smaller compared to those with n = 10 observations. So, the imprecision for T S > t j decreases as the dataset size increases, i.e., as we have more data available.
Example 5.
In this example, we are examining a system with K = 2 types of components, types 1 and 2, depicted in Figure 2. Coolen et al. [24] utilised this system to demonstrate NPI for the system survival time. The survival signature for this system can be found in Table 6. We are focusing on the data provided in Table 7 for the two types with m 1 = m 2 = 3 components, and each type has 10 observations, i.e., n 1 = n 2 = 10 , including failure events and right-censored observations, for discrete times t 1 , t 2 , and t 3 . The table also includes the NPI lower and upper probabilities for T S > t j at discrete times t 1 , t 2 , and t 5 , based on the given survival signature values and the results in Section 6.2.
When considering the NPI approach for real-valued data, it is typical for the lower probability value for X n + 1 > t in a specific interval to be less than or equal to the upper probability value for X n + 1 > t in the next interval. This is evident in the results of [29]. However, the NPI for the discrete-time approach indicates that this may not always be the case, as observed in the results of Table 7 where P ̲ ( T S > t 1 ) > P ¯ ( T S > t 2 ) , and in the results of Table 5 where P ̲ ( T S > t 3 ) > P ¯ ( T S > t 4 ) . Many of the findings presented in this paper suggest that, for discrete-time cases, this discrepancy may arise due to multiple failures occurring between discrete-time points.

7. Concluding Remarks

This paper introduced an alternative predictive approach to the actuarial estimator in the context of discrete-time data. The proposed NPI method provides lower and upper probabilities for the event that all future observations survive at a discrete-time point. The proposed method, based on NPI for Bernoulli data, is developed to derive the NPI lower and upper probabilities for the event that a specific number of future Bernoulli trials survive out of multiple future trials considered. Additionally, this development has been applied to systems reliability with single and multiple types of components at discrete-time points in conjunction with the survival signature method. The methods presented in this paper can be applied to various applications in system reliability, where in particular their use to support decisions in practical scenarios lead to interesting topics for future research.

Author Contributions

Methodology, F.P.A.C., T.C.-M. and A.M.Y.M.; Formal analysis, F.P.A.C., T.C.-M. and A.M.Y.M.; Investigation, F.P.A.C., T.C.-M. and A.M.Y.M.; Writing—original draft, F.P.A.C., T.C.-M. and A.M.Y.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Singer, J.D.; Willett, J.B. It’s about time: Using discrete-time survival analysis to study duration and the timing of events. J. Educ. Stat. 1993, 18, 155–195. [Google Scholar]
  2. Allison, P.D. Discrete-time methods for the analysis of event histories. Sociol. Methodol. 1982, 13, 61–98. [Google Scholar] [CrossRef]
  3. Masyn, K.E. Discrete-Time Survival Mixture Analysis for Single and Recurrent Events Using Latent Variables. Ph.D. Thesis, University of California, Los Angeles, CA, USA, 2003. Available online: https://www.statmodel.com/download/masyndissertation.pdf (accessed on 29 July 2024).
  4. Hill, B.M. Posterior distribution of percentiles: Bayes’ theorem for sampling from a population. J. Am. Stat. Assoc. 1968, 63, 677–691. [Google Scholar] [CrossRef]
  5. Augustin, T.; Coolen, F.P.A. Nonparametric predictive inference and interval probability. J. Stat. Plan. Inference 2004, 124, 251–272. [Google Scholar] [CrossRef]
  6. Coolen, F.P.A. On nonparametric predictive inference and objective Bayesianism. J. Logic Lang. Inf. 2006, 15, 21–47. [Google Scholar]
  7. Coolen, F.P.A. Low structure imprecise predictive inference for Bayes’ problem. Stat. Probab. Lett. 1998, 36, 349–357. [Google Scholar] [CrossRef]
  8. Coolen, F.P.A.; Coolen-Schrijner, P. Nonparametric predictive subset selection for proportions. Stat. Probab. Lett. 2006, 76, 1675–1684. [Google Scholar] [CrossRef]
  9. Coolen, F.P.A.; Yan, K.J. Nonparametric predictive inference with right-censored data. J. Stat. Plan. Inference 2004, 126, 25–54. [Google Scholar] [CrossRef]
  10. Coolen, F.P.A.; Yan, K.J. Nonparametric Predictive Comparison of Two Groups of Lifetime Data. In Proceedings of the 3rd International Symposium on Imprecise Probabilities and Their Applications, Carlton Scientific, Lugano, Switzerland, 14–17 July 2003; pp. 148–161. [Google Scholar]
  11. Coolen-Maturi, T.A.; Coolen, F.P.A.; Muhammad, N. Predictive inference for bivariate data: Combining nonparametric predictive inference for marginals with an estimated copula. J. Stat. Theory Pract. 2016, 10, 515–538. [Google Scholar] [CrossRef]
  12. Baker, R. Multinomial Nonparametric Predictive Inference: Selection, Classification and Subcategory Data. Ph.D. Thesis, University of Durham, Durham, UK, 2010. Available online: https://maths.durham.ac.uk/stats/people/fc/thesis-RB.pdf (accessed on 29 July 2024).
  13. Coolen, F.P.A.; Augustin, T. Learning from multinomial data: A nonparametric predictive alternative to the Imprecise Dirichlet Model. In Proceedings of the ISIPTA 4th International Symposium on Imprecise Probabilities and Their Applications, Pittsburgh, PA, USA, 20–23 July 2005; pp. 125–134. [Google Scholar]
  14. Janurová, K.; Briš, R. A nonparametric approach to medical survival data: Uncertainty in the context of risk in mortality analysis. Reliab. Eng. Syst. Saf. 2014, 125, 145–152. [Google Scholar] [CrossRef]
  15. Yan, K.J. Nonparametric Predictive Inference with Right-Censored Data. Ph.D. Thesis, Durham University, Durham, UK, 2002. Available online: https://maths.durham.ac.uk/stats/people/fc/thesis-KJY.pdf (accessed on 29 July 2024).
  16. Coolen, F.P.A.; Coolen-Maturi, T.; Al-Nefaiee, A.H. Nonparametric predictive inference for system reliability using the survival signature. Proc. Inst. Mech. Eng. Part O J. Risk Reliab. 2014, 228, 437–448. [Google Scholar] [CrossRef]
  17. Al-Nefaiee, A.H. Nonparametric Predictive Inference for System Failure Time. Ph.D. Thesis, University of Durham, Durham, UK, 2014. Available online: https://maths.durham.ac.uk/stats/people/fc/thesis-AAN.pdf (accessed on 29 July 2024).
  18. Hill, B.M. Bayesian nonparametric prediction and statistical inference. In Bayesian Analysis in Statistics and Econometrics; Springer: New York, NY, USA, 1992; pp. 43–94. [Google Scholar]
  19. Aboalkhair, A.M. Nonparametric Predictive Inference for System Reliability. Ph.D. Thesis, University of Durham, Durham, UK, 2012. Available online: https://maths.dur.ac.uk/stats/people/fc/thesis-AA.pdf (accessed on 29 July 2024).
  20. Hill, B.M. De Finetti’s Theorem, Induction, and Bayesian nonparametric predictive inference (with discussion). In Bayesian Analysis in Statistics and Econometrics; Springer: Berlin/Heidelberg, Germany, 1988; pp. 211–241. [Google Scholar]
  21. Hill, B.M. Parametric models for A(n): Splitting processes and mixtures. J. R. Stat. Soc. Ser. B 1993, 55, 423–433. [Google Scholar]
  22. Berkson, J.; Gage, R.P. Calculation of survival rates for cancer. Mayo Clin. 1950, 25, 270–286. [Google Scholar]
  23. Lawless, J.F. Statistical Models and Methods for Lifetime Data; Wiley: New York, NY, USA, 1982. [Google Scholar]
  24. Coolen, F.P.A.; Coolen-Maturi, T. Generalizing the signature to systems with multiple types of components. In Complex Systems and Dependability; Springer: Berlin/Heidelberg, Germany, 2013; pp. 115–130. [Google Scholar]
  25. Samaniego, F.J. System Signatures and Their Applications in Engineering Reliability; Springer: Berlin/Heidelberg, Germany, 2007. [Google Scholar]
  26. Samaniego, F.J. On closure of the IFR class under formation of coherent systems. IEEE Trans. Reliab. 1985, 34, 69–72. [Google Scholar] [CrossRef]
  27. Navarro, J.; Samaniego, F.J.; Balakrishnan, N.; Bhattacharya, D. On the application and extension of system signatures in engineering reliability. Nav. Res. Logist. 2008, 55, 313–327. [Google Scholar] [CrossRef]
  28. Coolen, F.P.A.; Coolen-Maturi, T. Predictive inference for system reliability after common-cause component failures. Reliab. Eng. Syst. Saf. 2015, 135, 27–33. [Google Scholar] [CrossRef]
  29. Coolen-Maturi, T.; Mahnashi, A.M.; Coolen, F. Nonparametric Predictive Inference for Two Future Observations with Right-Censored Data. Math. Methods Stat. 2024, in press.
Figure 1. System with a single type of m = 5 components for Example 4.
Figure 1. System with a single type of m = 5 components for Example 4.
Mathematics 12 03514 g001
Figure 2. System with 2 types of components for Example 5.
Figure 2. System with 2 types of components for Example 5.
Mathematics 12 03514 g002
Table 1. The actuarial estimator for the survival function and the NPI lower and upper probabilities for the event i = 1 m { X 9 + i > t j } , m { 1 , 3 , 10 , 15 } , Example 1.
Table 1. The actuarial estimator for the survival function and the NPI lower and upper probabilities for the event i = 1 m { X 9 + i > t j } , m { 1 , 3 , 10 , 15 } , Example 1.
m = 1 m = 3 m = 10 m = 15
t j d t j c t j n ^ t j n t j n ` t j 1 h ^ t j S ^ t j P ̲ P ¯ P ̲ P ¯ P ̲ P ¯ P ̲ P ¯
t 1 109890.8890.8890.80000.90000.66670.91670.42110.94740.33330.9583
t 2 217580.7500.6670.50000.67500.33330.73330.12380.83590.07580.8712
t 3 214250.6000.4000.20000.40500.09520.52380.01770.71650.00800.7795
t 4 011121.0000.4000.10000.40500.02380.52380.00160.71650.00050.7795
Table 2. NPI lower and upper probabilities for i = 1 m { X 374 + i > t j } , m { 1 , 2 , 5 , 10 } and [ P ̲ , P ¯ ] 5 ( X 375 > t j ) (Example 2).
Table 2. NPI lower and upper probabilities for i = 1 m { X 374 + i > t j } , m { 1 , 2 , 5 , 10 } and [ P ̲ , P ¯ ] 5 ( X 375 > t j ) (Example 2).
m = 1 m = 2 m = 5 m = 10 [ P ̲ , P ¯ ] 5 ( X 375 > t j )
t j d t j c t j n ^ t j n t j P ̲ P ¯ P ̲ P ¯ P ̲ P ¯ P ̲ P ¯ [ P ̲ ] 5 [ P ¯ ] 5
t 1 9003742840.7570.7600.5740.5780.25130.25570.06445400.06672350.24860.2536
t 2 7602842080.5530.5570.3060.3110.05270.05490.00292530.00317390.05170.0536
t 3 5102081570.4150.4210.1730.1780.01280.01380.00017930.00020700.01230.0132
t 4 25121451200.3410.3490.1170.1230.00490.00550.00002690.00003360.00460.0051
t 5 205115950.2800.2890.0790.0840.00180.00220.00000400.00000550.00170.0020
t 6 7986790.2540.2660.0650.0710.00110.00140.00000160.00000250.00110.0013
t 7 4970660.2360.2510.0560.0630.00080.00110.00000080.00000140.00070.0010
t 8 1363620.2290.2470.0530.0610.00070.00100.00000060.00000120.00060.0009
t 9 3557540.2130.2340.0460.0550.00050.00080.00000030.00000080.00040.0007
t 10 2549470.2000.2250.0400.0510.00040.00060.00000020.00000050.00030.0006
Table 3. NPI lower and upper probabilities for the event N t j x | N t j 1 = y with x y .
Table 3. NPI lower and upper probabilities for the event N t j x | N t j 1 = y with x y .
x = 1 x = 2 x = 3
t j d t j c t j n t j n ^ t j n ^ t j d t j y P ̲ P ¯ P ̲ P ¯ P ̲ P ¯
t 1 1089810.80000.9000
20.94550.98180.65450.8182
30.98180.99550.87270.95450.54550.7500
t 2 2157510.62500.7500
20.83330.91670.41670.5833
30.91670.96670.66670.81670.29170.4667
t 3 2124210.40000.6000
20.60000.80000.20000.4000
30.71430.88570.37140.62860.11430.2857
t 4 11010100.5000
200.666700.3333
300.750000.500000.2500
Table 4. NPI lower and upper for the event N t j x .
Table 4. NPI lower and upper for the event N t j x .
x = 1 x = 2 x = 3
t j d t j c t j n t j n ^ t j n ^ t j d t j P ̲ P ¯ P ̲ P ¯ P ̲ P ¯
t 1 108980.96250.99550.78830.88320.40910.7500
t 2 215750.84090.94320.50000.73180.15910.3500
t 3 212420.53340.78340.18330.43340.03330.1333
t 4 1101000.571400.257100.0714
Table 5. NPI lower and upper probabilities for T S > t j , for the system in Figure 1, with n = 10 and n = 20 , Example 4.
Table 5. NPI lower and upper probabilities for T S > t j , for the system in Figure 1, with n = 10 and n = 20 , Example 4.
n = 10 n = 20
t j d t j c t j n ^ t j n ^ t j d t j P ̲ P ¯ P ¯ P ̲ d t j c t j n ^ t j n ^ t j d t j P ̲ P ¯ P ¯ P ̲
t 1 201080.88110.94260.06154020160.91770.94650.0288
t 2 21750.77650.89090.11444214100.83440.89210.0577
t 3 20530.61900.80950.190531960.75520.85590.1007
t 4 11210.38570.78100.395322420.48100.73330.2523
t 5 101000.58330.5833202000.38570.3857
Table 6. Survival signature of the system in Figure 2 (Example 5).
Table 6. Survival signature of the system in Figure 2 (Example 5).
(1,2)Φ(1,2) (1,2)Φ(1,2)
( 0 , 0 ) 0 ( 2 , 0 ) 0
( 0 , 1 ) 0 ( 2 , 1 ) 0
( 0 , 2 ) 0 ( 2 , 2 ) 4/9
( 0 , 3 ) 0 ( 2 , 3 ) 6/9
( 1 , 0 ) 0 ( 3 , 0 ) 1
( 1 , 1 ) 0 ( 3 , 1 ) 1
( 1 , 2 ) 1/9 ( 3 , 2 ) 1
( 1 , 3 ) 3/9 ( 3 , 3 ) 1
Table 7. NPI lower and upper probabilities for T S > t j , for the system in Figure 2 with two types of components and m 1 = m 2 = 3 , Example 5.
Table 7. NPI lower and upper probabilities for T S > t j , for the system in Figure 2 with two types of components and m 1 = m 2 = 3 , Example 5.
t j d t j 1 c t j 1 n ^ t j 1 n ^ t j 1 d t j 1 d t j 2 c t j 2 n ^ t j 2 n ^ t j 2 d t j 2 P ̲ ( T S > t j ) P ¯ ( T S > t j )
t 1 2197301070.55000.7118
t 2 325231630.14120.3189
t 3 2020212000.1478
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Coolen, F.P.A.; Coolen-Maturi, T.; Mahnashi, A.M.Y. Nonparametric Predictive Inference for Discrete Lifetime Data. Mathematics 2024, 12, 3514. https://doi.org/10.3390/math12223514

AMA Style

Coolen FPA, Coolen-Maturi T, Mahnashi AMY. Nonparametric Predictive Inference for Discrete Lifetime Data. Mathematics. 2024; 12(22):3514. https://doi.org/10.3390/math12223514

Chicago/Turabian Style

Coolen, Frank P. A., Tahani Coolen-Maturi, and Ali M. Y. Mahnashi. 2024. "Nonparametric Predictive Inference for Discrete Lifetime Data" Mathematics 12, no. 22: 3514. https://doi.org/10.3390/math12223514

APA Style

Coolen, F. P. A., Coolen-Maturi, T., & Mahnashi, A. M. Y. (2024). Nonparametric Predictive Inference for Discrete Lifetime Data. Mathematics, 12(22), 3514. https://doi.org/10.3390/math12223514

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop