1. Introduction
Process capability indices (PCIs) are widely used in industry as they are considered practical tools for continuous quality improvement. They provide numerical measures of the process ability to produce output within specification limits on a quality characteristic of interest. The four most broadly used PCIs are
(Juran [
1]),
(Kane [
2]),
(Chan, Cheng, and Spiring [
3], Hsiang and Taguchi [
4]) and
(Choi and Owen [
5], Pearn, Kotz, and Johnson [
6]) and are defined as follows:
where
U and
L are the upper and the lower specification limits, respectively;
T is the target value of the process;
is the process mean; and
is the process standard deviation. If
and
are unknown, one can replace them with the sample mean
and the sample standard deviation
S, respectively. The properties, the estimation methods and the confidence limits of PCIs have been examined by many authors, such as Rodriguez [
7], Kotz and Johnson [
8,
9], Kotz and Lovelace [
10], and Pearn and Kotz [
11].
In 1991, the Automotive Industry Action Group (AIAG) recommended using the PCIs
and
when the process is in control with the process standard deviation estimated using the formula
(
s-short-term). On the other hand, when the process is not in control, AIAG recommended using the process performance indices (PPIs)
and
, given by
where
s is computed using
(
s-long-term). The main difference between PCIs and PPIs is that PCIs are used to predict the capability of a process to produce parts conforming to specifications, while PPIs are used to evaluate the behaviour of a process. Note that for a stable process, the differences between
(
) and
(
) are negligible (Montgomery [
12]).
The use of the four well-known PCIs presupposes that the process is in control as well as that the quality characteristic of interest is normally distributed. However, there are many processes in which the distribution of the quality characteristic is continuous but non-normal. Many authors have studied PCIs on continuous non-normal distributions. Clements [
13] used a percentile method (known as “Clements’ method”) in order to calculate the
and
indices for non-normal Pearsonian populations. Pearn and Kotz [
14] applied Clements’ method for computing the
and
indices. Rivera, Hubele and Lawrence [
15] studied the effect of several transformation methods on the estimate of
index for non-normal data. Pearn and Chen [
16] proposed a modification of Clements’ method in order to calculate the four well-known PCIs for non-normal Pearsonian populations. Castagliola [
17] proposed a method based on Burr’s distributions to estimate the
and
indices independently of the real process distribution. Hosseinifard et al. [
18] proposed a root transformation technique to estimate PCIs of right skewed processes. Hosseinifard, Abbasi and Niaki [
19] studied and compared various methods for estimating PCIs of non-normal processes.
In practice, there are many processes that the quality characteristic of interest is discrete and, unfortunately, the above studies cannot be used. Few studies have been focused on PCIs for discrete quality characteristics. Bothe [
20] computed the
index based on the percentage of nonconforming parts of a process with attribute data. Yeh and Bhattcharya [
21] proposed a PCI that is related to the probability of non-conformance of a process and can be applied to continuous as well as discrete distributions. Borges and Ho [
22] proposed a PCI based on the process fraction defective. Perakis and Xekalaki [
23,
24] proposed a PCI based on the proportion of conformance of the process, which can be used for continuous or discrete processes. Maiti, Saha and Nanda [
25] defined a generalized index, named
, based on the ratio of the proportion of specification conformance to the proportion of desired conformance, which can be used for continuous as well as discrete processes. Moreover, they introduced two modifications of the
index, the
and
indices, for studying off-centered and off-target processes, respectively. Maravelakis [
26] proposed the
Q transformation technique to estimate the four well-known PCIs for Poisson and binomial data. Dey and Saha [
27] considered three bootstrap confidence intervals of the
index using different methods of estimation for the logistic exponential distribution. Pal and Gauri [
28] studied several approaches for estimating the PCIs
and
of a binomial process. In addition, Gauri and Pal [
29] assessed the relative goodness of some generalized PCIs in quantifying the capability of a Poisson or binomial process.
Motivated by the work of Maravelakis [
26] and of Pal and Gauri [
28], in this paper, we expand the transformation techniques for Poisson and binomial data and we also apply them to negative binomial data in order to compute the four classical PCIs and the two PPIs. Moreover, we compare these indices with other existing PCIs for discrete data to make conclusions about the proposed methodology. Finally, we illustrate the application of the transformation techniques to real data from Poisson, binomial and negative binomial processes.
The paper is organized as follows. In
Section 2, we present the transformation techniques for Poisson, binomial and negative binomial data, while a brief overview of existing PCIs for discrete data is given in
Section 3.
Section 4 presents the results of a simulation study that was performed to compare the indices using different transformation methods. In
Section 5, we present three illustrative examples, one for each discrete distribution. Finally, we conclude the paper in
Section 6.
3. A Brief Overview of PCIs for Discrete Distributions
Yeh and Bhattcharga [
21] proposed an index based on the proportion of nonconforming products. This index is defined as follows:
where
,
and
,
are the expected proportions of nonconforming products that the manufacturer can tolerate on
L and
U, respectively.
Perakis and Xekalaki [
23] proposed an index that can be used for continuous as well as for discrete distributions and is defined as follows:
where
and
is the minimum allowable proportion of conformance, usually equal to 0.9973. Perakis and Xekalaki [
24] investigated the above index for discrete data where only one specification limit is set. More specifically, if only the
U is set, then the index is denoted by
while if only the
L is set, the index is denoted by
. The two indices are defined as follows:
Alternative forms of the two indices for a Poisson process can be obtained by using the chi-square distribution. Thus, they can be written as follows:
Maiti, Saha and Nanda [
25] proposed an index based on the process yield, which is defined as follows:
where
and
is the desired yield equal to
.
For an off-centered process, i.e.,
, where
is the cumulative density function (cdf), the index is defined as follows:
where
M is the median of the distribution and
for a centered process. We notice that for a centered process,
, while if
, the process median is off-centered.
For generalized asymmetric tolerance, i.e.,
, the index is defined as follows:
We point out that when .
The above PCIs have been studied for various non-normal as well as for Poisson distributions. However, these indices can be easily extended to other discrete data following a binomial or negative binomial distribution using the appropriate cdf.
When the value of the parameter of distribution is unknown, then it has to be estimated in order to compute the indices. In most cases, the maximum likelihood estimator (MLE) method is used. If are independent observations from a Poison process, the MLE of c is given by , while if the observations are from a binomial process, the MLE of p is given by . Finally, for a negative binomial process, the MLE of p is given by .
4. A Simulation Study
In this section, a simulation study is performed to evaluate the PCIs and PPIs for different Poisson, binomial and negative binomial processes. Six different processes are considered for each distribution: two centered, two off-centered and two off-target processes. For the centered processes, the process target is approximately equal to the median, and the
U and
L are chosen so that the condition
is satisfied. For the off-centered processes, the values of specification limits are chosen so that
. Finally, under off-target processes, the process target is such that
. These situations of a process are also considered in Maiti, Saha and Nanda [
25].
In our simulation study, random variables of each discrete distribution are generated for specified values of parameters and for sample of sizes and 100 to compute the mean and the standard deviation of each index. In order to estimate them, first, the appropriate transformation method is applied to the data. Then, the specification limits and the target value are transformed via the transformation applied on the data. Next, the obtained specification limits, the target value, as well as the mean and the standard deviation of the transformed data are used to estimate each index. As has an indirect association with and , with and with , comparisons among them and the classical PCIs are also presented. The estimators of the , , and are obtained using the MLE method. Moreover, in order to make the latter indices comparable with the classical PCIs, we use and . All the computations are carried out using the R version 4.3.0. software with 10,000 iterations.
The results of the simulation study are presented in
Table 1,
Table 2,
Table 3,
Table 4,
Table 5 and
Table 6. The first two processes in each table are centered, the other two are off-centered and the last two are off-target. The first columns in these tables show the values of the parameters of each distribution, the specification limits and the target value of the processes. In each process, the first row shows the mean of the index estimators and the second row shows their standard deviations.
As can be observed in the aforementioned tables, we observe that for each process, the differences in the mean values of PCI and PPI estimators applying different transformation methods are minor. More specifically, the differences are very small when applying Anscombe’s and Freeman–Tukey’s transformations for Poisson processes and Freeman–Tukey’s and Chen’s transformations for binomial processes. Moreover, for all processes, the values of are very close to the corresponding values of , while at almost all of them, the values of differ from those of and . This can be explained as follows: The index is based on the ratio of the allowable proportion of nonconforming products to the observed proportion of nonconforming products, while the index is based on the ratio of the observed proportion of conformance to the allowable proportion of conformance. In other words, the index is the reciprocal of the complementary proportions of the index. For off-centered processes, the values of indices are slightly smaller than those of . Furthermore, for binomial and negative binomial off-target processes, the values of indices are very close to the corresponding indices, while for Poisson processes, the differences are larger but both of them indicate incapability processes. Finally, it is observed that the values of and are approximately similar to those of and , respectively. Thus, the processes can be considered stable.
In addition, we study the impact of sample size and parameters of distribution on the results. For a certain process, the differences in the mean values of each index estimator are negligible for different values of sample sizes. This entails the fact that one should use a small value of sample size (say
) to obtain safe conclusions about the process capability/performance. On the other hand, the value of the standard deviation of each index estimator for
is always higher than the corresponding value for
. It should be noticed that the standard deviations of
,
and
are smaller than that of the classical PCIs estimators. Moreover, for certain values of
U,
L and
T, the capability of the process depends on the parameters of distribution. For example, consider the second and the fifth processes in
Table 1, where
, while
for the second process and
for the fifth one. It is recalled that the second process is centered and the fifth process is off-target. As can be seen in
Table 1, we observe that the differences in the mean and standard deviation values of the
and
indices for the two processes are negligible, while the corresponding differences are significant for the other three classical PCIs and
. This was expected as the
and
indices do not depend on the mean value of the process.
5. Applications
To illustrate the application of the above transformations, three examples are presented. The values of the PCIs and PPIs are computed with particular emphasis on describing Anscombe’s transformation for Poisson and negative binomial data and the Freeman–Tukey’s transformation for binomial data. Similarly, the indices can be calculated using the other transformations techniques. The , and indices are also computed for comparison reasons. In the following examples, we present the tools and we guide practitioners on how to apply the proposed methodology. The control charts are provided in order to confirm that the process is in-control. In addition, we apply the transformation techniques to compute the PCIs and PPIs, and we discuss the results afterwards.
Example 1. We use the data on the number of nonconformities in samples of 100 printed circuit boards, which is described in more detail in Chapter 7 by Montgomery [12]. The dataset consists of 46 samples. The first 26 samples of Table 7 are from Table 7.7 of Montgomery [12] and the last 20 samples are from the Table 7.8. of Montgomery [12] The data are presented in Table 7. Moreover, it is assumed that the target value is nonconformities in each sample, and the upper and lower specification limits are and nonconformities in each sample, respectively. After a retrospective data analysis, two values (in bold) in
Table 7 were eliminated due to assignable causes. Fitting in the remaining 44 samples the Poisson distribution, we find that
is the MLE (
p-value = 0.4669). In
Figure 1, the
c-chart is presented. Note that the control limits of the
c-chart are computed with
As no point and no systematic behaviour can be seen from the
c-chart, we can conclude that the process is in control. In addition, we compute all the
’s values,
using Anscombe’s transformation given by (2). The mean of the transformed data is
. In order to estimate the standard deviation of the transformed data, we compute their moving range. Hence, the standard deviation
is given by
An
chart is presented in
Figure 2.
The control limits for the
I-chart are computed with
and those for the
-chart are computed with
From the
chart in
Figure 2, we confirm that the process is in control. The last step before computing the four PCIs is to transform the values of
T,
U and
L, using transformation (2), and as a result, we have
,
and
. The values of the four PCIs are computed with
The long-term standard deviation of the transformed data is computed with
where
are the transformed data. Thus,
. The values of the two PPIs are equal to
Similarly, we compute the values of the indices with the other transformation techniques as well as the values of the
,
and
indices. The results are presented in
Table 8. From this table, we conclude that the differences in the values of the indices applying different transformation techniques are very small. Moreover, the values of
,
and
are very close to the corresponding values of
,
and
, while the values of
and
are approximately similar to those of
and
. Thus, one should consider that the process is stable. As the PCIs are higher than one, except for
, one should conclude that the process is capable. Consequently, the transformation techniques give similar results and one should apply them in order to make conclusions about the capability of the process.
Example 2. We use the data on a number of nonconforming cans with frozen orange juice in samples of size cans from an industrial process, which is also described in detail, as the previous example, in Chapter 7 by Montgomery [12]. The dataset consists of 40 samples and is presented in Table 9. It is also assumed that the target value is defective bellows in each sample and that the upper and lower specification limits are and defective bellows per sample, respectively. Applying the binomial distribution to these data, the MLE of the parameter p is 0.109 (p-value = 0.9999).
In
Figure 3, the
p-chart is presented. Note that the control limits of the
p-chart are computed with
From the
p-chart in
Figure 3, we conclude that the process is in control as no point and no systematic behaviour can be seen. In addition, we compute all the
’s values,
using Freeman–Tukey’s transformation, given in (7). The mean and standard deviation of the transformed data are
and
, respectively. An
chart for the transformed data is presented in
Figure 4.
The control limits for the
I-chart are calculated as
and those for the
-chart are calculated with
From the
chart in
Figure 4, we confirm that the process is in control. The transformed values of
T,
U and
L are
,
and
, respectively, and the values of the four PCIs are computed with
The long-term standard deviation of the transformed data is equal to
. Thus, the values of the PPIs are
The values of the indices with the other transformations techniques as well as the values of the
,
and
indices are presented in
Table 10. We conclude that the differences in the values of the PCIs applying different transformation techniques are very small. The values of
,
and
are close to the values of
,
and
, respectively. As
and
and of course
and
are lower than one, one should conclude that the process is incapable. Moreover, the differences between
and
as well as
and
are significant. Thus, the process is unstable. Similarly to the previous example, the transformation techniques give similar results and one should apply them in order to make conclusions about the capability/performance of the process. However, one should be careful, as
and
are larger than one while
,
,
and
are smaller than one; thus, one should evaluate all the indices to make safe conclusions about the capability/performance of the process.
Example 3. We assume a process in which we count the total number of inspected products until nonconforming items are found. It is also known that the probability p of observing a nonconforming item is equal to 0.1; the target value is inspected products; and the upper and lower specification limits are and inspected products, respectively. The dataset can be found in Chapter 3 of Xie, Goh, and Kuralmani [42], and it is presented row by row in Table 11. The control limits of a CCC-
r chart are given as the solution of the above equations
where
is the acceptable risk of a false alarm. In this example, the control limits of the CCC-5 chart under a standard false alarm probability level
are
,
and
.
In
Figure 5, the CCC-5 chart is presented from which we conclude that the process is in control as no point and no systematic behaviour can be seen. In addition, we compute the
values,
using Anscombe’s transformation, given in (14). The mean and standard deviation of the transformed data are
and
, respectively. An
chart for the transformed data is presented in
Figure 6.
The control limits for the
I-chart are computed with
and those for the
-chart are computed with
From the
chart in
Figure 6, we note that one point, that of sample 39, is plotted over the
in the
-chart. However, no assignable cause is determined, and therefore, we decide that the process is in-control. The transformed values of
T,
U and
L are
,
and
, respectively, and the values of the four PCIs are computed with
The long-term standard deviation of the transformed data is equal to
. Thus, the two PPIs are computed with
The values of the indices applying the two transformation methods are presented in
Table 12. Moreover, the values of
,
and
are also presented in the same table. We note that the value of parameter
L using the Box–Cox transformation is equal to −0.5415. Minor differences are observed in the
and
values applying the two transformation methods. However, the values of the indices indicate that the process is incapable and unstable. In this example, there are small differences in the computation of the four PCIs and two PPIs. However, similar to the previous example, one should evaluate all of the indices as only
and
are larger than the one for Anscombe’s transformation, while
,
and
are larger than the one for the Q transformation.
As a general conclusion from the above three applications, one practitioner should compute all four PCIs and two PPIs in order to make safe conclusions about the capability and the performance of the process. It should also be pointed out that if a practitioner evaluates only or , then they may wrongly conclude that the process is capable. Similarly, one should evaluate not only the but also the index to conclude about the behaviour of the process. The results from different transformation methods are similar. Therefore, one should compute the PCIs and PPIs by applying any method. We also note that the conclusions about the capability and the performance of the processes obtained by applying the transformation techniques are the same as those obtained by computing the indices for non-normal distributions; thus, practitioners do not have to know the indices for non-normal data.
6. Conclusions
Process capability and performance indices, named PCIs and PPIs, respectively, are practical and useful tools for continuous quality improvement. There are various methods in the literature to estimate PCIs for non-normal and continuous processes. In the case of discrete processes, new PCIs have been defined by researchers, with their use causing difficulties to many practitioners. In the present paper, we proposed various transformation techniques for data following Poisson, binomial or negative binomial distributions. The transformed data are then used to compute the well-known PCIs and PPIs. Through a simulation study, we demonstrated that the differences in the indices applying different transformation methods are very small. Comparisons with other PCIs for discrete data showed that the proposed methodology can safely be applied without any misinterpretation of the process capability. Thus, we suggest that practitioners who are not familiar with indices for non-normal data use the transformation methodology in order to obtain safe conclusions about the process capability and performance. Finally, we applied the transformation techniques in three illustrative examples for data, following different distribution.
For future research, we recommend the extension of these transformation techniques to multivariate discrete processes. Moreover, one should propose indices for data following the Conway–Maxwell Poisson distribution, which encompasses the geometric, Poisson and Bernoulli distributions as special cases, or zero-inflated distributions.