1. Introduction
As medical knowledge continues to expand rapidly, healthcare providers face significant challenges in thoroughly evaluating and analyzing the necessary data to make well-informed decisions [
1,
2,
3]. The complexity of these challenges is further heightened by the variety of findings, presented in different studies, which are sometimes conflicting. Meta-analysis, along with research synthesis or integration, has become an effective tool for addressing these issues. This method achieves its objective by applying rigorous statistical techniques to aggregate the results from multiple individual studies, thereby combining their findings [
2,
4]. Additionally, meta-analysis has gained widespread attention across numerous scientific fields, such as education, social sciences, and medicine. For example, in education has been used to consolidate research on the effectiveness of coaching in improving Scholastic Aptitude Test (SAT) scores in both verbal and mathematical sections [
5]. In social sciences, it has been used to synthesize studies on gender differences in quantitative, verbal, and visual-spatial abilities [
6]. In healthcare, meta-analysis has been particularly valuable during the COVID-19 pandemic, enhancing our understanding of the virus and informing public health strategies [
7,
8].
The challenge of combining two or more unbiased estimators is a common issue in applied statistics, with significant implications across various fields. A notable example of this problem occurred when Meier [
9] was tasked with making inferences about the mean albumin level in plasma protein in human subjects using data from four separate experiments. Similarly, Eberhardt et al. [
10] faced a scenario where they needed to draw conclusions about the mean selenium content in non-fat milk powder by integrating results from four different methods across four experiments.
Most of the early research on drawing inferences about the common mean
focuses on point estimation and theoretical decision rules regarding
. Graybill and Deal were the first among a few researchers to research about estimating
[
11]. Since then, numerous works have been building upon and expanding upon their initial work [
12,
13,
14,
15,
16,
17,
18], along with the related references. Conversely, Meier [
9] developed a method for estimating the confidence interval for
. In addition, refs. [
19,
20] have devised approximate confidence intervals. The properties of such estimators have accumulated substantial attention in the literature. Sinha et al. [
2] derived an unbiased estimator of the variance for the Graybill-Deal estimator, and Krishnamoorthy and Moore [
21] considered this in the prediction problem of linear regression.
In some cases, researchers find situations where prior information
on the mean population is available, whether through pre-test information or historical data. Pretest or preliminary tests or shrinkage estimators involve the concept of leveraging preliminary information to improve parameter estimation accuracy. These estimators work with the idea of borrowing strength from both sample data and pre-test information, resulting in higher efficiency and reliability than traditional estimators. Bancroft [
22] and Stein [
23] introduced and extensively examined the preliminary test shrinkage estimator. Their method has influenced numerous advancements and applications in statistics and has established a basis for the use of shrinkage estimators in contemporary statistical practice [
24,
25,
26]. Thompson [
27] proposed a shrinkage technique given as
where
(accept
),
(reject
) and
as prior guess. This was aimed at improving the current estimator of a parameter
to estimate the mean, thereby reducing the mean square error (MSE) of the uniform minimum-variance unbiased estimator (UMVUE) for the population mean. It has been observed that the shrinkage estimator performs better than the conventional estimator when the assumed value of
q aligns closely with the prior guess. Consequently, instead of treating
q as a constant value in the shrinkage estimator, it is advisable to regard it as a weight ranging between 0 and 1 [
27]. In this context,
q can be viewed as a continuous function dependent on certain pertinent statistics, anticipating that its value will decrease consistently as the deviation
from a reference value increases.
This preliminary test has been widely used in statistics [
24,
25,
26]. Khan et al. [
24] deployed a preliminary test for estimating the mean of a univariate normal population with an unknown variance. Shih et al. [
25] proposed a class of general pretest estimators for the univariate normal mean which included numerous existing estimators, such as pretest, shrinkage, Bayes, and empirical Bayes estimators. In the context of meta-analysis, Taketomi et al. [
26] proposed simultaneous estimation of individual means using the James–Stein shrinkage estimators, which improved upon individual studies’ estimators. Literature has observed that when prior information is available, shrinkage estimators for parameters of various distributions tend to outperform standard estimators in terms of MSE, especially when the estimated value is close to the true value [
22,
27,
28].
The use of prior information in estimating the common mean has several significant advantages. For example, it allows researchers to leverage significant past knowledge, which could be from historical data, expert opinion, or preliminary investigations, improving the accuracy of the estimation process. Secondly, they tend to strike a compromise between bias and variance, resulting in estimates that are both unbiased and more efficient than traditional estimators, particularly in circumstances with small sample sizes. However, there are limited research studies on point estimation of proposing preliminary test-based estimators. It is therefore, in this study we propose a preliminary test estimator for the common mean with unknown and unequal variances. In order to find the ideal estimator, the properties of the proposed preliminary test estimator will be examined, which includes its theoretical basis and performance-based criteria such as bias and MSE.
2. Background
To define the current problem, we assume there are
k independent normal populations with a common mean
, but with unknown and potentially unequal variances
. We assume we have independent and identically distributed (
) observations
from
,
and we define
and
as
where
,
. Note that these statistics
are all mutually independent. Again, it can be noted that
are minimal sufficient statistics for
but not complete [
29]. As a result, one cannot get the uniformly minimum variance unbiased estimator (UMVUE) if it exists using the standard Rao-Blackwell theorem on an unbiased estimator for estimating
. For the case of
k when the population variances are fully known,
can be readily estimated as
This estimator,
, is the UMVUE, the best linear unbiased estimator (BLUE), and the maximum likelihood estimator (MLE). In the context of our current problem, where the population variances are unknown and possibly unequal, the most appealing unbiased estimator for
is the Graybill-Deal (GD) estimator [
11], which is
In the case of two samples, GD [
11] first demonstrated that an unbiased estimator
in Equation (
6) has a lower variance compared to either sample mean, provided that both sample sizes exceed 10.
Khatri and Shah [
30] proposed an exact variance formula for
which is complex and not easily applied. To tackle this inference issue, Meier [
9] derived a first-order approximation of the variance of
, given by
where
.
A few years later, Sinha [
31] developed an unbiased estimator for the variance of
that takes the form of a convergent series. A first-order approximation of this estimator is
The above estimator is comparable to Meier’s [
9] approximate estimator, defined as
The “classical” meta-analysis variance estimator,
is given as
Approximate variance estimator proposed by Hartung [
32],
is given as
3. Proposed Preliminary Test Estimator
It is reasonable to test a null hypothesis when uncertain non-sample prior information is available. A preliminary test estimator is a two-step process that estimates a key parameter using the results of a preliminary test. To estimate
, we consider the hypothesis
Our proposed preliminary test estimate for
is as follows:
where
is unbiased estimator of
. Shih et al. [
25] defined
be a test function with
,
, and
. For
, then randomized test is defined as
The rejection or failure to reject
will be based on the
t statistic. A standard notation for a
t statistic based on a sample of size
n is
. We can refer to this
t computed from a specific set of data as the observed value of our test statistic, and reject
when
, where
is
degrees of freedom and
is Type-I error level. A test for
based on a
p-value on the other hand is based on
, and we reject
at level
if
. We let
as usual. Then
stands for the central
t variable with
degrees of freedom, and
stands for the upper
percentile of
. The general preliminary test estimator [
33] can be defined as
The estimator can also be written as
In this study, we focus only on the case where
. We can define our proposed preliminary test estimator with unknown variance as
where
is the indicator function defined as
if A is true and
if A is false. A random
p-value which has a
distribution under
is defined as
, where
. Most suggested tests for
are based on
and
values. To simplify the notation, we will denote
by small
p and
by large
P. In our context, we have independent t statistics,
, and also independent
p-values,
. In the following, we suggest various test procedures for testing
based on suitable combinations of
and
[
34]. Depending on the test procedure we use, the rejection set
A will be defined and used to compute the Bias and MSE of the preliminary test estimator of the common mean
.
3.1. P-Value Based Exact Tests
Suppose are independent p-values obtained from k continuous distributions of test statistics, then when individual hypothesis is true, is uniformly distributed over the interval . Testing the joint null hypothesis versus . Five p-value-based exact tests based on and p-value from k independent studies as available in the literature are listed below.
3.1.1. Tippett’s Test
Suppose
are independent and ordered
p-values. Then
is rejected if
. If the overall significance level is
then
. Interestingly, this test is equivalent to the test based on
suggested by Cohen and Sackrowitz [
35]. This
test was proposed by Tippet et al. [
36] also called the union-intersection.
3.1.2. Wilkinson’s Test
Wilkinson [
37] provided a generalization of Tippett’s test, where
are ordered
p-values with
the smallest
p-value,
as a test statistic. The common mean null hypothesis
will be rejected if
, where
follows a beta distribution with parameters
r and
under the null hypothesis and
satisfies
. This generates a series of tests for various values of
.
3.1.3. Inverse Normal Test
Stouffer et al. [
38], reported that the Inverse Normal test procedure involves transforming the
p-values to the corresponding standard normal distributions. The test statistic is defined as
, where
is the standard normal cumulative distribution function (CDF). The common mean null hypothesis
will be rejected if
, where
denotes the upper
level cut-off point of the standard normal distribution.
3.1.4. Fisher’s Inverse -Test
Fisher [
39] noted that the test statistic
has a
distribution with
degrees of freedom when
is true. This procedure uses the
to combine the
k independent
p-values. The common mean null hypothesis
will be rejected if
, where
denotes the upper
critical value of
-distribution with
degrees of freedom.
3.1.5. The Logit Test
This exact test procedure which involves transforming each
p-value into a logit was proposed by Mudholker and George [
40]. The test statistic is defined as
, where
G follows student’s
t-distribution with
degrees of freedom. The common mean null hypothesis
is rejected if
.
3.2. Exact Tests
3.2.1. Modified t
Fairweather [
41] suggested using a weighted linear combination of the
namely,
, where
, with
. The null hypothesis
is rejected if
, where
with
computed by simulation.
3.2.2. Modified F
Jordan and Krishnamoorthy [
42] suggested using linear combinations of the
’s namely,
, where
, with
for
. The null hypothesis
will be rejected if
, where
with
computed by simulation.
3.3. Properties of the Proposed Preliminary Test Estimator
3.3.1. Bias
Bias of the proposed preliminary test estimator is equal to
, where
Given that the rejection of and are dependent upon sample mean and sample variance and , respectively, it may be concluded that and are not mutually independent.
3.3.2. Mean Square Error
The MSE of
can be expressed as
5. Application in Biological Research
To demonstrate the practical applicability of the proposed preliminary test estimator, we analyzed data from four experiments used to estimate the percentage of albumin in plasma protein of normal human subjects. This dataset is reported in Meier [
9] and appears in
Table 4. For this dataset, previous studies focusing on the test problem [
44,
45], have compared the various test procedures for testing
versus
.
In our scenario, we could consider 59.50 as our non-sample prior information and apply our proposed preliminary test estimator to address this issue. According to the findings presented in
Table 5, the estimated mean (
) derived from
p-value based tests (including Tippett’s, Wilkinson (
and
), Inverse normal, and Fisher’s tests) notably integrates the non-sample prior information.
In our second application of the proposed preliminary test estimator, we analyzed the data from four experiments about non-fat milk powder. This data set is reported by Eberhardt et al. [
10] and appears in
Table 6. We can compute values of
for different values of
with fixed sampling values, based on
P-value and modified exact tests. The resulting values are shown in
Table 7.
The findings presented in
Table 7 suggest that when
is below 110.00,
. However, when
falls within the range of 110.00 to 110.50, tests including Tippett’s, Wilkinson’s (
,
, and
), Fisher’s, and the logit tests do not reject the null hypothesis (
), indicating an estimated common mean
of equal to 110.00, whereas other tests reject
, estimating the common mean
equal to 109.60. For
, tests based on Wilkinson’s (
and
) and the modified
F tests also fail to reject
, with an estimated
equal to 110.00. Both the Inverse normal test and the Modified
t test accepted the null hypothesis for various values of
. This may be because the Inverse normal test transforms
p-values into z-scores and combines them, whereas the Modified
t test adjusts the traditional
t test procedure to address specific issues such as heteroscedasticity or small sample sizes.
From the above results, we do not intend to make any broad conclusions here, but our simulation results suggest that our proposed preliminary test estimator based on Tippett’s, Wilkinson’s (, , and ), Fisher’s, and the logit tests are feasible and could be applied to this specific case if prior information about the population mean is available.
6. Conclusions
The past decade has witnessed increased interest in estimating unknown quantities using data from multiple independent yet non-homogeneous samples. This approach finds application across various domains, as evidenced by the diverse range of applications discussed in the most recent book by Sinah et al. [
2]. In this study, we introduce a preliminary test estimator that integrates non-sample prior information. Our simulations indicate that this proposed estimator exhibits distinct advantages in certain scenarios, particularly when dealing with very small sample sizes and situations where
exceeds
. Notably, the proposed estimator significantly reduces MSE values compared to traditional unbiased estimators, especially when
is in proximity to
. Moreover, the performance of the proposed estimator, when based on Tippett’s, Wilkinson’s (
,
, and
), Fisher’s, and logit tests, surpasses that of
, particularly in cases involving very small sample sizes. For substantial sample sizes, the effectiveness of the suggested estimator, deploying Inverse normal and modified
F tests, appeared to demonstrate consistent and dependable performance, MSE discrepancy of less than
compared to the MSE of the unbiased estimator. Consequently, we advocate for the adoption of the proposed estimator to enhance the accuracy of
estimation. Nevertheless, no universally optimal estimator performs best across all scenarios. Consequently, it becomes crucial to select an appropriate estimator tailored to each specific scenario. The decision on which estimator to employ relies on the objectives of the research, making it challenging to devise a purely statistical strategy for selection. Our findings in this article suggest that through careful application to real meta-analyses, the proposed estimator exhibits promising potential.
This article primarily considered the scenario under the general preliminary test estimator whereby
. Extensions of this work could explore cases where
and
through the introduction of a randomized test, where the probability function
is treated as a shrinkage parameter. Consequently, the proposed estimator would transition to a non-randomized form [
25]. Additionally, it’s pertinent to highlight that this study focuses on the univariate common mean of multiple normal populations. Future extensions could broaden the scope to encompass multiple responses, such as bivariate common mean.