Enhancing Efficiency: Halton Draws in the Generalized True Random Effects Model

Bernstein, David H.

doi:10.3390/econometrics12040032

Open AccessArticle

Enhancing Efficiency: Halton Draws in the Generalized True Random Effects Model

by

David H. Bernstein

Department of Economics, University of Miami, Coral Gables, FL 33146, USA

Econometrics 2024, 12(4), 32; https://doi.org/10.3390/econometrics12040032

Submission received: 26 July 2024 / Revised: 9 October 2024 / Accepted: 23 October 2024 / Published: 6 November 2024

Download

Browse Figures

Versions Notes

Abstract

:

This paper measures the impact of the number of Halton draws in excess of

⌈ \sqrt{n} ⌉

on technical efficiency in the generalized true random effects (four-component) stochastic frontier model estimated by simulated maximum likelihood. A substantial set of Monte Carlo simulations demonstrates that increasing the number of Halton draws to

⌈ n^{3 / 4} ⌉

(

⌈ n^{2 / 3} ⌉

) decreases the mean squared error of the total technical efficiency estimates by

6.1

(

4.9

) percent. Furthermore, increasing the number of Halton draws either improves or has no detrimental impact on correlation, mean squared error, relative bias, and upward bias for persistent, transient, and total technical efficiency. An energy sector application is included, to demonstrate how these issues can arise in practice, and how increasing Halton draws can improve parameter and efficiency estimates in empirical work.

Keywords:

stochastic frontier; Halton sequences; panel data

1. Introduction

The generalized true random effects (GTRE) model log likelihood contains an integral with no closed-form solution, which can be difficult to estimate directly.1 This integral can be estimated by simulation via the Butler and Moffitt (1982) formulation, with the ultimate goal of obtaining precise efficiency estimates. Technical efficiency estimates have been utilized by a wide swath of applied research since the seminal cross-sectional model by Aigner et al. (1977) and Meeusen and van Den Broeck (1977).2 The four-component panel data model (Colombi 2010; Colombi et al. 2011; Colombi et al. 2014; Kumbhakar et al. 2014; Filippini and Greene 2016), which builds upon the earlier cross-sectional form, allows for the parsing of persistent and transient inefficiency, adding depth to efficiency studies. This paper identifies an under-appreciated issue of integral estimation in the efficiency literature, and it presents practitioners with a practical guide on both the integral estimation itself and the selection of the number of Halton draws above

⌈ \sqrt{n} ⌉

therein.3

By increasing the number of Halton draws in the GTRE model estimated by simulated maximum likelihood, above

⌈ \sqrt{n} ⌉

, the impact on the time-varying, persistent, and total technical efficiency values can be assessed. Greene (2003) deployed 200 Halton draws for a sample size of 158 (

⌈ n^{1.047} ⌉

) for the normal-gamma model, while Bernstein (2020) used 537 Halton draws for a sample size of 3553 (

⌈ n^{0.769} ⌉

) for the four-component model. Rather than relying on the ad hoc use of relatively large numbers (

> > ⌈ \sqrt{n} ⌉

), this work gives the precise benefit to increasing the number of Halton draws, which is relevant, given the computational cost associated with increasing draws.

Our simulations indicate that increasing the number of Halton draws to

⌈ n^{3 / 4} ⌉

(

⌈ n^{2 / 3} ⌉

) decreases the mean squared error of the total technical efficiency estimates by

6.1

(

4.9

) percent. Similarly, increasing the number of Halton draws decreases the mean squared error of the transient technical efficiency estimates by

2.1

(

1.7

) percent and decreases the mean squared error of the persistent technical efficiency estimates by

3.2

(

1.5

) percent. These simulations also indicate either improvement or no negative (statistical) impact for correlation, mean squared error, relative bias, and upward bias4 for persistent, transient, and total technical efficiency (on average, other things equal).

The remainder of this paper proceeds as follows: Section 2 describes the foundational concepts for Halton sequences, the methods of Filippini and Greene (2016), and the results for the bias and noise of the simulated estimators. Section 3 builds on the framework of Badunenko and Kumbhakar (2016) and extends it by considering the number of Halton draws. Section 4 considers an empirical energy sector example based on Bernstein (2020) with additional techniques deployed in the estimation. Section 5 concludes and suggests future areas of exploration.

2. Methods

Numerical integral estimation is a pervasive problem in economics and elsewhere, encompassing numerous decision points and trade-offs. Although this work focuses on the GTRE in the efficiency literature, this section may be of interest to econometric practitioners of the mixed logit and discrete response models more broadly, or, generally, to researchers estimating an integral with no closed-form solution to the Butler and Moffitt (1982) formulation. In this section, a formulation and discussion of Halton sequences is provided, followed by their specific application to the four-component stochastic frontier model estimated by simulated maximum likelihood and to general theoretical properties for simulated estimators.

2.1. Halton Sequences

The Halton sequence, developed in Halton (1960), is a quasi-random sequence of numbers that approximates draws from a uniform density on the closed interval from zero to one.5 From the universality of the uniform, Halton draws can be mapped into any continuous distribution for numerical methods, including integral approximation.

Halton sequences (

h_{1}, \dots, h_{n}

) are constructed by beginning with a prime number greater than or equal to 2. Given the starting value of 2, one simply begins with the sequence

{h_{1}, h_{2}} = {0, 1 / 2}

then adds

1 / 2^{2} = 1 / 4

to that, in order to obtain

{h_{3}, h_{4}} = {1 / 4, 3 / 4}

. Then, one simply adds

1 / 2^{3} = 1 / 8

to the entire sequence

{h_{1}, \dots, h_{4}}

to obtain

{h_{5}, \dots, h_{8}} = {1 / 8, 5 / 8, 3 / 8, 7 / 8}

, and so on. The general case for a Halton sequence for prime k is

h_{t + 1} = {h_{t}, h_{t} + \frac{1}{k^{t}}, h_{t} + \frac{2}{k^{t}}, \dots, h_{t} + \frac{k - 1}{k^{t}}},

(1)

where each subsequence is a cycle of length k Train (2009).

The advantage of Halton sampling is illustrated in Figure 1, for sample sizes of

N \in {150, 300, 600, 1000}

. The Halton draws are mapped to the Gaussian distribution and are generated from the R package randtoolbox, using the halton call. The R built-in package stats creates ordinary normal draws using the rnorm call. Visual inspection of Figure 1 illustrates that Halton draws better approximate the normal distribution in contrast to ordinary draws.

In order to reduce correlation between/among Halton draws, the usual practice is to discard the first several draws in excess of the largest prime number used to generate the sequence(s) and to scramble the sequence(s) order, especially when moving beyond the one-dimensional (single-sequence) case (Train 2009). Additionally, for problems requiring more than one sequence, each sequence is generated with a different prime to reduce correlation. For the two-dimensional Halton draws in the simulation in Section 3, antithetic variates (see Section 4 for discussion) were not utilized, no draws were discarded or scrambled, and the 7th and 8th primes (17,19) were selected as the seeds for

r_{i}

and

h_{i}

, respectively. However, for the empirical exercise in Section 4.1, the first and second primes were selected (2,3) and antithetic variates, discarding, and scrambling were all explored.

In a maximum simulated likelihood (MSL) context, there are different views in the literature as to whether draws should differ from one observation to the next (McFadden 1989; Train 2009) or if draws should be the same from observation to observation. McFadden (1989) suggests using several independent draws for each observation, to reduce “simulation chatter” and increase efficiency.6 Hence, the MSL in the McFadden (1989) approach is evaluated at more points and, therefore, may converge differently than MSL using the same draws, making the two approaches difficult to compare directly while holding draws fixed. A benefit to using the same draws for each observation is that the results are invariant to the order selection of the draws, which is not the case when the draws differ for some or all observations. In this context (and generally), more MSL evaluations improve the simulated log likelihood but increase computational intensity and model run times. Therefore, to economize on time and computational resources, this work utilized the same Halton draws for each observation.

2.2. The Generalized True Random Effects Model

The generalized true random effects (GTRE) model is a four-component model of the form

Y_{i t} = e^{α_{0}} (\prod_{j = 1}^{K} X_{j, i t}^{α_{j}}) e^{r_{i} - h_{i} + v_{i t} - u_{i t}} .

(2)

The usual panel residuals are distributed normally with variance

σ_{v}^{2}

, where

v_{i t} \sim N (0, σ_{v}^{2})

and firm heterogeneity

r_{i} \sim N (0, σ_{r}^{2})

. The persistent component of inefficiency,

h_{i}

, is distributed half-normal with pre-truncation variance

σ_{h}^{2}

, and the time-varying component,

u_{i t}

, is distributed half-normal, with pre-truncation variance

σ_{u}^{2}

. After taking the natural logarithm of both sides, where

log (X) = x

, Equation (2) becomes

y_{i t} = α_{0} + \sum_{j = 1}^{K} α_{j} x_{j, i t} + r_{i} - h_{i} + v_{i t} - u_{i t} = α_{0} + \sum_{j = 1}^{K} α_{j} x_{j, i t} + ϵ_{i t} .

(3)

The vectorized version of Equation (3) is estimated by simulated log likelihood (see Filippini and Greene (2016)), as follows:

\begin{matrix} ln L_{s} (α, λ, σ, σ_{r}, σ_{h}) & = \\ \sum_{i = 1}^{N} ln \frac{1}{R} \sum_{r = 1}^{R} {\prod_{t = 1}^{T} [\frac{2}{σ} & ϕ ((y_{i t} - α^{'} x_{i t} - σ_{r} r_{i r} - σ_{h} | h_{i r} |) / σ) \cdot \\ Φ (- (y_{i t} - α^{'} x_{i t} - σ_{r} r_{i r} - σ_{h} | h_{i r} |) λ / σ)]} . \end{matrix}

(4)

In Equation (4),

ϕ (z)

and

Φ (z)

describe the normal PDF and CDF, respectively;

λ = σ_{u} / σ_{v}

and

σ = {(σ_{v}^{2} + σ_{u}^{2})}^{1 / 2}

. The normal half-normal density becomes a conditional density with the inclusion of

r_{i}

and

h_{i}

; thus, to obtain the unconditional density, integration is required. The summation over R replaces an integral that eliminates

r_{i}

and

h_{i}

via MSL estimation and the law of large numbers. Hence, these plant-specific effects are simulated by Halton sequences, which greatly improves upon the ordinary normal draws, as illustrated in Figure 1.7 MSL estimation of the GTRE is a consistent, asymptotically normal, and efficient computational method for integral estimation, with a crucial decision point being the number of draws.

The technical efficiency estimation for the GTRE, as implemented in Filippini and Greene (2016), follows Colombi (2010); Kumbhakar et al. (2014), where

E (exp (t \cdot {(h_{i}, u_{i 1}, \dots, u_{i T})}^{'}) | ϵ_{i}) = [\frac{{\bar{Φ}}_{T + 1} (R {\hat{ϵ}}_{i} + Λ t^{'}, Λ)}{{\bar{Φ}}_{T + 1} (R {\hat{ϵ}}_{i}, Λ)}] exp (t R {\hat{ϵ}}_{i} + \frac{1}{2} t Λ t^{'}),

(5)

for

i = 1, \dots, N

,

t = (- 1, 0, \dots, 0), (0, - 1, 0, \dots, 0), \dots, (0, \dots, 0, - 1)

, and

R = Λ A^{'} Σ^{- 1}

.

Λ = {[V^{- 1} + A^{'} Σ^{- 1} A]}^{- 1}

,

A = - [1_{t} | I_{T}]

and

Σ = {\hat{σ}}_{v}^{2} I_{T} + {\hat{σ}}_{r}^{2} 1_{T} 1_{T}^{'}

. Additionally,

V = (\begin{matrix} {\hat{σ}}_{h}^{2} & 0_{T}^{'} \\ 0_{T} & {\hat{σ}}_{u}^{2} I_{T} \end{matrix})

and

{\bar{Φ}}_{T + 1} (X, Σ_{X})

is the joint probability that a

T + 1

-order normal vector is in the non-negative orthant, given mean X and variance

Σ_{X}

. The computation of Equation (5) uses multivariate normal integration.8

2.3. Bias and Noise of Simulated Estimators

To better understand the theoretical properties of the MSL, begin by setting

θ

to the parameter vector so that

l_{s} (θ) = \partial l n L_{s} (θ) / \partial θ

. Next, let

θ^{*}

be the parameter values that maximize

l n L_{s}

, as

l_{s} (θ^{*}) = 0

. The simulated gradient can be decomposed by adding and subtracting the true gradient at

θ^{*}

and the expectation of the simulated gradient at

θ^{*}

to arrive at the following:

l_{s} (θ^{*}) = l (θ^{*}) + \underset{= simulation bias}{\underset{︸}{[E_{r} l_{s} (θ^{*}) - l (θ^{*})]}} + \underset{= simulation noise}{\underset{︸}{[l_{s} (θ^{*}) - E_{r} l_{s} (θ^{*})]}}

(6)

Equation (6) (Train 2009) shows that the simulated gradient can be decomposed into simulation bias and simulation noise. The simulation bias normalized by the sample size can be shown to equal

Z \sqrt{n} R^{- 1}

, where Z is constant. Thus, the simulation bias asymptotically requires that R increases faster than

⌈ \sqrt{n} ⌉

. For a finite sample, the simulation bias decreases with R. The simulation noise can be shown to be asymptotically normal with mean zero and variance

S {(n R)}^{- 1}

, where S is a constant (Train 2009). Hence, for a given finite sample, simulation noise continues to decrease as R rises. Thus, the choice of R impacts the parameter estimates in practice, as is illustrated by the Taylor expansion:

\hat{θ} = θ^{*} - (\frac{\partial}{\partial θ} l_{s} (θ^{*}))^{- 1} \cdot l_{s} (θ^{*})

(7)

Equation (7) (Train 2009) can be shown to be normally distributed with the same distribution as the maximum likelihood, yet the finite sample properties are clearly impacted by bias and noise in Equation (6). While the impact of this bias and noise is known for the parameters themselves, the impact on the efficiency estimates of the GTRE is the novel focus of this work.

3. Simulation Results

The simulation framework followed Badunenko and Kumbhakar (2016). A Cobb–Douglas production frontier with two inputs,

X_{1}

and

X_{2}

,

Y = e^{α_{0}} X_{1}^{α_{1}} X_{2}^{1 - α_{1}}

, was deployed. The production frontier was constant returns to scale with

α_{0} = 0.3

and

α_{1} = 0.4

. The covariates,

X_{1}

and

X_{2}

, were truncated exponential distributions with truncation

log (2)

and

log (10)

, respectively.

The

2^{4} = 16

different scenarios in Table 1 (see also Table 1 in Badunenko and Kumbhakar (2016)) were utilized for the variances and pre-truncation variances in

{σ_{r}^{2}, σ_{h}^{2}, σ_{u}^{2}, σ_{v}^{2}}

. The

σ

-values were either

0.04

or

0.20

, which made the

λ, λ_{0}

, and

Λ

values either

0.2

, 1, or 5. Each scenario had 1000 Monte Carlo trials, with

N \in {50, 100, 500}

and

t \in {3, 6, 10}

, and a maximum of 1000 function evaluations for each individual trial maximization.9 Letting

n = N \cdot t

and

⌈ ⌉

be the ceiling operator, then the length of Halton values we considered were the following:

H_{1 / 2} (n) = ⌈ n^{1 / 2} ⌉

,

H_{13 / 24} (n) = ⌈ n^{13 / 24} ⌉

,

H_{7 / 12} (n) = ⌈ n^{7 / 12} ⌉

,

H_{2 / 3} (n) = ⌈ n^{2 / 3} ⌉

, and

H_{3 / 4} (n) = ⌈ n^{3 / 4} ⌉

.

The importance of this research with respect to run-times is illustrated in Figure 2. The log median second run-times were plotted against the nine combinations of

(t, N)

for variance scenario 5, which represented

σ_{v} = 0.2

,

σ_{u} = σ_{r} = σ_{h} = 0.04

. For small sample sizes, the time difference for practical applications was very small, but for larger sample sizes the difference was enormous. In scenario 5, for example, when

N = 50

and

t = 3

, median

H_{1 / 2}

took nearly 6 s and

H_{3 / 4}

took about 11 s, resulting in over 83 percent more time to run

H_{3 / 4}

. However, for scenario 5, when

N = 500

and

t = 10

, median

H_{1 / 2}

took 131 s and

H_{3 / 4}

took 1231, implying that

H_{3 / 4}

resulted in over 841 percent more run-time.10 In practical applications when there is not a known DGP and the panel might be unbalanced or there are outliers in the data these run-times tend to grow substantially, as was the case in Section 4.

The small-sample nature of when more Halton draws were the most beneficial is highlighted in Figure 3. Figure 3 presents the median MSE for total technical efficiency by different Halton spaces. The MSE levels were most separated for small sample sizes but tended to decrease for the largest of sample sizes, as the theory suggests.

Table 2 demonstrates that increasing the number of Halton draws to

H_{3 / 4}

(

H_{2 / 3}

) decreased the mean squared error (MSE) of the total technical efficiency estimates by

6.1

(

4.9

) percent.11 Similarly, increasing the number of Halton draws decreased the mean squared error of the transient technical efficiency estimates by

2.1

(

1.7

) percent, and increasing the number of Halton draws to

H_{3 / 4}

decreased the mean squared error of the persistent technical efficiency estimates by

3.2

percent. The results for relative bias, upward bias, and correlation all showed a very similar pattern to MSE and are, therefore, included in the appendix, for the interested reader.

Table 2 and Table A1, Table A2 and Table A3 present meta-regressions of the

16 \times 3 \times 3 \times 1000 \times 5 = 720,000

simulations. This design is similar to that of Andor et al. (2023), which advocates for internal meta-regressions in large-scale Monte Carlo simulations.

4. Empirical Exercise

An empirical exercise was considered, to illustrate how issues surrounding Halton draws might manifest in practice. This section also introduces some best practices for Halton draws, including discarding, scrambling, the inverse error function, and antithetic draws (abbreviated as “original and enhancements” or simply “enhanced”) into the original simulation estimation procedure/code (abbreviated as “original”). Discarding was performed by removing the first 1000 draws for both

r_{i}

and

h_{i}

. Scrambling was introduced by taking 9999 samples of the Halton draw for

r_{i}

and then selecting the one with the smallest absolute value of correlation with the

h_{i}

.12 The square root of two times the inverse error function (

\sqrt{2} \erf^{- 1} (\cdot)

) was utilized for the

h_{i}

rather than taking the absolute value of the standard normal draws to enhance precision, as

h_{i}

was distributed half-normal. Finally, antithetic draws of the form

- r_{i}

were considered by evaluating the likelihood at

- r_{i}

for each i and then taking a simple average with the likelihood evaluated at the usual

r_{i}

.13

4.1. Production Function Application

A Cobb–Douglas production function from Bernstein (2020) was utilized on a panel of United States natural gas plants. The data came from the FERC Form 1: Electric Utility Annual Report. In order to demonstrate the importance of having a sufficiently large number of draws, the following model was estimated:

y_{i t} = α_{0} + \sum_{j = 1}^{4} α_{j} x_{j, i t} + \sum_{m = 1}^{3} β_{m} c_{m, i t} + \underset{= ϵ_{i t}}{\underset{︸}{r_{i} - h_{i} + v_{i t} - u_{i t}}},

(8)

with Halton spaces utilized for powers from

⌈ n^{50 / 100} ⌉

(baseline

⌈ \sqrt{n} ⌉

) to

⌈ n^{105 / 100} ⌉

with all

5 / 100

unit increments in between, as well as the setting of

⌈ n^{0.769} ⌉

(

⌈ n^{0.77} ⌉

, henceforth) from Bernstein (2020).14 In this example,

y_{i t}, x_{j, i t}

, and

c_{m, i t}

were all in natural logarithms, with the upper case variables in levels. Furthermore,

Y_{i t}

was the plant output in kWh, the production factors

X_{j, i t}

included plant capacity MW (h) (

X_{1} = K

), number of employees (

X_{2} = L

), natural gas in Mcf (

X_{3} = NG

), and barrels of oil (

X_{4} = Oil

). Three control variables

c_{m, i t}

were also included.15 The sample size was 3553, which was an unbalanced panel with 448 identified plants and time periods ranging from 2 to 23, with an average of

7.9

reporting years per entity. Summary statistics for the data are provided in Table 3:

For each model run, the bobyqa optimizer of the minqa package was utilized with 1,000,000 maximum function evaluations and, subsequently, the optim call in the stats package calculated the numerical Hessian for computing standard errors.16 For the simulations, the optim call was utilized with a maximum of 1000 function evaluations.

In Table 4, the efficiency parameters are presented for each set of Halton draws (

⌈ n^{x / 100} ⌉

) along with the average persistent (TE-p) and transient (TE-tr) technical efficiencies and the number of hours each model took to run for both the original and enhanced settings. The parameters

σ

and

λ

were estimated extremely well from model to model (as were the

α

and

β

parameters, excluding the intercept). The consistent estimation of

σ

and

λ

from run to run allowed for nearly identical estimates of the transient technical efficiency, with a minimum correlation of

0.964

for the original and

0.991

for the enhanced in the 13-by-13 Pearson correlation matrices;

σ_{r}

varied up to 22 percent for the original and 9 percent for the enhanced from the smallest to the largest case, while

σ_{h}

was zero twice in the original and zero thrice in the enhanced, making the estimation of the persistent TE more variable from run to run. Interestingly,

σ_{h}

had a much tighter range of 0.00–0.32 in the enhanced model relative to 0.00–0.63 in the original,

σ_{r}

had a tighter range of 0.68–0.74 in the enhanced model relative to 0.67–0.82 in the original,

σ

was nearly the same for all runs, and

λ

only differed somewhat in the two smallest Halton settings for the enhanced estimation.

In Figure 4, six parameter plots are provided, in order to illustrate issues of measuring certain parameters, with solid parameter lines and 95 percent confidence bands in dotted lines.17 The figures highlight the strong ability of all the models to tightly estimate the coefficient for the log of NG, and, generally, all the

α

s and

β

s look similar.18 The models’ difficulty in estimating the intercept is very striking for lower values of Halton draws, and this illustrates the importance of precisely estimating

σ_{h}

and

σ_{r}

for the constant term, as the composed error term does not have zero mean. In every original plot, the slope of the parameter appears to level off towards zero, indicating that

⌈ n^{95 / 100} ⌉

sufficed. For the enhanced estimation,

σ_{r}, σ_{h}

and the intercept do not appear to level off, as a zero for

σ_{h}

repeatedly appears, likely due to a local maximum of the log likelihood function that is very close to nonzero values for

σ_{h}

.19 Consequently, in this setup, it is recommended to utilize the estimates that maximize the log likelihood. The original case of

⌈ n^{95 / 100} ⌉

and

⌈ n^{100 / 100} ⌉

for the enhanced each maximized the log likelihood, as highlighted in Table 4 (this is denoted as

x [\max]

, so that for the original

x [\max] = 95

, and

x [\max] = 100

for the enhanced).

Considering the case of

⌈ n^{x [\max] / 100} ⌉

Halton draws to the sample size as the true (oracle) efficiency values for the original setting, we were able to compute the MSE of the transient TE as of the

⌈ \sqrt{n} ⌉

case of

0.00038

, an upward bias of

0.35

, a relative bias of

- 0.002

, and a correlation of

0.98

. As for the persistent TE, there was a larger divergence generally as MSE was

0.014

, upward bias was

0.00

, relative bias was

- 0.16

, and correlation was

0.97

. Similarly, for the enhanced setting where

x [\max] = 100

, the MSE of the transient TE was

0.000098

, the upward bias was

0.97

, the relative bias was

0.013

, and the correlation was

1.00

. As for the persistent TE, the MSE was

0.00017

, the upward bias was

0.00

, the relative bias was

- 0.016

, and the correlation was

1.00

. This result is strongly suggestive of the superior performance of the enhanced estimation methodologies in recovering similar root n results to the

x [\max]

case.

Large discrepancies in TE estimates could be problematic for European energy efficiency regulators, as are general discrepancies writ large. Andor et al. (2019) notes that regulators prefer to overestimate firm efficiency measures because if “the efficiency of such a firm is underestimated, the consequence is a cost saving requirement, which is too strict. In the long run this could lead to insolvency through no fault of the firm”. As is shown herein for the enhanced case, the upward bias of the transient TE was

0.97

for the

⌈ \sqrt{n} ⌉

case, which implied that the majority of the

⌈ \sqrt{n} ⌉

estimates were likely overestimated. As for the persistent TE for both the original and enhanced cases, all of the

⌈ \sqrt{n} ⌉

estimates were underestimates. Hence, from the Andor et al. (2019) regulatory perspective, some combination approach may be desirable to increase the likelihood of overestimated TE measures.

Figure 5 highlights the differences between the three technical efficiency estimates from the case of

⌈ n^{x [\max] / 100} ⌉

Halton draws on the y-axis and

⌈ \sqrt{n} ⌉

on the x-axis. The transient TE estimates being very highly correlated, they also tended to stay on the

y = x

line as they had very low relative bias amounting to just

0.20

percent, on average, for the original and

1.27

percent for the enhanced. For the original, a half dozen or so estimates appeared far away from the

y = x

line, which is not surprising in a sample of 3553 observations, yet these were eliminated when the enhancements were utilized. For the persistent TE, it is clear why upward bias was zero, as full separation of the data mass from the 45 degree line is noticeable, albeit less pronounced for the enhanced case. Overall, the enhanced estimation shows much tighter efficiency estimates across all three types of TE from the root n to the “oracle” settings relative to the original estimation.

5. Conclusions

This paper measured the impact of the number of Halton draws in excess of

⌈ \sqrt{n} ⌉

on three technical efficiency measures in the four-component stochastic frontier setting. Increasing the number of Halton draws to

⌈ n^{3 / 4} ⌉

(

⌈ n^{2 / 3} ⌉

) decreased the mean squared error of the total technical efficiency estimates by

6.1

(

4.9

) percent, and either improved (

↑ / ↓

) or had no damaging impact on correlation (↑), mean squared error (↓), relative bias (↓), and upward bias (↓) for the three technical efficiency measures. These findings suggest that efficiency metrics are improved by increasing the number of Halton draws well beyond what is required for consistency, especially for small samples. As a result, practitioners should implement best practices for simulating integrals such as discarding and scrambling, and they should balance the time it takes them to simulate with the precision they require for their analysis, as illustrated in the empirical example.

Future work might examine higher dimensions for Halton sequences such as the mixed logit model or the performance of the (deterministic) quadrature for integral approximations in the present setting. Another avenue of exploration is the presence and absence of determinants of inefficiency and the impact of scrambling, discarding, and antithetic draws on the simulation results herein. Furthermore, examining more draws per observation and utilizing other quasi-random sequences, such as the Sobol sequences or Hammersley sets, in lieu of Halton sequences would provide new lines of inquiry.

Funding

This research received no external funding.

Data Availability Statement

Data available on request from the author.

Acknowledgments

I thank the participants in the North American Productivity Workshop 2023 for useful comments. I am also grateful for the comments and suggestions from Chris Parmeter, Alecia Cassidy, the Econometrics Editorial Office, and three anonymous referees. Joshua Y. Katz is thanked for his helpful research assistance. The opinions and views offered here are my own and are not necessarily those of the United States, the Federal Energy Regulatory Commission, the individual Commissioners or members of the Commission staff. Any errors are mine alone.

Conflicts of Interest

The author declares no conflicts of interest.

Appendix A

The appendix includes the relative bias, upward bias, and correlation meta-regressions (Andor et al. 2023) for the 720,000 persistent, transient, and total technical efficiency estimates in Table A1, Table A2 and Table A3. Additional empirical results are also available for different combinations of the original simulation code augmented by discards and antithetic draws.

Table A1 demonstrates that increasing the number of Halton draws to

H_{3 / 4}

(

H_{2 / 3}

) decreased the relative bias of the total technical efficiency estimates by

0.016

(

0.014

), where the

H_{1 / 2}

base case average was

0.022

. Increasing the number of Halton draws to

H_{3 / 4}

(

H_{2 / 3}

) decreased the relative bias of the persistent technical efficiency estimates by

0.015

(

0.013

), with an

H_{1 / 2}

average of

0.016

, and increasing the number of Halton draws to

H_{3 / 4}

decreased the relative bias of transient technical efficiency estimates by

0.0002

, with an

H_{1 / 2}

average of

0.0069

.

Table A2 demonstrates that increasing the number of Halton draws to

H_{3 / 4}

(

H_{2 / 3}

) decreased the upward bias of the total technical efficiency estimates by

0.048

(

0.044

), where the

H_{1 / 2}

base case average was

0.53

. Increasing the number of Halton draws to

H_{3 / 4}

(

H_{2 / 3}

) decreased the upward bias of the persistent technical efficiency estimates by

0.072

(

0.064

), with an

H_{1 / 2}

average of

0.57

.

Table A3 demonstrates that increasing the number of Halton draws to

H_{3 / 4}

(

H_{2 / 3}

) increased the correlation of the total technical efficiency estimates by

0.014

(

0.010

), where the

H_{1 / 2}

base case average was

0.53

. Increasing the number of Halton draws to

H_{3 / 4}

(

H_{2 / 3}

) increased the correlation of the transient technical efficiency estimates by

0.002

(

0.001

), with an

H_{1 / 2}

average of

0.47

, and increasing the number of Halton draws to

H_{3 / 4}

(

H_{2 / 3}

) increased the correlation of the persistent technical efficiency estimates by

0.029

(

0.022

), with an

H_{1 / 2}

average of

0.37

.

Table A1. Internal meta-regressions: Three sources of relative bias.

	Persistent	Transient	Total
N = 100, t = 3	0.014 ***	0.003 ***	0.017 ***
	(0.0004)	(0.0002)	(0.0005)
N = 100, t = 6	0.003 ***	0.001 ***	0.003 ***
	(0.0004)	(0.0002)	(0.0004)
N = 50, t = 10	−0.006 ***	0.003 ***	−0.003 ***
	(0.0004)	(0.0002)	(0.0004)
N = 50, t = 3	0.005 ***	0.006 ***	0.011 ***
	(0.0004)	(0.0003)	(0.001)
N = 50, t = 6	−0.003 ***	0.003 ***	0.0004
	(0.0004)	(0.0002)	(0.0005)
N = 500, t = 10	0.001 ***	−0.001 ***	0.001 *
	(0.0003)	(0.0002)	(0.0004)
N = 500, t = 3	0.015 ***	−0.0001	0.015 ***
	(0.0003)	(0.0002)	(0.0004)
N = 500, t = 6	−0.0003	−0.001 ***	−0.001 ***
	(0.0003)	(0.0002)	(0.0004)
$H_{13 / 24}$	−0.006 ***	−0.00004	−0.006 ***
	(0.0003)	(0.0002)	(0.0004)
$H_{7 / 12}$	−0.005 ***	−0.0002	−0.006 ***
	(0.0003)	(0.0002)	(0.0004)
$H_{2 / 3}$	−0.013 ***	−0.0003 *	−0.014 ***
	(0.0003)	(0.0002)	(0.0003)
$H_{3 / 4}$	−0.015 ***	−0.0002	−0.016 ***
	(0.0003)	(0.0002)	(0.0003)
$Λ = 1$	0.022 ***	−0.008 ***	0.015 ***
	(0.0002)	(0.0002)	(0.0003)
$Λ = 5$	0.033 ***	−0.019 ***	0.014 ***
	(0.0004)	(0.0002)	(0.0004)
$λ_{0} = 1$	0.064 ***	0.0004 ***	0.065 ***
	(0.0002)	(0.0001)	(0.0003)
$λ_{0} = 5$	0.042 ***	0.001 ***	0.042 ***
	(0.0003)	(0.0002)	(0.0004)
$λ = 1$	0.004 ***	0.034 ***	0.039 ***
	(0.0002)	(0.0002)	(0.0003)
$λ = 5$	0.010 ***	0.022 ***	0.032 ***
	(0.0003)	(0.0002)	(0.0004)
Constant	−0.054 ***	−0.009 ***	−0.064 ***
	(0.0004)	(0.0003)	(0.001)
Observations	720,000	720,000	720,000
R²	0.155	0.117	0.116
Adjusted R²	0.155	0.117	0.116
Residual Std. Error (df = 719981)	0.076	0.047	0.092
F Statistic (df = 18; 719981)	7347 ***	5287 ***	5229 ***
Base case average $H_{1}$	0.016	0.0069	0.0226

Notes: Robust standard errors clustered at the replication level in parenthesis. Significance levels are the following: *

p <

0.1; **

p <

0.05; ***

p <

0.01. The base case in these regressions was

H_{1 / 2}

,

λ = λ_{0} = Λ = 0.2

, and N = 100, t = 10.

Table A2. Internal meta-regressions: Three sources of upward bias.

	Persistent	Transient	Total
N = 100, t = 3	0.067 ***	0.032 ***	0.045 ***
	(0.002)	(0.001)	(0.001)
N = 100, t = 6	0.014 ***	0.009 ***	0.009 ***
	(0.002)	(0.001)	(0.001)
N = 50, t = 10	−0.017 ***	0.023 ***	−0.004 ***
	(0.002)	(0.001)	(0.001)
N = 50, t = 3	0.033 ***	0.064 ***	0.038 ***
	(0.002)	(0.001)	(0.001)
N = 50, t = 6	0.007 ***	0.034 ***	0.008 ***
	(0.002)	(0.001)	(0.001)
N = 500, t = 10	0.012 ***	−0.016 ***	−0.0004
	(0.002)	(0.001)	(0.001)
N = 500, t = 3	0.037 ***	−0.004 ***	0.030 ***
	(0.002)	(0.001)	(0.001)
N = 500, t = 6	−0.004 **	−0.014 ***	−0.008 ***
	(0.002)	(0.001)	(0.001)
$H_{13 / 24}$	−0.033 ***	0.001	−0.020 ***
	(0.001)	(0.001)	(0.001)
$H_{7 / 12}$	−0.025 ***	0.0003	−0.018 ***
	(0.001)	(0.001)	(0.001)
$H_{2 / 3}$	−0.064 ***	0.0002	−0.044 ***
	(0.001)	(0.001)	(0.001)
$H_{3 / 4}$	−0.072 ***	0.001	−0.048 ***
	(0.001)	(0.001)	(0.001)
$Λ = 1$	0.012 ***	0.031 ***	0.052 ***
	(0.001)	(0.001)	(0.001)
$Λ = 5$	−0.038 ***	0.042 ***	0.039 ***
	(0.002)	(0.001)	(0.001)
$λ_{0} = 1$	0.220 ***	−0.037 ***	0.161 ***
	(0.001)	(0.001)	(0.001)
$λ_{0} = 5$	0.103 ***	−0.076 ***	0.068 ***
	(0.001)	(0.001)	(0.001)
$λ = 1$	−0.016 ***	0.031 ***	0.100 ***
	(0.001)	(0.001)	(0.001)
$λ = 5$	−0.038 ***	0.022 ***	0.099 ***
	(0.001)	(0.001)	(0.001)
Constant	0.436 ***	0.497 ***	0.313 ***
	(0.002)	(0.002)	(0.002)
Observations	720,000	720,000	720,000
R²	0.080	0.016	0.086
Adjusted R²	0.080	0.016	0.086
Residual Std. Error (df = 719981)	0.334	0.276	0.282
F Statistic (df = 18; 719981)	3460 ***	636 ***	3742 ***
Base case average $H_{1}$	0.5674	0.5214	0.5337

Notes: Robust standard errors clustered at the replication level in parenthesis. Significance levels are the following: *

p <

0.1; **

p <

0.05; *** p < 0.01. The base case in these regressions was

H_{1 / 2}

,

λ = λ_{0} = Λ = 0.2

, and N = 100, t = 10.

Table A3. Internal meta-regressions: Three sources of correlation.

	Persistent	Transient	Total
N = 100, t = 3	−0.084 ***	−0.052 ***	−0.038 ***
	(0.001)	(0.0003)	(0.001)
N = 100, t = 6	−0.030 ***	−0.015 ***	−0.017 ***
	(0.001)	(0.0002)	(0.001)
N = 50, t = 10	0.004 ***	0.002 ***	−0.010 ***
	(0.001)	(0.0002)	(0.001)
N = 50, t = 3	−0.066 ***	−0.066 ***	−0.046 ***
	(0.001)	(0.0004)	(0.001)
N = 50, t = 6	−0.033 ***	−0.015 ***	−0.025 ***
	(0.001)	(0.0002)	(0.001)
N = 500, t = 10	−0.0003	0.0001	0.021 ***
	(0.001)	(0.0001)	(0.001)
N = 500, t = 3	−0.073 ***	−0.044 ***	−0.021 ***
	(0.001)	(0.0002)	(0.001)
N = 500, t = 6	−0.007 ***	−0.011 ***	0.014 ***
	(0.001)	(0.0001)	(0.001)
$H_{13 / 24}$	0.015 ***	0.002 ***	0.006 ***
	(0.001)	(0.0002)	(0.0004)
$H_{7 / 12}$	0.007 ***	0.001 ***	0.005 ***
	(0.001)	(0.0002)	(0.0004)
$H_{2 / 3}$	0.022 ***	0.001 ***	0.010 ***
	(0.001)	(0.0002)	(0.0004)
$H_{3 / 4}$	0.029 ***	0.002 ***	0.014 ***
	(0.001)	(0.0002)	(0.0004)
$Λ = 1$	0.137 ***	−0.049 ***	−0.176 ***
	(0.001)	(0.0002)	(0.0003)
$Λ = 5$	0.295 ***	−0.082 ***	−0.032 ***
	(0.001)	(0.0003)	(0.001)
$λ_{0} = 1$	0.172 ***	0.046 ***	0.130 ***
	(0.0005)	(0.0002)	(0.0003)
$λ_{0} = 5$	0.557 ***	0.079 ***	0.444 ***
	(0.001)	(0.0002)	(0.0005)
$λ = 1$	0.097 ***	0.335 ***	0.167 ***
	(0.001)	(0.0002)	(0.0004)
$λ = 5$	0.190 ***	0.728 ***	0.483 ***
	(0.001)	(0.0002)	(0.0004)
Constant	−0.065 ***	0.143 ***	0.256 ***
	(0.001)	(0.0003)	(0.001)
Observations	720,000	720,000	720,000
R²	0.696	0.965	0.828
Adjusted R²	0.696	0.965	0.828
Residual Std. Error (df = 719981)	0.177	0.052	0.116
F Statistic (df = 18; 719981)	91,686 ***	1,097,208 ***	192,541 ***
Base case average $H_{1}$	0.3666	0.4681	0.5274

Notes: Robust standard errors clustered at the replication level in parenthesis. Significance levels are the following: *

p <

0.1; **

p <

0.05; ***

p <

0.01. The base case in these regressions was

H_{1 / 2}

,

λ = λ_{0} = Λ = 0.2

, and N = 100, t = 10.

Table A4. Selected coefficients, TE averages, and run times: including other settings.

x	$⌈ n^{x / 100} ⌉$	$λ$	$σ$	$σ_{r}$	$σ_{h}$	TE-p	TE-tr	Hours	log ℓ
Original
50	60	1.38 ***	0.57 ***	0.81 ***	0.63 ***	0.63	0.73	0.57	−2707.63
55	90	1.39 ***	0.57 ***	0.82 ***	0.49 ***	0.69	0.73	0.63	−2717.54
60	136	1.40 ***	0.57 ***	0.72 ***	0.47 ***	0.70	0.73	0.77	−2690.55
65	204	1.39 ***	0.57 ***	0.74 ***	0.46 ***	0.71	0.73	1.02	−2688.93
70	306	1.39 ***	0.57 ***	0.81 ***	0.00	1.00	0.73	1.18	−2688.94
75	461	1.39 ***	0.57 ***	0.81 ***	0.00	1.00	0.73	2.11	−2690.57
77	537	1.41 ***	0.57 ***	0.74 ***	0.29 ***	0.80	0.73	2.51	−2673.41
80	693	1.41 ***	0.57 ***	0.73 ***	0.34 *	0.77	0.73	4.07	−2674.24
85	1043	1.40 ***	0.57 ***	0.73 ***	0.24 ***	0.83	0.73	5.81	−2667.48
90	1569	1.40 ***	0.57 ***	0.73 ***	0.24 ***	0.83	0.73	7.20	−2668.59
95	2361	1.40 ***	0.57 ***	0.67 ***	0.39 ***	0.75	0.73	10.79	−2655.32
100	3553	1.40 ***	0.57 ***	0.67 ***	0.41 ***	0.74	0.73	21.48	−2656.65
105	5348	1.40 ***	0.57 ***	0.67 ***	0.40 ***	0.74	0.73	20.91	−2657.81
Original and discards
50	60	1.33 ***	0.56 ***	0.75 ***	0.34 ***	0.77	0.73	0.50	−2665.92
55	90	1.39 ***	0.57 ***	0.74 ***	0.25 ***	0.82	0.73	0.54	−2657.04
60	136	1.39 ***	0.57 ***	0.74 ***	0.20 ***	0.86	0.73	0.81	−2663.08
65	204	1.38 ***	0.57 ***	0.73 ***	0.20 ***	0.86	0.73	0.93	−2661.53
70	306	1.39 ***	0.57 ***	0.74 ***	0.23 ***	0.84	0.73	1.23	−2662.50
75	461	1.39 ***	0.57 ***	0.74 ***	0.21 ***	0.85	0.73	1.81	−2664.80
77	537	1.39 ***	0.57 ***	0.74 ***	0.22 ***	0.84	0.73	2.25	−2665.28
80	693	1.39 ***	0.57 ***	0.74 ***	0.21 ***	0.85	0.73	3.41	−2666.52
85	1043	1.40 ***	0.57 ***	0.74 ***	0.20 **	0.86	0.73	4.29	−2667.67
90	1569	1.40 ***	0.57 ***	0.68 ***	0.33 ***	0.78	0.73	6.68	−2654.69
95	2361	1.40 ***	0.57 ***	0.67 ***	0.39 ***	0.75	0.73	10.56	−2655.53
100	3553	1.40 ***	0.57 ***	0.67 ***	0.39 ***	0.74	0.73	13.62	−2656.49
105	5348	1.40 ***	0.57 ***	0.67 ***	0.38 ***	0.75	0.73	18.78	−2657.66
Original and discards and antithetic draws
50	60	1.31 ***	0.56 ***	0.71 ***	0.05	0.96	0.73	1.03	−2663.30
55	90	1.38 ***	0.57 ***	0.72 ***	0.18	0.87	0.73	0.88	−2656.84
60	136	1.38 ***	0.57 ***	0.71 ***	0.16 **	0.88	0.73	1.24	−2660.55
65	204	1.38 ***	0.57 ***	0.71 ***	0.22 *	0.84	0.73	1.80	−2658.58
70	306	1.39 ***	0.57 ***	0.72 ***	0.21 ***	0.85	0.73	3.27	−2659.19
75	461	1.39 ***	0.57 ***	0.72 ***	0.21 ***	0.85	0.73	4.14	−2660.35
77	537	1.39 ***	0.57 ***	0.72 ***	0.20 ***	0.85	0.73	5.58	−2660.92
80	693	1.39 ***	0.57 ***	0.72 ***	0.20 ***	0.85	0.73	6.00	−2662.10
85	1043	1.39 ***	0.57 ***	0.72 ***	0.21 **	0.85	0.73	6.94	−2663.24
90	1569	1.40 ***	0.57 ***	0.70 ***	0.22 **	0.84	0.73	11.86	-2656.61
95	2361	1.40 ***	0.57 ***	0.70 ***	0.24	0.83	0.73	20.52	−2657.83
100	3553	1.40 ***	0.57 ***	0.65 ***	0.37 ***	0.76	0.73	28.32	−2652.80
105	5348	1.40 ***	0.57 ***	0.66 ***	0.37 ***	0.76	0.73	47.28	−2653.92

Notes: Significance levels are given by 0.01 ***, 0.05 **, and 0.10 * from the Student’s t-Distribution on 3553 − 12 = 3541 degrees of freedom; x refers to the x in

⌈ n^{x / 100} ⌉

. The largest log likelihood (log ℓ) values are highlighted in pink in each set of simulations.

In Table A4, the discards are shown to have substantially reduced the amount of time it took for each model to run. For example, the original simulation code augmented by discards reduced the

x = 100

case by over 36 percent. These run-time gains go away when using the original simulation code augmented by discards and antithetic draws, which is not surprising, as the likelihood function was estimated twice as often in this implementation.

Notes

1	Full information on maximum likelihood estimation can be complex to implement and time-consuming for the GTRE.
2	With 399,000 results for “stochastic frontier analysis” on Google Scholar (conducted on 9 October 2024), the field is mature and very broadly applied. For the four-component model, Martini et al. (2024) measured persistent and transient productive efficiency in the African airline industry, Badunenko and Kumbhakar (2017) measured the effects of regulation in the banking sector, and Bernstein (2020) measured persistent and transient efficiency in the electricity sector. As the four-component model has been developed and coded more recently, the scope of stochastic frontier models can more readily be seen in earlier models of the Aigner et al. (1977) variety. In the cross-sectional (and pooled-cross-sectional) realm, there are many examples of applications of stochastic frontier analysis. For instance, Cullinane and Song (2006) estimated the technical efficiency of container ports in the United Kingdom, Mastromarco and Ghosh (2009) measured the technical efficiency of real GDP in developing countries, Lin and Long (2015) measured the energy efficiency of China’s chemical industry, Fenn et al. (2008) measured the efficiency of European insurance companies, Charoenrat and Harvie (2014) measured the efficiency of small- and medium-sized enterprises in the Thai manufacturing sector, Kraft and Tırtıroğlu (1998) measured bank efficiency in Croatia, D’Errico (2024) measured environmental–economic efficiency in OECD countries, Lamb and Tee (2024) applied stochastic frontier methods in modeling investment performance of the FTSE, S&P, and FTSE, and Last and Wetzel (2010) measured the efficiency of German public theaters. Last and Wetzel (2010) also estimated the true random effects panel data model, with and without the Mundlak formulation. The broad scope of these cross-sectional examples could readily be applied to panel models, assuming data availability. See Parmeter and Kumbhakar (2014) and Kumbhakar et al. (2020) for a review of recent advances.
3	$⌈ ⌉$ is defined as the ceiling operator.
4	Andor et al. (2019) discusses issues surrounding upward bias from a regulatory perspective.
5	Halton draws are selected herein, as utilized by Filippini and Greene (2016); however, other quasi-random sequences could be examined, including the Sobol sequences or Hammersley sets.
6	See Lee (1992) for asymptotic theory and simulations utilizing the independent draws approach for discrete response models.
7	The Halton sequences are generated via the `halton` call in the `randtoolbox` package for `R` (R Core Team 2024). The sequences are then passed through the `qnorm` call via the `stats` package, as explained in the `halton` call R Documentation, and the absolute value is taken for the $h_{i}$ ’s, as shown in Equation (4). This approach follows Filippini and Greene (2016), as they note on page 192. An anonymous referee pointed out that a more direct approach would be to transform the Halton draws via the half-normal quantile function $\sqrt{2} \erf^{- 1} (F)$ as a way to possibly reduce clustering of the “folded” draws and to potentially improve performance. The benefits of this approach are explored in Section 4.
8	This integration is implemented via the `ptmvnorm` call in the `tmvtnorm` package. Equation (5) is the panel analogue to the cross-sectional approach of Battese and Coelli (1995).
9	Computations were conducted in `R`, in a simulation version of the `sfm()` package. The general package is available at https://github.com/davidhbernstein/sfm.
10	The total implied run-time for the 720,000 simulations herein was 3.99 years, of which $H_{3 / 4}$ took 47 percent of the time. These simulations were generally run in parallel with at least eight cores and on multiple machines, dramatically decreasing the overall time to obtain these results. The results for $H_{13 / 24}$ ran with a slightly different setup, causing it to take more time than expected.
11	These percentage estimates were significant at the 0.01 percent level in each regression with 720,000 observations. The log-linear regression coefficients in Table 2 are transformed viz $(exp (β) - 1) * 100$ for more precise percentage interpretations in the text.
12	The correlation was taken after $r_{i}$ was transformed to standard normal draws and $h_{i}$ was transformed to half-normal draws via $\sqrt{2} \erf^{- 1} (\cdot)$ , which was implemented via the `erfinv` command in the `pracma` package.
13	These enhancements are now implemented into version 4.0 of the `sfm` package for the GTRE model. The random samples are generated from the base `R sample` command.
14	Results from Bernstein (2020) are successfully replicated. The original model can be found in Bernstein (2020) Table 11 column “GTRE”.
15	Information on the control variables can be found in Bernstein (2020), but are omitted from this discussion.
16	No model runs required more than one million function evaluations.
17	The confidence intervals are computed in the usual way via $\hat{β} \pm 1.96 * SE (\hat{β})$ . For $x = 105$ in the enhanced case, standard errors were not calculable (as omitted from Figure 4) from the numerical Hessian for $σ_{h}, σ_{r}$ and the intercept. Bootstrap methods could be utilized to compute standard errors in practice, but are not utilized herein.
18	These additional parameter plots are omitted, as they are very similar to the ln NG case, but they are available upon request.
19	If this occurred in practice, particle swarm optimization techniques could be utilized to more effectively maximize the log likelihood, as was carried out in Bernstein (2020). For this demonstration, however, such methodologies are not utilized, given the time constraints.

References

Aigner, Dennis, C. A. Knox Lovell, and Perter Schmidt. 1977. Formulation and estimation of stochastic frontier production models. Journal of Econometrics 6: 21–37. [Google Scholar] [CrossRef]
Andor, Mark A., Christopher Parmeter, and Sommer Stephan. 2019. Combining uncertainty with uncertainty to get certainty? Efficiency analysis for regulation purposes. European Journal of Operational Research 274: 240–52. [Google Scholar] [CrossRef]
Andor, Mark A., David H. Bernstein, Christopher F. Parmeter, and Sommer Stephan. 2023. Internal Meta-Analysis for Monte Carlo Simulations. Essen: RUHR Economic Papers. [Google Scholar]
Badunenko, Oleg, and Subal C. Kumbhakar. 2016. When, where and how to estimate persistent and transient efficiency in stochastic frontier panel data models. European Journal of Operational Research 255: 272–87. [Google Scholar] [CrossRef]
Badunenko, Oleg, and Subal C. Kumbhakar. 2017. Economies of scale, technical change and persistent and time-varying cost efficiency in Indian banking: Do ownership, regulation and heterogeneity matter? European Journal of Operational Research 260: 789–803. [Google Scholar] [CrossRef]
Battese, George E., and Timothy J. Coelli. 1995. A model for technical inefficiency effects in a stochastic frontier production function for panel data. Empirical Economics 20: 325–32. [Google Scholar] [CrossRef]
Bernstein, David H. 2020. An updated assessment of technical efficiency and returns to scale for U.S. electric power plants. Energy Policy 147: 111896. [Google Scholar] [CrossRef]
Butler, John S., and Robert Moffitt. 1982. A Computationally Efficient Quadrature Procedure for the One-Factor Multinomial Probit Model. Econometrica 50: 761–64. [Google Scholar] [CrossRef]
Charoenrat, Teerawat, and Charles Harvie. 2014. The efficiency of SMEs in Thai manufacturing: A stochastic frontier analysis. Economic Modelling 43: 372–93. [Google Scholar] [CrossRef]
Colombi, Roberto. 2010. A skew normal stochastic frontier model for panel data. In Proceedings of the 45-th Scientific Meeting of the Italian Statistical Society. Padua: CLEUP. [Google Scholar]
Colombi, Roberto, Gianmaria Martini, and Giorgio Vittadini. 2011. A Stochastic Frontier Model with Short-Run and Long-Run Inefficiency Random Effects. Working Paper Series; Bergamo: Department of Economics and Technology Management, University of Bergamo. [Google Scholar]
Colombi, Roberto, Subal C. Kumbhakar, Gianmaria Martini, and Giorgio Vittadini. 2014. Closed-Skew Normality in Stochastic Frontiers with Individual Effects and Long/Short-Run Efficiency. Journal of Productivity Analysis 42: 123–36. [Google Scholar] [CrossRef]
Cullinane, Kevin, and Dong-Wook Song. 2006. Estimating the Relative Efficiency of European Container Ports: A Stochastic Frontier Analysis. Research in Transportation Economics 16: 85–115. [Google Scholar] [CrossRef]
D’Errico, Maria Chiara. 2024. Sustainable economic growth and energy security nexus: A stochastic frontier analysis across OECD countries. Energy Economics 132: 107447. [Google Scholar] [CrossRef]
Fenn Paul, Dev Vencappa, Stephen Diacon, Paul Klumpes, and Chris O’Brien. 2008. Market structure and the efficiency of European insurance companies: A stochastic frontier analysis. Journal of Banking & Finance 32: 86–100. [Google Scholar]
Filippini, Massimo, and William Greene. 2016. Persistent and transient productive inefficiency: A maximum simulated likelihood approach. Journal of Productivity Analysis 45: 187–96. [Google Scholar] [CrossRef]
Greene, William. 2003. Simulated Likelihood Estimation of the Normal-Gamma Stochastic Frontier Function. Journal of Productivity Analysis 19: 179–90. [Google Scholar] [CrossRef]
Halton, John H. 1960. On the efficiency of certain quasi-random sequences of points in evaluating multi-dimensional integrals. Numerische Mathematik 2: 84–90. [Google Scholar] [CrossRef]
Kraft, Evan, and Doğan Tırtıroğlu. 1998. Bank Efficiency in Croatia: A Stochastic-Frontier Analysis. Journal of Comparative Economics 26: 282–300. [Google Scholar] [CrossRef]
Kumbhakar, Subal C., Christopher F. Parmeter, and Valentin Zelenyuk. 2020. Stochastic Frontier Analysis: Foundations and Advances I. In Handbook of Production Economics. Singapore: Springer, pp. 1–40. [Google Scholar]
Kumbhakar, Subal C., Gudbrand Lien, and J. Brian Hardaker. 2014. Technical Efficiency in Competing Panel Data Models: A Study of Norwegian Grain Farming. Journal of Productivity Analysis 41: 321–37. [Google Scholar] [CrossRef]
Lamb, John D., and Kai-Hong Tee. 2024. Using stochastic frontier analysis instead of data envelopment analysis in modelling investment performance. Annals of Operations Research 332: 891–907. [Google Scholar] [CrossRef]
Last, Anne-Kathrin, and Heike Wetzel. 2010. The efficiency of German public theaters: A stochastic frontier analysis approach. Journal of Cultural Economics 34: 89–110. [Google Scholar] [CrossRef]
Lee, Lung-Fei. 1992. On Efficiency of Methods of Simulated Moments and Maximum Simulated Likelihood Estimation of Discrete Response Models. Econometric Theory 8: 518–52. [Google Scholar] [CrossRef]
Lin, Boqiang, and Houyin Long. 2015. A stochastic frontier analysis of energy efficiency of China’s chemical industry. Journal of Cleaner Production 87: 235–44. [Google Scholar] [CrossRef]
Martini, Gianmaria, Flavio Porta, and Davide Scotti. 2024. Persistent and transient productive efficiency in the African airline industry. Journal of Productivity Analysis 61: 259–78. [Google Scholar] [CrossRef] [PubMed]
Mastromarco, Camilla, and Sucharita Ghosh. 2009. Foreign Capital, Human Capital, and Efficiency: A Stochastic Frontier Analysis for Developing Countries. World Development 37: 489–502. [Google Scholar] [CrossRef]
McFadden, Daniel. 1989. A Method of Simulated Moments for Estimation of Discrete Response Models without Numerical Integration. Econometrica 57: 995–1026. [Google Scholar] [CrossRef]
Meeusen, Wim, and Julien van Den Broeck. 1977. Efficiency estimation from Cobb-Douglas production functions with composed error. International Economic Review 18: 435–44. [Google Scholar] [CrossRef]
Parmeter, Christopher F., and Subal C. Kumbhakar. 2014. Efficiency Analysis: A Primer on Recent Advances. Foundations and Trends in Econometrics 7: 191–385. [Google Scholar] [CrossRef]
R Core Team. 2024. R: A Language and Environment for Statistical Computing. Vienna: R Core Team. [Google Scholar]
Train, Kenneth E. 2009. Discrete Choice Methods with Simulation. Cambridge, MA: Cambridge University Press, vol. 2. [Google Scholar]

Figure 1. Four comparisons: halton (randtoolbox package), rnorm (stats package), and the true Gaussian density, generated in R.

Figure 2. Run-times for scenario 5 (

σ_{v} = 0.2

and

σ_{u} = σ_{r} = σ_{h} = 0.04

).

Figure 2. Run-times for scenario 5 (

σ_{v} = 0.2

and

σ_{u} = σ_{r} = σ_{h} = 0.04

).

Figure 3. Median total technical efficiency for scenario 16 (

σ_{v} = σ_{u} = σ_{r} = σ_{h} = 0.2

.)

Figure 3. Median total technical efficiency for scenario 16 (

σ_{v} = σ_{u} = σ_{r} = σ_{h} = 0.2

.)

Figure 4. Parameter Plots.

Figure 5. Technical efficiency plots.

Table 1. Monte Carlo

σ

-combinations, as in Table 1 in Badunenko and Kumbhakar (2016).

Table 1. Monte Carlo

σ

-combinations, as in Table 1 in Badunenko and Kumbhakar (2016).

Scenario	$σ_{h}$	$σ_{u}$	$σ_{r}$	$σ_{v}$	$λ_{0} = σ_{h} / σ_{r}$	$λ = σ_{u} / σ_{v}$	$Λ = σ_{h} / σ_{u}$
1	$0.04$	$0.04$	$0.04$	$0.04$	1	1	1
2	$0.04$	$0.2$	$0.04$	$0.04$	1	5	$0.2$
3	$0.2$	$0.04$	$0.04$	$0.04$	5	1	5
4	$0.2$	$0.2$	$0.04$	$0.04$	5	5	1
5	$0.04$	$0.04$	$0.04$	$0.2$	1	$0.2$	1
6	$0.04$	$0.2$	$0.04$	$0.2$	1	1	$0.2$
7	$0.2$	$0.04$	$0.04$	$0.2$	5	$0.2$	5
8	$0.2$	$0.2$	$0.04$	$0.2$	5	1	1
9	$0.04$	$0.04$	$0.2$	$0.04$	$0.2$	1	1
10	$0.04$	$0.2$	$0.2$	$0.04$	$0.2$	5	$0.2$
11	$0.2$	$0.04$	$0.2$	$0.04$	1	1	5
12	$0.2$	$0.2$	$0.2$	$0.04$	1	5	1
13	$0.04$	$0.04$	$0.2$	$0.2$	$0.2$	$0.2$	1
14	$0.04$	$0.2$	$0.2$	$0.2$	$0.2$	1	$0.2$
15	$0.2$	$0.04$	$0.2$	$0.2$	1	$0.2$	5
16	$0.2$	$0.2$	$0.2$	$0.2$	1	1	1

Table 2. Internal meta-regressions: Three sources of log(mse).

	Persistent	Transient	Total
N = 100, t = 3	0.265 ***	0.430 ***	0.259 ***
	(0.006)	(0.005)	(0.005)
N = 100, t = 6	0.106 ***	0.154 ***	0.104 ***
	(0.006)	(0.005)	(0.005)
N = 50, t = 10	0.148 ***	0.097 ***	0.142 ***
	(0.006)	(0.005)	(0.005)
N = 50, t = 3	0.353 ***	0.589 ***	0.401 ***
	(0.006)	(0.005)	(0.005)
N = 50, t = 6	0.244 ***	0.275 ***	0.245 ***
	(0.006)	(0.005)	(0.005)
N = 500, t = 10	−0.167 ***	−0.156 ***	−0.198 ***
	(0.005)	(0.004)	(0.005)
N = 500, t = 3	0.054 ***	0.171 ***	−0.013 **
	(0.006)	(0.005)	(0.005)
N = 500, t = 6	−0.131 ***	−0.048 ***	−0.161 ***
	(0.005)	(0.004)	(0.005)
$H_{13 / 24}$	−0.030 ***	−0.011 ***	−0.040 ***
	(0.004)	(0.004)	(0.004)
$H_{7 / 12}$	0.003	−0.011 ***	−0.020 ***
	(0.005)	(0.004)	(0.004)
$H_{2 / 3}$	−0.016 ***	−0.017 ***	−0.050 ***
	(0.004)	(0.004)	(0.004)
$H_{3 / 4}$	−0.033 ***	−0.021 ***	−0.063 ***
	(0.004)	(0.004)	(0.004)
$Λ = 1$	1.167 ***	−1.257 ***	−0.070 ***
	(0.004)	(0.003)	(0.003)
$Λ = 5$	2.208 ***	−2.563 ***	0.305 ***
	(0.005)	(0.003)	(0.004)
$λ_{0} = 1$	−0.316 ***	0.532 ***	−0.043 ***
	(0.004)	(0.003)	(0.003)
$λ_{0} = 5$	−1.025 ***	1.154 ***	−0.653 ***
	(0.005)	(0.003)	(0.004)
$λ = 1$	0.358 ***	−0.694 ***	0.003
	(0.004)	(0.003)	(0.003)
$λ = 5$	0.845 ***	−1.406 ***	−0.349 ***
	(0.004)	(0.003)	(0.004)
Constant	−6.782 ***	−4.741 ***	−4.905 ***
	(0.007)	(0.005)	(0.006)
Observations	720,000	720,000	720,000
R²	0.197	0.337	0.118
Adjusted R²	0.197	0.337	0.118
Residual Std. Error (df = 719981)	1.200	0.955	1.040
F Statistic (df = 18; 719981)	9827 ***	20,372 ***	5342 ***
Base case average $H_{1}$	0.0095	0.0049	0.0117

Notes: Robust standard errors clustered at the replication level in parenthesis. Significance levels are the following: *

p <

0.1; **

p <

0.05; ***

p <

0.01. The base case in these regressions is

H_{1 / 2}

,

λ = λ_{0} = Λ = 0.2

, and N = 100, t = 10.

Table 3. Summary statistics for natural gas electric generating plants from 1994–2016.

	Y	K	L	Oil	NG
units	kWh/10³	MW (h)	employees	barrels/10³	Mcf/10³
mean	483,178	432	26	76	6072
sd	576,553	421	28	619	10,965
min	1018	6	1	0	2
max	2,145,656	2764	214	24,331	154,002
Observations $= 3553$ (unbalanced panel)

Notes: Y, Oil, and NG are each divided by 10³. MW stands for megawatts, kWh for kilowatt-hours, h for hour, and Mcf for thousands of cubic feet.

Table 4. Selected coefficients, TE averages, and run times.

x	$⌈ n^{x / 100} ⌉$	$λ$	$σ$	$σ_{r}$	$σ_{h}$	TE-p	TE-tr	Hours	log ℓ
Original
50	60	1.38 ***	0.57 ***	0.81 ***	0.63 ***	0.63	0.73	0.57	−2707.63
55	90	1.39 ***	0.57 ***	0.82 ***	0.49 ***	0.69	0.73	0.63	−2717.54
60	136	1.40 ***	0.57 ***	0.72 ***	0.47 ***	0.70	0.73	0.77	−2690.55
65	204	1.39 ***	0.57 ***	0.74 ***	0.46 ***	0.71	0.73	1.02	−2688.93
70	306	1.39 ***	0.57 ***	0.81 ***	0.00	1.00	0.73	1.18	−2688.94
75	461	1.39 ***	0.57 ***	0.81 ***	0.00	1.00	0.73	2.11	−2690.57
77	537	1.41 ***	0.57 ***	0.74 ***	0.29 ***	0.80	0.73	2.51	−2673.41
80	693	1.41 ***	0.57 ***	0.73 ***	0.34 *	0.77	0.73	4.07	−2674.24
85	1043	1.40 ***	0.57 ***	0.73 ***	0.24 ***	0.83	0.73	5.81	−2667.48
90	1569	1.40 ***	0.57 ***	0.73 ***	0.24 ***	0.83	0.73	7.20	−2668.59
95	2361	1.40 ***	0.57 ***	0.67 ***	0.39 ***	0.75	0.73	10.79	−2655.32
100	3553	1.40 ***	0.57 ***	0.67 ***	0.41 ***	0.74	0.73	21.48	−2656.65
105	5348	1.40 ***	0.57 ***	0.67 ***	0.40 ***	0.74	0.73	20.91	−2657.81
Original and Enhancements
50	60	1.28 ***	0.56 ***	0.71 ***	0.32 ***	0.78	0.74	0.90	−2661.90
55	90	1.36 ***	0.57 ***	0.73 ***	0.12 **	0.91	0.73	0.79	−2656.36
60	136	1.39 ***	0.57 ***	0.74 ***	0.00	1.00	0.73	0.90	−2662.65
65	204	1.40 ***	0.57 ***	0.73 ***	0.08	0.94	0.73	1.58	−2660.55
70	306	1.40 ***	0.57 ***	0.73 ***	0.07	0.95	0.73	2.32	−2661.22
75	461	1.40 ***	0.57 ***	0.74 ***	0.00	1.00	0.73	3.72	−2663.09
77	537	1.39 ***	0.57 ***	0.72 ***	0.20 *	0.86	0.73	5.04	−2661.04
80	693	1.40 ***	0.57 ***	0.73 ***	0.11 *	0.92	0.73	6.18	−2663.44
85	1043	1.40 ***	0.57 ***	0.73 ***	0.14 **	0.90	0.73	6.73	−2664.02
90	1569	1.40 ***	0.57 ***	0.72 ***	0.14 *	0.90	0.73	12.43	−2657.93
95	2361	1.41 ***	0.57 ***	0.70 ***	0.21 **	0.85	0.73	16.31	−2657.61
100	3553	1.40 ***	0.57 ***	0.68 ***	0.30 ***	0.80	0.73	27.60	−2654.04
105	5348	1.40 ***	0.57 ***	0.70	0.00	1.00	0.73	38.64	−2655.33

Notes: Significance levels are given by 0.01 ***, 0.05 **, and 0.10 * from the Student’s t-Distribution on 3553 − 12 = 3541 degrees of freedom; x refers to the x in

⌈ n^{x / 100} ⌉

. The largest log likelihood (log ℓ) values are highlighted in each set of simulations. “Enhancements” refers to discards, antithetic draws,

\sqrt{2} \erf^{- 1} (\cdot)

, and shuffling. Appendix Table A4 provides further testing of modifications made to the original case. The largest log likelihood (log ℓ) values are highlighted in pink in each set of simulations.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bernstein, D.H. Enhancing Efficiency: Halton Draws in the Generalized True Random Effects Model. Econometrics 2024, 12, 32. https://doi.org/10.3390/econometrics12040032

AMA Style

Bernstein DH. Enhancing Efficiency: Halton Draws in the Generalized True Random Effects Model. Econometrics. 2024; 12(4):32. https://doi.org/10.3390/econometrics12040032

Chicago/Turabian Style

Bernstein, David H. 2024. "Enhancing Efficiency: Halton Draws in the Generalized True Random Effects Model" Econometrics 12, no. 4: 32. https://doi.org/10.3390/econometrics12040032

APA Style

Bernstein, D. H. (2024). Enhancing Efficiency: Halton Draws in the Generalized True Random Effects Model. Econometrics, 12(4), 32. https://doi.org/10.3390/econometrics12040032

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Enhancing Efficiency: Halton Draws in the Generalized True Random Effects Model

Abstract

1. Introduction

2. Methods

2.1. Halton Sequences

2.2. The Generalized True Random Effects Model

2.3. Bias and Noise of Simulated Estimators

3. Simulation Results

4. Empirical Exercise

4.1. Production Function Application

5. Conclusions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

Notes

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI