Non-Asymptotic Confidence Sets for Circular Means

Hotz, Thomas; Kelma, Florian; Wieditz, Johannes

doi:10.3390/e18100375

Open AccessArticle

Non-Asymptotic Confidence Sets for Circular Means^†

by

Thomas Hotz

^*,‡,

Florian Kelma

^‡ and

Johannes Wieditz

^‡

Institut für Mathematik, Technische Universität Ilmenau, 98684 Ilmenau, Germany

^*

Author to whom correspondence should be addressed.

^†

This paper is an extended version of our paper published in Proceedings of the 2nd International Conference on Geometric Science of Information, Palaiseau, France, 28–30 October 2015; Nielsen, F., Barbaresco, F., Eds.; Lecture Notes in Computer Science, Volume 9389; Springer International Publishing: Cham, Switzerland, 2015; pp. 635–642.

^‡

These authors contributed equally to this work.

Entropy 2016, 18(10), 375; https://doi.org/10.3390/e18100375

Submission received: 15 July 2016 / Revised: 10 October 2016 / Accepted: 13 October 2016 / Published: 20 October 2016

(This article belongs to the Special Issue Differential Geometrical Theory of Statistics)

Download

Browse Figures

Versions Notes

Abstract

:

The mean of data on the unit circle is defined as the minimizer of the average squared Euclidean distance to the data. Based on Hoeffding’s mass concentration inequalities, non-asymptotic confidence sets for circular means are constructed which are universal in the sense that they require no distributional assumptions. These are then compared with asymptotic confidence sets in simulations and for a real data set.

Keywords:

directional data; circular mean; universal confidence sets; non-asymptotic confidence sets; mass concentration inequalities; Hoeffding’s inequality

MSC:

62H11; 62G15

1. Introduction

In applications, data assuming values on the circle, i.e., circular data, arise frequently, examples being measurements of wind directions, or time of the day that patients are admitted to a hospital unit. We refer to the literature, e.g., [1,2,3,4,5], for an overview of statistical methods for circular data, in particular the ones described in this section.

Here, we will concern ourselves with the arguably simplest statistic, the mean. However, given that a circle does not carry a vector space structure, i.e., there is neither a natural addition of points on the circle nor can one divide them by a natural number, what should the meaning of “mean” be?

In order to simplify the exposition, we specifically consider the unit circle in the complex plane,

S^{1} = {z \in C : | z | = 1}

, and we assume the data can be modelled as independent random variables

Z_{1}, \dots, Z_{n}

which are identically distributed as the random variable Z taking values in

S^{1}

. In the literature, however, the circle is often taken to lie in the real plane

R^{2}

, i.e., while we denote the point on the circle corresponding to an angle

θ \in (- π, π]

by

\exp (i θ) = \cos (θ) + i \sin (θ) \in C

one may take it to be

(\cos θ, \sin θ) \in R^{2}

.

Of course, C is a real vector space, so the Euclidean sample mean

{\bar{Z}}_{n} = \frac{1}{n} \sum_{k = 1}^{n} Z_{k} \in C

is well-defined. However, unless all

Z_{k}

take identical values, it will (by the strict convexity of the closed unit disc) lie inside the circle, i.e., its modulus

| {\bar{Z}}_{n} |

will be less than 1. Though

{\bar{Z}}_{n}

cannot be taken as a mean on the circle, if

{\bar{Z}}_{n} \neq 0

, one might say that it specifies a direction; this leads to the idea of calling

{\bar{Z}}_{n} / | {\bar{Z}}_{n} |

the circular sample mean of the data.

Observing that the Euclidean sample mean is the minimiser of the sum of squared distances, this can be put in the more general framework of Fréchet means [6]: define the set of circular sample means to be

{\hat{μ}}_{n} = \underset{ζ \in S^{1}}{argmin} \sum_{k = 1}^{n} {| Z_{k} - ζ |}^{2},

(1)

and analoguously define the set of circular population means of the random variable Z to be

μ = \underset{ζ \in S^{1}}{argmin} E {| Z - ζ |}^{2} .

(2)

Then, as usual, the circular sample means are the circular population means with respect to the empirical distribution of

Z_{1}, \dots, Z_{n}

.

The circular population mean can be related to the Euclidean population mean

E Z

by noting that

{E | Z - ζ |}^{2} = {E | Z - E Z |}^{2} + {| E Z - ζ |}^{2}

(in statistics, this is called the bias-variance decomposition), so that

μ = \underset{ζ \in S^{1}}{argmin} {| E Z - ζ |}^{2}

(3)

is the set of points on the circle closest to

E Z

. It follows that μ is unique if and only if

E Z \neq 0

in which case it is given by

μ = E Z / | E Z |

, the orthogonal projection of

E Z

onto the circle; otherwise, i.e., if

E Z = 0

, the set of circular population means is all of

S^{1}

. We consider the information of whether the circular population mean is not unique, e.g., but not exclusively because Z is uniformly distributed over the circle, to be relevant; it thus should be inferred from the data as well. Analogously,

{\hat{μ}}_{n}

is either all of

S^{1}

or uniquely given by

{\bar{Z}}_{n} / | {\bar{Z}}_{n} |

according to whether

{\bar{Z}}_{n}

is 0 or not. Note that

{\bar{Z}}_{n} \neq 0

a.s. if Z is continuously distributed on the circle, even if

E Z = 0

.

{\bar{Z}}_{n}

is what is known as the vector resultant, while

{\bar{Z}}_{n} / | {\bar{Z}}_{n} |

is sometimes referred to as the mean direction.

The expected squared distances minimised in Equation (2) are given by the metric inherited from the ambient space C; therefore, μ is also called the set of extrinsic population means. If we measured distances intrinsically along the circle, i.e., using arc-length instead of chordal distance, we would obtain what is called the set of intrinsic population means. We will not consider the latter in the following, see e.g., [7] for a comparison and [8,9] for generalizations of these concepts.

Our aim is to construct confidence sets for the circular population mean μ that form a superset of μ with a certain (so-called) coverage probability that is required to be not less than some pre-specified significance level

1 - α

for

α \in (0, 1)

.

The classical approach is to construct an asymptotic confidence interval where the coverage probability converges to

1 - α

when n tends to infinity. This can be done as follows: since Z is a bounded random variable,

\sqrt{n} ({\bar{Z}}_{n} - E Z)

converges to a bivariate normal distribution when identifying C with

R^{2}

. Now, assume

E Z \neq 0

so μ is unique. Then, the orthogonal projection is differentiable in a neighbourhood of

E Z

, so the δ-method (see e.g., [1] (p. 111) or [4] (Lemma 3.1)) can be applied and one easily obtains

\sqrt{n} Arg (μ^{- 1} {\hat{μ}}_{n}) \overset{D}{\to} N (0, \frac{E {(Im (μ^{- 1} Z))}^{2}}{{| E Z |}^{2}}),

(4)

where

Arg : C \ {0} \to (- π, π] \subset R

denotes the argument of a complex number (it is defined arbitrarily at

0 \in C

), while multiplying with

μ^{- 1}

rotates such that

E Z = μ

is mapped to

0 \in (- π, π]

, see e.g., [4] (Proposition 3.1) or [7] (Theorem 5). Estimating the asymptotic variance and applying Slutsky’s lemma, one arrives at the asymptotic confidence set

C_{A} = {ζ \in S^{1} : | Arg (ζ^{- 1} {\hat{μ}}_{n}) | < δ_{A}}

provided

{\hat{μ}}_{n}

is unique, where the angle determining the interval is given by

δ_{A} = \frac{q_{1 - \frac{α}{2}}}{n | {\bar{Z}}_{n} |} \sqrt{\sum_{k = 1}^{n} {(Im ({\hat{μ}}_{n}^{- 1} Z_{k}))}^{2}},

(5)

with

q_{1 - \frac{α}{2}}

denoting the

(1 - \frac{α}{2})

-quantile of the standard normal distribution

N (0, 1)

.

There are two major drawbacks to the use of asymptotic confidence intervals: firstly, by definition, they do not guarantee a coverage probability of at least

1 - α

for finite n, so the coverage probability for a fixed distribution and sample size may be much smaller. Indeed, Simulation 2 in Section 4 demonstrates that, even for

n = 100

, the coverage probability may be as low as

64 %

when constructing the asymptotic confidence set for

1 - α = 90 %

. Secondly, they assume that

E Z \neq 0

, so they are not applicable to all distributions on the circle. Since in practice it is unknown whether this assumption hold, one would have to test the hypothesis

E Z = 0

, possibly again by an asymptotic test, and construct the confidence set conditioned on this hypothesis having been rejected, setting

C_{A} = S^{1}

otherwise. However, this sequential procedure would require some adaptation taking the pre-test into account (cf. e.g., [10])—we come back to this point in Section 5—and it is not commonly implemented in practice.

We therefore aim to construct non-asymptotic confidence sets for μ, guaranteeing coverage with at least the desired probability for any sample size n, which in addition are universal in the sense that they do not make any distributional assumptions about the circular data besides them being independent and identically distributed. It has been shown in [7] that this is possible; however, the confidence sets that were constructed there were far too large to be of use in practice. Nonetheless, we start by varying that construction in Section 2 but using Hoeffding’s inequality instead of Chebyshev’s as in [7]. Considerable improvements are possible if one takes the variance

E {(Im (μ^{- 1} Z))}^{2}

“perpendicular to

E Z

” into account; this is achieved by a second construction in Section 3. Of course, the latter confidence sets will still be conservative but Proposition 2(iv) shows that they are (for

1 - α = 95 %

) only a factor

\sim \frac{3}{2}

longer than the asymptotic ones when the sample size n is large. We further illustrate and compare those confidence sets in simulations and for an application to real data in Section 4, discussing the results obtained in Section 5.

2. Construction Using Hoeffding’s Inequality

We will construct a confidence set as the acceptance region of a series of tests. This idea has been used before for the construction of confidence sets for the circular population mean [7] (Section 6); however, we will modify that construction by replacing Chebyshev’s inequality—which is too conservative here—by three applications of Hoeffding’s inequality [11] (Theorem 1): if

U_{1}, \dots, U_{n}

are independent random variables taking values in the bounded interval

[a, b]

with

- \infty < a < b < \infty .

Then,

{\bar{U}}_{n} = \frac{1}{n} \sum_{k = 1}^{n} U_{k}

with

E {\bar{U}}_{n} = ν

fulfills

P ({\bar{U}}_{n} - ν \geq t) \leq {[{(\frac{ν - a}{ν - a + t})}^{ν - a + t} {(\frac{b - ν}{b - ν - t})}^{b - ν - t}]}^{\frac{n}{b - a}}

(6)

for any

t \in (0, b - ν)

. The bound on the right-hand side—denoted

β (t)

—is continuous and strictly decreasing in t (as expected; see Appendix A) with

β (0) = 1

and

\lim_{t \to b - ν} β (t) = {(\frac{ν - a}{b - a})}^{n}

whence a unique solution

t = t (γ, ν, a, b)

to the equation

β (t) = γ

exists for any

γ \in ({(\frac{ν - a}{b - a})}^{n}, 1)

. Equivalently,

t (γ, ν, a, b)

is strictly decreasing in

γ .

Furthermore,

ν + t (γ, ν, a, b)

is strictly increasing in ν (see Appendix A again), which is also to be expected. While there is no closed form expression for

t (γ, ν, a, b)

, it can without difficulty be determined numerically.

Note that the estimate

β (t) \leq \exp (- 2 n t^{2} / {(b - a)}^{2})

(7)

is often used and called Hoeffding’s inequality [11]. While this would allow to solve explicitly for t, we prefer to work with β as it is sharper, especially for ν close to b as well as for large t. Nonetheless, it shows that the tail bound

β (t)

tends to zero as fast as if using the central limit theorem which is why it is widely applied for bounded variables, see e.g., [12].

Now, for any

ζ \in S^{1}

, we will test the hypothesis that ζ is a circular population mean. This hypothesis is equivalent to saying that there is some

λ \in [0, 1]

such that

E Z = λ ζ

. Multiplication by

ζ^{- 1}

then rotates

E Z

onto the non-negative real axis:

E ζ^{- 1} Z = λ \geq 0

.

Now, fix ζ and consider

X_{k} = Re (ζ^{- 1} Z_{k})

,

Y_{k} = Im (ζ^{- 1} Z_{k})

for

k = 1, \dots, n

which may be viewed as the projection of

Z_{1}, \dots, Z_{k}

onto the line in the direction of ζ and onto the line perpendicular to it. Both are sequences of independent random variables taking values in

[- 1, 1]

with

E X_{k} = λ

and

E Y_{k} = 0

under the hypothesis. They thus fulfill the conditions for Hoeffding’s inequality with

a = - 1

,

b = 1

and

ν = λ

or 0, respectively.

We will first consider the case of non-uniqueness of the circular mean, i.e.,

μ = S^{1}

, or equivalently

λ = 0

. Then, the critical value

s_{0} = t (\frac{α}{4}, 0, - 1, 1)

is well-defined for any

\frac{α}{4} > 2^{- n},

and we get

P ({\bar{X}}_{n} \geq s_{0}) \leq \frac{α}{4}

, and also, by considering

- X_{1}, \dots, - X_{n}

, that

P (- {\bar{X}}_{n} \geq s_{0}) \leq \frac{α}{4}

. Analogously,

P (| {\bar{Y}}_{n} | \geq s_{0}) \leq 2 \frac{α}{4} = \frac{α}{2}

. We conclude that

P (| {\bar{Z}}_{n} | \geq \sqrt{2} s_{0}) = P (| {\bar{X}}_{n} |^{2} + {| {\bar{Y}}_{n} |}^{2} \geq 2 s_{0}^{2}) \leq P (| {\bar{X}}_{n} |^{2} \geq s_{0}^{2}) + P (| {\bar{Y}}_{n} |^{2} \geq s_{0}^{2}) \leq α .

Rejecting the hypothesis

μ = S^{1}

, i.e.,

E Z = 0

, if

| {\bar{Z}}_{n} | \geq \sqrt{2} s_{0}

thus leads to a test whose probability of false rejection is at most α (see Figure 1). Of course, one may work with

| {\bar{X}}_{n} |^{2} \geq s_{0}^{2}

and

| {\bar{Y}}_{n} |^{2} \geq s_{0}^{2}

as criterions for rejection; however, we prefer working with

| {\bar{Z}}_{n} | \geq \sqrt{2} s_{0}

since it is independent of the chosen

ζ .

In the case of uniqueness of the circular mean, i.e., for the hypothesis

λ > 0

, we use the monotonicity of

ν + t (γ, ν, a, b)

in ν and obtain

P ({\bar{X}}_{n} \leq - s_{0}) = P (- {\bar{X}}_{n} \geq t (\frac{α}{4}, 0, - 1, 1)) \leq P (- {\bar{X}}_{n} \geq - λ + t (\frac{α}{4}, - λ, - 1, 1)) \leq \frac{α}{4}

as well. For the direction perpendicular to the direction of ζ (see Figure 2), however, we may now work with

\frac{3}{8} α

, so for

s_{p} = t (\frac{3}{8} α, 0, - 1, 1)

—which is well-defined whenever

s_{0}

is since

\frac{3}{8} α > \frac{α}{4} > 2^{- n}

—we obtain

P ({\bar{Y}}_{n} \geq s_{p}) + P ({\bar{Y}}_{n} \leq - s_{p}) \leq 2 \cdot \frac{3}{8} α .

Rejecting if

{\bar{X}}_{n} \leq - s_{0}

or

| {\bar{Y}}_{n} | \geq s_{p}

, then, will happen with probability at most

\frac{α}{4} + 2 \cdot \frac{3}{8} α = α

under the hypothesis

μ = ζ

. In case that we already rejected the hypothesis

μ = S^{1}

, i.e., if

| {\bar{Z}}_{n} | \geq \sqrt{2} s_{0}

, ζ will not be rejected if and only if

{\bar{X}}_{n} > s_{0} > 0

and

| {\bar{Y}}_{n} | < s_{p} < s_{0}

which is then equivalent to

| Arg (ζ^{- 1} {\bar{Z}}_{n}) | = \arcsin (| {\bar{Y}}_{n} | / | {\bar{Z}}_{n} |) < \arcsin (s_{p} / | {\bar{Z}}_{n} |) = δ_{H}

(see Figure 3).

Define

C_{H}

as all ζ which we could not reject, i.e.,

C_{H} = \{\begin{matrix} S^{1}, & if α \leq 2^{- n + 2} or | {\bar{Z}}_{n} | \leq \sqrt{2} s_{0}, \\ \{ζ \in S^{1} : | Arg (ζ^{- 1} {\hat{μ}}_{n}) | < δ_{H}\} & otherwise . \end{matrix}

(8)

Then, we obtain the following result:

Proposition 1.

Let

Z_{1}, \dots, Z_{n}

be random variables taking values on the unit circle

S^{1},

α \in (0, 1)

, and let

C_{H}

be defined as in Equation (8).

(i): $C_{H}$ is a $(1 - α)$ -confidence set for the circular population mean set. In particular, if $E Z = 0$ , i.e., the circular population mean set equals $S^{1}$ , then $| {\bar{Z}}_{n} | > \sqrt{2} s_{0}$ with probability at most $α,$ so indeed $C_{H} = S^{1}$ with probability at least $1 - α .$
(ii): $s_{0}$ and $s_{p}$ are of order $n^{- \frac{1}{2}}$ .
(iii): If $E Z \neq 0,$ then $\sqrt{n} δ_{H} \to 0$ in probability and the probability of obtaining the trivial confidence set, i.e., $P (C_{H} = S^{1}) = P (| {\bar{Z}}_{n} | \leq \sqrt{2} s_{0})$ , goes to 0 exponentially fast.

Proof.

(i) holds by construction.

For (ii), recall Equation (7), from which we obtain the estimates

\frac{α}{4} \leq \exp (- n s_{0}^{2} / 2)

resp.

\frac{3}{8} α \leq \exp (- n s_{p}^{2} / 2)

, implying that

s_{0}

and

s_{p}

are of order

n^{- \frac{1}{2}};

the same holds stochastically for

δ_{H}

since

{\bar{Z}}_{n} \to E Z

a.s. Regarding the second statement of (iii), if μ is unique, consider

ζ = - μ

; then,

τ = E {\bar{X}}_{n} < 0

and

- \sqrt{2} s_{0}

is eventually less than

\frac{τ}{2}

and also

α > 2^{- n + 2}

eventually. Hence, the probability of obtaining the trivial confidence set

C_{H} = S^{1}

is eventually bounded by

P (ζ \in C_{H}) \leq P ({\bar{X}}_{n} > - s_{0}) \leq P ({\bar{X}}_{n} > \frac{τ}{2}) = P ({\bar{X}}_{n} - E {\bar{X}}_{n} > - \frac{τ}{2}) \leq \exp (- n τ^{2} / 8)

, and thus will go to zero exponentially fast as n tends to infinity. ☐

3. Estimating the Variance

From the central limit theorem for

{\hat{μ}}_{n}

in case of unique μ, cf. Equation (4), we see that the aymptotic variance of

{\hat{μ}}_{n}

gets small if

| E Z |

is close to 1 (then

E Z

is close to the boundary

S^{1}

of the unit disc, which is possible only if the distribution is very concentrated) or if the variance

E {(Im (μ^{- 1} Z))}^{2}

in the direction perpendicular to μ is small (if the distribution were concentrated on

\pm μ

, this variance would be zero and

{\hat{μ}}_{n}

would equal μ with large probability). While

δ_{H}

(

| {\bar{Z}}_{n} |

being the denominator of its sine) takes the former into account, the latter has not been exploited yet. To do so, we need to estimate

E {(Im (μ^{- 1} Z))}^{2}

.

Consider

V_{n} = \frac{1}{n} \sum_{k = 1}^{n} Y_{k}^{2}

that is under the hypothesis that the corresponding ζ is the unique circular population mean has expectation

σ^{2} = Var (Y_{k}) = E {(Im (ζ^{- 1} Z))}^{2}

. Now,

1 - V_{n} = \frac{1}{n} \sum_{k = 1}^{n} (1 - Y_{k}^{2})

is the mean of n independent random variables taking values in

[0, 1]

and having expectation

1 - σ^{2}

. By another application of Equation (6), we obtain

P (σ^{2} \geq V_{n} + t) = P (1 - V_{n} \geq 1 - σ^{2} + t) \leq \frac{α}{4}

for

t = t (\frac{α}{4}, 1 - σ^{2}, 0, 1)

, the latter existing if

\frac{α}{4} > {(1 - σ^{2})}^{n}

. Since

1 - σ^{2} + t (\frac{α}{4}, 1 - σ^{2}, 0, 1)

increases with

1 - σ^{2}

, there is a minimal

σ^{2}

for which

1 - V_{n} \geq 1 - σ^{2} + t (\frac{α}{4}, 1 - σ^{2}, 0, 1)

holds and becomes an equality; we denote it by

\hat{σ^{2}} = V_{n} + t (\frac{α}{4}, 1 - \hat{σ^{2}}, 0, 1)

. Inserting into Equation (6), it by construction fulfills

\frac{α}{4} = {[{(\frac{1 - \hat{σ^{2}}}{1 - V_{n}})}^{1 - V_{n}} {(\frac{\hat{σ^{2}}}{V_{n}})}^{V_{n}}]}^{n} .

(9)

It is easy to see that the right-hand side depends continuously on and is strictly decreasing in

\hat{σ^{2}} \in [V_{n}, 1]

(see Appendix A), thereby traversing the interval

[0, 1]

so that one can again solve the equation numerically. We then may, with an error probability of at most

\frac{α}{4}

, use

\hat{σ^{2}}

as an upper bound for

σ^{2}

. Note that

\hat{σ^{2}} > V_{n}

exists if

\frac{α}{4} > {(1 - \hat{σ^{2}})}^{n} .

The latter is fulfilled for any

V_{n} < 1

since Equation (9) is equivalent to

\frac{α}{4} = {(1 - \hat{σ^{2}})}^{n} {[\underset{> 1}{\underset{︸}{(\frac{1}{1 - V_{n}})}} \underset{> 1}{\underset{︸}{{(\frac{1 - \hat{σ^{2}}}{1 - V_{n}})}^{- V_{n}}}} \underset{> 1}{\underset{︸}{{(\frac{\hat{σ^{2}}}{V_{n}})}^{V_{n}}}}]}^{n} .

For

V_{n} = 1

, let

\hat{σ^{2}} = 1

be the trivial bound.

With such an upper bound on its variance, we now can get a better estimate for

P ({\bar{Y}}_{n} > t)

. Indeed, one may use another inequality by Hoeffding [11] (Theorem 3): the mean

{\bar{W}}_{n} = \frac{1}{n} \sum_{k = 1}^{n} W_{k}

of a sequence

W_{1}, \dots, W_{n}

of independent random variables taking values in

(- \infty, 1]

, each having zero expectation as well as variance

ρ^{2}

fulfills

\begin{matrix} P ({\bar{W}}_{n} \geq w) & \leq {[{(1 + \frac{w}{ρ^{2}})}^{- ρ^{2} - w} {(1 - w)}^{w - 1}]}^{\frac{n}{1 + ρ^{2}}}, \end{matrix}

(10)

\begin{matrix} \leq \exp (- n t [(1 + \frac{ρ^{2}}{t}) \ln (1 + \frac{t}{ρ^{2}}) - 1]) . \end{matrix}

(11)

for any

w \in (0, 1) .

Again, an elementary calculation (analogous to Lemma A1) shows that the right-hand side of Equation (10) is strictly decreasing in w, continuously ranging between 1 and

{(\frac{ρ^{2}}{1 + ρ^{2}})}^{n}

as w varies in

(0, 1)

, so that there exists a unique

w = w (γ, ρ^{2})

for which the right-hand side equals γ, provided

γ \in ({(\frac{ρ^{2}}{1 + ρ^{2}})}^{n}, 1)

. Moreover, the right-hand side increases with

ρ^{2}

(as expected), so that

w (γ, ρ^{2})

is increasing in

ρ^{2}

, too (cf. Appendix A).

Therefore, under the hypothesis that the corresponding ζ is the unique circular population mean,

P (| {\bar{Y}}_{n} | \geq w (\frac{α}{4}, σ^{2})) \leq 2 \frac{α}{4} = \frac{α}{2}

. Now, since

P (w (\frac{α}{4}, σ^{2}) \geq w (\frac{α}{4}, \hat{σ^{2}})) = P (σ^{2} \geq \hat{σ^{2}}) \leq \frac{α}{4}

, setting

s_{V} = w (\frac{α}{4}, \hat{σ^{2}})

we get

P (| {\bar{Y}}_{n} | \geq s_{V}) \leq \frac{3}{4} α

. Note that

\frac{ρ^{2}}{1 + ρ^{2}}

increases with

ρ^{2}

, so in case

s_{0}

exists,

\hat{σ^{2}} \leq 1

implies

\frac{α}{4} > 2^{- n} \geq {(\frac{\hat{σ^{2}}}{1 + \hat{σ^{2}}})}^{n}

, i.e., the existence of

s_{V}

.

Following the construction for

C_{H}

from Section 2, we can again obtain a confidence set for μ with coverage probability at least

1 - α

as shown in our previous article [13]. In practice however, this confidence set is hard to calculate since

\hat{σ^{2}} = \hat{σ^{2}} (ζ)

has to be calculated for every

ζ \in S^{1} .

Though these confidence sets can be approximated by using a grid as in [13], we suggest using a simultaneous upper bound for the variance of

Im ζ^{- 1} Z_{k}

.

We obtain a (conservative) connected, symmetric confidence set

C_{V} \subseteq C_{H}

by testing

ζ \in C_{H}

with

\hat{σ_{\max}^{2}} = \sup_{ζ \in C_{H}} \hat{σ^{2}}

as a common upper bound for the variance perpendicular to any

ζ \in C_{H}

. Note that

\hat{σ_{\max}^{2}}

can be obtained as the solution of Equation (9) with

{\tilde{V}}_{n} = \sup_{ζ \in C_{H}} \frac{1}{n} \sum_{k = 1}^{n} {(Im ζ^{- 1} Z_{k})}^{2} .

Furthermore, we can shorten

C_{V}

by iteratively redefining

{\tilde{V}}_{n} = \sup_{ζ \in C_{V}} \frac{1}{n} \sum_{k = 1}^{n} {(Im ζ^{- 1} Z_{k})}^{2}

and recalculating

C_{V}

(see Algorithm 1). The resulting opening angle will be denoted by

δ_{V} = \arcsin \frac{s_{V}}{| {\bar{Z}}_{n} |} .

Algorithm 1: Algorithm for computation of

C_{V}

.

Proposition 2.

Let

Z_{1}, \dots, Z_{n}

be random variables taking values on the unit circle

S^{1},

and let

α \in (0, 1) .

(i): The set $C_{V}$ resulting from Algorithm 1 is a $(1 - α)$ -confidence set for the circular population mean set. In particular, if $E Z = 0$ , i.e., the circular population mean set equals $S^{1}$ , then $| {\bar{Z}}_{n} | > \sqrt{2} s_{0}$ with probability at most $α,$ so indeed $C_{V} = S^{1}$ with probability of at least $1 - α .$
(ii): $s_{V}$ is of order $n^{- \frac{1}{2}}$ .
(iii): If $E Z \neq 0,$ i.e., if the circular population mean is unique, then $\sqrt{n} δ_{V} \to 0$ in probability, and the probability of obtaining a trivial confidence set, i.e., $P (C_{H} = S^{1}) = P (| {\bar{Z}}_{n} | \leq \sqrt{2} s_{0})$ , goes to 0 exponentially fast.
(iv): If $E Z \neq 0$ , then

$\underset{n \to \infty}{lim sup} \frac{δ_{V}}{δ_{A}} \leq \frac{\sqrt{- 2 \ln \frac{α}{4}}}{q_{1 - \frac{α}{2}}} a . s .$

with $q_{1 - \frac{α}{2}}$ denoting the $(1 - \frac{α}{2})$ -quantile of the standard normal distribution $N (0, 1) .$

Proof.

Again, (i) follows by construction, while (iii) is shown as in Proposition 1.

For (ii), note that

s_{V} \leq s_{0}

since the bound in Equation (10) for

ρ^{2} = 1

agrees with the bound in Equation (6) for

a = - 1,

b = 1

and

v = 0,

thus

s_{V}

and

δ_{V}

are at least of the order

n^{- \frac{1}{2}} .

For (iv), we will use the estimate in Equation (11). Recall that

\ln (1 + x) = x - \frac{x^{2}}{2} + o (x^{2})

; therefore, for large n and hence small

s_{V}

a.s.

\begin{matrix} \frac{α}{4} & \leq \exp (- n s_{V} [(1 + \frac{\hat{σ_{\max}^{2}}}{s_{V}}) (\frac{s_{V}}{\hat{σ_{\max}^{2}}} - \frac{s_{V}^{2}}{2 {(\hat{σ_{\max}^{2}})}^{2}} + o (s_{V}^{2})) - 1]) \\ = \exp (- n s_{V}^{2} / 2 \hat{σ_{\max}^{2}} + o (s_{V}^{2})), \end{matrix}

thus

s_{V} \leq \sqrt{- 2 \hat{σ_{\max}^{2}} \ln (\frac{α}{4}) / n} + o (n^{- \frac{1}{2}}) .

Additionally,

\arcsin x = x + o (x)

for x close to 0 which gives

δ_{V} = s_{V} / | {\bar{Z}}_{n} | + o (s_{V}) \leq \sqrt{- 2 \hat{σ_{\max}^{2}} \ln \frac{α}{4}} / (\sqrt{n} | {\bar{Z}}_{n} |) + o (n^{- \frac{1}{2}})

a.s.

Furthermore,

\hat{σ_{\max}^{2}} \to σ^{2}

a.s. for

n \to \infty

, and we obtain

\underset{n \to \infty}{lim sup} \frac{δ_{V}}{δ_{A}} \leq \frac{\sqrt{- 2 \ln \frac{α}{4}}}{q_{1 - \frac{α}{2}}} a . s .

since

δ_{A} = \frac{q_{1 - \frac{α}{2}}}{\sqrt{n} | {\bar{Z}}_{n} |} \underset{\to \sqrt{σ^{2}}}{\underset{︸}{\sqrt{\frac{1}{n} \sum_{k = 1}^{n} {(Im ({\hat{μ}}_{n}^{- 1} Z_{k}))}^{2}}}}

(see Equation (5)). ☐

4. Simulation and Application to Real Data

We will compare the asymptotic confidence set

C_{A}

, the confidence set

C_{H}

constructed directly using Hoeffding’s inequality in Section 2, and the confidence set

C_{V}

resulting from Algorithm 1 by reporting their corresponding opening angles

δ_{A}

,

δ_{H}

, and

δ_{V}

in degrees (

^{\circ}

) as well as their coverage frequencies in simulations.

All computations have been performed using our own code based on the software package R (version 2.15.3) [14] .

4.1. Simulation 1: Two Points of Equal Mass at $\pm 10^{\circ}$

First, we consider a rather favourable situation:

n = 400

independent draws from the distribution with

P (Z = \exp (10 π i / 180)) = P (Z = \exp (- 10 π i / 180)) = \frac{1}{2}

. Then, we have

| E Z | = E Z = \cos (10 π i / 180) \approx 0.985

, implying that the data are highly concentrated,

μ = 1

is unique, and the variance of Z in the direction of μ is 0; there is only variation perpendicular to μ, i.e., in the direction of the imaginary axis (see Figure 4).

Table 1 shows the results based on 10,000 repetitions for a nominal coverage probability of

1 - α = 95 %

: the average

δ_{H}

is about

3.5

times larger than

δ_{V}

, which is about twice as large as

δ_{A}

. As expected, the asymptotics are rather precise in this situation:

C_{A}

did cover the true mean in about

95 %

of the cases, which implies that the other confidence sets are quite conservative; indeed

C_{H}

and

C_{V}

covered the true mean in all repetitions. One may also note that the angles varied only a little between repetitions.

4.2. Simulation 2: Three Points Placed Asymmetrically

Secondly, we consider a situation which has been designed to show that even a considerably large sample size (

n = 100

) does not guarantee approximate coverage for the asymptotic confidence set

C_{A}

: the distribution of Z is concentrated on three points,

ξ_{j} = \exp (θ_{j} π i / 180)

,

j = 1, 2, 3

with weights

ω_{j} = P (Z = ξ_{j})

chosen such that

E Z = | E Z | = 0.9

(implying a small variance and

μ = 1

),

ω_{1} = 1 %

and

Arg ξ_{1} > 0

, while

Arg ξ_{2}, Arg ξ_{3} < 0

. In numbers,

θ_{1} \approx 25.8

,

θ_{2} \approx - 0.3

, and

θ_{3} \approx - 179.7

(in

^{\circ}

) while

ω_{2} \approx 94 %

, and

ω_{3} \approx 5 %

(see Figure 5).

The results based on 10,000 repetitions are shown in Table 2 where a nominal coverage probability of

1 - α = 90 %

was prescribed. Clearly,

C_{A}

with its coverage probability of less than

64 %

performs quite poorly while the others are conservative;

δ_{V} \approx 5^{\circ}

still appears small enough to be useful in practice, though.

4.3. Real Data: Movements of Ants

Fisher [3] (Example 4.4) describes a data set of the directions 100 ants took in response to an illuminated target placed at

180^{\circ}

for which it may be of interest to know whether the ants indeed (on average) move towards that target (see [15] for the original publication). The data set is available as Ants_radians within the R package CircNNTSR [16].

The circular sample mean for this data set is about

- {176.9}^{\circ}

; for a nominal coverage probability of

1 - α = 95 %

, one gets

δ_{H} \approx {27.3}^{\circ}

,

δ_{V} \approx {20.5}^{\circ}

, and

δ_{A} \approx {9.6}^{\circ}

so that all confidence sets contain

\pm 180^{\circ}

(see Figure 6). The data set’s concentration is not very high, however, so the circular population mean could—according to

C_{V}

—also be

- {156.4}^{\circ}

or

{162.6}^{\circ}

.

5. Discussion

We have derived two confidence sets,

C_{H}

and

C_{V}

, for the set of circular sample means. Both guarantee coverage for any finite sample size without making any assumptions on the distribution of the data (besides that they are independent and identically distributed) at the cost of potentially being quite conservative; they are non-asymptotic and universal in this sense. Judging from the simulations and the real data set,

C_{V}

—which estimates the variance perpendicular to the mean direction—appears to be preferable over

C_{H}

(as expected) and small enough to be useful in practice.

While the asymptotic confidence set’s opening angle is less than half (asymptotically about

2 / 3

for

α = 5 %

) of the one for

C_{V}

in our simulations and application, it has the drawback that even for a sample size of

n = 100

, it may fail to give a coverage probability close to the nominal one; in addition, one has to assume that the circular population mean is unique. Of course, one could also devise an asymptotically justified test for the latter but this would entail a correction for multiple testing (for example working with

\frac{α}{2}

each time), which would also render the asymptotic confidence set conservative.

Further improvements would require sharper “universal” mass concentration inequalities taking the first or the first two moments into account; however, this is beyond the scope of this article.

Acknowledgments

T. Hotz wishes to thank Stephan Huckemann from the Georgia Augusta University of Göttingen for fruitful discussions concerning the first construction of confidence regions described in Section 2. We acknowledge support for the Article Processing Charge by the German Research Foundation and the Open Access Publication Fund of the Technische Universität Ilmenau. F. Kelma acknowledges support by the Klaus Tschira Stiftung, gemeinnützige Gesellschaft, Projekt 03.126.2016.

Author Contributions

All authors contributed to the theoretical and numerical results as well as to the writing of the article. All authors have read and approved the final manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proofs of Monotonicity

Lemma A1.

β (t) = {[{(\frac{ν - a}{ν - a + t})}^{ν - a + t} {(\frac{b - ν}{b - ν - t})}^{b - ν - t}]}^{\frac{n}{b - a}}

is strictly decreasing in

t .

Proof.

We show the equivalent statement that

\tilde{β} (t) = \ln [{(\frac{ν - a}{ν - a + t})}^{ν - a + t} {(\frac{b - ν}{b - ν - t})}^{b - ν - t}]

is strictly decreasing in t:

\begin{matrix} \frac{d}{d t} \tilde{β} (t) & = \frac{d}{d t} ((\ln (ν - a) - \ln (ν - a + t)) (ν - a + t) + (\ln (b - ν) - \ln (b - ν - t)) (b - ν - t)) \\ = \ln (ν - a) - \ln (ν - a + t) - \frac{1}{ν - a + t} (ν - a + t) - \ln (b - ν) + \ln (b - ν - t) + \frac{1}{b - ν - t} (b - ν - t) \\ = \ln (\underset{< 1}{\underset{︸}{\frac{b - ν - t}{b - ν}}} \cdot \underset{< 1}{\underset{︸}{\frac{ν - a}{ν - a + t}}}) < 0 . \end{matrix}

Hence,

\tilde{β} (t)

and thus

β (t)

are strictly decreasing in

t .

☐

Lemma A2.

Let

t = t (γ, ν, a, b)

be the solution to the equation

β (t) = γ .

Then,

ν + t

is strictly increasing in

ν .

Proof.

t is the solution of the equation

(ν - a + t) \ln (\frac{ν - a}{ν - a + t}) + (b - ν - t) \ln (\frac{b - ν}{b - ν - t}) = \frac{b - a}{n} \ln γ .

(A1)

The derivatives of the left-hand side of Equation (A1) w.r.t. ν and t exist and are continuous. Furthermore, the derivative w.r.t. t does not vanish for any

t \in (0, b - ν)

, cf. the proof of Lemma A1, whence the derivative

t^{'} = \frac{d t}{d ν}

exists by the implicit function theorem. When differentiating Equation (A1) with respect to

ν,

one obtains

\begin{matrix} (1 + t^{'}) \ln (\frac{ν - a}{ν - a + t}) & + (ν - a + t) (\frac{1}{ν - a} - \frac{1 + t^{'}}{ν - a + t}) \\ - (1 + t^{'}) \ln (\frac{b - ν}{b - ν - t}) + (b - ν - t) (- \frac{1}{b - ν} + \frac{1 + t^{'}}{b - ν - t}) = 0, \end{matrix}

or equivalently

\begin{matrix} (1 + t^{'}) [\underset{< 0}{\underset{︸}{\ln (\frac{ν - a}{ν - a + t})}} - \underset{> 0}{\underset{︸}{\ln (\frac{b - ν}{b - ν - t})}}] & = \frac{t (a - b)}{(v - a) (b - v)} < 0, \end{matrix}

whence

1 + t^{'} = \frac{d}{d ν} (ν + t) > 0

finishes the proof. ☐

Lemma A3.

The function

ξ (\hat{σ^{2}}) = {[{(\frac{1 - \hat{σ^{2}}}{1 - V_{n}})}^{1 - V_{n}} {(\frac{\hat{σ^{2}}}{V_{n}})}^{V_{n}}]}^{n}

is strictly decreasing in

\hat{σ^{2}} \in [V_{n}, 1] .

Proof.

We show the equivalent statement that

n^{- 1} \ln ξ (\hat{σ^{2}})

is strictly decreasing in

\hat{σ^{2}} :

\begin{matrix} \frac{d}{d \hat{σ^{2}}} [n^{- 1} \ln ξ (\hat{σ^{2}})] & = \frac{d}{d \hat{σ^{2}}} [(1 - V_{n}) (\ln (1 - \hat{σ^{2}}) - \ln (1 - V_{n})) + V_{n} (\ln (\hat{σ^{2}}) - \ln (V_{n}))] \\ = - \underset{> 1}{\underset{︸}{\frac{1 - V_{n}}{1 - \hat{σ^{2}}}}} + \underset{< 1}{\underset{︸}{\frac{V_{n}}{\hat{σ^{2}}}}} < 0 . \end{matrix}

☐

Lemma A4.

Let

w = w (γ, ρ^{2})

be the solution of the equation

{[{(1 + \frac{w}{ρ^{2}})}^{- ρ^{2} - w} {(1 - w)}^{w - 1}]}^{\frac{n}{1 + ρ^{2}}} = γ .

Then, w is increasing in

ρ^{2}

.

Proof.

w is the solution of the equation

\frac{ρ^{2} + w}{1 + ρ^{2}} \ln (1 + \frac{w}{ρ^{2}}) + \frac{1 - w}{1 + ρ^{2}} \ln (1 - w) = - \frac{\ln γ}{n} .

(A2)

The derivatives of the left-hand side of Equation (A2) w.r.t.

ρ^{2}

and w exist and are continuous. Furthermore, the derivative w.r.t. w does not vanish for any

w \in (0, 1)

: this derivative is

\frac{1}{1 + ρ^{2}} [\ln (1 + \frac{w}{ρ^{2}}) + \frac{ρ^{2} + w}{ρ^{2} (1 + \frac{w}{ρ^{2}})} - \ln (1 - w) - 1] = \frac{1}{1 + ρ^{2}} [\ln (1 + \frac{w}{ρ^{2}}) - \ln (1 - w)],

vanishing if and only if

1 + \frac{w}{ρ^{2}} = 1 - w

, i.e., if and only if

w (1 + \frac{1}{ρ^{2}}) = 0,

which does not happen for

w, ρ^{2} > 0 .

Now, the derivative

w^{'} = \frac{d w}{d ρ^{2}}

exists by the implicit function theorem. When differentiating Equation (A2) with respect to

ρ^{2},

one obtains

\begin{matrix} \frac{(1 + w^{'}) (1 + ρ^{2}) - (ρ^{2} + w)}{{(1 + ρ^{2})}^{2}} \ln (1 + \frac{w}{ρ^{2}}) \\ + \underset{\frac{w^{'} ρ^{2} - w}{ρ^{2} (1 + ρ^{2})}}{\underset{︸}{\frac{ρ^{2} + w}{1 + ρ^{2}} \cdot \frac{\frac{w^{'}}{ρ^{2}} - \frac{w}{ρ^{4}}}{1 + \frac{w}{ρ^{2}}}}} - \frac{w^{'} (1 + ρ^{2}) + (1 - w)}{{(1 + ρ^{2})}^{2}} \ln (1 - w) - \frac{w^{'}}{1 + ρ^{2}} = 0, \end{matrix}

or equivalently

w^{'} [\underset{> 0}{\underset{︸}{\ln (1 + \frac{w}{ρ^{2}}) - \ln (1 - w)}}] = \frac{w}{ρ^{2}} - \frac{1 - w}{1 + ρ^{2}} \ln (\frac{ρ^{2} + w}{ρ^{2} (1 - w)}) .

Hence,

w^{'} \geq 0

if and only if

\frac{w}{ρ^{2}} \geq \frac{1 - w}{1 + ρ^{2}} \ln (\frac{ρ^{2} + w}{ρ^{2} (1 - w)})

, which holds since

\ln (\frac{ρ^{2} + w}{ρ^{2} (1 - w)}) = \ln (1 + \frac{w (1 + ρ^{2})}{ρ^{2} (1 - w)}) \leq \frac{w}{ρ^{2}} \frac{1 + ρ^{2}}{1 - w},

finishing the proof. ☐

References

Mardia, K.V. Directional Statistics; Academic Press: London, UK, 1972. [Google Scholar]
Watson, G.S. Statistics on Spheres; Wiley: New York, NY, USA, 1983. [Google Scholar]
Fisher, N.I. Statistical Analysis of Circular Data; Cambridge University Press: Cambridge, UK, 1993. [Google Scholar]
Jammalamadaka, S.R.; SenGupta, A. Topics in Circular Statistics; Series on Multivariate Analysis; World Scientific: Singapore, 2001. [Google Scholar]
Mardia, K.V.; Jupp, P.E. Directional Statistics; Wiley: New York, NY, USA, 2000. [Google Scholar]
Fréchet, M. Les éléments aléatoires de nature quelconque dans un espace distancié. Annales de l’Institut Henri Poincaré 1948, 10, 215–310. (In French) [Google Scholar]
Hotz, T. Extrinsic vs. Intrinsic Means on the Circle. In Proceedings of the 1st Conference on Geometric Science of Information, Paris, France, 28–30 October 2013; Lecture Notes in Computer Science, Volume 8085. Springer-Verlag: Heidelberg, Germany, 2013; pp. 433–440. [Google Scholar]
Afsari, B. Riemannian L^p center of mass: Existence, uniqueness, and convexity. Proc. Am. Math. Soc. 2011, 139, 655–673. [Google Scholar] [CrossRef]
Arnaudon, M.; Miclo, L. A stochastic algorithm finding p-means on the circle. Bernoulli 2016, 22, 2237–2300. [Google Scholar] [CrossRef]
Leeb, H.; Pötscher, B.M. Model selection and inference: Facts and fiction. Econ. Theory 2005, 21, 21–59. [Google Scholar] [CrossRef]
Hoeffding, W. Probability Inequalities for Sums of Bounded Random Variables. J. Am. Stat. Assoc. 1963, 58, 13–30. [Google Scholar] [CrossRef]
Boucheron, S.; Lugosi, G.; Massart, P. Concentration Inequalities : A Nonasymptotic Theory of Independence; Oxford University Press: Oxford, UK, 2013. [Google Scholar]
Hotz, T.; Kelma, F.; Wieditz, J. Universal, Non-asymptotic Confidence Sets for Circular Means. In Proceedings of the 2nd International Conference on Geometric Science of Information, Palaiseau, France, 28–30 October 2015; Nielsen, F., Barbaresco, F., Eds.; Lecture Notes in Computer Science, Volume 9389. Springer International Publishing: Cham, Switzerland, 2015; pp. 635–642. [Google Scholar]
R Core Team. R: A Language and Environment for Statistical Computing, version 2.15.3; R Foundation for Statistical Computing: Vienna, Austria, 2013. [Google Scholar]
Jander, R. Die optische Richtungsorientierung der Roten Waldameise (Formica Rufa L.). Zeitschrift für Vergleichende Physiologie 1957, 40, 162–238. (In German) [Google Scholar] [CrossRef]
Fernandez-Duran, J.J.; Gregorio-Dominguez, M.M. CircNNTSR: An R Package for the Statistical Analysis of Circular Data Using Nonnegative Trigonometric Sums (NNTS) Models, version 2.1. 2013.

Figure 1. The construction for the test of the hypothesis

μ = S^{1},

or equivalently

E Z = 0 .

Figure 1. The construction for the test of the hypothesis

μ = S^{1},

or equivalently

E Z = 0 .

Figure 2. The construction for the test of the hypothesis

E Z = λ ζ

with

λ > 0

.

Figure 2. The construction for the test of the hypothesis

E Z = λ ζ

with

λ > 0

.

Figure 3. The critical

{\bar{Z}}_{n}

regarding the rejection of ζ.

δ_{H}

bounds the angle between

{\hat{μ}}_{n}

and any accepted

ζ .

Figure 3. The critical

{\bar{Z}}_{n}

regarding the rejection of ζ.

δ_{H}

bounds the angle between

{\hat{μ}}_{n}

and any accepted

ζ .

Figure 4. Two points of equal mass at

\pm 10^{\circ}

and their Euclidean mean.

Figure 4. Two points of equal mass at

\pm 10^{\circ}

and their Euclidean mean.

Figure 5. Three points placed asymmetrically with different masses and their Euclidean mean.

Figure 6. Ant data ( Entropy 18 00375 i002

) placed at increasing radii to visually resolve ties; in addition, the circular mean direction ( Entropy 18 00375 i003

) as well as confidence sets

C_{H}

(

),

C_{V}

(

), and

C_{A}

(

) are shown.

Figure 6. Ant data ( Entropy 18 00375 i002

) placed at increasing radii to visually resolve ties; in addition, the circular mean direction ( Entropy 18 00375 i003

) as well as confidence sets

C_{H}

(

),

C_{V}

(

), and

C_{A}

(

) are shown.

Table 1. Results for simulation 1 (two points of equal mass at

\pm 10^{\circ}

) based on 10,000 repetitions with

n = 400

observations each: average observed

δ_{H}

,

δ_{V}

, and

δ_{A}

(with corresponding standard deviation), as well as frequency (with corresponding standard error) with which

μ = 1

was covered by

C_{H}

,

C_{V}

, and

C_{A}

, respectively; the nominal coverage probability was

1 - α = 95 %

.

**Table 1.** Results for simulation 1 (two points of equal mass at $\pm 10^{\circ}$ ) based on 10,000 repetitions with $n = 400$ observations each: average observed $δ_{H}$ , $δ_{V}$ , and $δ_{A}$ (with corresponding standard deviation), as well as frequency (with corresponding standard error) with which $μ = 1$ was covered by $C_{H}$ , $C_{V}$ , and $C_{A}$ , respectively; the nominal coverage probability was $1 - α = 95 %$ .
Confidence Set	Mean δ (±s.d.)	Coverage Frequency (±s.e.)
$C_{H}$	${8.2}^{\circ}$ ( $\pm {0.0005}^{\circ}$ )	$100.0 %$ ( $\pm 0.0 %$ )
$C_{V}$	${2.4}^{\circ}$ ( $\pm {0.0025}^{\circ}$ )	$100.0 %$ ( $\pm 0.0 %$ )
$C_{A}$	${1.0}^{\circ}$ ( $\pm {0.0019}^{\circ}$ )	$94.8 %$ ( $\pm 0.2 %$ )

Table 2. Results for simulation 2 (three points placed asymmetrically) based on 10,000 repetitions with

n = 100

observations each: average observed

δ_{H}

,

δ_{V}

, and

δ_{A}

(with corresponding standard deviation), as well as frequency (with corresponding standard error) with which

μ = 1

was covered by

C_{H}

,

C_{V}

, and

C_{A}

, respectively; the nominal coverage probability was

1 - α = 90 %

.

**Table 2.** Results for simulation 2 (three points placed asymmetrically) based on 10,000 repetitions with $n = 100$ observations each: average observed $δ_{H}$ , $δ_{V}$ , and $δ_{A}$ (with corresponding standard deviation), as well as frequency (with corresponding standard error) with which $μ = 1$ was covered by $C_{H}$ , $C_{V}$ , and $C_{A}$ , respectively; the nominal coverage probability was $1 - α = 90 %$ .
Confidence Set	Mean δ (±s.d.)	Coverage Frequency (±s.e.)
$C_{H}$	${16.5}^{\circ}$ ( $\pm {0.85}^{\circ}$ )	$100.0 %$ ( $\pm 0.0 %$ )
$C_{V}$	${5.0}^{\circ}$ ( $\pm {0.38}^{\circ}$ )	$100.0 %$ ( $\pm 0.0 %$ )
$C_{A}$	${0.4}^{\circ}$ ( $\pm {0.28}^{\circ}$ )	$62.8 %$ ( $\pm 0.5 %$ )

© 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hotz, T.; Kelma, F.; Wieditz, J. Non-Asymptotic Confidence Sets for Circular Means. Entropy 2016, 18, 375. https://doi.org/10.3390/e18100375

AMA Style

Hotz T, Kelma F, Wieditz J. Non-Asymptotic Confidence Sets for Circular Means. Entropy. 2016; 18(10):375. https://doi.org/10.3390/e18100375

Chicago/Turabian Style

Hotz, Thomas, Florian Kelma, and Johannes Wieditz. 2016. "Non-Asymptotic Confidence Sets for Circular Means" Entropy 18, no. 10: 375. https://doi.org/10.3390/e18100375

APA Style

Hotz, T., Kelma, F., & Wieditz, J. (2016). Non-Asymptotic Confidence Sets for Circular Means. Entropy, 18(10), 375. https://doi.org/10.3390/e18100375

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Non-Asymptotic Confidence Sets for Circular Means^†

Abstract

1. Introduction

2. Construction Using Hoeffding’s Inequality

3. Estimating the Variance

4. Simulation and Application to Real Data

4.1. Simulation 1: Two Points of Equal Mass at $\pm 10^{\circ}$

4.2. Simulation 2: Three Points Placed Asymmetrically

4.3. Real Data: Movements of Ants

5. Discussion

Acknowledgments

Author Contributions

Conflicts of Interest

Appendix A. Proofs of Monotonicity

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Non-Asymptotic Confidence Sets for Circular Means †

Abstract

1. Introduction

2. Construction Using Hoeffding’s Inequality

3. Estimating the Variance

4. Simulation and Application to Real Data

4.1. Simulation 1: Two Points of Equal Mass at ± 10 ∘

4.2. Simulation 2: Three Points Placed Asymmetrically

4.3. Real Data: Movements of Ants

5. Discussion

Acknowledgments

Author Contributions

Conflicts of Interest

Appendix A. Proofs of Monotonicity

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Non-Asymptotic Confidence Sets for Circular Means^†

4.1. Simulation 1: Two Points of Equal Mass at $\pm 10^{\circ}$