1. Introduction
Consider the process
defined on a probability space
as a solution to the stochastic partial differential equation
with initial and boundary conditions
where
and
is an unknown parameter, whereas
Q is the covariance operator for the Wiener process
, so that
with
being a cylindrical Brownian motion in
. It is a standard fact (see, e.g., [
1]) that, given
Q is nuclear,
where
are independent standard Brownian motions and
is a complete orthonormal system in
, which consists of eigenvectors of
Q. We denote
as the eigenvalue corresponding to
. For simplicity, we consider a special covariance operator
and a complete orthonormal system
with
In this case, the corresponding eigenvalues
are
, that is,
We define a solution
to the problem (
1) as a formal sum (see [
1])
where the Fourier coefficient
follows the dynamics of Ornstein–Uhlenbeck processes as follows:
with initial condition
Here, the
are determined by
It can be shown (see [
1]) that
belongs to
;
together with its derivative in
It vanishes at 0 and 1 and its norm in
is continuous in
In addition,
is the only solution to (
1) with the above properties. Let
be the finite dimensional subspace of
generated by
. The likelihood ratio of the projection of the solution
onto the subspace
(see [
2,
3])
can be expressed as follows:
where
denotes the probability measure on
generated by the
.
By maximizing the log likelihood ratio with respect to the parameter
, we obtain the Maximum Likelihood Estimator (MLE)
for
based on
as follows:
Moreover, using (
2) and (
3), we can write
Recently, several papers provided explicit upper bounds for the Kolmogorov distance for the rate of convergence for the central limit theorem of estimators for coefficients in stochastic Gaussian models, see, e.g., [
4,
5,
6,
7,
8].
The purpose of this paper is to derive upper bounds of the Wasserstein distance for the rate of convergence of the distribution of the MLE
when
and/or
. Upper bounds of the Kolmogorov distance for the central limit theorem of the MLE
, as
and
T fixed, are provided in [
4,
9]. Let us describe what is proved in this direction. In [
9], Mishra and Prakasa Rao proved that there exists a constant
depending on
and
T such that, for any
and
, depending on
and
T,
where
denotes a standard normal random variable and the normalizing factor
is
Moreover, in ([
9], Remark 4.4), Mishra and Prakasa Rao proved that, if
, then the upper bound given by (
5) is of order
, and, in such case, the upper bound can be obtained to be of order
by choosing
. However, if
(for example,
i.e.,
for all
), then the upper bound in (
5) is given by
In this case, we notice that the upper bound of the Kolmogorov distance given by (
6) does not show that the normal approximation of the MLE
holds. Hence, the sharp upper bound is needed to prove the normal approximation through the Kolmogorov distance. This problem has been solved by Kim and Park in [
4], where they improved the bound in (
5) to that converging to zero when
and
T fixed, by using techniques based the combination Malliavin calculus and Stein’s method. More precisely, they proved, in the case when
, that, for sufficiently large
N, there exists a constant
depending on
and
T such that
where the normalizing factor
is given by
The goal of this paper is to provide Berry–Esseen bounds in Wasserstein distance for the MLE
when
and/or
. Let us first recall that the estimator
is strongly consistent and asymptotically normal in three asymptotic regimes: for the two cases
and
T fixed, and
and
N fixed, see, for instance, [
10] and references therein, and for the case when both
, see [
11]. However, the study of the asymptotic distribution of an estimator is not very useful in general for practical purposes unless the rate of convergence is known. To the best of our knowledge, no results of Berry–Esseen bounds are known for MLE
in terms of Wasserstein distance when
and/or
. Recall that, if
are two real-valued integrable random variables, then the Wasserstein distance between the law of
X and the law of
Y is given by
where
is the set of all Lipschitz functions with Lipschitz constant
.
In what follows, in order to simplify the notation, we set and hence for all The following are the main results of this paper.
Case 1:
and
T fixed. Then, there exists a positive constant
depending only on
and
T such that, for every
,
In particular, as
,
Case 2:
and
N fixed. Then, there exists a positive constant
depending only on
and
N such that, for every
,
In particular, as
,
Case 3:
and
. Then, there exists a positive constant
depending only on
such that, for every
and
,
In particular, as
,
Remark 1. Note that, in Case 1, and T fixed, we obtained the upper bound in Wasserstein distance for normal approximation of the MLE , while the upper bound in Kolmogorov distance obtained by Kim and Park [4] is . The paper is organized as follows:
Section 2 contains some preliminaries presenting the tools needed from the analysis on Wiener space, including Wiener chaos calculus and Malliavin calculus. In
Section 3, we derive upper bounds for the rate of convergence of the distribution of the MLE
when
and/or
, see Theorem 1. We also included in this section a lemma that plays an important role in the proof of Theorem 1.
2. Preliminaries
In this section, we recall some elements from the analysis on Wiener space and the Malliavin calculus for Gaussian processes that we will need in the paper. For more details, we refer the reader to [
12,
13]. Let
and let
be a Wiener process that is a centered Gaussian family of random variables on a probability space
such that
. In this case, we denote
and
for every
.
The Wiener chaos of order p is defined as the closure in of the linear span of the random variables , where and is the Hermite polynomial of degree p.
Multiple Wiener–Itô integral. The multiple Wiener stochastic integral
with respect to
W of order
p is defined as an isometry between the Hilbert space
(symmetric tensor product) equipped with the norm
and the Wiener chaos of order
p under
’s norm, that is, the multiple Wiener stochastic integral of order
p:
is a linear isometry defined by
.
The Wiener chaos expansion. Let
; then, there exists a unique sequence of functions
in
such that
where the terms
are all mutually orthogonal in
and
Product formula and contractions. Let
p,
be integers and
and
; then,
where
is the contraction of
f and
g of order
r, which is an element of
defined by
Its symmetrization is denoted by
, where the symmetrization
of a function
f is defined by
where the sum runs over all permutations
of
. The special case for
in (
10) is particularly handy, and can be written in its symmetrized form:
where
means the tensor product of
f and
g.
Hypercontractivity property in Wiener chaos. Fix
. For any
, there exists
depending only on
p and
q such that, for every
,
It should be noted that the constants
above are known with some precision when
: by ([
12], Corollary 2.8.14),
.
Optimal fourth moment theorem. Let
Z denote the standard normal law. Let a sequence
, such that
and
, and assume
converges to a normal law in distribution, which is equivalent to
(this equivalence, proved originally in [
14], is known as the
fourth moment theorem). Then, we have the optimal estimate for total variation distance
, known as the optimal 4th moment theorem, proved in [
15]. This optimal estimate also holds with Wasserstein distance
, see ([
16], Remark 2.2), as follows: there exist two constants
depending only on the sequence
X but not on
n, such that
Moreover, we recall that, for a standardized random variable
X, i.e., with
and
, the third and fourth cumulants are, respectively,
Fix
and an integer
. Recall that, if
and
with
are independent standard Brownian motions; then, for every
, the multiple integral
is defined by
and
Moreover, if
, then the third and fourth cumulants for
satisfy the following (see (6.2) and (6.6) in [
17], respectively):
and
Throughout the paper, denotes a standard normal random variable, while denotes a normal variable with mean and variance .
3. Berry–Esseen Bounds for the MLE
Recall that, in what follows, in order to simplify the notation, we set
and hence
for all
In this case, since the Equation (
2) is linear, it is immediate to solve it explicitly; one then gets the following formula:
Let us introduce the following sequences:
and
Combining (
4) and (
18), we have, for every
,
Using (
14), we can write
where
On the other hand, using the product formula (
11),
Thus, for every
,
This and the linearity of
imply
According to (
20)–(
22), we can write, for every
,
Lemma 1. Let and , where is a Wiener process. Let denote the sigma-field generated by W, that is, . Then, for every ,where the function is increasing and hence for all . Furthermore, for every , there exists a positive constant depending only on θ and such thatwhere the processes , are given by (17). Proof. We will use similar arguments as in ([
16], Proposition 6.3). Let
. Using the fact that, for every
,
is independent of
, we have
Moreover, since
for all
, the function
is increasing. Thus, the proof (
24) is complete.
Let us now prove (
25). Fix
, and let
m be a positive integer such that
. Using Hölder’s inequality, we have, for all
,
Using the fact that if
, almost surely,
, we obtain
Applying Carbery–Wright Inequality, there is a universal constant
such that, for any
, we can write
Using (
24) for
and the fact that, for any fixed
, the function
is increasing on
. Moreover, since
is positive and continuous on
and
as
, we have
. Combining these facts, we get, for every
, that
Therefore, combining (
27)–(
29), we deduce that, for every
,
Consequently, it follows from (
26) and (
30) that, for all
,
which completes the proof of (
25). □
Theorem 1. Suppose that . Let be the MLE given by (3), and let be the normalizing factor given by (19). Then, there exists a positive constant depending only on θ such that, for any integer and any real number ,where Z is standard Normal law. Consequently, the estimates (7)–(9) are obtained. Proof. It follows from (
21) that
Combining (
19) and (
32), we get
Notice also that, from (
32), we have
Moreover, since
belongs to
, it follows from (
12) and (
34) that
On the other hand, since
we obtain
Therefore, using (
19), (
22), (
33), (
34) and (
36), there exists a positive constant
depending only on
such that, for every
,
Using (
16), straightforward calculations lead to
Combining (
13), (
38) and (
39), there exists a positive constant
depending only on
such that, for every
,
It follows from (
20) that
On the other hand, from (
25), we have
Using (
35), (
37) and (
40)–(
42), there exists a positive constant
depending only on
such that, for every
,
Therefore, the desired result is obtained. □
In this paper, we are interested in the rate of convergence for the central limit theorem of the maximum likelihood estimator of the drift coefficient for a stochastic partial differential equation based on continuous time observations of the Fourier coefficients
of the solution, over some finite interval of time
. We provide explicit upper bounds for the Wasserstein distance for the rate of convergence when
and/or
. In the case when
T is fixed and
, the upper bounds obtained in our results are more efficient than those of the Kolmogorov distance given by Mishra and Prakasa Rao [
9] and Kim and Park [
4].