1. Introduction
In 1950–1951, Laurent Schwartz published a two volumes work Théorie des Distributions [
1,
2], where he provided a convenient formalism for the theory of distributions. The purpose of this paper is to present a self-contained account of the main ideas, results, techniques, and proofs that underlie the approach to distribution theory that is central to aspects of quantum mechanics and infinite dimensional analysis. This approach develops the structure of the space of Schwartz test functions by utilizing the operator
This operator arose in quantum mechanics as the Hamiltonian for a harmonic oscillator and, in that context as well as in white noise analysis, the operator N = T − 1 is called the number operator. The physical context provides additional useful mathematical tools such as creation and annihilation operators, which we examine in detail.
In this paper we include under one roof all the essential necessary notions of this approach to the test function space
. The relevant notions concerning topological vector spaces are presented so that the reader need not wade through the many voluminous available works on this subject. We also describe in brief the origins of the relevant notions in quantum mechanics.
We present
the essential notions and results concerning topological vector spaces;
a detailed analysis of the creation operator C, the annihilation operator A, the number operator N and the harmonic oscillator Hamiltonian T;
a detailed account of the Schwartz space
, and its topology, as a decreasing intersection of subspaces
, for
p ∈ {0, 1, 2,
…}:
an exact characterization of the functions in the space
;
summary of notions from spectral theory and quantum mechanics;
Our exposition of the properties of
T and of
follows Simon’s paper [
3], but we provide more detail and our notational conventions are along the lines now standard in infinite-dimensional distribution theory.
The classic work on spaces of smooth functions and their duals is that of Schwartz [
1,
2]. Our purpose is to present a concise and coherent account of the essential ideas and results of the theory. Of the results that we discuss, many can be found in other works such as [
1–
6], which is not meant to be a comprehensive list. We have presented portions of this material previously in [
7], but also provide it here for convenience. The approach we take has a direct counterpart in the theory of distributions over infinite dimensional spaces [
8,
9].
2. Basic Notions and Framework
In this section we summarize the basic notions, notation, and results that we discuss in more detail in later sections. Here, and later in this paper, we work mainly with the case of functions of one variable and then describe the generalization to the multi-dimensional case.
We use the letter
W to denote the set of all non-negative integers:
2.1. The Schwartz Space
The
Schwartz space is the linear space of all functions
f :
R →
C which have derivatives of all orders and which satisfy the condition
for all
a,
b ∈
W = {0, 1, 2,
…}. The finiteness condition for all
a ≥ 1 and
b ∈
W, implies that
xafb(
x) actually goes to 0 as |
x| →
∞, for all
a,
b ∈
W, and so functions of this type are said to be
rapidly decreasing.
2.2. The Schwartz Topology
The functions
pa,b are
semi-norms on the vector space
, in the sense that
and
for all
, and
z ∈
C. For this semi-norm, an open ball of radius
r centered at some
is given by
Thus each pa,b specifies a topology τa,b on
. A set is open according to τa,b if it is a union of open balls.
One way to generate the standard Schwartz topology τ on
is to "combine" all the topologies τa,b. We will demonstrate how to generate a "smallest" topology containing all the sets of τa,b for all a, b ∈ W. However, there is a different approach to the topology on
that is very useful, which we describe in detail below.
2.3. The Operator T
The operator
plays a very useful role in working with the Schwartz space. As we shall see, there is an orthonormal basis
of
L2(
R,
dx), consisting of eigenfunctions
ϕn of
T:
The functions
ϕn, called the
Hermite functions are actually in the Schwartz space
. Let
B be the bounded linear operator on
L2(
R) given on each
f ∈
L2(
R) by
It is readily checked that the right side does converge and, in fact,
Note that
B and
T are inverses of each other on the linear span of the vectors
ϕn:
where
2.4. The L2 Approach
For any
p ≥ 0, the image of
Bp consists of all
f ∈
L2(
R) for which
This is a subspace of
L2(
R), and on
there is an inner-product 〈·, ·〉
p given by
which makes it a Hilbert space, having
, and hence also
, a dense subspace. We will see later that functions in
are
p-times differentiable.
We will prove that
the intersection is exactly equal to S(R). In fact,
We will also prove that
the topology on generated by the norms ║·║
p coincides with the standard topology. Furthermore, the elements
form an orthonormal basis of
, and
showing that the inclusion map
is
Hilbert-Schmidt.
The topological vector space
has topology generated by a
complete metric [
10], and
has a countable dense subset given by all finite linear combinations of the vectors
ϕn with rational coefficients.
2.5. Coordinatization as a Sequence Space
All of the results described above follow readily from the identification of
with a space of sequences. Let {ϕn}n∈W be the orthonormal basis of L2(R) mentioned above, where W = {0, 1, 2, …}. Then we have the set CW; an element a ∈ CW is a map W → C : n ⟼an. So we shall often write such an element a as (an)n∈W.
We have then the coordinatizing map
For each
p ∈
W let
Ep be the subset of
CW consisting of all (
an)
n∈W such that
On
Ep define the inner-product 〈·, ·〉
p by
This makes Ep a Hilbert space, essentially the Hilbert space L2(W, µp), where µp is the measure on W given by µp({n}) = (n + 1)2p for all n ∈ W.
The definition,
Equation (10), for
shows that it is the set of all
f ∈
L2(
R) for which
I(
f) belongs to
Ep.
We will prove in Theorem 16 that
I maps exactly ontoThis will establish essentially all of the facts mentioned above concerning the spaces
.
Note the chain of inclusions:
2.6. The Multi-Dimensional Setting
In the multidimensional setting, the Schwartz space
consists of all infinitely differentiable functions
f on
Rd for which
for all (
k1,
…,
kd) ∈
W d and
m = (
m1,
…,
md) ∈
Wd. For this setting, it is best to use some standard notation:
For the multi-dimensional case, we use the indexing set
W d whose elements are
d–tuples
j = (
j1,
…,
jd), with
j1,
…,
jd ∈
W, and counting measure
µ0 on
Wd. The sequence space is replaced by
; a typical element
, is a map
The orthonormal basis (
ϕn)
n∈W of
L2(
R) yields an orthonormal basis of
L2(
Rd) consisting of the vectors
The coordinatizing map
I is replaced by the map
where
Replace the operator
T by
In place of
B, we now have the bounded operator
Bd on
L2(
Rd) given by
Again, Td and Bd are inverses of each other on the linear subspace d of L2(Rd) spanned by the vectors ϕj.
The space
Ep is now the subset of
consisting of all
for which
This is a Hilbert space with inner-product
Again we have the chain of spaces
with the inclusion
Ep+1 →
Ep being Hilbert-Schmidt.
To go back to functions on
Rd, define
to be the range of
Bd. Thus
is the set of all
f ∈
L2(
Rd) for which
The inner-product 〈·, ·〉
p comes back to an inner-product, also denoted 〈·, ·〉
p, on
and is given by
The intersection
equals
. Moreover, the topology on
is the smallest one generated by the inner-products obtained from 〈·, ·〉p, with p running over W.
3. Topological Vector Spaces
The Schwartz space is a topological vector space, i.e., it is a vector space equipped with a Hausdorff topology with respect to which the vector space operations (addition, and multiplication by scalar) are continuous. In this section we shall go through a few of the basic notions and results for topological vector spaces.
Let V be a real vector space. A vector topology τ on V is a topology such that addition V × V → V : (x, y) ⟼ x + y and scalar multiplication R × V → V : (t, x) ⟼ tx are continuous. If V is a complex vector space we require that C × V → V : (α, x) ⟼ αx be continuous.
It is useful to observe that when
V is equipped with a vector topology, the translation maps
are continuous, for every
x ∈
V, and are hence also homeomorphisms since
.
A
topological vector space is a vector space equipped with a Hausdorff vector topology. A
local base of a vector topology
τ is a family of open sets {
Uα}
α∈I containing 0 such that if
W is any open set containing 0 then
W contains some
Uα. If
U is any open set and
x any point in
U then
U −
x is an open neighborhood of 0 and hence contains some
Uα, and so
U itself contains a neighborhood
x +
Uα of
x:
Doing this for each point x of U, we see that each open set is the union of translates of the local base sets Uα.
3.1. Local Convexity and the Minkowski Functional
A vector topology τ on V is locally convex if for any neighborhood W of 0 there is a convex open set B with 0 ∈ B ⊂ W. Thus, local convexity means that there is a local base of the topology τ consisting of convex sets. The principal consequence of having a convex local base is the Hahn-Banach theorem which guarantees that continuous linear functionals on subspaces of V extend to continuous linear functionals on all of V. In particular, if V ≠ {0} is locally convex then there exist non-zero continuous linear functionals on V.
Let
B be a convex open neighborhood of 0. Continuity of
R × V →
V : (
s,
x) ⟼
sx at
s = 0 shows that for each
x the multiple
sx lies in
B if
s is small enough, and so
t−1x lies in
B if
t is large enough. The smallest value of
t for which
t−1x is just outside
B is clearly a measure of how large
x is relative to
B. The
Minkowski functional µB is the function on
V given by
Note that 0 ≤
µB(
x) <
∞. The definition of
µB shows that
µB(
kx) =
kµB(
x) for any
k ≥ 0. Convexity of
B can be used to show that
If B is symmetric, i.e., B = −B, then µB(kx) = |k|µB(x) for all real k. If V is a complex vector space and B is balanced in the sense that αB = B for all complex numbers α with |α| = 1, then µB(kx) = |k|µB(x) for all complex k. Note that in general it could be possible that µB(x) is 0 without x being 0; this would happen if B contains the entire ray {tx : t ≥ 0}.
3.2. Semi-Norms
A typical vector topology on
V is specified by a
semi-norm on
V,
i.e., a function
µ :
V →
R such that
for all
x,
y ∈
V and
t ∈
R (complex
t if
V is a complex vector space). Note that then, using
t = 0, we have
µ(0) = 0 and, using −
x for
y, we have
µ(
x) ≥ 0. For such a semi-norm, an open ball around
x is the set
and the topology
τµ consists of all sets which can be expressed as unions of open balls. These balls are convex and so the topology
τµ is locally convex. If
µ is actually a norm,
i.e.,
µ(
x) is 0 only if
x is 0, then
τµ is Hausdorff.
A consequence of the triangle inequality
Equation (32) is that a semi-norm
µ is uniformly continuous with respect to the topology it generates. This follows from the inequality
which implies that
µ, as a function on
V, is continuous with respect to the topology
τµ it generates. Now suppose
µ is continuous with respect to a vector topology
τ. Then the open balls {
y ∈
V :
µ(
y −
x) <
r} are open in the topology
τ and so
τµ ⊂
τ.
3.3. Topologies Generated by Families of Topologies
Let {τα}α∈I be a collection of topologies on a space. It is natural and useful to consider the the least upper bound topology τ, i.e., the smallest topology containing all sets of ∪α∈Iτα. In our setting, we work with each τα a vector topology on a vector space V.
Theorem 1. The least upper bound topology τ of a collection {τα}α∈I of vector topologies is again a vector topology. If is a local base for τα then a local base for τ is obtained by taking all finite intersections of the form.
Proof. Let B be the collection of all sets which are of the form
.
Let
τ′ be the collection of all sets which are unions of translates of sets in
B (including the empty union). Our first objective is to show that
τ′ is a topology on
V. It is clear that
τ′ is closed under unions and contains the empty set. We have to show that the intersection of two sets in
τ′ is in
τ′. To this end, it will suffice to prove the following:
Clearly, it suffices to consider finitely many topologies τα. Thus, consider vector topologies τ1, …, τn on V.
Let
be the collection of all sets of the form B1 ∩⋯∩ Bn with Bi in a local base for τi, for each i ∈ {1, …, n}. We can check that if D,
then there is an
with E ⊂ D ∩ D′.
Working with
Bi drawn from a given local base for
τi, let
z be a point in the intersection
B1∩⋯
∩Bn. Then there exist sets
, with each
being in the local base for
τi, such that
(this follows from our earlier observation
Equation (31)). Consequently,
Now consider sets
C1 an
C2, both in
. Consider
a,
b ∈
V and suppose
x ∈ (
a +
C1)
∩ (
b +
C2). Then since
x −
a ∈
C1 there is a set
with
; similarly, there is a
with
. So
and
. So
where
satisfies
C ⊂
C1 ∩ C2. This establishes
Equation (35), and shows that the intersection of two sets in
τ′ is in
τ′.
Thus τ′ is a topology. The definition of τ′ makes it clear that τ′ contains each τα. Furthermore, if any topology σ contains each τα then all the sets of τ′ are also open relative to σ. Thus τ′ = τ, the topology generated by the topologies τα.
Observe that we have shown that if W ∈ τ contains 0 then W ⊃ B for some
.
Next we have to show that τ is a vector topology. The definition of τ shows that τ is translation invariant, i.e., translations are homeomorphisms. So, for addition, it will suffice to show that addition V × V ⟼V : (x, y) ⟼x + y is continuous at (0, 0). Let W ∈ τ contain 0. Then there is a
with 0 ∈ B ⊂ W. Suppose B = B1 ∩⋯∩ Bn, where each Bi is in the given local base for τi. Since τi is a vector topology, there are open sets Di,
, both containing 0, with
. Then choose Ci,
in the local base for τi with Ci ⊂ Di and
. Then
. Now let C = C1∩⋯∩Cn, and
. Then C,
and
. Thus, addition is continuous at (0, 0).
Now consider the multiplication map
R × V →
V : (
t,
x) ⟼
tx. Let (
s,
y), (
t,
x) ∈
R × V. Then
Suppose
F ∈
τ contains
tx. Then
for some
. Using continuity of the addition map
at (0, 0, 0), we can choose
with
W1 +
W2 +
W3 ⊂
W′. Then we can choose
, such that
Suppose W = B1 ∩⋯∩ Bn, where each Bi is in the given local base for the vector topology τi. Then for s close enough to t, we have (s − t)x ∈ Bi for each i, and hence (s − t)x ∈ W. Similarly, if y is τ–close enough to x then t(y − x) ∈ W. Lastly, if s − t is close enough to 0 and y is close enough to x then (s − t)(y − x) ∈ W. So sy − tx ∈ W′, and so sy ∈ F, when s is close enough to t and y is τ–close enough to x. □
The above result makes it clear that if each τα has a convex local base then so is τ. Note also that if at least one τα is Hausdorff then so is τ.
A family of topologies {τα}α∈I is directed if for any α, β ∈ I there is a γ ∈ I such that τα ∪ τβ ⊂ τγ. In this case every open neighborhood of 0 in the generated topology contains an open neighborhood in one of the topologies τγ.
3.4. Topologies Generated by Families of Semi-Norms
We are concerned mainly with the topology
τ generated by a family of semi-norms {
µα}
α∈I; this is the smallest topology containing all sets of
. An open set in this topology is a union of translates of finite intersections of balls of the form
. Thus, any open neighborhood of
f contains a set of the form
This topology is Hausdorff if for any non-zero x ∈ V there is some norm µα for which µα(x) is not zero.
The description of the neighborhoods in the topology τ shows that a sequence fn converges to f with respect to τ if and only if µα(fn − f) → 0, as n → ∞, for all α ∈ I.
We will need to examine when two families of semi-norms give rise to the same topology:
Theorem 2. Let τ be the topology on V generated by a family of semi-norms = {µi}i∈I, and τ′ the topology generated by a family of semi-norms. Suppose each µi is bounded above by a linear combination of the. Then τ ⊂ τ′.
Proof. Let
µ ∈
M. Then there exist
, and real numbers
c1,
…,
cn > 0, such that
Now consider any
x,
y ∈
V. Then
So µ is continuous with respect to the topology generated by
. Thus, τµ ⊂ τ′. Since this is true for all µ ∈ , we have τ ⊂ τ′. □
3.5. Completeness
A sequence (xn)n∈N in a topological vector space V is Cauchy if for any neighborhood U of 0 in V, the difference xn − xm lies in U when n and m are large enough. The topological vector space V is complete if every Cauchy sequence converges.
Theorem 3. Let {τα}α∈I be a directed family of Hausdorff vector topologies on V, and τ the generated topology. If each τα is complete then so is τ.
Proof. Let (xn)n≥1 be a sequence in V, which is Cauchy with respect to τ. Then clearly it is Cauchy with respect to each τα. Let xα = limn→∞ xn, relative to τα. If τα ⊂ τγ then the sequence (xn)n≥1 also converges to xγ relative to the topology τα, and so xγ = xα. Consider α, β ∈ I, and choose γ ∈ I such that τα ∪ τβ ⊂ τγ. This shows that xα = xγ = xβ, i.e., all the limits are equal to each other. Let x denote the common value of this limit. We have to show that xn → x in the topology τ. Let W ∈ τ contain x. Since the family {τα}α∈I which generates τ is directed, it follows that there is a β ∈ I and a Bβ ∈ τβ with x ∈ Bβ ⊂ W. Since (xn)n≥1 converges to x with respect to τβ, it follows xn ∈ Bβ for large n. So xn → x with respect to τ. □
3.6. Metrizability
Suppose the topology
τ on the topological vector space
V is generated by a countable family of semi-norms
µ1,
µ2,
…. For any
x,
y ∈
V define
where
Then
d is a metric, it is translation invariant, and generates the topology
τ [
10].
4. The Schwartz Space
Our objective in this section is to show that the Schwartz space is
complete, in the sense that every Cauchy sequence converges. Recall that
is the set of all
C∞ functions
f on
R for which
for all
a,
b ∈
W = {0, 1, 2,
…}. The functions
pa,b are semi-norms, with ║ · ║
0,0, being just the sup-norm. Thus the family of semi-norms given above specify a Hausdorff vector topology on
. We will call this the
Schwartz topology on
.
Theorem 4. The topology on generated by the family of semi-norms ║·║a,b for all a, b ∈ {0, 1, 2, …}, is complete.
Proof. Let (
fn)
n≥1 be a Cauchy sequence on
. Then this sequence is Cauchy in each of the semi-norms ║·║
a,b, and so each sequence of functions
xaDbfn(
x) is uniformly convergent. Let
Let
f =
g0. Using a Taylor theorem argument it follows that
gb is
Dbf. For instance, for
b = 1, observe first that
and so, letting
n →
∞, we have
which implies that
f′(
x) exists and equals
g1(
x).
In this way, we have
xaDbfn(
x) →
xaDbf(
x) pointwise. Note that our Cauchy hypothesis implies that the sequence of functions
xaDbfn(
x) is Cauchy in sup-norm, and so the convergence
is uniform. In particular, the sup-norm of
xaDbf(
x) is finite, since it is the limit of a uniformly convergent sequence of bounded functions. Thus
.
Finally, we have to check that fn converges to f in the topology of
. We have noted above that xaDbfn(x) → xaDbf(x) uniformly. Thus fn → f relative to the semi-norm ║·║a,b. Since this holds for every a, b ∈ {0, 1, 2, 3, …}, we have fn → f in the topology of
. □
Now let’s take a quick look at the Schwartz space
. First some notation. A
multi-index a is an element of {0, 1, 2,
…}
d,
i.e., it is a mapping
If
a is a multi-index, we write |
a| to mean the sum
a1+⋯+
ad,
xa to mean the product
, and
Da to mean the differential operator
. The space
consists of all
C∞ functions
f on Rd such that each function
xaDbf(
x) is bounded. On
we have the semi-norms
for each pair of multi-indices
a and
b. The Schwartz topology on
is the smallest topology making each semi-norm ║·║
a,b continuous. This makes
a topological vector space.
The argument for the proof of the preceding theorem goes through with minor alterations and shows that:
Theorem 5. The topology on generated by the family of semi-norms ║·║a,b for all a, b ∈ {0, 1, 2, …}d, is complete.
5. Hermite Polynomials, Creation and Annihilation Operators
We shall summarize the definition and basic properties of Hermite polynomials (our approach is essentially that of Hermite’s original [
11]). We repeat for convenience of reference much of the presentation in Section 2.1 of [
7].
A central role is played by the
Gaussian kernelProperties of translates of
p are obtained from
Expanding the
right side in a Taylor series we have
where the Taylor coefficients, denoted
Hn(
x), are
This is the n–th Hermite polynomial and is indeed an n–th degree polynomial in which xn has coefficient 1, facts which may be checked by induction.
Going over to the Taylor series and comparing the appropriate Taylor coefficients (differentiation with respect to
y and
z can be carried out under the integral) we have
Thus an orthonormal set of functions is given by
Because these are orthogonal polynomials, the n–th one being exactly of degree n, their span contains all polynomials. It can be shown that the span is in fact dense in L2(p(x)dx). Thus the polynomials above constitute an orthonormal basis of L2(p(x)dx).
Next, consider the derivative of
Hn:
The operator
is called the
creation operator in
L2(
R;
p(
x)
dx).
Officially, we can take the creation operator to have domain consisting of all functions
f which can be expanded in
L2(
p(
x)
dx) as ∑
n≥0 anhn, with each
an a complex number, and satisfying the condition ∑
n≥0(
n + 1)|
an|
2 <
∞; the action of the operator on
f yields the function
. This makes the creation operator unitarily equivalent to a multiplication operator (in the sense discussed later in subsection
A.5) and hence a
closed operator (see
A.1 for definition). For the type of smooth functions
f we will mostly work with, the effect of the operator on
f will in fact be given by application of
to
f.
Next, from the fundamental generating relation
Equation (41) we have :
Letting
y = 0 allows us to equate the
n = 0 terms, and then, successively, the higher order terms. From this we see that
where
H−1 = 0. Thus:
The operator
is the
annihilation operator in
L2(
R;
p(
x)
dx). As with the creation operator, we may define it in a more specific way, as a closed operator on a specified domain.
6. Hermite Functions, Creation and Annihilation Operators
In the preceding section we studied Hermite polynomials in the setting of the Gaussian space L2(R;p(x)dx). Let us translate the concepts and results back to the usual space L2(R; dx).
To this end, consider the isomorphism:
Then the orthonormal basis polynomials
hn go over to the functions
ϕn given by
The family {ϕn}n≥0 forms an orthonormal basis for L2(R, dx).
We now determine the annihilation and creation operators on
L2(
R,
dx). If
f ∈
L2(
R,
dx) is differentiable and has derivative
f′ also in
L2(
R,
dx), we have:
So, on
L2(
R,
dx), the annihilator operator is
which will satisfy
where
ϕ−1 = 0. For the moment, we proceed by taking the domain of
A to be the Schwartz space
.
Thus the
creation operator is
The reason we have written
A* is that, as is readily checked, we have the adjoint relation
with the inner-product being the usual one on
L2(
R,
dx). Again, for the moment, we take the domain of
C to be the Schwartz space
(though, technically, in that case we should not write
C as
A∗, since the latter, if viewed as the
L2–adjoint operator, has a larger domain).
Observe also that
which imply:
Next observe that
and so
CA is called the
number operator N:
As noted above in
Equation (59), the number operator
N has the eigenfunctions
ϕn:
Integration by parts (see Lemma 10) shows that
for every
, and so also
It follows that the operator
N satisfies
for every
.
Now consider the case of
Rd. For each
j ∈ {1,
…,
d}, there are creation, annihilation, and number operators:
These map
into itself and, as is readily verified, satisfy the commutation relations
Now let us be more specific about the precise definition of the creation and annihilation operators. The basis {
ϕn}
n≥0 for
L2(
R) yields an orthonormal basis
of
L2(
Rd) given by
. For convenience we say
ϕm=0 if some
mj <0. Given its effect on the orthonormal basis
the operator
Ck has the form:
where
for all
i ∈ {1,
…,
d} except when
i =
k, in which case
. The domain of
Ck is the set
given by
The operator
Ck is then officially defined by specifying its action on a typical element of its domain:
where
m′ is as before. The operator
Ck is essentially the composite of a multiplication operator and a bounded linear map taking
ϕm →
ϕm′ where
m′ is as defined above. (See subsection
A.5 for precise formulation of a multiplication operator.) Noting this, it can be readily checked that
Ck is a closed operator using the following argument: Let
T be a bounded linear operator and
Mh a multiplication operator (any closed operator will do); we show that the composite
MhT is a closed operator. Suppose
xn →
x. Since
T is a bounded linear operator,
Txn →
Tx. Now suppose also that
Mh(
Txn) →
y. Since
Mh is closed, it follows then that
and
y =
MhTx.
The operators Ak and Nk are defined analogously.
Proposition 6. Let 0 be the vector subspace of L2(
Rd)
spanned by the basis vectors.
Then for k ∈ {1, 2,
…,
d},
Ck|
0 and Ak|
0 have closures given by Ck and Ak,
respectively (see subsection
A.4 for the notion of closure).
Proof. We need to show that the graph of
Ck, denoted
Gr(
Ck), is equal to the closure of the graph of
Ck|
0,
i.e., to
(see to subsection
A.1 for the notion of graph). It is clear that
. Using this and the fact that
Ck is a closed operator, we have
Going in the other direction, take (
f,
Ckf) ∈
Gr(
Ck). Now
where
am = 〈
f,
ϕm〉. Let
fN be given by
Observe that
fN ∈
0. Moreover
in
L2(
Rd). Thus
and so we have
.
The proof for Ak follows similarly. □
Linking this new definition for
Ck with our earlier formulas
Equation (63) we have:
Proposition 7. If then Proof. Let
. Since
, we have
g ∈
L2(
Rd). So we can write
g as
where
aj=〈
g,
ϕj〉. Let us examine these
aj’s more closely. Observe
where
for all i ∈ {1, …, d} except when i = k, in which case
.
Bringing this information back to our expression for
g we see that
The second equality is obtained by letting m = j″ and noting that ϕj″ = 0 when
is −1. The proof follows similarly for Ak. □
7. Properties of the Functions in
Our aim here is to obtain a complete characterization of the functions in
. We will prove that
consists of all square-integrable functions
f for which all derivatives
f(k) exist for
k ∈ {1, 2,
…,
p} and
for all
a,
b ∈ {0, 1,
…,
p − 1} with
a +
b ≤
p − 1.
A significant tool we will use is the Fourier transform:
This is meaningful whenever
f is in
L1(
R), but we will work mainly with
f in
. We will use the following standard facts:
maps
onto itself and satisfies the Plancherel identity:
if
then
For the purposes of this section it is necessary to be precise about domains. So we take now
A and
C to be closed operators in
L2(
R), with common domain
and
Moreover, define operators
C1 and
A1 on the common domain
and
We will prove below that C and C1 (and A and A1) are, in fact, equal.
For a function
we will use the notation
fN for the partial sum:
Observe the following about the derivatives
:
Lemma 8. If, then is Cauchy in L2(R).
Since
, we know
tends to 0 as M goes to infinity. Thus
is Cauchy in L2(R). □
Lemma 9. If then f is, up to equality almost everywhere, bounded, continuous and {fN} converges uniformly to f, i.e., ║f−fN║sup→0 as N→∞.
Proof. It is enough to show that ║
f−
fN║
sup→0
as M,N→∞. Note that
by
Equation (70). Since
f ∈
L2(
R) we have
as
M,
N →
∞ and by Lemma 8 we have that
as
M,
N →
∞ Therefore {
fN} converges uniformly to
f. □
Next we establish an integration-by-parts formula:
Lemma 10. If f,g ∈
L2(
R)
are differentiable with derivatives also in L2(
R)
then Proof. The derivative of
fg, being
f′
g +
fg′, is in
L1. So the fundamental theorem of calculus applies to give:
for all real numbers
a < b.
Consequently, there exist
aN < −
N < N < bN with
Next we have the first step to showing that C1 equals C:
Lemma 11. If f is in the domain of C1 then f is in the domain of C and Proof. Let
f be in the domain of
C1. Then we may assume that
f is differentiable and both
f and the derivative
f′ are in
L2(R). We have then
Because this sum is finite, it follows that
f is in the domain
of
C. Moreover,
The argument showing Af = A1f is similar. □
We can now prove:
Theorem 12. The operators C and C1 are equal, and the operators A and A1 are equal. Thus, a function f ∈ L2(R) is in the domain of C (which is the same as the domain of A) if and only if f is, up to equality almost everywhere, a differentiable function with derivative f′ also in L2(R) and with.
Proof. In view of Lemma 11, it will suffice to prove that
. Let
f ∈
. Then
This implies that the sequences {
C1fN}
N≥0 and {
A1fN}
N≥0 are Cauchy, where
fN is the partial sum
So the sequences of functions
and {
hN}
N≥0, where
are also
L2–Cauchy. Now, as shown in Lemma 9, we can take
f to be the uniformly convergent pointwise limit of the sequence of continuous functions
fN.
By Lemma 8, the sequence of derivatives
is Cauchy in
L2(R). Let
in
L2(R). Observe that
Now
by the Cauchy-Schwartz inequality. Since
as
N →
∞, we have
Because
converges to
f uniformly by Lemma 9, taking the limit as N → ∞ in
Equation (78) we obtain
Therefore
f′ =
g ∈
L2(
R). Lastly, we have, by Fatou’s Lemma:
because the sequence {
gN}
N≥0 is convergent. Thus we have established that
. □
Finally we can characterize the space Sp(R):
Theorem 13. Suppose f ∈
Sp(
R),
where p ≥ 1.
Then f is (up to equality almost every where) a 2
p times differentiable function andfor every a,
b ∈ {0, 1, 2,
…}
with a +
b < 2
p. Moreover,
Sp(
R)
consists of all 2
p times differentiable functions for which the functions x ⟼
xaf(b)(
x)
are in L2(
R)
for every a,
b ∈ {0, 1, 2,
…}
with a +
b ≤ 2
p. Proof. Consider
f ∈
S1(
R). Then
In particular,
. Moreover,
From these expressions and
Equation (79) it is clear that
Cf and
Af both belong to
. Thus,
Similarly, we can check that if
f ∈
Sp(
R), where
p ≥ 2, then
Thus, inductively, we see that
(This really means that f is in the domain of each product operator B1 ···B2p.) Now the operators
and multiplication by x are simple linear combinations of A and C. So for any a, b ∈ {0, 1, 2, …} with a + b ≤ 2p we can write the operator
as a linear combination of operators B1…B2p with B1, …, B2p ∈ {C, A, I}.
Conversely, suppose
f is 2
p times differentiable and the functions
x ⟼
xaf(b)(
x) are in
L2(
R) forevery
a,
b ∈ {0, 1, 2, … } with
a +
b ≤ 2
p. Then
f is in the domain of
C2p and so
Thus f ∈ Sp(R).
The preceding facts show that if
f ∈
Sp(
R) then for every
B1,
…,
B2p ∈ {
C,
A,
I}, the element
B1 ···
·B2p−1f is in the domain of
C, and so, in particular, is bounded. Thus,
for all
a,
b ∈ {0, 1, 2,
…} with
a +
b ≤ 2
p − 1. □
We do not carry out a similar study for
Sp(R
d), but from the discussions in the following sections, it will be clear that:
8. Inner-Products on S(R) from N
For
f ∈
L2(
R), define
for every
t > 0. More generally, define
for all
f,
g in the subspace of
L2(
R) consisting of functions
F for which |
|F ||t < ∞.
Theorem 14. Let f ∈
S(
R).
Then for every t > 0
we have ||f||t < ∞. Moreover,
for every integer m ≥ 0,
we also havewhere on the left Nm is the differential operator applied n times,
and on the right the series is taken in the sense of L2(
R,
dx).
Furthermore,
This result will be strengthened and a converse proved later.
Proof. Let
m ≥ 0 be an integer. Since
f ∈
S(
R), it is readily seen that
N f is also in
S(
R), and thus, inductively, so is
Nmf. Then we have
Thus we have proven the relation
An exactly similar argument shows
So if
t > 0, choosing any integer
m ≥
t we have
Observe that the series
is convergent in
L2(
R,
dx) since
So for any
g ∈
L2(
R,
dx) we have, by an argument similar to the calculations done above:
This proves the statement about Nmf. □
We have similar observations concerning
Cmf and
Amf. First observe that since
C and
A are operators involving
and
x, they map
S(
R) into itself. Also,
for all
f,
g ∈
S(
R), as already noted. Using this, for
f ∈
S(
R), we have
More generally, if
B1,
…,
Bk are such that each
Bi is either
A or
C then
where the integer
r is the excess number of
C’s over the
A’s in the sequence
B1,
…,
Bk, and
θn,k is a real number determined by
n and
k. We do have the upper bound
Let’s look at the case of R
d. The functions
ϕn generate an orthonormal basis by tensor products. In more detail, if
a ∈
Wd is a multi-index, define
ϕa ∈
L2(
Rd) by
Now, for each
t > 0, and
f ∈
L2(
Rd), define
and then define
for all
f,
g in the subspace of
L2(
Rd) consisting of functions
F for which ║
F║
t < ∞.
Let
Td be the operator on
given by
Then, for every non-negative integer
m, we have
The other results of this section also extend in a natural way to Rd.
9. L2–Type Norms on
For integers
a, b ≥ 0, and
f ∈
, define
Recall the operators
and the norms
The purpose of this section is to prove the following:
Theorem 15. The system of semi-norms given by ║f║a,b,2 and the system given by the norms ║f║m generate the same topology on.
Proof. Let
a, b be non-negative integers. Then
where each
Bi is either
A or
C, and
k =
a +
b. Writing
cn = ⟨
f, ϕn⟩, we have where
where
and, as noted earlier in
Equation (90),
Thus ║f║a,b,2 is bounded above by a multiple of the norm ║f║a+b.
It follows, that the topology generated by the semi-norms || · ||a,b,2 is contained in the topology generated by the norms || · ||k.
Now we show the converse inclusion. From
and the expression of
N as a differential operator we see that
is bounded above by a linear combination of
for appropriate
a and
b. It follows then that the topology generated by the norms ║·║
k is contained in the topology generated by the semi-norms ║·║
a,b,2. □
Now consider
Rd. Let
a, b ∈
Wd be multi-indices, where
W = {0, 1, 2, …}. Then for
f ∈ S(R
d) define
These specify semi-norms and they generate the same topology as the one generated by the norms || · ||m, with m ∈ W. The argument is a straightforward modification of the one used above.
10. Equivalence of the Three Topologies
We will demonstrate that the topology generated by the family of norms || · ||k, or, equivalently, by the semi-norms ║ · ║a,b,2, is the same as the Schwartz topology on.
.
Putting in
xaDbf(
x) in place of
f(
x) we then have
Next we bound the semi-norms ║
f║
a,b,2 by the semi-norms ║
f║
a,b. To this end, observe first
So for any integers
a, b ≥ 0, we have
Thus, the topology generated by the semi-norms ║ · ║a,b,2 coincides with the Schwartz topology.
Now lets look at the situation for
Rd. The same result holds in this case and the arguments are similar. The appropriate Sobolev inequalities require using (1 + |
p|
2)
d instead of 1 +
p2. For
, we have the Fourier transform given by
Again, this preserves the
L2 norm, and transforms derivatives into multiplications:
Repeated application of this shows that
where
is the Laplacian. Iterating this gives, for each
r ∈ {0, 1, 2, …} and
f ∈
,
which in turn implies, by the Plancherel formula
Equation (67), the identity:
Then we have, for any
m > d/4,
where
The function (1 +
s)
n/(1 +
sn), for
s ≥ 0, attains a maximum value of 2
n−1, and so we have the inequality (1 +
s)
2m ≤ 2
2m−1(1 +
s2m), which leads to
This last quantity is clearly bounded above by a linear combination of ║f║0,b,2 for certain multi-indices b. Thus ║f║sup is bounded above by a linear combination of ║f║0,b,2 for certain multi-indices b. It follows that ║xa Dbf║sup is bounded above by a linear combination of ║f║a′,b′,2 for certain multi-indices a′, b′.
For the inequality going the other way, the reasoning used above for
Equation (97) generalizes readily, again with (1 +
x2) replaced by (1 + |
x|
2)
d. Thus, on
the topology generated by the family of semi-norms ║ · ║
a,b,2 coincides with the Schwartz topology.
Now we return to
Equation (102) for some further observations. First note that
and so Δ
m consists of a sum of multiples of (3
d)
m terms each a product of 2
m elements drawn from the set {
A1,
C1,…,
Ad,
Cd}. Consequently, by
Equation (95)for some positive constant
cd,m. Combining this with
Equation (102), we see that for
m > d/4, there is a constant
kd,m such that
holds for all
.
Now consider
, with
p > d/4. Let
Then
fN → f in
L2 and so a subsequence
converges pointwise almost everywhere to
f. It follows then that the essential supremum ║
f║
∞ is bounded above as follows:
Note that
fN → f also in the ║ · ║
p–norm. It follows then from
Equation (104) that
holds for all
with
p > d/4. Replacing
f by the difference
f − fN in
Equation (105), we see that
f is the
L∞–limit of a sequence of continuous functions which, being Cauchy in the sup-norm, has a continuous limit; thus
f is a.e. equal to a continuous function, and may thus be redefined to be continuous.
11. Identification of
with a Sequence Space
Suppose
a0,
a1, … form a sequence of complex numbers such that
We will show that the sequence of functions given by
converges in the topology of
to a function
for which
an = ⟨
f, ϕn⟩ for every
n ≥ 0.
All the hard work has already been done. From
Equation (106) we see that (
sn)
n≥0 is Cauchy in each norm ║·║
m. So it is Cauchy in the Schwartz topology of
, and hence convergent to some
. In particular,
sn → f in
L2. Taking inner-products with
ϕj we see that
aj = ⟨
f,
ϕj⟩.
Thus we have
Theorem 16. Let W = {0, 1, 2, …},
and defineby requiring thatfor all n ∈
W. Then the image of under F is the set of all a ∈ C
W for which for every integer m ≥ 0.
Moreover, if is equipped with the topology generated by the norms ║ · ║
m then F is a homeomorphism.