1. Introduction
Stocks and bonds play a crucial role in making investment decisions. However, there is a significant difference between stock and bond markets. A stock market is a platform enabling companies to issue their shares and trade them on a stock exchange. Buying these shares and having ownership stake, investors can benefit from the performance of the issuer in the future. In contrast, a bond market is used for issuing and trading debt securities. Usually, an investor and a bond issuer make a contract enabling the former to benefit from the interest paid on a regular basis and full nominal amount received when reaching maturity. The bond issuer may be a government, a corporation, etc.
Besides, there are bonds that do not pay an intermediate interest but are traded at a deep discount, rendering profit at the maturity when they are redeemed for their full face value. These bonds are called zero-coupon bonds. For a maturity
, let
denote the price of a zero-coupon bond at time
. It is known from the theory of interest rates (viz. Definition 10 in
Section 2) that the instantaneous forward rate of a zero-coupon bond is given by
Under appropriate assumptions, Heath, Jarrow and Morton [
1] demonstrated that
can be modeled as a Markovian diffusion that evolves according to the following stochastic differential equation (SDE):
where
is a
d-dimensional standard Brownian motion,
is the market instantaneous forward curve at time
,
and
are 1- and
d-dimensional stochastic processes adapted to the natural filtration of
and
is formally defined as the scalar product. Under suitable regularity assumptions, integrating Equation (
2) over
for
, the instantaneous forward rate
can be represented as an Itô-process:
However, the dynamics in Equation (
2) is not necessarily arbitrage-free. In the spirit of [
2], we understand the presence of arbitrage as the existence of a self-financing strategy which yields a positive income in the future without investing a positive amount today (cf.
Section 4). It was further proved in [
3] that the existence of an equivalent martingale measure implies the absence of arbitrage opportunities. It was also shown that the unique equivalent martingale measure exists if the drift
depends on the volatility
(see ([
2], Chapter 2) for details). In particular, the requirement that the evolution in Equation (
2) is taking place under a risk-neutral measure
necessitates the following representation:
Therefore, for all
, we conclude that
is fully specified (as a stochastic process) as long as the volatility vector
is provided:
Recalling Equation (
1), Itô’s lemma furnishes the following SDE for the zero-coupon bond price
for all
:
where
is the instantaneous short-term interest rate at
t given by
In his dissertation [
4], Filipović proposed to extend this approach to an infinite-dimensional framework and consider the following stochastic
partial differential equation (SPDE):
with the so-called Musiela’s notation (cf. [
5])
,
being employed (see also [
6], Chapter 6). Here,
is an infinite-dimensional vector of (coutably many) independent real-valued Brownian motions, the forward rate curve
is a random variable taking its values in a function space
and the drift
is given by the no-arbitrage condition
To facilitate the analytic treatment of Equation (
4), we put it into the framework of abstract operator SDEs with stochastic coefficients:
where
X is an
-valued random process,
is a random
-valued
-measurable initial value,
is a linear (possibly unbounded) operator,
is a standard cylindrical Wiener process over
,
,
and
,
are separable Hilbert spaces. Here and in the sequel,
denotes the space of Hilbert-Schmidt operators (cf. Definition A8). Letting
we can easily see that Equation (
4) is a special case of Equation (
5). A straightforward integration of Equation (
5) over
yields the integral formulation of Equation (
5):
The rest of the paper is structured as follows.
Section 2,
Section 3 and
Section 4 are meant to provide a concise review of the HJM framework and related results. With most of the results being scattered over a number of articles and monographs, our thrust was to give a coherent presentation on this matter with consistent terminology and notation to make the theory better accessible to a broad scientific and professional audience. In
Section 2, a brief summary of the interest rate theory is given, in particular, a relation between the price of a zero-coupon bond and its instantaneous forward rate from Equation (
1) is established. In
Section 3, we introduce the abstract HJM model first presented and studied by [
4,
6], discuss major advantages of this approach and protocol some known well-posedness results. In
Section 4, we summarize central properties of the abstract HJM model, in particular, we give sufficient conditions under which the abstract HJM model defines an arbitrage-free market, describe the long rates associated with the model, and give an example of a state space compatible with the instantaneous forward rate dynamics.
Section 5 contains the main contribution of this paper. A (semi-)implicit numerical scheme is developed and applied to Equation (
4). Using real data from the US Department of the Treasury [
7], the resulting non-parametric model is calibrated using a custom-developed least-squares-like procedure involving principal component analysis (PCA) for parameter estimation and dimension reduction. The simplicity of our approach is one of its main advantages. Indeed, it does not rely on any sophisticated optimization techniques or third-party libraries, thus, making itself attractive and accessible to a wide audience. The model is then deployed to forecast future yield curve rates, estimate the mean future yield curve and compute prediction regions. Using the predictive power of the model as a benchmark measure in lieu of the mere calibration quality is another novelty of our contribution. To the best of our knowledge, this is the first time an infinite-dimensional model is applied to analyze the US Treasury’s daily yield curve rates. We refer [
8,
9,
10] for the finite-dimensional treatise. In
Section 6, conclusions are presented.
Appendix A,
Appendix B and
Appendix C contain all seminal probabilistic and functional-analytic concepts and results [
6,
11,
12,
13,
14,
15] used in the article.
2. Interest Rate Theory
Before presenting the abstract Heath-Jarrow-Morton model in
Section 3, we give a concise summary of the theory of interest rates based on ([
2] pp. 2–18). The goal of this section is twofold. First, we want to give motivation for using the theory of interest rates in this paper. Second, we aim to prove relation (
1) between the price and the instantaneous forward rate of a zero-coupon bond in order to define the abstract HJM model afterwards.
As pointed out in ([
16], Chapter 25), multivariate (or multidimensional) models have numerous advantages over single-state short rate models, in particular, in describing the intrinsic volatility structure or explaining the inversion of the yield curve, etc. The HJM framework, being at “the far end of this [multivariate] spectrum” (cf. [
16], p. 388), is, probably, the most comprehensive one as it employs the forward rate curve as an infinite-dimensional state variable. We refer to the aforecited monograph as well as ([
17], pp. 242–245) for further discussions.
Application areas of the HJM framework are very diverse. They include, but are not limited to, modeling of interest futures [
18], risk management [
19], American option pricing [
20], credit risk modeling [
21], investment management [
22], etc.
2.1. Bank Account and Short Rate
Depositing a certain amount of money in a bank account is typically expected to guarantee its growth at some rate. We need appropriate definitions to express this concept in mathematical terms. Consider a bank account which represents a riskless investment with the profit accruing continuosly.
Definition 1 (Bank account)
. Let be the value of a bank account at time . We assume that the bank account evolves according to the differential equationwhere is a positive function of time. Letting , we have Definition 1 suggests that a unit amount invested at
has value (
8) at
, implying also that
accrues at the short rate
.
In fact, the first-order Taylor expansion of
is
whence
Definition 2 (Numéraire). A numeraire is any positive non-dividend-paying asset.
Remark 1. The concept of bank account numéraire is important for relating amounts of currencies available at different times. Assume for simplicity that is deterministic. If K units of currency are deposited into a bank account at the time , the amount of units of currency at time is . Analogously, the amount is at time . If we want to have one unit of currency at time T, i.e., , we have to initially invest . Hence, the value of the invested amount K at time t is Remark 1 motivates the following definition (cf. [
2], Chapter 1).
Definition 3 (Stochastic discount factor)
. The amount at time t “equivalent” to one unit of currency payable at time Tis referred to as stochastic discount factor. 2.2. Zero-Coupon Bonds and Spot Interest Rates
Definition 4 (Zero-coupon bond).A zero-coupon bond is a contract that entitles its holder to a (single) payment of one unit of currency at maturity T. Denoting the contract value at time by , we observe for all .
Definition 5 (Time to maturity). The amount of time from the present moment of time t to the maturity time is referred to as time to maturity .
Obviously, is well-defined if t and T are real numbers associated with some time instants. However, if t and T denote two dates , e.g., of format (month/day/year), we can define the amount of time between them in terms of the number of days. Moreover, the number of days between and is calculated according to appropriate market convention, which specifies how these days are to be counted.
Definition 6 (Year fraction)
. Let denote a time measure between t and T, which is usually referred to as year fraction between the dates t and T. If t and T are less than one day apart (typically, occuring when dealing with limiting quantities which involve time to maturities tending to zero), is roughly given by . A particular choice is known as the day-count convention (see [2,23] for examples). Additionally, we have to exploit another feature of the zero-coupon bonds called the compounding type in order to switch from the price of a zero-coupon bond and its instantaneous forward rate and vice versa.
Definition 7 (Continuously-compounded spot interest rate)
. The (constant) rateat which an investment of amount at time t accrues continuously to produce a unit amount of currency at maturity T is referred to as continuously-compounded spot interest rate. Definition 8 (Simply-compounded spot interest rate)
. Given an initial capital of at time t and assumning accruing takes place proportionally to the investment time, the (constant) rateat which money needs to be invested to yield a single unit of currency at maturity T is called the simply-compounded spot interest. 2.3. Forward Rates
Forward rates are interest rates that can be locked in on a particular day for an investment in a future time period. They are characterized by three time instants: the time t at which the rate is considered, its expiry T and its maturity S, .
The forward rate can be also viewed as a typical forward rate agreement represented by a contract involving three time characteristics: the current time
t, the expiry time
and the maturity
. It enables the investor to benefit from the interest paid between
T and
S. Moreover, it is possible to fix the interest between
T and
S at rate
K with simply-compounded contract rates, implying a payment of
and a receipt of
, with
N denoting the contract nominal value. This renders the contract value to be
. Recalling Definition 8, the former quantity rewrites it as
Let
be the amount of currency held at time
S. Its value at time
T is obtained by multiplying this amount with the zero coupon price
, i.e.,
so this term is equivalent to holding one unit of currency at the time
T. Besides, one unit of currency at the time
T is worth
units of currency at the time
t. Therefore, the amount
in
S is equivalent to an amount of
in
t. Now, consider the remaining terms in Equation (
9). The worth of
at the maturity
S at the time
t is
Therefore, the total value of the contract at the time
t is
Remark 2. The value of K rendering the contract in Equation (10) fair at the time t is unique. Definition 9 (Simply-compounded forward interest rate)
. Given expiry and maturity , the simply-compounded forward interest rate at time t reads as As the maturity of the forward rate approches its expiry, the concept of instantaneous forward rate emerges. If
S is sufficiently close to
T, then
and we have:
This motivates the following definition.
Definition 10 (Instantaneous forward interest rate)
. The instantaneous forward interest rate prevailing at time t for the maturity is denoted by and given as Remark 3. Using Musiela’s notation (viz. [5]), for and . The space of functions will be denoted by and called the state space. 3. The Abstract Heath-Jarrow-Morton Model
Filipović [
4] proposed to model the instantaneous forward rate of a zero-coupon bond via an infinite-dimensional SDE driven by the standard cylindrical Wiener process. In contrast to the standard finite-dimensional Heath-Jarrow-Morton (HJM) framework, the underlying bond market captures the entire (unobservable) infinite-dimensional interest rate term structure. Among many other advantages, the infinite-dimensional HJM model mitigates the so-called “maturity specific risk” typically present in finite-dimensional models. For a more detailed discussion of infinite-dimensional models, the reader may be referred to the monograph ([
6], Chapter 6.5). Also, it gives a natural way to “extrapolate” outside of the original finite maturity grid. From the empirical point of view, the infinite-dimensional structure of the model not only helps better mimic the observed dynamics by merely increasing the “number” of model parameters, but makes it possible to capture the non-trivial topological aspects of the dynamics, thus, improving the predictive power and allowing for natural dimension reduction or regularization without resorting to
ad hoc penalties. Arguably, this justifies the increased price associated with the infinite-dimensional functional-analytic and probabilistic framework.
In this section, we introduce the abstract HJM model following the approach of [
4,
6]. The resulting model is a semilinear stochastic PDE (SPDE) driven by the standard cylindrical Wiener process (see
Appendix C). The well-posedness of these abstract SPDEs in the class of mild solutions is obtained using the abstract results from ([
11], Chapter 7) and discussed in
Section 3.1. The results are then applied to justify the well-posedness of the abstract HJM model to be introduced in
Section 3.2.
3.1. Existence and Uniqueness of Mild Solutions to Semilinear Spdes
We consider the abstract SPDE
and investigate the existence and uniqueness of mild solutions. Depending on the context, the solution process will be denoted as
or
. Let
be a filtered probability space and let
. Let
and
denote the predictable
-algebras on
and
, respectively. We also assume that
,
are separable Hilbert spaces and
is a standard cylindrical Wiener process over
given in Definition A7. Therefore, using the notation of
Appendix C.1, we can let
and
. The spaces
,
,
induce corresponding Borel
-algebras
,
,
. (Recall that
denotes the space of Hilbert-Schmidt operators.) Lastly,
is a random
-valued
-measurable initial value. See
Appendix A and
Appendix C for a brief review of probabilistic concepts and results used in this section.
Assumption 1. Let A, F and B in Equation (11) satisfy: is measurable from to ,
is measurable from to ,
There exists a constant such that for all , , we have: A is the generator of a strongly continuous operator semigroup on .
Definition 11 (Mild Solution)
. An -valued predictable stochastic process withis referred to as a mild solution to Equation (11) if it satisfies the integral equation (well-defined on the strength of Proposition A9):-almost surely. Note that the predictability of implies it possesses a continuous modification, i.e., , which (in addition to squared-integrability) is essential for the latter integral to be well-defined.
Theorem 1 ([
11], Section 7.1)
. Under Assumption 1, there exists a unique (up to indistinguishability) -valued mild solution to Equation (11). Moreover, for any , there exists a positive constant C, which only depends on p and T, such that 3.2. Definition of Abstract HJM Model
Let
be a filtered probability space. Further, let
either be a compact interval of the form
for some positive
or the half-line
and let
be a standard cylindrical Wiener process over a separable Hilbert space
. Let
be a state space and
be a predictable
-algebra on
. We make the following assumption (cf. [
6], Chapter 6).
Assumption 2. Suppose:
is a separable Hilbert space consisting of continuous, real-valued functions . Additionally, we assume that for every the functionalsare continuous on . (The former assumption can also be encountered in statistics, for example, in the context of Reproducing Kernel Hilbert Spaces (RKHS)). The left shift semigroup defined viais a strongly continuous operator semigroup on generated by a (linear) operator A. is a measurable mapping from a non-empty subset of into and defined byfor all .
The crucial property of the space
is that its elements
are (locally) integrable functions so that
is well-defined. Indeed, the local integrability holds true as the linear functional
is bounded on
for each
since
where
is finite on the strength of Banach-Steinhaus theorem.
The financial implication of the first part of Assumption 2 is the identity for the short interest rate. The bank account is then defined via . The semigroup given in the second part of Assumption 2 enables us to switch to Musiela’s notation , where . The third part of Assumption 2 is intimately related to the no-arbitrage principle and is needed to define the drift term in the HJM equation. Further, it is generally false that is an element of unless is an element of a proper subspace of .
Assumption 3. Assume there exists a subspace of such that the binary operator ⋆ defined bymapping from into is bounded, i.e., there exists such that the following boundholds for all . The following estimate for
is known after ([
6], p. 167).
Proposition 1. Let the space satisfy Assumption 3. Then the mapping satisfies the estimatefor all Hilbert-Schmidt operators with the range contained in . In particular, the mapping is measurable from to . Now we can give a rigorous formulation of the abstract HJM model adopted from ([
6], Section 6.3). It is rooted in the standard HJM framework [
1], which describes a certain class of arbitrage-free finite-dimensional models for the instantaneous forward rate. We provide a more powerful framework, referred to as the abstract HJM model, which incorporates arbitrage-free infinite-dimensional models.
Definition 12 (The abstract HJM model). The abstract HJM model in a functional space is a pair of functions such that
λ is a measurable function from into ,
σ is a measurable function from into
so that there exists a non-empty set of -valued -measurable initial conditions , which admits a unique time-continuous mild -valued solution to the HJM equation: The mild formulation of Equation (
12) is given via:
Imposing typical Lipschitzianity and boundedness assumptions, as a direct corollary of Theorem 1, we obtain (cf. [
6], p. 169):
Proposition 2 (Sufficient existence condition)
. Let satisfy Assumption 3 with a closed subspace such that for . Let the operator σ map into for every . Assume σ is bounded and the Lipschitzianfor some constant and all , then the pair is an abstract HJM model on . Furthermore, for any initial -valued forward curve , there exists a unique time-continuous mild solution to Equation (12) such that 4. Absence of Arbitrage
A fundamental economic assumption behind any adequate financial model is the absence of arbitrage opportunities in the associated financial market (cf. [
24], pp. 637–654). The absence of arbitrage means impossibility to invest nothing today and receive a positive amount in the future. We restrict ourselves to continuous-time economies (cf. [
25], pp. 381–408).
Consider a filtered probability space
with a right-continuous filtration
. Assume
non-dividend paying securities are traded from time 0 until time
T continuously with their modeled by an adapted
-dimensional semimartingale
having non-negative components
. The asset indexed by 0 is a bank account. We refer to
Appendix B for a brief review of the martingale theory.
Definition 13 (Trading strategy)
. A trading strategy is a (-dimensional) process with locally bounded, predictable components . The value process associated with a strategy ϕ is defined byand the gains process associated with a strategy ϕ are given by The quantity corresponds to the number of units of k-th security held by an investor at time t. Predictability of suggests that is known directly before t.
Definition 14 (Self-financing trading strategy). We refer to a trading strategy ϕ as self-financing if and for .
Definition 15 (Equivalent martingale measure). A probability measure on is referred to as equivaletn martinage measure if it satisfies:
and are equivalent martingale measures, i.e., if and only if for all ,
The Radon-Nikodym derivative belongs to ,
The “discounted asset price” process is an -martingale, i.e.,
for all and all .
Definition 16 (Locally equivalent martingale measure). We say that is a locally equivalent martingale measure if for all the restriction of to is equivalent to the restriction of to .
An arbitrage opportunity is defined by a self-financing strategy
such that
but
. It is known (cf. ([
3], pp. 215–260) and ([
26], pp. 17–19)) that the existence of an equivalent martingale measure implies the absence of arbitrage opportunities:
Theorem 2 (Fundamental theorem of asset pricing ([
6], Theorem 2.1))
. There are no arbitrage strategies if there exists a probability measure , equivalent to , such that the discounted asset prices process is a local martingale under . Chapter 3.1.1 of [
6] provides sufficient conditions for the abstract HJM model from
Section 3 to be free of arbitrage. These results are summarized below. Consider an abstract HJM model
with an initial condition
and denote by
the unique mild solution to the HJM equation. For brevity, denote
and
.
Theorem 3 (Conditions of no arbitrage for the abstract HJM model)
. Providedfor all , then the market given by the abstract HJM model admits no arbitrage. In its current form, the setup of Theorem 3 appears to be too general for practical use. Indeed, it would suffice to show the Novikov condition for
is satisfied in order to prove that
is a martingale (cf. [
6], Theorem 6.1) since every martingale is a local martingale.
4.1. Long Rates
Modeling the forward rate as a function on a bounded interval
is different from modeling it on the real half-line. Namely, we encounter an unexpected phenomenon that the long rate defined by the limit
never decreases. This phenomenon for long forward and zero-coupon rates was previously studied in [
27]. The intrinsic nature of this phenomenon is rooted in the fact that the (appropriately defined) discounted bond prices
are local martingales for all
(cf. [
6], p. 172). The following result is known after ([
6], Chapter 6.3.2).
Assumption 4. Every is a function such that the limit exists. The functional is an element of .
Theorem 4. If , then -almost surely.
4.2. Example of State Space
An example of a state space
was first proposed in ([
4], Chapter 3) and further investigated in ([
6], Chapter 3.3). Obviously, this space should fulfill the first and the second parts of Assumption 2 under which the abstract HJM model is well-defined, i.e., it should consist of real-valued continuous functions, for which the functionals
should be continuous on
and the strongly continuous semigroup
should exist on
. Parenthetically, we point out that, though the following presentation assumes
, our empirical example in
Section 5 will rather utilize a bounded interval
.
Definition 17 (Example of state space)
. Let be a monotonically increasing function such thatdenote by the spacewhere is the distributional derivative of f and If
has a weak derivative
, e.g., some function
such that
then there exists an absolutely continuous representative
f, still denoted by
f, such that
(cf. [
28], Section VIII.2). In particular, this implies that the trace
is well-defined.
Proposition 3 ([
6], Chapter 3.3)
. If w satisfies inequality (14), then the inner product space is a separable Hilbert space. The evaluation functional and the definite integration functional defined byrespectively, are continuous on for all positive x. Further, the operator semigroup on is strongly continuous. 5. Example: Modeling and Predicting US Treasuries
Over the past two decades, the problem of parameter estimation, also referred to as model calibration or estimation, for the classical
finite-dimensional HJM model [
1] (viz. Equation (
2)) has attracted major attention from various researchers in econometrics, quantitative finance, stochastic analysis, numerical analysis, etc. In the finite-dimensional context, estimation for the HJM model is typically performed using the method of least squares, the method of moments, maximum likelihood estimation and penalized versions thereof, stochastic filtration, kriging, parametric and nonparametric Bayesian inference, etc. See, e.g., [
29,
30,
31,
32,
33,
34] for relevant studies in both parametric and non-parametric settings. We also refer to “classic” papers [
8,
9,
10] as well as recent contributions [
35,
36] and references therein for closely related results for the Vasicek model [
37].
As for the abtsract HJM model (viz. Equation (
4)) in the
infinite-dimensional context, with the exception of the numerical paper [
38], no results are known in the literature. With the primary focus on studying the convergence properties of the proposed numerical splitting schemes, ([
38], Section 2.3.1) presents a brief empirical study based on a dataset from the dissertation [
39]. As opposed to the finite-dimensional situation, infinite-dimensional parameter estimation or calibration consitutes a (possibly ill-posed) statistical inverse problem. To the best of our knowledge, no rigorous investigations of HJM model calibration in the context of statistical inverse problems are currently available in the literature. A promissing angle of attack is though provided in [
40]. Also, no calibration and/or forecasting studies on the infinite-dimensional HJM model in connection with real-world data are also available up to date. In this section, we investigate a non-parametric calibration problem for the linear version of Equation (
4). To this end, we discretize the equation with an implicit/explicit “splitting scheme” and propose a PCA- and least-squares based parameter estimation/dimension reduction procedure to fit the discretized model. We illustrate our approach by applying our proposed scheme to a real-world dataset presented below.
Daily US Treasury yield curve rates are available from the US Department of the Treasury [
7]. For any given business day (i.e., all days excluding holidays and weekends), the yield curves relate the closing market bid yields on actively traded Treasury securities in the over-the-counter market to their time to maturity. Further details on the methodology used to calculate the yield rates are provided in [
7]. The time-to-maturity grid includes 1 mo, 2 mo (starting 16 October 2018), 3 mo(nths), 6 mo, 1 y(ea)r, 2 yr, 3 yr, 5 yr, 7 yr, 10 yr, 20 yr and 30 yr.
Figure 1 displays historical yield rates for the year 2018. Since the 2-month rates were missing for the dates between 1 January 2018 and 15 October 2018, linear interpolation was used to estimate these values.
Employing Musiela’s notation [
5], let
denote the yield rate of a US treasury traded at time
t with the time to maturity
x. According to ([
4], Definition 3.13),
can be expressed with the aid of the instantaneous forward rate
as
In particular, invoking the mean value theorem, the former equation suggests
where
stands for the short rate. Multiplying both sides of Equation (
15) by
x and differentiating in
x, we can solve for
, i.e.,
Unfortunately, no “natural” boundary condition is readily available for
. In contrast, since
can be assumed bounded in
, a homogeneneous Dirichlet boundary condition emerges for
. Therefore, in spirit of ([
4], Chapter 3.3), we will rather work with the (observable) quantity
referred to as the “integrated forward rate” since
This quantity, , satisfies the homogeneous Dirichlet boundary condition at .
Since the shortest time-to-maturity available in our dataset is
month,
will be approximated/extrapolated as
. Further, as
Figure 1 suggests, the
x-grid becomes very coarse for
. In particular, this means that the infinitesimal effects of the operator
A in the abstract HJM model in Equation (
13) will become negligible as the semigroup
will practically “collapse” to the identity operator, essentially reducing the model (for large
x) to the classic HJM model. Therefore, the maturities were truncated at 36 months. Lastly, the observed yield rates are interpolated onto
month using cubic splines.
Figure 2 displays the historical
Y-curves for the year 2018 over the truncated 36-month
x-horizon.
5.1. The Model
Consider the space
defined in
Section 4.2. The “spatio-”temporal evolution of the forward rates will be assumed to follow the linear autonomous HJM model from Equation (
4), i.e.,
with some initial datum
and
determined via the “no arbitrage” condition
where the evolution in Equation (
19) is taking place under the risk-neutral measure
. To assure the model is identifiable, similar to [
8,
10],
’s will be assumed orthogonal with respect to an appropriate inner product to be discussed later.
Following the results of
Section 4, we can write
for a standard cylindircal Wiener process
with respect to the physical measure
, where
must satisfy the conditions of Theorem 3. See also ([
10],
Section 2). In the spirit of [
8], assuming an “affine specification” for the market price of risk
the former conditions are trivially satisfied. Thus, Equation (
19) can be equivalently expressed under the physical measure
:
with
Since the parameter vectors
and
are infinite-dimensional, the model (
20), though being linear and autonomous, is genuinely non-parametric from the statistical point of view.
Arguing similar to ([
4], p. 42), the stochastic process
is expected to solve the SPDE:
where
is the initial value for the yield curve
at
and
denotes the short rate. We would like to draw the reader’s attention to the minus sign in front of the first term on the right-hand side of the first equation in (
21). Since the boundary condition is now prescribed at
, the information propagates from left to right, i.e., the abstract operator is now
subject to homogeneneous Dirichlet boundary conditions at
.
On the strength of Hille & Yosida theorem, the operator
A with the maximal domain
(note that the point evaluation is well-defined in view of Sobolev embedding theorem) generates a strongly continuous semigroup
of contractions on
endowed with the standard scalar product. The semigroup
is now the right-shift semigroup given as
Assuming
is a scalar Itô process independent of the cylindircal Wiener process
, the concept of a mild solution can be developed similar to
Section 3.1.
To make Equation (
21) useful for prediction purposes, we need to specify the evolution of
. To this end, we utilize the classical Vasiček’s short rate model [
37]:
where
are parameters,
is a given random variable independent of
and
is, in turn, independent of
and
. Since the evolution in Equation (
22) is taking place under the risk-neutral measure
, we perform a change of measures using Girsanov’s theorem assuming
Thus, under the physical measure
, the evolution in Equation (
22) assumes the form
with
and a standard Wiener process
. In the following, we will work with Equations (
21) and (
23) under the physical measures
and
. The key feature of Equations (
21) and (
23) is that they are autonomous, which is crucial requirement if a model is to be used in forecasting.
Since our dataset is discrete, it is natural to discretize Equations (
21) and (
23) as well. To this end, we employ (semi-)implicit and explicit Euler & Maruyama schemes for time integration and a two-point finite-difference difference quotient approximation for
A. Selecting a time grid
and a “space” grid
with
,
,
,
, we obtain
and
Using standard results from numerical analysis of S(P)DEs, both numerical schemes can be proved to be unconditionally feasible and stable (i.e., for any
). The consistency also holds true, but is irrelevant for our purposes as
remain fixed. A numerical implementation in Matlab
® is given in the
Supplementary Materials.
The historical yield rate dataset [
7] used in this paper contains
time perios for the year 2018. Restricting the times to maturity to 0 to 12 months, the equidistant
x-grid contains
nodes ranging from 0 to 32 months. Thus,
and
.
5.2. Parameter Estimation
Before the model in Equations (
24) and (
25) can be used for prediction purposes, the unknown functional parameter vector
and the scalar parameters
need to be estimated. Since
is infinite-dimensional, the underlying estimation problem is a genuinely non-parametric and will be regularized through a type of PCA (Principal Component Analysis) truncation ([
6], Chapter 1.7.1). In the PDE optimization literature, this approach is commonly known as POD (Proper Orthogonal Decomposition) and used as a dimension reduction technique [
41].
We begin with Equation (
25). The maximum likelihood (ML) estimators for
and
are given as
(cf. [
42,
43,
44]). The ML estimates for the 2018 historic data are displayed in
Table 1 below.
Since no maximum likelihood estimator is known for
in Equation (
24), we will develop a custom least-squares-like method instead. To this end, we rewrite Equation (
24) as
with
According to Equation (
26),
’s are i.i.d. Gaussian elements of
. In view of the fact that we interested in predicting the yields
given as
, for regularization purposes, we will rather employ the stronger topology of
.
As mentioned before, to render
’s identifiable, we assume the latter are orthogonal in
. This implies
are orthogonal in
since
Also note that since for on the stength of Sobolev’s imbedding theorem and Hölder’s inequality.
We proceed by forming the sample
with the discretized negative derivative operator
and compute the covariance operator in the scalar product of
Let
denote the resulting (bounded) covariance operator/matrix given as
with the sample mean element/vector
Performing spectral decomposition
for some “principal variances”
and unit
(cf. [
6], Chapter 1.7.1), we choose the first
“principal modes” responsible for, say, 99% of variation, i.e., we choose the smallest
such that
.
This truncation step, in addition to the
-topology employed, renders the procedure numerically and statistically well-posed.
Figure 3 displays the six principal modes responsible for 99.32% of variation in
’s for the 2018 integrated forward rate data. Summarizing,
’s and, thus,
can now be estimated via
where the discrete negative antiderivative operator
comprises the (isometric isomorphic) inverse of
. Since the covariance operator
is (in general) non-spherical, we expect that our method is consistent, but does not necessarily need to exhibit optimal efficiency. The latter can be improved with a type of “backfitting” procedure, which is beyond the scope of this article.
Once
’s are estimated, we can estimate
’s using the least squares method. Computing the “design matrix”
and recalling the sample mean
from Equation (
28), the Gauss & Markov theorem yields an estimate for the “truncated”
vector:
(Note that we first estimated
and then solved for
.) Applied to the 2018 data, the estimates for the
“principal”
’s in given in
Table 2.
5.3. Prediction
Now, that the estimates
and
are at hand, Equations (
24) and (
25) can be used to predict the dynamics of
and
and, thus, that of
. 1 January 2019 was used as
with the yield curve from 31 December 2018 (multiplied by
x) playing the role of the initial value. We generated
sample paths for the standard Wiener increments
and
. Here, we chose
which corresponds to predicting the integrated forward rates over a one-month time horizon. Further,
sample paths of the discretized solution processes
,
were computed by solving the difference Equations (
24) and (
25). The predicted yield curves were then formed as
.
To empirically assess the statistical performance of our method, we chose
1 February 2020 and compared our predicted curves with the historically observed one.
Figure 4 displays the observed yield rate curve (solid) and the estimated mean yield curve
(dotted). Further, the dashed curves represent the lower and upper 99% pointwise prediction bounds. Note that the overall confidence level of the resulting tube region is less than 99% since no Bonferroni correction was performed.
Figure 4 further contains 5 sample solution paths, which are all (essentially) contained in the prediction region. The estimated mean yield curve is also contained in the prediction region and does not substantially deviate from the observed curve. The prediction region is relatively tight suggesting good predictive power of our model.
The left panel of
Figure 5 displays the historical yield curve rates for January 2019 (22 business days), while the right panel gives the estimated mean yield curve rate for the same time window based on
replications. We conjecture that the slight overestimation of the yield curve for large maturities may be related with the non-decreasing behavior of the long rates (see
Section 4.1).
In sum, the presented model can be used both for prediction and estimation purposes as well as uncertainty quantification. Further potential applications include stochastic optimization and optimal stopping time problems, etc. The full set of Matlab
® codes used to produce the results of
Section 5 are available at
https://github.com/mpokojovy/abstract.HJM.