1. Introduction
With the continuous development of science and technology, a large number of inverse problems [
1,
2,
3] have appeared in the fields of weather forecasting, material nondestructive testing, wave inverse scattering, biological imaging and option pricing. The inverse volatility problem has received much attention in recent years due to its significant impacts on the market value of options (see, e.g., [
4,
5,
6,
7,
8,
9,
10,
11]). In the real market, the volatility of the underlying asset price cannot be directly observed, but the market price of an option can be directly observed and can provide information about the volatility of the underlying asset price. The associated forward operators are based on solutions to the corresponding Black–Scholes/Dupire equations (see [
12], and
Section 2 of this paper). In this paper, we investigate the following problem:
Problem P: In terms of a stock option without paying dividends, it is matter of public knowledge that
for a call option satisfies the following Black–Scholes equation
Herein, the parameters are the price of the underlying stock and the strike price. and r are, respectively, the date of expiry, the risk-neutral drift and the risk-free interest rate, which are assumed to be constants. The parameter represents the volatility coefficient to be identified.
Given the following additional condition:
in which
and
indicate market price of the stock at time
and market price of the option with strike
K at a given expiry time
T. The inverse problem is to identify a pair of functions
satisfying (1) and (2), respectively.
The problem P was first considered by Dupire in [
12]. In his paper, Dupire applied the symmetric property of the transition probability density function to replace the option pricing inverse problem with an equation that contains parameters
K,
T and Dupire’s proposed formula for calculating the implied volatility. Although Dupire’s formula is seriously ill-posed, the solution of Dupire provided the foundation for later understanding of this problem. Time-dependent and space-related volatility has been studied in [
4,
5]. For space-dependent volatility, authors generate a broad class of non-Gaussian stochastic processes. By using the dual equation, the problem is transformed into a inverse coefficient problem with final observations and some new uniqueness and stability theorems are established. In [
6,
7,
8], an optimal control framework is applied to determine implied volatility in option pricing. A rigorous mathematical analysis about the existence, necessary condition and convergence of the inverse problem is made. The authors propose a well-posed approximation algorithm to identify the coefficients. In recent years, the linearization method has been applied to the inverse volatility problem in option pricing. In [
13,
14,
15], the linearization method is applied to transform the problem P into an inverse source problem in classical parabolic equation, from which an unknown volatility can be recovered. However, there are several deficiencies in these studies that need to be improved. One of the significant deficiencies is the assumption that the original problem P is in the unbounded region, and it is under this assumption many scholars conducted numerical simulations by applying the artificial truncation. There is a potential issue in this approach; that is, if the truncation interval is too large, it will increase the amount of calculation, and if the truncation interval is too small, it will increase the error. In practical applications, this approach may fail to control the volatility risk. There are a number of papers devoted to this problem, see, e.g., [
16,
17,
18,
19].
The purpose of this paper is to better control the volatility risk of financial markets with precision and to address the deficiencies caused by the artificial truncation in previous studies performed by other scholars. In order to achieve our objective, in this paper we transform the principal coefficient inversion problem P of classical parabolic equation into the principal coefficient inversion problem of the degenerate parabolic equation in bounded regions via the variable substitutions. For the study of the inverse problems of degenerate parabolic equation, we refer the reader to [
20,
21,
22], and for other topics of degenerate parabolic equation, such as null controllability, see references [
23,
24,
25]. In the present paper, we use the optimal control framework to provide a full analysis of the convergence for the degenerate problem. To the best of our knowledge, there are no other works investigating the theoretical studies on the convergence of inverse volatility problem for a degenerate parabolic equation.
The structure of this paper is arranged as follows: In
Section 2, we propose inverse the principal coefficient problem of degenerate parabolic equations through some variable substitutions, and use the optimal control framework to transform problem P1 into an optimal control problem P2. In
Section 3, we prove the existence of the minimum of the optimal control problem. We show that the approximate solution of the optimal control problem converges to the true solution of the original problem in
Section 4. In
Section 5, we use the gradient iteration method to perform some numerical experiments.
2. Problem Setting and Methodology
In this section, we introduce some variables substitutions to transform the problem P into a degenerate parabolic principal term coefficient inversion problem P1. First, we apply the Dupire’s solution to the problem (1) by setting
it is well-known that
G satisfies the following equation (see
Appendix A for detailed derivation):
Since
, it is easy to see that
, as a function of
is the fundamental solution of the conjugate problem of the above problem (see [
12]), namely
Substitute
into the above conjugate problem and integrate
K twice on
, if
s is fixed, as
,
Therefore, the problem (1) is transformed into the Cauchy problem of the following Dupire equation (see [
4,
5,
6,
12]):
From above equation, Dupire obtained the formulas for local volatility under the different strike price and expiry date, i.e.,
However, this formula is extremely sensitive to changes in the data, and small changes in the options market quotes may cause great changes in the second-order partial derivative, so the formula is unstable.
For such a case, we take
where
E is a non-zero constant,
u satisfies the following problem P1 (see
Appendix A for detailed derivation).
Poblem P1: Consider the following degenerate parabolic equation
with the additional data
where
is a relatively small constant. The goal is to determine the unknown space-dependent volatility coefficient
and u satisfying the (4) and (5). For simplicity, we denote
where
, and
are the smooth functions given on
, and
is the unknown term satisfying
Equation (
4) are the degenerate parabolic equations. Compared with the classical parabolic equations, the boundary conditions of degenerate equations may vanish on some degenerate part. In [
3], whether or not boundary conditions should be given is determined by the sign of the Fichera’s function. For general degenerate parabolic equations with non-divergent form, the Fichera function can be defined as follows:
where
satisfies
and
are taken as 1, −1 when
,
, respectively. Specifically, if the sign of the Fichera function is non-negative, then boundary condition should not be given. For Equation (
4), it can be easily seen that
. Therefore, the boundary conditions on
and
should not be given either.
3. Optimal Control Framework
Before we consider the regularization of problem P1, we discuss the well-posedness of direct problem and give some basic definitions, lemmas and estimations.
Definition 1. Define to be the closure of under the following norm: Definition 2. A function is called the weak solution of (4), if u satisfiesand the following integration identityholds for a.e., . Theorem 1. For any given , there exists a unique weak solution to (4) and satisfies the following estimate: Proof. Firstly, multiplying
u on both sides of (4) and integrating over
, we have
Integration by parts, we obtain
Integration the second term from the left side of the equality (9) by parts, according to the Fichera condition and the Grönwall’s inequality, one has
where
C is independent of
T. □
Since the problem P1 is ill-posed, that is, the solution of the problem does not continuously depend on the observed data. Therefore, we use the optimal control framework to convert the problem P1 into the following optimal control problem P2:
Problem P2: Find
such that
where
is the solution of (4) for a given coefficient
,
is the regularization parameter.
We are now going to show the existence of minimizers to the problem (10). Firstly, we assert that the functional is of some continuous property in with the following sense.
Lemma 1. For any sequence in which converges to some in as , we have Proof. Step1: Setting
and selecting the test function as
in (7), one can derive the following equality by integrating on
.
for any
. By the Fichera condition, Young’s inequality and the smoothness of
, we obtain
Since the sequence
is uniformly bounded in the space
, we can extract a subsequence denoted as
, such that
Step 2:
By taking
in (7), multiplying both sides by a function
with
, and integrating with respect to
, we have
Integration by part for the first term on the left side of (17), letting
in (17), and using (16), we obtain
Noticing that (18) is valid for any
with
we have
and
. Therefore,
by the definition of
.
Step 3: Prove as .
We rewrite (7) for
in the form
Taking
in (20), we have
Similar relations hold for
, namely
Subtracting (22) from (21), one has
Taking
in (7), we have
Similarly, by taking
we have
Substituting (24) and (25) into (23), after some manipulations, we derive
Integrating over the interval
for any
, we obtain
By the convergence of
and weak convergence of
, one can easily obtain
Combining (27) and (28) we have
Besides, by the
inequality, there holds
From (16), (29) and (30) we obtain
This completes the proof of Lemma 1. □
Theorem 2. There exists a minimizer of , i.e., Proof. It is obvious that
is non-negative and thus
has the greatest lower bound
. Let
be a minimizing sequence, i.e.,
□
Notice that
we deduce
where
C is independent of
n. From the boundedness of
and (31), we also have
So we can extract a subsequence, still denoted by
, such that
By using the Sobolev’s imbedding theorem (see [
1]) we obtain
It can be easily seen that
. So we get as
that
in
.
Moreover, from (35) we have
From Lemma 1 and the convergence of
, there exists a subsequence of
, still denoted by
, such that
By (35), (36) and (37), we obtain
4. Convergence
In this section, we consider whether the solution of the control problem (10) converges to the solution of the original problem (4). What needs to be declared is that the optimal control problem P2 is non-convex, so a unique solution may not be obtained. In fact, the optimization technique is a classic tool for obtaining “general solutions” for inverse problems that do not have a unique solution. However, we can obtain the necessary conditions for the optimal solution, and through the necessary conditions it can be proved that the optimal control problem has local uniqueness when
T is relatively small. We do not elaborate here because the techniques are similar to [
6]. We are more interested in its convergence. We assume the true solution
is achievable, namely, there exists
, such that
Observation data tends to have errors, so it can be recorded as
, assuming that the error level satisfies:
We define a forward operator
:
where
is the solution of Equation (
10) corresponding to
. For any
and
, the
derivative
satisfies the homogeneous initial conditions, and for any
, the following equality holds:
Lemma 2. For any and , such that , we have the following equalitywhereholds for any . Proof. Notice that
satisfies
Subtracting (7) from (45), and replacing
in Equation (
7) with
, we obtain
Combining (46) and (43), (44) can be obtained.
In order to get the convergence of the optimal solution, some source conditions need to be added to the direct problem. We introduce the following linear operator
:
where
is the solution of Equation (
10). From (43), it is easy to see that for
and
, the following equation holds:
where
represents the inner product in the
. Notice that
∇ is a linear operator, now we define
as its conjugate operator such as:
If , then . Here, we only use the weak form of . □
Theorem 3. Assume that there exists a function φsuch that in the weak sense the following source condition holdswhere is defined by (47). Moreover, as , here are the following estimates:andwhere is the minimum when w replaces in (11). is the solution of variational problem (7) when , and C is independent of δ and α. Proof. Because
is the minimum of (11), then
that is
From (53), one can deduce
Combining Equations (48) and (50), for the last term in (54), we have
The main idea now is to use the left side of inequality (54) to control .
For
, we can deduce from (44)
Integration by part, we obtain
By the triangle inequality, we have
Then from Equations (39), (40) and (58) and using Young’s inequality, we have
For
, from (39), (40), and using integration by parts, trigonometric inequality and Young inequality, we can obtain
Similarly, for
, there are
Combining (54), (55), and (58) to (64), we obtain
by (65), one has
Choosing
, we obtain
Then by (67) and using inequality, we can also obtain (51).
This completes the proof of Theorem 3. □
5. Numerical Experiments
Based on the optimal control theory, we theoretically analyze the optimal control problem P2. In this section, we will use the synthetic data to reconstruct the implied volatility by gradient-type iteration method, and give some numerical examples. To achieve that, we apply the finite difference method to solve the forward problem (4). Since the explicit difference scheme is conditionally stable, the Crank–Nicolson scheme, which is absolutely stable, is employed to obtain the numerical solution. To construct more general testing functions
a and
u, we add a source function
to the right-hand side of Equation (
4), i.e., we consider the more general equation:
Assume that the domain
is divided into
mesh with the spacial step size
and the time step size
, respectively. Grid points
are defined by
in which
M and
N are two integers. The notation
is used for the finite difference approximation of
.
The Crank–Nicolson scheme leads to the following difference equation for (68):
for
, and
.
Note that on the left boundary
, Equation (
68) degenerates into the following ordinary differential equation:
Likewise, on the right boundary
, we can obtain the following discretized equation:
The initial condition can be discretized by
This method is stable in the maximum-norm without any restriction on the mesh parameters t and h, and the truncation error is .
For the inverse problem, we consider using the gradient-type iteration method to obtain the numerical solution. The key of this method is to use the derivative. Before discussing the algorithm, we need to provide the following lemmas:
Lemma 3. Let a be the solution of the optimal control problem (10). Then there exists a triple of functions satisfying the following system:andfor any . This lemma represents the necessary conditions for the minimum of optimal control problem, and the proof is similar to [
6].
Lemma 4. The derivative of in the direction of can be expressed as:where and are solutions of (69)/(70) at the given coefficient . The procedure for the stable reconstruction of the solutions u and a can be stated as follows:
Step 1. Choose the initial iteration value . For simplicity, we choose .
Step 2. Solve the initial boundary value problem (4), and obtain the solution corresponding to .
Step 3. Solve the adjoint problem of (69) to get , where .
Step 4. Compute the
derivative
where
is a function determined by the following grid:
Then, the iterative value from step
j to step
can be defined as:
Step 6.1. Let .
Step 6.2. Compute .
Step 6.3. If , then we execute the Step 6.4; otherwise, let , we execute the Step 6.2, where is an adjustment parameter.
Step 6.4. Take . If , then iteration should cease; otherwise, let continue to execute the Step 2. We take as the initial value of the new iteration and follow the above steps until the stopping rule is met.
We present two numerical experiments to test the stability of our algorithm. Since
is the unknown term, it can be directly seen that all but
is known in
, we only give the unknown term
in our following examples. For simplicity, in all experiments, the values of some basic parameters are
We use the symbol
to denote the stopping parameter in the iteration procedure, i.e.,
and the symbols
and
to denote the absolute and relative
-norm error between the exact solution
to be identified and the numerically reconstructed solution
, i.e.,
Example 1. In the first numerical experiment, we take The additional condition is given by The noisy data is generated in the following form:where the parameter δ is the noisy level. We present the results of the reconstructed non-linear function
after 100 iterations in
Figure 1, where the figure is obtained by
, and
, respectively. It can be seen from this figure that the main shape of the unknown function is recovered well. The initial guess of the unknown term is chosen as
. In terms of the number of iteration steps alone, it is obvious that the more iterations are better than those with fewer iterations. However,
Figure 1 shows that the difference between them is not significant. It can be observed from
Figure 1 that the result of
is only slightly better than that of
, not necessarily indicating that the more iterations, the more obvious the effect.
We also consider the case of noisy input data to test the stability of our algorithm. The reconstruction results of
from the noisy data
are shown in
Figure 2, where the noise level
is taken as 0.03, 0.05 and 0.08, respectively. In fact, the numerical solution can achieve a satisfactory accuracy for only about 28 iterations (the
error
is less than
). Satisfactory approximation is obtained even under the case of noisy data. The iteration number
j, the stopping parameter
, the accuracy error
and relative
-norm error
for various noisy level
are given in
Table 1. We can see from this figure that after 12 iterations (denoted by
j) the stopping parameter
is
which is much less than
.
Remark 1. Compared with the exact case, we cannot execute endless iterations for the noisy case. Normally, we have to stop the iteration at a right time, or the data will overflow. This is the cause why the iteration number for the noisy case is much less than the exact case. In fact, if we do not stop the iteration in time after reaching the stopping point, then the residual ϵ may be smaller, but the error of solution will become more and more large.
Example 2. In the second numerical experiment, we consider the piecewise function The additional condition is given by The noisy data is generated in the following form:where the parameter δ is the noisy level. The initial guess of source function is taken as
. In this case, we also present the results of the reconstructed unknown function
after 250 iterations in
Figure 3, where the figure is obtained by
and
, respectively. One can see from
Figure 3 that although the main shape of the unknown function is reconstructed after 250 iterations, the latter iterative process converges rather slowly. Generally speaking, it is quite difficult to reconstruct the information of unknown functions near the boundary of parabolic equations. From these two examples, one can see that the unknown function in the middle part of the region is well recovered, and the main error appears near the boundary. In summary, our algorithm is satisfactory.
Analogously, we also consider the noisy case, where the noisy levels are
and
. The corresponding numerical result is shown in
Figure 4. Finally, some computation parameters
and
for the exact case and noisy one are given in
Table 2.