1. Introduction
The stochastic linear–quadratic (SLQ) optimal control problem plays an extremely important role in modern control theory and methodology, because of its elegant structure of solutions and wide applications in engineering, finance, networks, etc. More importantly, SLQ optimal control problems can also reasonably approximate some nonlinear stochastic optimal control problems. In the literature for SLQ optimal control problems, refer to Wonham [
1], Bismut [
2], Bensoussan [
3], Peng [
4], Chen et al. [
5], Chen and Zhou [
6], Chen and Yong [
7], Ait Rami et al. [
8], Tang [
9], Yu [
10], Tang [
11], Sun et al. [
12], Sun and Yong [
13], and Sun et al. [
14], for journal papers and Davis [
15], Anderson and Moore [
16], Yong and Zhou [
17], and Sun and Yong [
18] for monographs.
In the above work, stochastic systems are modelled by Brownian motions. However, in reality, Brownian noises are usually inadequate in a mathematical modeling sense. For example, it is particularly appropriate to use stochastic systems with Poisson jumps or Lévy jumps to describe the large fluctuations in the stock market (Merton [
19], Kou [
20], Cont and Tankov [
21], Oksendal and Sulem [
22], Lim [
23], Hanson [
24]). Moreover, from a mathematical point of view, there exist essential differences between stochastic systems with and without jumps.
SLQ optimal control problems with Poisson jumps (SLQP optimal control problems) are also researched by many authors. Tang and Hou [
25] studied an optimal control problem of partially observed linear–quadratic stochastic systems with a Poisson process and obtained an explicit solution of this problem by the partially observed maximum principle. Wu and Wang [
26] studied a kind of SLQP optimal control problem, the explicit form of optimal controls is obtained by the solutions to a forward–backward stochastic differential equation with Poisson jumps (FBSDEP) and a generalized Riccati equation system. Hu and Oksendal [
27] studied an SLQP optimal control problem with partial information. Meng [
28] considered an SLQP optimal control problem with random coefficients. The state feedback representation was obtained for the open-loop optimal control by a matrix-valued backward stochastic Riccati equations with jumps (BSREJ), and the solvability of it in a special case was discussed. The solvability of BSREJ in the general case was studied in Zhang et al. [
29]. Note that Li et al. [
30] gave the concept of relax compensator, which is used to describe indefinite BSREJ, then they investigated the solvability of BSREJ and gave the optimal control. Moon and Chung [
31] studied the indefinite SLQP optimal control problem with random coefficients by a completion of squares approach.
Our interest in this paper is the closed-loop solvability of the SLQP optimal control problem, which, as far as we know, is not researched in the literature. In 2014, the notions of open-loop and closed-loop solvabilities for SLQ optimal control problems were introduced in Sun and Yong [
32], where they concentrated on the LQ zero-sum stochastic differential game, in which SLQ optimal control problem is a special case when there is only one player/controller is considered. Sun et al. [
12] further gave more detailed necessary and sufficient conditions of the open-loop and closed-loop solvability for SLQ optimal control problems. Sun and Yong [
13] studied the open-loop and closed-loop solvability of SLQ optimal control problems in the infinite horizon and showed that open-loop and closed-loop solvabilities are equivalent in the infinite horizon. For more details and complete content, please also refer to their book [
18]. Li et al. [
33] studied the SLQ optimal control problem of mean-field type and gave the characterization of the closed-loop optimal strategy. Lv [
34,
35] researched the closed-loop solvabilities of SLQ optimal control problems for systems governed by stochastic evolution equations (SEEs) and SEEs of mean-field type, respectively. Tang et al. [
36] studied the open-loop and closed-loop solvability for indefinite SLQP optimal control problem of mean-field type and its application in finance. Sun et al. [
14] considered the indefinite SLQ optimal control problem with random coefficients and investigated the closed-loop representation of open-loop optimal controls.
Our work differs from the existing results in the following respects. (1) We consider an SLQP optimal control problem with deterministic coefficients in a general framework (
problem (SLQP) in
Section 2), where the weighting matrices in the cost functional are allowed to be indefinite. Moreover, cross-product terms in the control and state processes are present in the cost functional. Non-homogenous terms also appear in the controlled state equation and cost functional.The model considered in this paper is a nontrivial generalization of those in [
12,
32]. (2) Characterization of the closed-loop solvability for the SLQP optimal control problem is obtained, via the Riccati integral–differential equation (RIDE). For the SLQ optimal control problem without Poisson jumps, Sun et al. [
12] first found two matrix-valued SDE of
,
, then they applied Itô’s formula to find
. The solution to the related Riccati equation is defined as
. However, this method fails in our SLQP optimal control problem, as the Poisson jumps appear in the controlled system and difficulty is encountered. In detail, when we take the inverse of the matrix-valued SDEP as that of the matrix-valued SDE in [
12] and apply Itô–Wentzell’s formula (Oksendal and Zhang [
37]), terms such as
will appear in
. Because we do not have any restrictions on the coefficients in our system and
is the closed-loop optimal strategy that we are going to seek, there is no reason to arbitrarily presume that
is invertible. From Lv [
34,
35], we overcome this difficulty by transforming the original problem (SLQP) into a problem of solving the open-loop optimal control of
problem (SLQP) in
Section 2. Thus, a Lyapunov integral–differential Equation (
25) is given first and then the RIDE (
29) is obtained. Note that the technique used in this paper is also different from that in Tang et al. [
36], where a matrix minimum principle by Athans [
38] was used when dealing with the SLQP problem of mean-field type.
The rest of this paper is organized as follows.
Section 2 begins with the preparation work, including giving some basic knowledge and presenting the formulation of the SLQP optimal control problem. In
Section 3, characterizations of closed-loop solvability of SLQP optimal control problems are presented and the concrete proofs are given.
Section 4 gives an example to demonstrate the effectiveness of the main result. Finally, in
Section 5, some concluding remarks are given.
2. Problem Formulation and Preliminaries
First, let us introduce some notations that will be used throughout this paper.
Let
be a constant, and
is a finite time duration. Let
be the collection of all
matrices, and
be the collection of all
symmetric matrices. We let
I be the identity matrix with a suitable size. We use
to denote inner products in possibly different Hilbert spaces and
to denote the norm induced by these inner products. Let
and
be the transpose and range of a matrix
M, respectively. For
,
(respectively,
) implies that
is a positive semi-definite matrix (respectively, positive definite matrix). Let
denote the pseudo-inverse of a matrix
. If the inverse
of
exists, then the pseudo-inverse is equal to the inverse. See Penrose [
39] for the definition and some basic properties of the pseudo-inverse.
For any Banach space H (for example, ) and , let be the space of all -integrable functions valued from H on , be the space of all continuous functions valued from H on , and be the space of Lebesgue measurable, essentially bounded functions from into H.
Let be a completed filtered probability space, where is filtration generated by the following two mutually independent stochastic processes and augmented by all the -null sets in :
A standard one-dimensional Brownian motion .
A Poisson random measure N defined on , where is a nonempty open set and its Borel field is . The compensator of N is , satisfying , such that is a martingale for any ; is assumed to be a -finite measure on and is called the characteristic measure.
For
, we introduce some notation for spaces of random variables and stochastic processes:
We consider the following controlled linear SDEP on
:
where
is the initial time and
is the given initial state;
,
,
,
are given deterministic matrix-valued functions of proper dimensions, and
are independent of
. The expressions
,
are
-progressively measurable processes and
is also random;
is the control process. We define the admissible control set:
The control process is called an admissible control.
Then we define the cost functional:
where
H is a symmetric matrix and
, and
are deterministic matrix-valued functions of proper dimensions that satisfy
. The expression
g is an
-measurable random variable,
is an
-progressively measurable process, and
is an
-predictable process.
In order to make the given (
1) and (
3) meaningful, we adopt some assumptions for coefficients as follows.
Hypothesis 1. The coefficients of the state Equation (1) satisfy the following: Hypothesis 2. The weighting coefficients of the cost functional (3) satisfy the following: For simplicity, we denote the above Hypothesis 1 and Hypothesis 2 as (H1) and (H2), respectively.
Under (H1) and (H2), for any given
and
, state Equation (
1) admits a unique adapted solution
and the cost functional is well-defined. Therefore, the following problem is meaningful.
Problem 1. (SLQP). For given initial pair , find a such that Any satisfying (4) is called an open-loop optimal control of problem (SLQP) for , the corresponding is called an open-loop optimal state, and is called an open-loop optimal pair. The map is called the value function of problem (SLQP). In particular, when , and g are all zero, we refer to the above problem as the problem (SLQP).
Definition 1. For given initial pair , if there exists a (unique) such that (4) holds, then we say that problem (SLQP) is (uniquely) open-loop solvable for . Next, take
and
. For given initial pair
, let us consider the following equation (some time variables are usually omitted):
which admits a unique solution
, depending on the
and
;
is called a closed-loop strategy and the above Equation (
5) is called a closed-loop system of the original state Equation (
1) under
. We point out that
is independent of the initial state
. With the above solution
, we define
We now introduce the following definition.
Definition 2. If a closed-loop strategy pair satisfies the following inequalitywhere on the left and on the right, then is called a closed-loop optimal strategy of problem (SLQP) on , and we say that problem (SLQP) is closed-loop solvable on . We emphasize that the pair is required to be independent of the initial state . We have the following equivalence theorem.
Theorem 1. Let (H1) and (H2) hold and let . Then the following statements are equivalent:
- (i)
is a closed-loop optimal strategy of problem (SLQP) on .
- (ii)
For any given and ,where on the right. - (iii)
For any given and ,
Proof. (i) ⇒ (ii). From the definition of closed-loop optimal strategy, it can be proved. (ii) ⇒ (iii). For given
and
,
is the adapted solution to the following SDEP:
Taking
, it is easy to see that
. Thus
From the existence and uniqueness of the adapt solution to SDEP, we obtain that
Therefore, by (ii), one has
proving (iii). (iii) ⇒ (i). For a given
,
and
, let
. Taking
from the existence and uniqueness of the adapt solution of SDEP, we know that
Therefore, by (iii), we have
□
For the closed-loop optimal strategy
of problem (SLQP) on
and the corresponding closed-loop optimal state
, we can define the outcome
From the third part of Theorem 1, we see that for any given initial state and , is an open-loop optimal control of problem (SLQP) for x. Therefore, if problem (SLQP) is closed-loop solvable on , it must also be open-loop solvable, and the outcome of the closed-loop optimal strategy is the open-loop optimal control for any .
The following result is concerned with open-loop solvability of problem (SLQP) for given initial state.
Proposition 1. Let (H1) and (H2) hold. For given initial pair , a control is an open-loop optimal control of problem (SLQP) if and only if the following hold:
- (i)
The stationarity condition holds:where is the adapted solution to the following FBSDEP: - (ii)
The convexity condition holds: For any ,where is the adapted solution to the following SDEP:
Proof. For any
and
, let
, thus
is the corresponding state that satisfies
Then
satisfies the following SDEP:
From the existence and uniqueness of the solution to SDEP, we know that
. Then
Applying Itô’s formula to
, we get
Therefore,
is an open-loop optimal pair of problem (SLQP) if and only if (
9) and (
11) hold. □
On the other hand, from the second part of Theorem 1, we can see that
being a closed-loop optimal strategy of problem (SLQP) is equivalent to
being an open-loop optimal control of the SLQP optimal control problems (
5) and (
6) with
, which we denote by
problem (SLQP). In particular, when
and
g are all zero, we refer to it as problem (SLQP)
. Similar to Proposition 1, we can give the following result.
Proposition 2. Let (H1) and (H2) hold. For any given , is an open-loop optimal control of problem (SLQP) if and only if the following stationarity condition holds:where is the adapted solution to the following FBSDEP:and the following convexity condition holds: For any ,where is the adapted solution to the following SDEP: 3. Main Results
In this section, we will study the necessary and sufficient conditions for problem (SLQP) to be closed-loop solvable.
Making use of (
13), we may rewrite the BSDEP in (
14) and obtain
We first have the following result.
Theorem 2. Let (H1) and (H2) hold. If is an optimal closed-loop strategy of problem (SLQP) on , then is an optimal closed-loop strategy of problem (SLQP) on .
Proof. By Proposition 2, we can see that
is an optimal closed-loop strategy of problem (SLQP) on
if and only if for any
, the stationarity condition holds:
where
is the adapted solution to the following FBSDEP:
and the following convexity condition holds: For any
,
where
is the adapted solution to the following SDEP:
Since the FBSDEP (
17) admits a solution for each
and
is independent of
x, by subtracting solutions corresponding
x and 0, the latter from the former, we see that, for any
, the following FBSDEP:
and in this time
satisfies
It follows, again from Theorem 1 and Proposition 2, that is an optimal closed-loop strategy of problem (SLQP) on . □
To summarize the relationship between problem (SLQP), problem (SLQP)
, problem (SLQP)
, and problem (SLQP)
, we plot the following diagram in
Figure 1:
It is clear that when we want to study the necessary conditions for the closed-loop solvability of problem (SLQP), we can transform the original problem into the open-loop solvability of problem (SLQP)
where the open-loop optimal control is
. Thus from Proposition 1, we can know that the optimal system of problem (SLQP)
is the following FBSDEP:
and in this time
satisfies
In the light of
, we assume that
where
is a matrix-valued differential function satisfying
. Applying Itô–Wentzell’s formula to (
21), we have
Comparing the diffusion coefficient of the above equation and of the second equation in (
19), we can see
Plugging (
21) and (
23) into (
20), we obtain
Now comparing the drift coefficient of (
22) and of the second equation in (
19), noting that (
21), (
23) and (
24), we have
Thus, we let
satisfy the following Lyapunov integral–differential equation:
Proposition 3. Let be the solution to (25). Then for any , we have Proof. Let us consider problem (SLQP)
. For any
, the state equation and the cost functional are:
and
Applying Itô–Wentzell’s formula to
, we get
Putting the above equation into the cost functional, we have
From the previous analysis, we know that
is the open-loop optimal control for problem (SLQP)
, i.e.,
then
For simplicity, we give the following notations:
thus,
Choose the initial pair
and
,
,
. In this time,
is a deterministic function, thus
where
is solution to the following ordinary differential equation (ODE):
It is easy to see that
thus
Thus, the first inequality in (
26) is obtained, and now let us prove the second equation. We take the initial pair
and
,
,
. In this time,
where
is solution to the following ODE:
As we know, this is an inhomogeneous linear ODE, and we get for
,
Since
is arbitrary,
Dividing both sides by
h and let
, we have
Since
is arbitrary,
This is the second equation in (
26). □
The following theorem is the main result in this paper, which characterizes the closed-loop solvability of problem (SLQP).
Theorem 3. Let (H1) and (H2) hold. Problem (SLQP) admits an optimal closed-loop strategy if and only if the following RIDE:admits a solution such thatand the following BSDEP:admits a unique solution , which satisfies In this case, the closed-loop optimal strategy of problem (SLQP) admits the following representation:for some , . Further, the value function is given by Proof. We first prove the necessity. Let
be a closed-loop optimal strategy of problem (SLQP) over
. The second equation of (
26) implies
Denoting
, since
and
is an orthogonal projection, we see that (
30) holds and
for some
. Consequently,
Plugging the above into the Lyapunov Equation (
25), we obtain the RIDE in (
29).
To determine
, we define
where
is the adapted solution to the FBSDEP (
17). Then
According to the stationarity condition (
16), we have
Since
and
is an orthogonal projection, we see that (
32) holds and
for some
. Consequently,
Therefore,
is the unique solution to the following BSDEP:
To prove the sufficiency, we take any
, and let
be the corresponding state process. Applying Itô–Wentzell’s formula to
and
, we have
and
Thus,
□
4. Example
In this section, we will give a simple example. Consider the following controlled linear SDE with Poisson jumps:
and the cost functional is defined to be
Then, for any closed-loop strategy pair
, we get the closed-loop system:
and the closed-loop cost functional
In this case,
is the unique adaptive solution to BSDEP (
31). Then, by Theorem 3, we obtain the closed-loop optimal strategy
admits the following representation:
for some
, where
is the solution to the following RIDE:
such that
Further, the value function
is given by
In this time, the closed-loop optimal control of this problem is
, where
admits the following repesentation:
We further make the following assumptions:
Hypothesis 3. The coefficients of the drift and diffusion terms of (37) are time-invariant constants, and the coefficient of the jump diffusion term depends only on the variable e, that is, is independent of t. Hypothesis 4. The Poisson process N has jumps of unit size, i.e., .
Hypothesis 5. The weighting coefficients of the cost functional (38) satisfy the following: Under (H4), the compensated Poisson process
is a martingale, where
and
is the intensity of the Poisson process
N. Then
Under (H4)–(H5), the equation of RIDE (
42) can be reduced to the following:
where
A,
B,
C,
D,
G,
Q,
R,
S, and
H are all constants.
When we assume that the coefficients are
,
,
,
,
,
,
,
,
, and
, respectively, the solution
of the above Riccati equation can be shown in the following
Figure 2.